Back

How Collate's New Workflow System Automates Data Governance

··

7 min read

Cover Image for How Collate's New Workflow System Automates Data Governance

As modern organizations work to promote data quality, maintain compliance, and enable cross-team collaboration, the evolving nature of data and its governance presents major challenges: Tightened regulations, vast amounts of data, and the lack of standardization and automation in governance processes. Most governance work is still manual, fragmented, and unable to keep pace with the complexity of modern data ecosystems, putting healthy data management at risk.

To address these challenges, Collate 1.6 reimagined Governance Workflows to help organizations automate and standardize governance processes. This release introduced:

  • An enhanced Glossary Approval workflow allowing greater customizability and automation so organizations can define one that fits their needs.

  • A new Certification feature that automatically categorizes the quality of assets based on given rules.

These updates strengthen Collate's existing governance capabilities and lay the groundwork for future AI and automation innovations.

This blog post explores why data governance is more critical than ever, how Collate 1.6 improves governance workflows, and what’s next for Collate in shaping the future of AI-powered data governance.

The Challenge of Data Governance

Today's data landscape is more complex than ever. Organizations must navigate stricter regulations, manage rapidly expanding data volumes, discover their most meaningful data assets, learn how to best use them, and optimize resource usage.

The first challenge is data discovery: most modern organizations deal with so much data that they don’t even know what they have. Before data can be governed, it must first be found and understood. This is typically done through data catalogs, which index data assets to make them searchable and easier to locate.

Once data is discovered, the next challenge is data observability: understanding the shape, quality, and reliability of the data. Without robust observability, companies risk making critical decisions based on stale, incomplete, or inaccurate data. Observability tracks key metrics such as:

  • Accuracy – Is the data correct?

  • Completeness – Are there missing values?

  • Consistency – Do different sources match up?

The final challenge is data governance: maintaining control over data assets. Governance teams must implement and enforce policies across multiple dimensions, including:

  • Data access Who can do what, and for what use cases, with the data?

  • Stewardship and ownership Who is responsible for maintaining quality?

  • Compliance How does the organization align with regulations like GDPR, CCPA, and industry-specific standards?

  • Consistent business terminology Are business terms standardized across teams? For example, does “sale” refer to gross revenue or net revenue after returns?

Manual metadata management just can’t keep up, creating significant inefficiencies and inaccuracies. We tackled part of this problem with Metadata Automations in Collate. Streamlining data governance processes comes down to addressing two critical issues:

  1. Inefficiency: Teams relying on manual processes incur ongoing, unscalable workloads to verify that metadata is correct, instead of having automated systems that let them respond only when issues arise.

  2. Inconsistency: Federated data environments can lead to fragmented processes as different teams “reinvent the wheel” by developing separate solutions to similar problems.

To address these issues, Collate 1.6 introduced an enhanced Governance Workflow System that automates and standardizes governance processes. With improved glossary approval and a new data asset certification, organizations can streamline governance efforts, promote consistency, and reduce manual overhead.

Reimagining Workflow Systems

When designing the new Data Governance Workflows, we wanted to automate governance while remaining flexible enough to support diverse use cases. To validate this approach, we started with the Glossary Approval Workflow as a proof of concept.

The Previous Glossary Approval System

Before the introduction of new Governance Workflows, the Glossary Approval process followed the structure shown here:

While powerful, this had several limitations:

  • No mechanism to mark terms as Draft: Reviewers were often prompted for terms that weren’t actually ready for review, leading to wasted time.

  • No automatic re-review trigger for updated terms: If a term was modified after approval, reviewers weren’t notified, increasing the risk of outdated or incorrect definitions.

  • Forced approval of terms without assigned reviewers: If no reviewer was assigned due to human error, terms were automatically approved, bypassing governance controls.

The New Governance Workflow

Rather than simply patching these issues, we redesigned the workflow system from the ground up to support a broader range of governance scenarios. The new Governance Workflow System is built on four key components:

  1. Entity Event Triggers

  2. Automated Attribute Validation

  3. Automated Attribute Assignment

  4. User Tasks

The updated Glossary Approval Workflow looks like this:

Key Improvements

This modular approach maintains backward compatibility while adding several crucial features:

  • The ability to re-review terms when they are updated means no more worrying about unauthorized updates to approved glossary terms.

  • Configurable rules give you flexibility to customize the workflow. For instance, if the term has no reviewer, instead of being approved automatically, it can be kept in Draft.

  • New In Review state allows you to defer the review until the term meets the needed conditions.

  • Visual workflow monitoring lets you manage the workflow configuration from the UI, making it easy to understand at a glance.

Putting It All Together

The new Governance Workflow System gives you full control over how glossary terms move through the approval process. If, for example, your organization requires a custom metadata property to be set before review, you can configure the Is Ready to be Reviewed? step to keep the term in Draft until that condition is met.

Automating these processes reduces governance bottlenecks, improves data quality, and helps ensure compliance across the organization.

The New Governance UI

Collate 1.6 introduced a new Governance Workflow UI, designed for visibility and configuration. The interface is structured into two tabs:

Workflow Tab: View and Configure Steps

This tab provides a visual representation of the workflow, allowing you to:

  • View the approval process at a glance

  • Modify the configuration of any workflow step

    Configuration Tab: Configure the Trigger

    This tab lets you fine-tune governance workflows by managing event triggers to automate workflow initiation:

  • Extending the System: Certification Workflows

    Approving glossaries is not the only problem companies are trying to solve and automate. Organizations also need to help users trust the data and understand which assets have been curated and reviewed by the team.

    To answer these questions, we have introduced Table and Dashboard Certification Workflows built on top of the Governance Workflows framework.

    Certification System Architecture

    At the core of these new workflows is the Certification Badge, which helps categorize data assets based on their quality and governance status:.

    1. Gold: Your highest-quality assets; well-documented, fully owned, and tested.

    2. Silver: High-quality assets, but missing some requirements for Gold status.

    3. Bronze: Assets in early-stage curation, perhaps with incomplete metadata.

The descriptions for each level are deliberately vague, as different organizations have different standards. What does a Gold data asset look like in your organization? What about a Silver or a Bronze one?

  • Through the Certification Workflows, these badges are assigned automatically based on automated attribute validations. The out-of-the-box workflows use the following rules:

    • Gold: The data asset is Tier 1, has owners and descriptions

    • Silver: The data asset is Tier 2, has owners and descriptions

    • Bronze: The data asset is not Tier 1 or Tier 2, has owners and description

    • Not set: The data asset does not have owners or descriptions

These rules are completely customizable so you can build your own definitions.

Technical Implementation of the Workflows

To set it all in motion, we created the Periodic Entity Batch trigger, a scheduled workflow to fetch assets depending on filters and certify them in batches. Again, the entire workflow (shown below) is controlled through the Workflow and Configuration tabs:

Workflow Tab

Configuration Tab

What’s Next for Data Governance Workflows

Collate 1.6 and the Governance Workflows framework mark the first phase of our workflow automation strategy, designed to provide a flexible, customizable way for users to interact with their metadata. Our goal is to continually enhance the platform, making data governance even more streamlined and effective.

When combined with Collate’s Metadata Automations and AI-powered description generation, upcoming enhancements will significantly reduce the manual effort needed to maintain data governance, improving accuracy and scalability. Future updates will include:

  • Expanded workflow customization allows you to not only change the steps configuration, but to create your workflow from scratch for different needs.

  • Integration with additional services so you can use the framework to build workflows that interact with external services.

You can try Collate Governance Workflows today in the live product sandbox with demo data, or by signing up for Collate free tier to use with your own data assets. Check out our Governance Workflows documentation for more information.

;