Collate Blog

OpenMetadata 0.13.0 Release

··

8 min read

Cover Image for OpenMetadata 0.13.0 Release

OpenMetadata 0.13.0 Release — Data Insights & KPIs, Lineage Traceability, Data Lake Profiler, Search Improvements, and lots more.

Written By: Suresh Srinivas, Sachin chaurasiya, Teddy Crépineau, Shailesh Parmar, Nahuel, Chirag M, Mayur Singal, Pere Miquel Brull, Mohit Yadav, Shilpa Vernekar, Sriharsha Chintalapani

With Thanksgiving just around the corner, we are elated to announce the launch of OpenMetadata’s 0.13.0 Release, which introduces a major feature- Data Insights. OpenMetadata was never intended to be yet another passive data cataloging tool. We’ve always been focused on adding features that drive collaboration and bring data teams together to sort out the data deluge. With built-in goal setting and tracking mechanisms in Data insights, now you can proactively drive the data culture of your company. Set targets, monitor, and boost teams to proactively accomplish data goals toward a healthy data culture.

The 0.13 release sets the stage to take automation to the next level with the groundwork to add well-defined policies for Bots. Now, there are bots that handle specific processes and they can be given scoped access permissions accordingly. The lineage UI has been transformed to enhance user experience. Now, the UI displays end-to-end lineage traceability for the table and column levels to help with Impact Analysis.

Several other improvements have been made in the 0.13.0 release. Domo connector has been added as a Database, Dashboard, as well as a Pipeline service. Three new connectors have been developed as part of Hacktoberfest, which has been an overwhelming success for the OpenMetadata community. The other new connectors are AWS SageMaker, AWS Kinesis, and AWS QuickSight. We now parse Avro and Protobuf Schemas to extract the fields from Kafka and Redpanda Messaging services. Apart from this, we support profiling for Data lakes like Amazon S3. Single Sign-On with LDAP has been added in this release. Advanced Search has been introduced to help discover assets quickly.

Join us for a webinar on How to Improve Data Culture using OpenMetadata on Thursday, December 1st at 9:00 AM PST. We’ll cover discuss more on how you can:

  • Enhance your data platform governance

  • Drive and monitor OpenMetadata platform usage

  • Engage your users with Data Insight KPIs

Community Update

We are grateful to the engaging community for their participation, feature recommendations, code contributions, and the awesome feedback we’ve been receiving. Thank you for being such a wonderful open-source community.

  • Crossed 1600+ GitHub stars

  • The Slack community reached 1800+ members

  • 127 Open source GitHub developers

  • 738 Commits were merged into the 0.13.0 release

OpenMetadata 0.13.0 Release Highlights

Data Insights and KPIs

Data Insights and KPIs

Data Insights, a game-changing feature has been introduced that transforms from the passive approach to data to a collaborative approach towards improved data culture. Data Insights aims to provide a single-pane view of all the key metrics to best reflect the state of your data. OpenMetadata gathers all the metrics related to the metadata that you are extracting, different types of data assets created, and the data evolution over a period of time. Based on these metrics, we provide analytics to assess the gathered data.

Admins can define the Key Performance Indicators (KPIs) and set goals within OpenMetadata to work towards better documentation, ownership, and tiering.

These goals are based on different types of data assets and are driven to achieve targets within a specified time. For example, Admins can set goals to have at least 60% documentation coverage, ownership, and tiering of data by the end of Q1 2023. Teams can view a time series report to monitor the health of their data and track the progress toward the organization's goal.

In addition to the metrics on data, Admins can view the aggregated user activity and get insights into user engagement and user growth. Admins can check for Daily active users and know how OpenMetadata is being used.

The Data Insights Report is emailed weekly so that teams can assess their performance relative to the KPIs set at an organizational level to improve data culture on an ongoing basis.

Lineage

Trace end-to-end Lineage

The lineage UI has been transformed to enhance user experience. Users can get a holistic view of an entity from the Lineage tab. When an entity is selected, the UI displays end-to-end lineage traceability for the table and column levels. Just search for an entity and expand the graph to unfold lineage. It’ll display the upstream and downstream for each node enabling Impact Analysis to identify the data issues quickly and fix them.

The Lineage Tab UI supports two-finger scrolling to zoom in or zoom out.

Data Quality

Data Profiler for Data Lakes such as AWS S3, Google GCP

With the OpenMetadata UI, users can now create and deploy profiling workflows for the Datalake connector, which supports AWS S3 and GCS. In the next release, we’ll add the support to run tests as well as cover Azure ADLS.

OpenMetadata already supports advanced search syntax. Since it is syntax-driven, it’s not easy to use for all, except advanced users. In 0.13, a Syntax Editor has been introduced for advanced search with And/Or conditions that help discover assets quickly. A huge thank you to Cristian Osiac from Bloomberg for helping with this feature.

Security

With the addition of the LDAP SSO in the current release, OpenMetadata supports nine SSOs, which include Google, Azure, Okta, OneLogin, Auth0, Amazon Cognito, Keycloak, and custom OIDC. In the 0.12.1 release, support was added for basic authentication to sign up using a Username/Password.

OpenMetadata Roles and Policies treat Bots as a special users with access to all the APIs and entities, just like an Admin. Bots have been in use for ingestion to extract metadata, as well as for data profiler and so on. In the 0.13 release, we’ve created multiple bots to serve different scenarios. For example, Ingestion Bot, Lineage Bot, Data Quality, and Profiler Bot.

Given the varying roles for specific bots, the policies and access control for bots have been redefined. Now, Bots can have their own Roles. For example, the Ingestion Bot can create and update entities. The Profiler Bot can only update the profile of a table and does not have policies for any other entities or access to update table descriptions, etc.

New Connectors

In the 0.13 release, we have introduced four new connectors:

  • Domo is a cloud-based dashboard service. The Domo Business Cloud is a low-code data app platform that takes the power of BI to the next level by combining all your data and putting it to work across any business process or workflow. OpenMetadata supports Domo as a Database, Dashboard, as well as a Pipeline service.

Hacktoberfest has been a complete success at the OpenMetadata community, with three connectors being developed as part of the event:

  • AWS SageMaker is a fully managed machine learning service, where data scientists and developers can quickly and easily build and train machine learning models, and then directly deploy them into a production-ready hosted environment.

  • AWS Kinesis is a cloud-based messaging service that allows real-time processing of streaming large amounts of data per second.

  • AWS QuickSight is a cloud-scale business intelligence (BI) service that allows everyone in the organization to understand the data by asking questions in natural language, exploring through interactive dashboards, or automatically looking for patterns and outliers powered by machine learning.

Big thanks and congratulations to Michael Zhou for developing AWS QuickSight and to Tushar Mittal for adding both AWS SageMaker and AWS Kinesis.

Several improvements have been made to the ingestion framework. In the 0.12.1 release, we shipped the ability to add a custom service type. Users can now develop their own connector and ingest it as with any other supported service! If you’d like to learn more about that, you can check out the demo!

Messaging Service Schemas

Major enhancements have been made to the way data is extracted from Kafka and Redpanda Messaging services. Previously, OpenMetadata extracted all the Topics in the messaging queue and also connected to the Schema Registry to get the Schemas. These schemas were taken as one payload and simply published to OpenMetadata. We now parse Avro and Protobuf Schemas to extract the fields.

Other Changes

  • Soft deleted entities can be restored. Currently, only the ML Models are not supported.

  • Soft deleted teams can be restored. When restoring a soft dele

Thanks to our Contributors

We are thankful for the overwhelming feedback and support we received from our community. We are grateful to the following community members for their code contributions:

Thanks to Abhishek, Ali Jbeili, Allen Haozi, Anna Malchow-Perryman, Ashish, Benjamin Meyer, Cometta, Daniel Fedak, Deepak Parashar, Dipankar, Flavio Altinier Maximiano da Silva, geoHeil, George, Guangyu Qu, Jromandv, Juan Suarez, Koabhi, Konnectr, Luis Jiménez Tortosa, Marcelo Sarmento, Marius Osiac, Martin Trillhaas, Michael Tiemann, Mukhesh Narra, Nathan, Pauline Tolstova, Raphael Lima, Sam Firke, Sidharth Reddy, Sisin, Steven Spadotto, Vishwajeet Kumar, Vqcuong, for raising GitHub issues that made it to the 0.13.0 release.

Please reach out to us on Slack if you have any questions about code, installation, and docs. For feature requests, please file a GitHub issue or reach out to us on Slack. Interested in contributing code? Here are some good starting issues to get you going.

Like what we are doing? Please give us a GitHub star. That’ll help OpenMetadata in reaching a wider audience.


OpenMetadata 0.13.0 Release was originally published in OpenMetadata on Medium, where people are continuing the conversation by highlighting and responding to this story.