In this blog, Ashwin Nayak, CTO of Zaloni, dives into the live questions asked during his recent webinar, Making the Business Case for Automated Data Governance. In the webinar, Ashwin discusses the importance of incorporating data governance into your data ecosystem to ensure compliance with the latest privacy laws and regulations and the growing need to provide data consumers with self-service access to data. Unfortunately, when these data governance processes are not automated, it is far more difficult for environments to scale as the volume of data increases. Ashwin refers to his firsthand experience and expertise in implementing a data platform and the right processes to facilitate successful data governance to overcome such challenges.
What are the technical architecture patterns to implement automated governance?
Several architecture patterns may be used in today’s data governance or while implementing a data platform. First, before a technical architecture diagram, it’s essential to have a business technology diagram of your entire data ecosystem. Doing so helps organizations pinpoint and map their data sources. The data sources may serve different purposes; for instance, one of the data sources is what will be consumed by end-users, while another source may house an application that will be generating the data. Creating a business technology diagram helps teams visualize the data ecosystem’s interconnectivity, develop processes, and understand how the data consumers will benefit from the available data and the tools they’ll be using. From a technical architecture standpoint, consider leveraging data mesh architecture patterns where each department or domain has its own data zones for data discovery and processing to create a trusted and refined dataset. They can share artifacts (such as rule definitions or templates) with other domains. The architecture can be implemented in a lakehouse compute environment to reduce data duplication, improve cost efficiency, and provide faster data access. In addition, you should consider robust metadata-led data fabric and tools to capture technical, operational, and business metadata for the entire data ecosystem irrespective of the tech stack you implement. The active metadata fabric will allow you to implement data governance, policies, access controls, and working KPIs to measure success and improve adoption.
Where should governance be on a list of priorities for a CDO?
Governance should be part of the broader data analytics, AI strategy and data platform implementation. Governance enables production-grade analytics and AI and advances your data engineering process, improves the organization’s data literacy, and delivers compliance requirements. On a contrary note, if data governance is treated as a stand-alone tool and/or process, it will most likely fail.
Who is the business sponsor of the data governance program?
We are finding CDOs or heads of governance or data are the economic sponsors, while IT is the key tech partner and the data team is the user in the system.
Maintenance and usability in the cloud are easier than on-prem, but it comes with multiple risks. Whereas on-prem is safer as we can always use multiple borderline security, but ultimately, it is slower to access and comes with more cost. Ashwin, what is your opinion on this?
Deciding to be in the cloud or on-prem has been a popular debate. For Zaloni and our customers, we see increasingly hybrid data environments. Many of our customers are migrating some of their business units or their data platform to the cloud, while others have fully migrated to the cloud. From a networking and security standpoint, there are plenty of safeguards implemented with the cloud, along with the zero trust connectivity to applications and databases. We see that wherever data resides, hybrid governance is becoming critical for many of our customers. When customers are already using data on-prem and are moving to the cloud, they want to use the same tools and policies while migrating to the cloud. A couple of things to consider as one migrates to the cloud:
In your opinion, how to measure success in data governance?
During the webinar, I touched on the criteria for measuring data governance success within an organization. Below are the four pillars and metrics discussed:
Additionally, another vital piece of measuring success is communication. You should find the right cadence to communicate the metrics and share progress.
When it comes to automating governance, I could imagine that adding metrics to sensitive attributes and combinations of attributes could be pretty important. Where do you start with that? What metric would you use for the sensitivity of an attribute?
Automating governance starts with classifying data to understand confidentiality and sensitivity and implementing supervised processes to accept/reject recommendations. Once data stewards approve the result, the next set of automation workflows is triggered to perform actions such as pseudonymizing confidential data and applying data quality rules. The governance policies drive which actions to start and when. In addition, automation of tagging data based on sensitivity and confidentiality should be considered.
The sensitivity of attributes can be determined based on data classification. You may want to consider measuring attributes that are classified by PII, PCI, or PHI across the database or application.
How do you foster a collaboration and mutual success culture for data governance, data engineering, and IT?
After a data platform is implemented we are seeing, especially with our customers, the importance of starting with “why”. We communicate the business value of governance and why it needs to be a part of your overall data management and data ecosystem plan. There is a lot of discussion around different stakeholders involved in governance initiatives, like engineering, IT security, and governance-specific teams. When creating a governance plan, these stakeholders should be present, so these teams know how they will contribute to fulfilling the data governance plans and initiatives.
Additionally, and on a more proactive note, the team designated to execute data delivery is critical. For example, data engineers, data stewards, and data analysts, these three roles from their respective departments should be incorporated into your daily scrum. Bringing these individuals into your data to day activities will bring success to your organization.
Lastly, we discussed in the presentation building a RACI model (as shown below). This model helps to define roles and responsibilities clearly.
If you enjoyed learning more about data governance and the value it can bring your organization while implementing a data platform, be sure to watch Ashwin’s webinar. Feel free to explore our website to learn about Zaloni, our Zaloni Arena data governance platform, and our many resources, like our complimentary Data Governance 101 e-learning course and Zaloni’s latest data governance research report with Dataversity.
News By: Team Zaloni
Blogs By: Matthew Caspento
Blogs By: Matthew Caspento
Webinars By: Team Zaloni
Papers By: Team Zaloni
Blogs By: Annie Bishop