You can’t truly demonstrate the value of your data lake investment until you enable more business users to access it. Our experience, as well as industry research, shows that if you build it, they will indeed come. In fact, according to Gartner, self-service business users are on a trajectory to produce more analytics output than data scientists!
The key to making data self-service a reality is right-sized data governance – so that you can control access and users can rely on data quality and understand the context around how data is being used. Before diving into building an enterprise-wide self-service data hub, however, it pays to step back and develop a strategy that will ensure ROI and avoid or solve challenges like data sprawl and ungoverned data and data quality issues.
What is the best way to go about it? You’ve heard that saying: “take it one day at a time.” When it comes to building an enterprise-wide data hub, we say “take it one line of business at a time.” Although you may think this seems like a “slow” way to do things, it’s the smart way to do it, and will ultimately result in faster delivery of a successful project. Overall, we recommend a three-step implementation strategy.
We typically tie the steps of implementation to an organization’s level of data maturity. For Step 1, you would be at what we consider “Level 1” in our data hub maturity model, which you can see below. Step 1 involves the implementation of a data catalog to improve visibility and provide broader self-service and typically role-based access. We recommend building what would be considered a traditional data catalog that can be accessed enterprise-wide and by every line of business (LOB).
Work with one line of business (LOB) at a time to enrich the data that the LOB needs to achieve its goals. As you work to understand the group’s requirements and enrich the data using both internal and external sources, the key is to enable your business users to process the data themselves, quickly, and in a self-governed way. We find there is no point in doing this at an enterprise level because it takes too long and delivers minimal value.
Once you can do Step 2 successfully, then you “action” the enriched data one LOB at a time. This should be a discrete enhancement, done separately from steps 1 and 2. This step is where you’ll see the power of the active data hub. By “active data hub,” we mean a catalog that also serves as the foundation of all self-service data activities. This, to us, is Level 3 in the data hub maturity model. Business users will be empowered to take the enriched data definitions and underlying data, and deliver it to their business applications. Furthermore, the data hub keeps everything coordinated and updated as the enriched data builds and changes.
Taking the time to deliver a self-service data hub one line of business at a time enables you to build a stable and sustainable foundation based on real-world successes and learnings from challenges and mistakes – that won’t sink the whole ship. Zaloni’s DataOps platform, Arena, is a key tool in this process, enabling management of data quality and governance from within the platform, and enabling different levels of use and access for different business users as appropriate.
Learn more about Zaloni’s approach to transforming data catalogs into actionable, self-service data hubs, best practices for successful enterprise-wide implementation, and real-world case studies in our white paper, Achieving DataOps Success with a Collaborative Data Catalog
News By: Team Zaloni
Blogs By: Matthew Caspento
Blogs By: Haley Teeples