February 28th, 2019
Data, and working with data, is a struggle. The fluidity and ever-changing nature of data creates barriers which obstruct multiple ways of extracting useful insights.
Businesses have always used data, but today there are more sources and higher volumes than ever before. There are different roles and functions for users around working with data that are defined by their skills and specialties.
The tools available today that support these endeavors are arguably the best they have ever been. Whether your organization has users with very little technical skills or users that require the support of highly advanced data processing, there is a tool for every person. If data availability, technology changes, and increased job functions supporting data have helped breach barriers, why is data so difficult?
In order for these segments in data and processing to work cohesively, organizations need to consider the modernization of the data platform as a whole. Let’s consider three key needs of a successful data platform and how they can unite data, processing, and users into a collaborative space that allows the discovery and delivery of value.
First, all users must have the same understanding of the data. It is not unusual for a single piece of data to have various meanings within a single organization. To further complicate matters, how that single piece of data relates or links to other pieces of data can also be a mystery.
I can recall working with an organization on this very challenge that led me to a single user who controlled a physical copy of the data definitions on his desktop. For the users that were aware, they knew to go to Tim and for the users who did not, a new definition based on their own understanding emerged. What if Tim and his spreadsheet were available 24/7? How much better would it be if all users had Tim’s understanding of the data? A data catalog provides this ability. Users can search, tag, visualize and most importantly, understand the business context of the data. Irrespective of structure (or lack of structure) users are able to read the glossary that defines the value of the data and its relationships and apply it to processing.
The data catalog helps all the various roles in the organization understand the data’s business and technical value while being able to access the right data through various open and third-party tools.
Second, as data flows from source systems and we begin to gain that understanding of the usefulness of it, it is important to take the next step into refining the data. Refining is the process of removing the crude or unwanted portions of the data. It is a process of shaping data to be more unified in its structure (for instance, removing gaps or missing data) more coherent in its format (for instance, applying a standard format to an attribute) and more secure for use across the organization.
As part of the modernization of a data platform, these data quality rules and methodologies must be applied at the correct time and place to be efficient. Whether the rules are simple or compounded, organizations need the ability to allow business users to develop the rules and IT the ability to orchestrate the rules seamlessly. This provides a needed coupling between users that supports collaboration and thusly maximizes time to value. Data flows can then send bad data back for remediation and downstream reporting. As a result, analytics are not hampered by poor quality data.
A good data quality process will prevent access to the wrong data while at the same time sending well-formatted and secured sensitive data forward for business development needs.
Finally, providing all users access to clean and secure data means supporting all the various avenues to the data. Many business users prefer a visualized way of communicating the value within the data which could be through the lens of data quality (How many formats for transaction date exist in the data?) or for more analytic purposes (What are the possible product purchase outcomes based on what we know of our customers?). In either case, the modern data platform needs to help users of all skills and disciplines collaborate easily across the data.
Self-service provides a business user with a quick and easy user interface to query and transform data while at the same time providing a data engineer the same abilities through an API or a data scientist a coding interface to complete the task. Self-service enables all users to collaborate across the data landscape through a variety of tools (native, open, and third-party) to understand the data and quickly arrive at a business outcome.
As the ever-changing data landscape grows and becomes more treacherous, new technologies and approaches are formed to help all users find value in their data. The key here is for organizations to modernize their data platform by providing all users the ability to access, transform, and provision the data. If you’re interested in modernizing your approach to data you’ll want to consider Zaloni Arena to accelerate your data strategy. Contact us to learn more.