Are you prepared to meet your big data vision? Request a demo to see how Arena can help!
Read the webinar transcription here:
[Eric Kavanagh] All right folks, welcome back once again for being radio Yes, it’s your host Eric Kavanagh, all down again from a different location today than us actually dialing in from Mercy Hospital in downtown Pittsburgh, not the best of circumstances but I’ll tell you what the show must go on. We’re very excited about today’s topic. We’re gonna talk all about volume velocity and variety How about vision of course we’re talking big data, I’m sure will blend into a discussion about analytics today as well and seeing as I am at a hospital I’m going to put a little bit of a healthcare spin on it at least for part of the show today, I am incredibly bullish on the power of analytics and in particular deep learning, but machine learning in general, to help in the healthcare field. In fact, I would tend to think that in a few short years, we’re going to see deep learning do tremendous things for being able to understand very serious diseases like multiple sclerosis or ALS or some of these other really really fixing and difficult diseases that folks, some folks unfortunately have, and also just some of the basic stuff like fractures and broken bones and so forth. Now once you can scan hundreds and thousands and even 10s of thousands of MRIs and really start to analyze them using some deep learning modules, we’ll be able to better understand what’s really going on this particular lesion here that particular lesion there will be able to have a better understanding what all that stuff means and I’m very excited about that. Now there are some hurdles that have to be overcome obviously there are privacy issues and there are lots of different issues to sort out. But as a general rule, I’m very excited about this and I think we’re going to see a lot of good stuff come out in the near future, but just in general big data, it’s a whole different ballgame than traditional small data if you will enterprise data, big data is really reshaping the enterprise in many very very interesting ways. I like to sometimes refer to it as real world data at scale. But we’re seeing all kinds of really interesting developments in terms of companies being able to understand the marketplace being understanding their own business being able to understand consumer trends for example, there’s this whole movement of alternative data these days which is being used to fuel marketing operations it’s being used to fuel stock market investment choices for example so there are lots of really interesting things going on and we’ll be hearing from Matthew Monahan of a company called zaloni. And so with that let’s dive right in. So Matthew, tell us a bit about yourself and slowly what you folks are doing in the Big Data space.
[Matthew Monahan] Thank you, Eric and thank you for having me on the show, as you said, I work for is the Zaloni My name’s Matthew Monahan I’ve been in the data space for quite a while now, really enjoy it I absolutely love working with data. And we’ve talked a lot about the volume, velocity and variety of data. And it’s absolutely critical to have a plan for managing the data. Of course you can’t have a friend, be successful without having a clear vision and vision, always starts with a good understanding of how and why the data will be used and as you’ve indicated there’s, there’s so many more ways now that users can get access to data and figure out what to do with it, whether that’s in the medical space, or in finance or in everyday life, e commerce, other areas, absolutely incredible. But as the data management industry has evolved over the last decade we’ve made great strides in moving data into data warehouses and datalakes, however users still struggle with access to the resources that are there. But why is it that we still have access, have trouble accessing the data access to data is not just about pulling it all into a central location. Right. First the heavy ETL process for data warehouses, it’s just too slow for the pace of business today. It’s not uncommon for the new data requests to take six or even eight weeks from the initial app to final provisioning when a user gets to do something that’s also brittle process is prone to failure when either the source or the destination systems change in second data lakes or data swamps as they often become just don’t have the right level of governance or metadata to be used with confidence.
The data may be there but how can users, be sure that the data they find is accurate, current, and complete and Shivani was talking a lot about that earlier. How can organization. Ensure that data is only accessible to the right people for the right reasons. And that data or access to the data is retired and removed when appropriate. To enable our business teams to source understand and use data. We need to deliver a complete end to end solution which provides self service access to data in hours or days, not weeks or months. And along the way we need to apply that right size governance to ensure that business decisions are made with confidence in the underlying data.
And our focus needs to shift to understanding where the needed data reside. What it means how it relates to other data where it originated its level of quality, and we need to accomplish that while the data arrives to ever accelerating rate, increasing in volume and spread out among even more applications. But really, let’s focus on how is that data going to be used and Wayne was talking earlier about some of the data science aspects. And I think that’s really a critical key, as we move forward to figuring out the real time, access to that and what users are going to do with it.
[Eric Kavanagh] Yeah, right, and you know it’s interesting I think Zaloni you’re one of a handful of companies that is really focused on this data layer and just to kind of explain things for our audience out there. I think everybody knows what a database is and the traditional architecture of some solutions and applications, you’d have the database layer you’d have your application layer, maybe you have some, some interface on top like a special GUI graphical user interface for example, but traditional databases are simply not up to snuff when it comes to big data that’s why you had companies like Facebook and others invent their own basically they rolled their own databases and then of course we had to do some reverse engineering to deal with issues that came as part of that process but the point is that this whole concept of a Data Fabric is now very mature and with good reason, kind of, to your point, Matthew because you if you’re going to leverage the full power of your information assets you need to have this foundation, this data fabric to fuel, whatever the industry needs might be right.
[Matthew Monahan] That’s absolutely right, you know, just a few years ago the end of the data pipeline, as it was often called was refined data at rest, right these massive databases whether they were a data warehouse or data lake, they could do massive processing of massive amounts of data, it was still fundamentally data at rest. And as we’re starting to see you know you can for science we talked about the piece of diet. It’s extremely exciting to see the results. But the reality is that takes years and years to get there right we’re building these models we’re analyzing data, but the the use cases today are really starting to revolve around real time. So it’s interesting to analyze the data and know what types of data for example in medical imaging might be there. But what you really want to know is I’m taking an X ray now I want to know what the results of that X ray are now right and the same can be true for business situations as well. So let’s take a look at an e commerce site. When a user arrives at that site. They want, not only the recommendation. But they want to really feel like the offers that are being presented to them, are relevant to them, not them five years ago, not been five days ago, maybe not even five minutes ago. And that’s where and one of the things that we’ve really focused on it Saloni is this concept of an active data hub. Right, so it’s not just about moving data from point A to point B, but it’s how your users are going to make use of the data so when a user says, I need to get access to the data, what they really mean is I have this business challenge when someone comes to my website. When a patient walks in the door. How do I interact with them in a way that encapsulates everything I could possibly know about them in real time. So we’re taking a look at the end of that data pipeline now is a business decision, it’s a real time business decision, not just data at rest. And so the the ability for us in the vision that I think we have, as we move forward in the data management space. Is that ability to give users that self service access to the data to get it out exactly where they need it with the tool, not just data, but data flows tools in a sandbox environment where they can cross it, whether it’s in a Jupiter notebooks or Python and R, whatever it is the tools that they’re using. And it may still be Excel, it may still be Tableau MicroStrategy, but whatever those tools are the users want that with the data, and they want it.
[Eric Kavanagh] Yeah, and I think the internet in general and all these major properties that have come along like your Facebook’s and LinkedIn and Twitter’s and Instagrams and all these other technologies all these other sites have really changed the expectations of the end user in the enterprise in large organizations to where they’re much less patient about getting access to things right that the belief is, you should be able to get it why can’t I get it, and if you can’t get it. Let’s face it, you get frustrated and you move on and do something else so it’s it’s, to me it’s really imperative on large organizations and even small and mid sized companies these days to provide some kind of foundational Data Fabric to, to, to feed, whatever the use cases might be right.
[Matthew Monahan] Absolutely. And, you know, it’s not just the consumer environment. If you take a look I know you mentioned you’re in a hospital right now. One of the things that medical community is doing is saying that patient walks in the door. And I’m not just following a standard recipe of diagnosis and helping to treat the patient as a human being, but as an individual, right, the ability to analyze their DNA and come up with the right treatments and medications to address that specific individual. And that really requires that real time access to data, the ability to process all that, and you’re absolutely right, as a society, you know, we may poke fun at some of the millennials, but the reality is that society as a whole is getting used to the fact that everyone is an individual, and I can have exactly what I want, when I want it, and it’s going to be customized for me. And to be able to do that effectively. If you just imagine consumer banking, for example, somebody picks up the phone and talks to bank, wait a minute, no they don’t pick up the phone anymore. they’re having a Twitter conversation when they’re frustrated or they’re reaching out by email, they’re going to a website. But more often than not, they’re doing all of them right they’re having all of those interactions and the ability to pull in that variety of data. So that whichever interaction, you’re dealing with at that moment, you’ve had the entire context in full view of that person. And one of the great things about where we are in data today, we are really exploring in the very early stages of how to analyze and use all the data right we’ve been for the past 10 years maybe 20 might just have this explosion again to bring it back to the theme of volume, velocity variety, this explosion of data, and only now we’re really just starting to figure out how to, how to work with it and figure out what the data means. And so, again, to the point I was making about vision. If you have a clear vision and understand what you might be able to do with the data and get access to that right data that’s really going to help us evolve, and then the next step after that the next piece that evolution is even more data. So you start to see those repetitive use cases where more and more people are trying to do similar things with similar sets of data. And of course, the challenge is figuring out what that is actually similar and much use cases are similar, but as we evolve and as we figure those things out. Packaging those results and making those available, it just continues to make that value pushed out to more and more people in a faster and faster way just like today you can get the compute power and the storage and everything else that you need to do the processing, you know imagine another world in a few years where you can also just grab the models that you need that have had years of development and experience and grab public data sets that support those models, it’s really going to change the landscape.
Now, all this talk of of democratizing data is absolutely great and at the same time incredibly scary for a lot of organizations. So one of the things that we’ll need to keep in mind and keep as a part of our vision as we move forward, is how we do the right amount of data governance on top of that. And better yet, let’s find ways to automate some of the data governance to make it easier for the organization to keep that piece of data moving and not have to rely on a lot of manual processes that people were involved in saying yes you can have access no you can’t do that. While the technologies the AI DML will help us to do.
[Eric Kavanagh] So thank you all very much for your time folks what a wonderful show, we’ll catch you next week, folks. You’ve been listening to DMA.