TDWI Articles

Reducing Time to Insight with Kevin Bohan

How can improving data management result in faster data access and sharper insights?

Real-time data may seem like nirvana. Everyone wants faster access to data. A recent TDWI Best Practices Report (BPR) about reducing time to insight and maximizing the benefits of real-time data highlights two major challenges: inadequate data quality and data fragmentation into silos. In part of a three-part podcast series reviewing this BPR, Kevin Bohan, director of product marketing with Denodo, began by discussing the challenges of real-time data. [Editor’s note: Speaker quotations have been edited for length and clarity.]

For Further Reading:

How to Overcome the Insights Gap with AI-Powered Analytics

How Generative AI and Data Management Can Augment Human Interaction with Data

3 Data Management Rules to Live By

 “Real-time access is a big focus for many use cases when you're doing things that might be related to customer sentiment and trying to engage with them before they leave your shop or click off the website.” However, Bohan noted that when it comes to real-time insights, most enterprises are more concerned about the delays typical business users face. “Users aren’t trying to get data in milliseconds, and there's an inherent complexity in the way most people have architected their data landscape today. There are lots of data silos; an IDC report recently said the average large enterprise has 367 different data systems and applications where information is stored. Think of the complexity when you have this many different systems.”

 How do you get access to that data? “A popular approach is bringing everything into a data warehouse.” Bohan explained that enterprises think, “Let's use processes such as ETL (or now ELT) and move data into a central location; then it's going to be easier to gain access to the data. However, creating those pipelines, finding the right information, bringing it into the system, and making it available to people is pretty complex.” He noted that “what the enterprise ends up doing is over-relying on a data team or a central IT department to get information to the end users. In many cases they'll wait weeks if not months to get access to a new data source.”

 Once the data source <em>is</em> available to users, refreshing that data is typically a batch process that takes place every couple of hours or days -- sometimes on a weekly basis because the loads are large and enterprises don't want to bog down their networks. “What people are getting is information with latency built in, and it's just inherent to the process by which you're bringing the information together. At Denodo we believe there's a better way. All those systems are rebuildable.”

 “I'm not suggesting that you don't need your data warehouses, data lakes, centralized systems. What we believe is you should embrace the fact that data is distributed. The real goal is to have a simplified, easy way for users to access the data. The real goal is to have a central location where you can enforce policies consistently throughout the organization. What you can do is establish a logical data access layer that sits atop your data silos and presents the information to the consumer as though they're coming to one well-integrated system. You can present that data to the individual user in a format they're going to understand and will be able to use.”

 To make it that simple, Bohan believes enterprises must leverage AI to automate and simplify many different tasks.

 “What I’m describing is really a logical approach to data management that is added atop your existing infrastructure. It leverages the investments you’ve made. It's not a whole new approach, nor is it a major redesign of your infrastructure. A Forrester report about this logical approach -- sitting atop your data sources -- improved delivery times over ETL processes by 65%. It's a tremendous value for organizations looking to provide quicker access to information that's actionable.”

 The conversation turned to whether deploying a data catalog can help organizations reduce delays and bottlenecks. Bohan explained that using a data catalog is like online shopping. “The site’s catalog allows the user to explore and discover what they're looking for. If you carry that analogy through, I think that the best practice for an organization is to think about that data catalog as a way to leverage the experience these e-commerce sites have trained everyone on: how to think when they're looking for different things.”

 When users shop online, they may receive personal recommendations based on other searches they've done or what other people have searched for. “I might go to a data catalog looking for information about the success of a previous campaign and see a custom recommendation of the top-performing assets in campaigns, which will be of tremendous interest to me, but I didn't even know it existed. The data catalog is making that recommendation or providing shortcuts to popular data assets that people are using or endorsing. If I know a colleague has noted ‘this is a good data source to use,’ I'm going to have more confidence in using it.”

 The enterprise must build trust in its data. Knowing the source or lineage can help. “Enterprises can give users an experience that gives them faith in what they're getting. ‘If I know that customer information is coming from Salesforce, and we use that as the golden source for company data, I'm going to have more faith in it than if it was coming from a spreadsheet from somebody else.

 “Thanks to ChatGPT-like functionality, more catalogs are adding AI capabilities; I don’t have to put the data into an analytics application. Instead, I can use a prompt to ask the system to show me the top-selling product on the West Coast and get the answer right away.”

 Bohan was asked what organizations should keep in mind to future-proof their data strategies.

 “When it comes to future-proofing, sure there’s AI, but I would say it goes back to what we originally started talking about -- this abstraction layer that sits atop your data sources. Think of the power of that if you want to make a change -- if you no longer want to be running a data warehouse that's on premises and you want to move it to the cloud. If you now have old data, consumers come into this abstraction layer, they never know anything changes, you basically can swap out the underlying data infrastructure for whatever reason you might want. It doesn't affect the consumers. That's tremendously powerful.

 “We often will wait weeks or months because we can't make a change towards the end of a period. We have small windows where we can make incremental updates to the environment. By creating this abstraction layer, we get that same flexibility and agility that the DevOps teams have had for a while and can quickly make changes. You can roll back if there are any issues … it also allows you to swap out environments. We have customers who were trying something with one cloud vendor and the data platform couldn't handle it, so they quickly moved over to another environment. They were able to do that within weeks where their previous switch took them closer to a year. That agility is incredibly powerful.”

 [Editor’s note: You can stream this episode on demand here.]

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI Members have access to exclusive research reports, publications, communities and training.

Individual, Student, and Team memberships available.