Benefits and Best Practices for Data Virtualization in the Real World
Ralph Aloe, director of enterprise information management at Prudential Financial, Inc., explains how his enterprise put data virtualization to use, including how the technology fits in with their data fabric, benefits they enjoyed, and lessons they learned.
- By James E. Powell
- July 10, 2020
Upside: What are the benefits of data virtualization for a company such as Prudential?
Ralph Aloe: Prudential Financial, Inc., is a financial wellness leader and active global investment manager with more than $1 trillion in assets under management as of March 31, 2020. We have operations in the United States, Asia, Europe, and Latin America. As an older and established organization, Prudential has disparate systems across multiple lines of business, so finding the right data is time-consuming and not always reliable.
Data virtualization brought us a new way to combine data from different sources that made access easier for our customers. In one case, we were able to combine data from both an external source and a data warehouse to produce new insights for our marketing teams. Virtualization dramatically reduced the time to compile and deliver those insights from weeks to on-demand. Getting to data faster with more reliability is just one example of the benefits we are starting to see.
How does data virtualization fit into an enterprise data fabric and your enterprise data fabric specifically?
After our first implementation, we started to realize that data virtualization can support metadata management in a simple and straightforward way. We enhanced our data governance processes by providing data owners with a single place to manage metadata easier than they had before. We coupled this capability with a data catalog that allowed our customers across the enterprise to easily identify relevant data and incorporate it into their financial, HR, and insurance processes.
The ability of the virtualization platform to expose APIs created a new market for application and data integrations, including opportunities for near-real-time processing through our lines of business instead of creating more ETL. All these features combined became a foundational part our data fabric, where we reduced the friction of delivering new data services and reusability without adding more overbearing governance.
What lessons did you learn in using data virtualization in this way at Prudential?
We initially envisioned virtualization as an abstraction layer for data delivery, meaning that we could use it to deliver data from anywhere to consuming systems downstream. However, we started to discover capabilities such as metadata management, easy embedding of custom enrichment services, API support, and more. We also learned that data virtualization can be used for data prep before getting into the data lake, not just as a delivery layer on the outbound side of the data lake or warehouse. The learning continues as we encounter new use cases and find ways to incorporate these capabilities into our data fabric.
What benefits did you realize?
We are realizing several benefits with our new data fabric and centralized data management processes. In a very simple case, we replaced a homegrown financial dashboard solution by migrating Excel scripts into the data virtualization platform. Now, that dashboard is updated on demand rather than every month. It also became a repeatable model for us to quickly replicate for several similar use cases, allowing us to deliver the solutions faster.
Another benefit from extending the virtualization platform was to provide access to centralized data quality services, which enabled us to replace multiple single-use services across the enterprise. With better metadata management and our data catalog, we provided easier discovery and access management capabilities, increased governance, and raised security awareness. The combination of metadata management and our data catalog capabilities has allowed us to get closer to our vision of data democratization by helping our internal customers discover, govern, and access data like they couldn't before. We also anticipate it will increase our ability to mine data for our ML and AI initiatives.
What benefits did you reap that you hadn't anticipated?
One pleasant surprise was the extendibility of the platform. This enabled us to incorporate enrichment and hygiene services through a single point in the data supply chain and is allowing us to deprecate redundant, outdated, and single-use services across the enterprise.
We were also surprised at how our data virtualization platform facilitated metadata management. This capability helped expand and streamline our data governance processes. The data catalog brought us closer to data democratization by allowing our customers to discover and easily request access to existing data sets instead of creating the same assets repeatedly with more ETL.
What best practices can you recommend for other enterprises considering data virtualization?
Like any new technology, standards for the platform processes and structures must be defined before starting. Working with enterprise developers and architects to define security models and development processes and giving them clear procedures that fit into our agile methods made all aspects of the platform easier to build, administer, and support. As with most standards, expect them to evolve and become more important over time.
At the beginning of our journey, we created a virtualization community for members to share ideas for improvement, brainstorm use cases, and help each other. The community also made it easier for us to gain insight and learn more about how we can make the platform better for the enterprise. The idea to build microservices into the platform was the result of a community conversation. That spark led to an internet search, which led to sample code, that led to a proof of concept, and within an hour we had a simple Java service extension and a process to create and manage more extensions.
I can't say enough about how our community has helped to improve on our investment in data virtualization and would highly recommend others consider creating a similar community.
What would you do differently if you could do things all over again?
We did not anticipate the quick adoption of the platform when we started. Our initial implementation was small without much of an expansion plan in place, but the demand increased quickly and we found ourselves doubling the environment after about eight months. I think this is one area where better planning for expanding the environment would have helped us stay ahead of the growth.
Do you plan to extend data virtualization into any other areas?
In terms of lines of business, data virtualization is already in use across most of the organization, supporting finance, HR, and our insurance businesses. One area of focus is to reduce our dependence on ETL as a means of data sharing and move towards more real-time integration with our data virtualization capabilities. We are starting to see interesting use cases for aggregating SharePoint data to be surfaced directly into dashboards or used as a reference data management tool and brought into our data lake for further processing.
In another use case, we are modernizing a mainframe HR system and plan to use data virtualization to abstract the data delivery layer first so that downstream systems won't be impacted when the mainframe system is migrated to a new platform. Our platform is currently deployed on premises. However, we have plans for migration to a hybrid architecture on our road map, and with the abstraction layer virtualization creates, we can manage and limit the impact to our processes as we continue to grow the platform.
James E. Powell is the editorial director of TDWI, including research reports, the Business Intelligence Journal, and Upside newsletter. You can contact him
via email here.