TDWI Articles

Uncovering the ROI of a Data Fabric

Data fabric technology is enjoying increasing scrutiny by enterprises, but with costs on the minds of every manager, where can enterprises expect to find a positive return on their investment? Here are four places to look.

The golden road to rapid growth in the enterprise software space is public, credible ROI, particularly when bringing net new capabilities to market. But therein lies the chicken-and-egg dilemma. For new technologies, there's always a lag to establish ROI. However, there are clear pathways to ROI that customers are already adopting.

For Further Reading:

Data Fabrics for Big Data

Benefits and Best Practices for Data Virtualization in the Real World

Modernizing a Data Warehouse with Real-Time Functions

Data fabric is having its moment of public debut in the data management space. According to a Gartner report (Gartner Top 10 Data and Analytics Trends for 2021), as data becomes increasingly complex and digital business accelerates, data fabric is the architecture that will support composable data and analytics and its various components.

There are good reasons to believe this adoption is fueled both by growing awareness that there must be an alternative to "integration by physical consolidation" and by pandemic-driven global attention to digital transformation and the role unconnected data plays in achieving this transformation. All of which leads prospects and early adopters to start asking data fabric vendors about ROI.

The ROI of Data Fabric is a Function of Speed, Resilience, and Efficiency

The aforementioned Gartner report says that data fabric reduces time for integration design by 30 percent, deployment by 30 percent, and maintenance by 70 percent because it enables data reuse and a variety of data integration patterns. Further, data fabric leverages existing investments in data hubs, data lakes, and data warehouses while introducing new approaches and tools for the future.

It's still early for hard numbers in overwhelming volume, but we can see the outlines of where the ROI of data fabric is to be found. To summarize, by focusing carefully on the underlying value proposition of data fabric as a net new capability in data management, the ROI is going to come from four sources.

ROI Source #1: Promoting data reusability

Data fabric ROI is a function of data reusability at enterprise scale. Conventional data integration, which is based almost entirely on relational data models and data location in storage, has a poor track record of data and schema reuse. It's shockingly rare to see relational schemas reused extensively across an enterprise. What's worse, relational technologies are not good at managing complex data, particularly when hierarchical data, which is quite common, overlaps with non-hierarchical data or when multiple data hierarchies overlap each other.

Data fabric takes the approach that there is a compounding business value owing to data reuse by means of leveraging existing data models. Modern data fabrics stress the importance of reusing data (by reusing data models), best-of-breed data fabric implementations, and adopting semantic graph data modeling technology. The increased power of these data models lets modern enterprises manage inherent complexity by representing or modeling that complexity and then reusing those models extensively.

Once the various domains that exist in a large, complex enterprise have been modeled correctly, there's precious little reason to re-model them again for each new application or requirement. Key to the ROI of a properly expressive data fabric are the efficiencies unlocked by data model-led data reuse. Data fabric solutions that provide for schema multitenancy and virtual transparency -- that is, the ability to host multiple data models or schemas simultaneously over the same data, and the ability to model and query data regardless of its location at the storage layer -- are key to making data model-led reuse a practical reality.

Modern enterprises, especially those in regulated industries subject to compliance and reporting requirements, are simply too complex to permit the compounding inefficiencies of not reusing data models and, hence, data itself.

ROI Source #2: Eliminating unnecessary data replication

For Further Reading:

Data Fabrics for Big Data

Benefits and Best Practices for Data Virtualization in the Real World

Modernizing a Data Warehouse with Real-Time Functions

Data fabric ROI can also be found in its ability to reduce inefficiencies caused by excessive data replication and amelioration of the nasty "second order effects" that are downstream of that replication. In addition to data reuse, data fabric technology reduces the necessity of data movement, copying, and replication as a primary means of data integration. This is due, in part, to data modeling efficiencies and the result of data virtualization and query federation capabilities.

A recent IDC report on global data growth rates concludes that "the ratio of unique data (created and captured) to replicated data (copied and consumed) is roughly 1:9." We can further conclude that this ratio represents an enormous global opportunity for efficiencies and waste reduction with the right approach to data integration. Although it is not realistic to think that all data replication can be eliminated, when such a huge proportion of global data volume is replicated data, integration techniques that reduce or eliminate the need to replicate data will be huge drivers of ROI.

Data fabric is precisely a data integration technique that reduces the need for massive data replication. Leading data fabric vendors feature data virtualization and query federation capabilities that, together with some of the aforementioned capabilities, make it possible to query and connect data in place, whether in on-premises or multicloud scenarios, without moving or copying that data first. This capability is essential to a data fabric and to the future of data management.

ROI Source #3: Creating a digital twin of everything

Another way to determine data fabric ROI is in its ability to function as a "digital twin for all the enterprise things." The business value offered by a richer data model, backed by the ability to query and connect data in place where it lives, without moving or copying it, is to effectively create a digital twin for the entirety of a complex enterprise.

Of course, digital twin is a concept that originally arose in the context of supply chain management (SCM) and systems engineering. It's escaped into the rest of the digital economy where it serves as a potent target for modern data management and digital transformation efforts. Just as with SCM, the rest of the economy is awakening to the absolute necessity of purely digital representations of the primary categories, entities, and processes of the enterprise, which will differ from firm to firm but which also have significant overlaps.

The point of enterprisewide digital twin efforts is to create digital, data-backed analogues or "twins" for real-world systems and to gain unprecedented insight and control of the real-world systems by controlling the digital twins, algorithmically. Let's consider a few examples.

Pharma and life sciences: Boehringer Ingelheim's Semantic Integration Project created a semantic layer atop Boehringer's data lake to enable a company-wide data fabric that provides a consolidated, one-stop shop for 90 percent of their R&D data. Another global pharmaceutical company is building a data fabric-powered system to span its entire supply chain to create a supply chain control tower that lets them observe, predict, and react to fragilities, disruptions, and weaknesses in the supply chain in real time by doing so first in the digital layer. This molecule-to-market initiative (which is only possible by choosing the right data management technology at the level of infrastructure) can predict, based on actual outages, impacts to its supply chain many quarters in advance, offering resilience, flexibility, and agility of response.

Insurance: Similarly, a digital twin effort created a semantics-enabled data fabric at one of the world's largest insurers to homogenize and rationalize risk and portfolio management across global markets.

Banking and financial Services: A global bank is building a similar data fabric-powered system to create a layer of semantics for the firm that spans all operations and global business units. This layer, which decouples the physical storage systems from the business-level meaning of that data, allows real-time access, prediction, and control of the underlying real-world systems, focusing largely on managing operational risk and increasing resilience across the company.

The ROI achieved in each of these use cases is significantly impacted by the data fabric because it offers transformational-level insight and resilience.

ROI Source #4: Leveraging speed

Contributing to data fabric ROI is accelerating time to insight. With powerful data modeling and reusability capabilities, leading to digital twins of the major systems and entities of an enterprise, the downstream ROI impact that adds significant value is speed to insight. Part of being resilient is being strong with many layers of adaptable defense. The other part of being resilient is being agile, flexible, and fast to react/respond/adapt to new obstacles and challenges.

Speed to insight underlies all of these modes of enterprise resilience. In both pharmaceutical projects already mentioned, the impact to ROI is positive and undeniable, even when not easy to quantify generally.

Speed to insight is also critical in financial services, where the ability to manage operational risk while simultaneously maintaining a competitively aggressive investment posture is key. Data fabric is important to operational risk, including connected inventory, IT asset management, capital controls enforcement, and life cycle management. The ROI for a data fabric approach is made up of both quantifiable and non-quantifiable or strategic elements and includes efficiencies and superior decisions made faster, as well as increased assurance that, with a holistic or "risk 360" view, the firm is completely safeguarded.

A Final Word

The ROI around data fabric will be found as an aggregate of the business value of a new degree of automation of data-driven insight. As Rita Sallam, distinguished VP and lead analyst of the Gartner report, pointed out, "Data and analytics leaders must proactively examine how to leverage these trends into mission-critical investments that accelerate their capabilities to anticipate, shift, and respond." Essentially, they need to adopt a holistic net new capability to answer questions in real time that were either previously unanswerable or answerable only as a result of laborious manual work.

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI Members have access to exclusive research reports, publications, communities and training.

Individual, Student, and Team memberships available.