TDWI | Training & Research | Business Intelligence, Analytics, Big Data, Data Warehousing

Think
- Research & Resources
  - TDWI Playbook | Next Generation Data Science: The AI-Driven Data Science Life Cycle
  - TDWI Data Points | The Data Foundation for AI
  - TDWI Best Practices Report | Data Strategies and Foundations for Modern Data Management
  - TDWI Insight Accelerator | Adopting a Platform Approach for Gaining Insights from Unstructured Data
- Webinars
  - Modernize and Govern: Unifying Your Data Strategy July 10, 2025
  - Expert Panel: Best Practices for Modernizing Your Data Environment July 14, 2025
  - Powering Data Science with AI-Driven Tools and Practices July 15, 2025
  - Data Integration for AI: Overcoming Modern Pipeline Challenges July 23, 2025
- Virtual Summits
  - Virtual Events Keys to Making Your Data AI Ready September 10, 2025
  - Virtual Events Data Quality for BI, Analytics and AI October 22, 2025
  - Virtual Events Modern Data Strategy November 12, 2025
  - Virtual Events What’s Ahead in 2026 for Data & Analytics December 10, 2025
- By Topic
  - By Topic
    
    Explore the Latest AI, Analytics, and Data Research and Training by Topic
  - BI, Analytics, and Data Literacy
  - AI, Data Science, and Machine Learning
  - Data Management and Governance
  - Platforms and Architecture
  - Strategy and Methods
- Speaking of Data Podcast
  
  Current Research Surveys
Train
- In-Person Events
  - Conference TDWI Transform 2025 San Diego August 18, 2025
  - Executive Summit TDWI Modern Data Leader's Summit San Diego: AI in the Enterprise August 18, 2025
  - Executive Summit AI Accelerate 2025, Brought to You by AI Boadroom & TDWI August 18, 2025
  - Conference TDWI Transform 2025 Orlando November 16, 2025
- Virtual Live Seminars
  - TDWI Data Governance Principles and Practices: Managing Data as an Asset June 25, 2025
  - Building Your Company’s Data Governance Roadmap June 25, 2025
  - Data Governance: Driving Engagement and Organizational Change June 26, 2025
  - A Framework for Modern Data Governance June 25, 2025
- Online Learning
- By Topic
  - By Topic
    
    Explore the Latest AI, Analytics, and Data Research and Training by Topic
  - BI, Analytics, and Data Literacy
  - AI, Data Science, and Machine Learning
  - Data Management and Governance
  - Platforms and Architecture
  - Strategy and Methods
- Train Your TeamCustom solutions for training your team
  
  Get CertifiedEarn a professional credential in BI and Analytics, Data Governance, or AI
  
  TDWI MembershipExclusive access to the research, tools, training, and connections
Engage
- Connect
  - Connect and Contribute to Our Vibrant Community of Data Leaders
    
    Subscribe to TDWI Stay up to date on the latest news and events. Sign Up
    
    Become a TDWI Member Gain exclusive access to the research, tools, training, and connections to move your careers, teams, and projects forward. Learn More
    
    Become a Part of the TDWI Research Panel Make a difference in the data and analytics industry and earn incentives by sharing your insights with TDWI. Explore Now
    
    Speak at TDWI Events Share your expertise and build your personal brand as a speaker at a TDWI In-Person or Virtual Event. Submit a Proposal
    
    Become a TDWI Research Fellow Apply to be a member of TDWI’s industry leading research team. Apply Today
    
    Become a Member of the Data & AI Leaders Forum Engage in collaborative discussions, stay ahead of the curve, and stay in the know. Apply Now
    
    Showcase Your Data & AI Solutions Reach and engage with TDWI community through multi-channel marketing programs. Learn More

TDWI Articles

Executive Q&A: Data Virtualization and the Use Cases of Today and Tomorrow

Data virtualization has been around for decades but confusion among technologies remains. Denodo’s SVP and CMO Ravi Shankar helps clear things up.

By Upside Staff
April 4, 2022

Ravi Shankar, SVP and CMO of Denodo, dispels some common misconceptions about data virtualization and explains the role it plays today, as well as the role it is likely to play in the future.

Upside: There is some confusion about what data virtualization (DV) is and what it is not. For those looking to learn more about it, can you clear up any differences between data virtualization and other technologies that are sometimes confused with it?

For Further Reading:

Leveraging Data Virtualization for Digital Transformation

Benefits and Best Practices for Data Virtualization in the Real World

Uncovering the ROI of a Data Fabric

Ravi Shankar: Data virtualization has been around for over 20 years yet, some vendors -- primarily those offering a small subset of what data virtualization provides -- claim that they offer data virtualization solutions. The technologies that are often confused with data virtualization are data federation and SQL query acceleration.

It’s true that both data federation and data virtualization enable two or more databases, either on premises or in the cloud, to appear as a single database. The difference is that data virtualization establishes an enterprise-wide semantic layer above the disparate data sources, which abstracts away the complexities of data access, such as the need to know where the data is sourced. This semantic layer can be flexibly manipulated to meet a wide variety of use cases without affecting the source data.

Even though the data virtualization layer contains no source data, it provides real-time access to the source data without having to move it to a consolidated repository through the critical metadata it contains for accessing the different sources. Unlike simple data federation, DV architecture enables enterprise governance capabilities and data catalog creation that not only can list data but also deliver it.

Although data virtualization includes SQL query acceleration, products that are only SQL query accelerators are not true data virtualization technologies. Such query accelerators usually fast-track the use of data in data lakes for analytics. However, these tools are very specific and limited; they can’t work with multiple types of data sources, such as fast data stores that stream Internet of Things (IoT) data. Also, they do not provide strong security and universal governance capabilities like the data catalogs I mentioned. Finally, their data delivery is limited to SQL analytics, and they do not support operational use cases using APIs.

What is the current state of data virtualization, and can you provide some use cases?

Over the years, data virtualization has evolved into a mature data integration, management, and delivery technology that offers broad capabilities, including hybrid/multicloud data integration, query optimization, advanced semantics, unified security, artificial intelligence/machine learning (AI/ML)-powered recommendations, and enterprise data governance. For instance, data virtualization automates many of the common data integration and management functions using AI/ML. By learning the usage patterns of users and statistics of the queries executed, data virtualization streamlines the development of views with practical guidance. It uses active metadata to, for example, automatically infer relationships, boost performance with refined cost estimations, offer suggestions for joins and transformations, and perform smart autocompletion for frequently used SQL fragments.

Data virtualization also accelerates performance with summary tables. This enables business leaders to ask questions that rely on aggregate information such as what were the most profitable products last year in the Americas. Data virtualization uses summary tables to rapidly return the required results without having to query millions of rows of transactional data. Summary tables are pre-aggregated data sets that are much smaller than the originals and can be rapidly transferred over the network to the visualization application. The best part is that the business leader will not even realize that the report/chart he or she is viewing uses data from summary tables.

As the concept of data fabric continues to evolve, what role does data virtualization play?

Data fabric emerged as an alternative to the traditional configuration in which all data sits in a single repository, such as a monolithic data warehouse or data lake. In a data fabric, data is distributed across the enterprise, and anyone in the organization can access the data by tapping into any individual “strand” of the fabric.

However, data still needs to be replicated, which takes time, causing the usual frustration. Data virtualization turns a data fabric into a “logical” data fabric, in that data virtualization makes data available in real time without replication. A logical data fabric knits a virtual view of data across applications by leaving it within its original sources while enabling a unified view of all enterprise data.

Data mesh is another hot topic in 2022, especially for organizations looking to modernize their data infrastructures. How does data virtualization support a data mesh architecture?

Data mesh is another concept that has recently emerged as an alternative to the traditional, consolidated paradigm I’ve described. In a data mesh, data is not stored in a single repository or owned by a single group. Instead, data is organized into different “data domains” that are owned and operated by different departments within the organization.

For Further Reading:

Leveraging Data Virtualization for Digital Transformation

Benefits and Best Practices for Data Virtualization in the Real World

Uncovering the ROI of a Data Fabric

The data domains in a data mesh are not siloed; however, each data domain is supported by a core provisioning platform that enables data to be shared throughout the organization as “data products.” These data products are specially curated for consumption like the products in a grocery store. Just as it is the natural foundation for a logical data fabric, data virtualization is the perfect fit for a data mesh.

By enabling the creation of highly customizable semantic models above an organization’s disparate data sources, data virtualization facilitates establishing full-featured data domains without changing the underlying data. In this way, data virtualization serves as the core provisioning platform of a data mesh, enabling data domains that serve curated, governed data products to the organization at large.

A new concept -- composable data architectures -- is another interesting topic. What does composable mean and what benefits does it offer?

Composable architecture emphasizes the composers or multiple data-creation centers within an organization. With the proliferation and growing importance of roles such as citizen analysts and citizen integrators, self-service infrastructure creation and self-service analytics are critical for many modern organizations, whereby certain business units or users are empowered to pick and choose their own low-code/no-code tools to build parts (or the entirety) of their required data infrastructure.

Composable architecture also implies a balance between collecting data versus connecting to the data via a logical infrastructure like what is enabled by data virtualization. In this way, a logical data fabric is an inherently composable architecture. Composable data and analytics brings agility to an organization’s data and analytics environment by reducing IT dependency, making business users more self-sufficient and reducing the time required to build the infrastructure.

Big data is already ubiquitous throughout the industry, yet organizations continue to struggle with using unstructured and structured data together. How do “small data analytics” and “wide data analytics” work with data virtualization to address these challenges?

Consumers and businesses alike are using small data analytics to do things such as create hyper-personalized experiences for their customers so the enterprise can understand each individual customer sentiment around a specific product or service within a short time window. To ensure such analytics is successful, companies need to combine certain data sets in real time, which is only possible with data virtualization and cannot be done using legacy data integration methods that require physical consolidation.

Wide data analytics involves the combination of structured, unstructured, and semistructured data from various data sources; this often includes geospatial data, machine generated data, text data, video data, temperature data, and the list goes on. Healthcare companies often combine lab data, X-ray data, R&D data, patient data, and many other types of data for clinical purposes and patient treatment as well as for offering data-as-a-service to their ecosystem partners. In such cases, traditional data integration methods often fall short either because they are not good at handling less-structured or unstructured data or because there is a need for real-time data integration for quicker decision-making. In such scenarios, data virtualization is an absolute necessity.

Where is data virtualization headed? What role will it play 2-3 years from now?

Data virtualization will continue to push the envelope with AI/ML-driven functionality to the point where it will automatically infer changes at the individual data sources and where data is continuously created through business transactions. Soon, there will be no friction with accessing, combining, and using data, and no one will have to ask where the data is physically located or what format it is stored in at the source. Data will continue to grow in volume, velocity, and variety, but with data virtualization and real-time access across disparate systems, the conversation will shift more to using data and less on managing its complexity.

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI Members have access to exclusive research reports, publications, communities and training.

Individual, Student, and Team memberships available.

↑

TDWI | Training & Research | Business Intelligence, Analytics, Big Data, Data Warehousing

Research & Resources

Webinars

Virtual Summits

By Topic

In-Person Events

Virtual Live Seminars

Online Learning

By Topic

Connect and Contribute to Our Vibrant Community of Data Leaders

TDWI Articles

Executive Q&A: Data Virtualization and the Use Cases of Today and Tomorrow

Related Articles

Trending Articles

Breaking Barriers in Conversational BI/AI with a Semantic Layer

AI in 2025: Key Considerations for Technology Leaders

The Tech Blanket: Building a Seamless Tech Ecosystem

What’s Ahead in Generative AI in 2025? (Part Two)

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI

Engage

Research

Research & Resources

Webinars

Virtual Summits

By Topic

In-Person Events

Virtual Live Seminars

Online Learning

By Topic

Connect and Contribute to Our Vibrant Community of Data Leaders

TDWI Articles

Executive Q&A: Data Virtualization and the Use Cases of Today and Tomorrow

Related Articles

Trending Articles

Breaking Barriers in Conversational BI/AI with a Semantic Layer

AI in 2025: Key Considerations for Technology Leaders

The Tech Blanket: Building a Seamless Tech Ecosystem

What’s Ahead in Generative AI in 2025? (Part Two)

TDWI Membership

Accelerate Your Projects, and Your Career

TDWI

Engage

Research

Accelerate Your Projects,
and Your Career