TDWI | Training & Research | Business Intelligence, Analytics, Big Data, Data Warehousing

Think
- Research & Resources
  - TDWI Playbook | Next Generation Data Science: The AI-Driven Data Science Life Cycle
  - TDWI Data Points | The Data Foundation for AI
  - TDWI Best Practices Report | Data Strategies and Foundations for Modern Data Management
  - TDWI Insight Accelerator | Adopting a Platform Approach for Gaining Insights from Unstructured Data
- Webinars
  - Expert Panel: Leveraging AI-Powered Solutions for Data Management July 28, 2025
  - A Generative AI Framework for Credit and Financial Markets July 29, 2025
  - Redefining Clinical Operations with Agentic AI: Accelerating Innovation Across Data Management and Site Monitoring July 30, 2025
  - Smarter Marketing in Retail: How AI and Modern Data Foundation Drive Growth July 31, 2025
- Virtual Summits
  - Virtual Events Keys to Making Your Data AI Ready September 10, 2025
  - Virtual Events Data Quality for BI, Analytics and AI October 22, 2025
  - Virtual Events Modern Data Strategy November 12, 2025
  - Virtual Events What’s Ahead in 2026 for Data & Analytics December 10, 2025
- By Topic
  - By Topic
    
    Explore the Latest AI, Analytics, and Data Research and Training by Topic
  - BI, Analytics, and Data Literacy
  - AI, Data Science, and Machine Learning
  - Data Management and Governance
  - Platforms and Architecture
  - Strategy and Methods
- Speaking of Data Podcast
  
  Current Research Surveys
Train
- In-Person Events
  - Conference TDWI Transform 2025 San Diego August 18, 2025
  - Executive Summit TDWI Modern Data Leader's Summit San Diego: AI in the Enterprise August 18, 2025
  - Conference TDWI Transform 2025 Orlando November 16, 2025
  - Executive Summit TDWI Data & AI Leaders Summit Orlando: Governing Data, Analytics, and AI November 17, 2025
- Virtual Live Seminars
  - Platforms & Architecture Week July 25, 2025
  - AI Bootcamp Week July 25, 2025
  - Data Governance Week July 25, 2025
- Online Learning
- By Topic
  - By Topic
    
    Explore the Latest AI, Analytics, and Data Research and Training by Topic
  - BI, Analytics, and Data Literacy
  - AI, Data Science, and Machine Learning
  - Data Management and Governance
  - Platforms and Architecture
  - Strategy and Methods
- Train Your TeamCustom solutions for training your team
  
  Get CertifiedEarn a professional credential in BI and Analytics, Data Governance, or AI
  
  TDWI MembershipExclusive access to the research, tools, training, and connections
Engage
- Connect
  - Connect and Contribute to Our Vibrant Community of Data Leaders
    
    Subscribe to TDWI Stay up to date on the latest news and events. Sign Up
    
    Become a TDWI Member Gain exclusive access to the research, tools, training, and connections to move your careers, teams, and projects forward. Learn More
    
    Become a Part of the TDWI Research Panel Make a difference in the data and analytics industry and earn incentives by sharing your insights with TDWI. Explore Now
    
    Speak at TDWI Events Share your expertise and build your personal brand as a speaker at a TDWI In-Person or Virtual Event. Submit a Proposal
    
    Become a TDWI Research Fellow Apply to be a member of TDWI’s industry leading research team. Apply Today
    
    Become a Member of the Data & AI Leaders Forum Engage in collaborative discussions, stay ahead of the curve, and stay in the know. Apply Now
    
    Showcase Your Data & AI Solutions Reach and engage with TDWI community through multi-channel marketing programs. Learn More

TDWI Articles

Data Management: 2016’s Hot Trends and What to Watch in 2017

The leading 2016 trends included Hadoop adoption, data lakes, and data warehouse modernization. In 2017 we'll see new activity around the SQL-ization of Hadoop, orchestrated data hubs, and managing IoT sensor data.

By Philip Russom
December 16, 2016

The leading trends in enterprise data management in 2016 were continued from recent years, namely Hadoop adoption and data warehouse modernization. However, the real surprise in 2016 was the data lake, which user organizations are suddenly taking seriously as the preferred design pattern for data set organization in Hadoop.

All these will continue into 2017 and be joined by new activity around the SQL-ization of Hadoop, orchestrated data hubs, and managing sensor data from the industrial Internet of Things (IoT). Let's look at each of these in detail.

2016 Top Trends

Increased Adoption of Hadoop

TDWI surveys in recent years have shown that Hadoop is making steady progress as a platform well suited to many purposes in data warehousing and analytics. Many early adopters have already integrated Hadoop clusters and tools into the architectures of their data warehouse environments.

Hadoop's massive, cheap storage offloads older systems by taking responsibility for data staging, ELT push down, and archiving of detailed source data (typically in the data lake design pattern). Hadoop also serves as a massively parallel execution engine for a wide variety of set-based and algorithmic analytics methods. These valuable use cases are driving the adoption of Hadoop.

TDWI has seen a giant step forward in adoption starting in late 2015 and continuing into 2016. The survey from TDWI Best Practices Report: Data Warehouse Modernization shows that 17 percent of data warehouse programs surveyed already have Hadoop in production in their data warehouse environment. This is up from earlier surveys, which showed 10 to 12 percent.

Even more dramatic, the survey shows that the percentage of organizations integrating Hadoop with a data warehouse will more than double within three years (up to 36 percent). In short, Hadoop is here to stay and will soon become common in data warehouse programs.

Hadoop-based Data Lakes

First and foremost, a data lake is a repository for raw data. A data lake tends to manage highly diverse data types and can scale to handle tens or hundreds of terabytes -- sometimes petabytes. It is optimized to ingest raw data quickly as received from both new and traditional sources.

The point of the data lake managing data in its original raw state is so that its details can be repurposed repeatedly as new business requirements and opportunities for new applications arise. After all, once data is remodeled, standardized, and otherwise transformed (as is required for report-oriented data warehousing), its applicability for other unforeseen use cases is greatly narrowed.

With that in mind, you can see that analytics is the primary driver behind data lakes. For example, certain forms of advanced analytics work best with data in its original state with all its original details. These include analytics based on mining, statistics, predictive algorithms, and natural language processing.

Hadoop has become an important enabling platform for data lakes because it scales linearly, supports a wide range of processing techniques, and costs a fraction of similar relational configurations. For these reasons, Hadoop is now the preferred data platform for data lakes.

Data Warehouse Modernization

This was the hottest area for data management in 2015. Most approaches to data warehouse modernization are large, multiphase projects that take months or years to complete, so the heat has continued through 2016 and will linger into 2017.

The main drivers for this trend are to extend the warehouse (sometimes with a complementary Hadoop cluster) to accommodate big data and other new data (especially sensor data), to update the warehouse architecture, to add more real-time functionality, to enable logical data warehousing, and to modernize related systems (such as those for reporting, analytics, and data integration).

2017 Anticipated Hot Spots

The SQL-ization of Hadoop

Hadoop was originally designed for Internet environments that had no relational requirements. As we increasingly employ Hadoop in mainstream use cases, however, relational requirements are becoming pressing, especially the need for ANSI-standard SQL. In fact, SQL support for Hadoop is a "must have" for emerging practices that involve Hadoop data, such as data exploration, data prep, and SQL-based analytics.

A number of open source tools -- including Impala, Drill, Presto, and Spark -- seek to add ANSI SQL to Hadoop, and a few mature vendor tools (for reporting, analytics, and data integration) have been updated to do the same. It's still early days with SQL engines for Hadoop, however, so we're waiting for more functionality, performance, and interoperability. In 2017 we will witness improvements in these areas for both open source and vendor-built SQL for Hadoop capabilities.

Data Hubs

Instead of a Hadoop-based data lake, some organizations prefer to build a large relational data hub to achieve similar goals -- namely to provide a governable home for big data, analytics sandboxes, and collaborative data practices. Ambitious organizations are building data hubs that mix relational and Hadoop technologies such that it is hard to tell a data hub from a data lake.

Even so, here's a key differentiator: a true data hub is more than just another database. It also has significant toolage around it for data orchestration, publish and subscribe, security, auditing, and data integration and quality. The uptick in data hub deployments started in 2016 will grow in 2017.

Sensors in the Industrial Internet of Things (IoT)

TDWI sees IoT as emerging from its "hype cycle" earlier than anticipated. In particular, the industrial side of IoT (but not the consumer side) is ramping up aggressively, driven by an explosion of enterprise sensors in 2015 and 2016.

For example, utility companies (and other firms that monitor facilities closely) have long had many sensors; these firms have recently quadrupled their sensors so they can track processes in a more granular fashion and turn more manual tasks into digital ones. Manufacturing is a similar case; it has had robots for decades, but now the robots have more sensors for finer control -- now they can perform quality assurance, not just assembly.

As another example, multiple truck and rail freight companies have spoken at TDWI conferences about how sensor data (from vehicles and shipping containers) helps them make logistical operations and routes more efficient, keep customers happy with fast, auditable service, and reduce insurance rates by proving that vehicles and cargos are handled legally and safely.

About the Author

Philip Russom is director of TDWI Research for data management and oversees many of TDWI’s research-oriented publications, services, and events. He is a well-known figure in data warehousing and business intelligence, having published over 600 research reports, magazine articles, opinion columns, speeches, Webinars, and more. Before joining TDWI in 2005, Russom was an industry analyst covering BI at Forrester Research and Giga Information Group. He also ran his own business as an independent industry analyst and BI consultant and was a contributing editor with leading IT magazines. Before that, Russom worked in technical and marketing positions for various database vendors. You can reach him at [email protected], @prussom on Twitter, and on LinkedIn at linkedin.com/in/philiprussom.

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI Members have access to exclusive research reports, publications, communities and training.

Individual, Student, and Team memberships available.

↑

TDWI | Training & Research | Business Intelligence, Analytics, Big Data, Data Warehousing

Research & Resources

Webinars

Virtual Summits

By Topic

In-Person Events

Virtual Live Seminars

Online Learning

By Topic

Connect and Contribute to Our Vibrant Community of Data Leaders

TDWI Articles

Data Management: 2016’s Hot Trends and What to Watch in 2017

Related Articles

Trending Articles

Breaking Barriers in Conversational BI/AI with a Semantic Layer

AI in 2025: Key Considerations for Technology Leaders

The Tech Blanket: Building a Seamless Tech Ecosystem

What’s Ahead in Generative AI in 2025? (Part Two)

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI

Engage

Research

Research & Resources

Webinars

Virtual Summits

By Topic

In-Person Events

Virtual Live Seminars

Online Learning

By Topic

Connect and Contribute to Our Vibrant Community of Data Leaders

TDWI Articles

Data Management: 2016’s Hot Trends and What to Watch in 2017

Related Articles

Trending Articles

Breaking Barriers in Conversational BI/AI with a Semantic Layer

AI in 2025: Key Considerations for Technology Leaders

The Tech Blanket: Building a Seamless Tech Ecosystem

What’s Ahead in Generative AI in 2025? (Part Two)

TDWI Membership

Accelerate Your Projects, and Your Career

TDWI

Engage

Research

Accelerate Your Projects,
and Your Career