TDWI | Training & Research | Business Intelligence, Analytics, Big Data, Data Warehousing

Think
- Research & Resources
  - TDWI Playbook | Next Generation Data Science: The AI-Driven Data Science Life Cycle
  - TDWI Data Points | The Data Foundation for AI
  - TDWI Best Practices Report | Data Strategies and Foundations for Modern Data Management
  - TDWI Insight Accelerator | Adopting a Platform Approach for Gaining Insights from Unstructured Data
- Webinars
  - Modernize and Govern: Unifying Your Data Strategy July 10, 2025
  - Expert Panel: Best Practices for Modernizing Your Data Environment July 14, 2025
  - Powering Data Science with AI-Driven Tools and Practices July 15, 2025
  - Data Integration for AI: Overcoming Modern Pipeline Challenges July 23, 2025
- Virtual Summits
  - Virtual Events Keys to Making Your Data AI Ready September 10, 2025
  - Virtual Events Data Quality for BI, Analytics and AI October 22, 2025
  - Virtual Events Modern Data Strategy November 12, 2025
  - Virtual Events What’s Ahead in 2026 for Data & Analytics December 10, 2025
- By Topic
  - By Topic
    
    Explore the Latest AI, Analytics, and Data Research and Training by Topic
  - BI, Analytics, and Data Literacy
  - AI, Data Science, and Machine Learning
  - Data Management and Governance
  - Platforms and Architecture
  - Strategy and Methods
- Speaking of Data Podcast
  
  Current Research Surveys
Train
- In-Person Events
  - Conference TDWI Transform 2025 San Diego August 18, 2025
  - Executive Summit TDWI Modern Data Leader's Summit San Diego: AI in the Enterprise August 18, 2025
  - Executive Summit AI Accelerate 2025, Brought to You by AI Boadroom & TDWI August 18, 2025
  - Conference TDWI Transform 2025 Orlando November 16, 2025
- Virtual Live Seminars
  - TDWI Data Governance Principles and Practices: Managing Data as an Asset June 25, 2025
  - Building Your Company’s Data Governance Roadmap June 25, 2025
  - Data Governance: Driving Engagement and Organizational Change June 26, 2025
  - A Framework for Modern Data Governance June 25, 2025
- Online Learning
- By Topic
  - By Topic
    
    Explore the Latest AI, Analytics, and Data Research and Training by Topic
  - BI, Analytics, and Data Literacy
  - AI, Data Science, and Machine Learning
  - Data Management and Governance
  - Platforms and Architecture
  - Strategy and Methods
- Train Your TeamCustom solutions for training your team
  
  Get CertifiedEarn a professional credential in BI and Analytics, Data Governance, or AI
  
  TDWI MembershipExclusive access to the research, tools, training, and connections
Engage
- Connect
  - Connect and Contribute to Our Vibrant Community of Data Leaders
    
    Subscribe to TDWI Stay up to date on the latest news and events. Sign Up
    
    Become a TDWI Member Gain exclusive access to the research, tools, training, and connections to move your careers, teams, and projects forward. Learn More
    
    Become a Part of the TDWI Research Panel Make a difference in the data and analytics industry and earn incentives by sharing your insights with TDWI. Explore Now
    
    Speak at TDWI Events Share your expertise and build your personal brand as a speaker at a TDWI In-Person or Virtual Event. Submit a Proposal
    
    Become a TDWI Research Fellow Apply to be a member of TDWI’s industry leading research team. Apply Today
    
    Become a Member of the Data & AI Leaders Forum Engage in collaborative discussions, stay ahead of the curve, and stay in the know. Apply Now
    
    Showcase Your Data & AI Solutions Reach and engage with TDWI community through multi-channel marketing programs. Learn More

TDWI Blog

TDWI Blog: Data 360

Q&A RE: Data Warehouse Architecture Issues

Attendees of a recent TDWI Webinar asked excellent questions.
By Philip Russom, TDWI Research Director for Data Management

Recently, on Tuesday April 15, 2014, I broadcasted a TDWI Webinar in which I presented some of the findings from my new TDWI report, Evolving Data Warehouse Architectures in the Age of Big Data. You can download a free copy of the report in a PDF file. And you can replay the Webinar.

Attendees of the Webinar posed several very good questions about various issues in data warehouse architecture. Please allow me to share a few of the attendees’ questions and the answers I sent them via e-mail:

Q. As we update our data warehouse from more reporting to more analytics functions, should we design a brand new data warehouse architecture, or improve from the existing one?

If the existing data warehouse and its architecture fulfill business requirements and technical performance requirements (for speed and scale), then you should try to build out the existing architecture. For that to work, your existing vendor platform under the warehouse must perform well with multiple mixed workloads, including analytic workloads; ask your vendor representative for customer references who’ve succeeded with mixed workloads. Also, building up data sets for advanced analytics typically means loading large data volumes into the warehouse, which may cost more money with some licenses; again, ask your vendor if there are such ramifications under your current license.

If your current core warehouse platform cannot support mixed workloads with high performance (or adding analytic data costs too much money), you may decide to manage and process large data sets for advanced analytics on a separate standalone platform that integrates with your warehouse. But in that case, you still keep your existing data warehouse and most of its data structures intact, just making slight changes for better integration with the new additional platform(s) for advanced analytics.

Q. Given the lack of integration across this multi-platform [data warehouse] environment, how do we avoid the need to replicate DW transactional sources into the big data platforms, as transactions are required in mining?

Good question, and there are number of issues here. First, a well-designed multi-platform environment won’t suffer a “lack of integration.” TDWI’s definition of “logical data warehouse” is that the logical design specifies integration schemes (not just data models) across physically distinct platforms, whether that integration takes a data model approach (as in shared or conformed dimensions, etc.) or a data integration approach (as in jobs for ETL, replication, etc.) or both. Second, I take your point, that replicating data more than needed can lead to a variety of problems, as data gets out of sync and loses integrity. A good architecture can minimize replication, and sometimes alleviate it. Third, for decades, users have faced the same decision you’re looking at: do we store, manage, and analytically process our rich, valuable collection of transactional data in the warehouse proper or on a standalone but integrated platform, such as the usual operational data store (ODS)?

For years, a solution I’ve seen users successfully adopt is to deploy a homegrown ODS that they’ve designed and optimized for transactions. The ODS is on a standalone platform that’s integrated with the core warehouse (plus other ODSs, marts, etc.), running on a relational DBMS atop commodity priced hardware. Note that the upcoming trend is toward ODSs atop Hadoop (but only if the data volumes are massive). The idea is to manage transactional data on a platform that’s much cheaper than the DW, on a standalone platform where the relentless sorting, updating, and processing of that data won’t degrade warehouse performance. Yet, the ODS is easily reached from all tools, plus through data federation and virtualization as well, which minimizes the replication of transactional data.

If you give the ODS the capacity it needs to persist multiple sort orders and data subsets in the ODS, then copying data outside the ODS is further reduced. Also, if you use data mining tools that can work on data “in situ” (i.e., in the ODS’s relational database) without moving data to the tool, then that also reduces copying and moving transactional data.

Q. The need for data warehouses is never going to go away. But isn’t the separation between "operations" and "analytics" starting to blur? In other words, the future isn't DWE; it's a "data environment" that does both.

Operational BI is all about getting operational data into BI faster and more frequently, while also embedding BI functions in operational applications and their processes as well. Operational BI is a very popular practice. It has been for years, and will get even more popular, as organizations adjust their BI efforts to bring them closer to real time (to be more competitive, customer conscious, efficient, etc.). The widespread existence of operational BI corroborates that the line between operations and BI is already quite blurred and will become even more so.

In another trend, many organizations are purposefully evolving toward a more or less loosely unified data environment for most enterprise data. I say “more or less” and “loosely” because early adopters are quick to say that the architecture is not 100 percent of the enterprise and integration is spotty, on an “as needed” basis. As one architect joked, “it’s more archaeology than architecture, because the work usually consists of imposing a logical architecture over mature, preexisting systems.” For early adopters, it makes sense to architect data globally, when customer data and some other data domains are pervasively shared across multiple applications, departments, and processes. It also makes sense in firms where business processes ramble across multiple business units and IT systems. Obviously, there’s an infinitude of resulting enterprise data architectures.

The data warehouse environment (DWE) I’m describing is a local microcosm of such a broad and loosely unified multi-platform data architecture. However, in some organizations today, the data warehouse and similar data platforms are just a few among many other data platforms, integrated on an enterprise scale. But those organizations are as yet the minority, although we at TDWI expect it to be the norm for IT-intense organizations within five years. TDWI’s Vegas conference has been devoted to issues in enterprise-scale data architecture for years, and will continue to be. You might consider attending next February.

Q. Can you point us to white papers on the difference between reporting and analytics [and how that affects DW architecture]?

You can read my blog on the subject. Or you could read the new report on evolving data warehouse architectures, because I adapted material from the blog to become a section in the report, starting on page 24.

Q. What’s the role, or is there a role, for variants like an ODS in the new world [of data warehouse architectures]? Is it part of the real-time world?”

Historically, some of the first standalone systems in a multi-platform data warehouse (going back to the mid-1990s) were ODSs deployed on their own hardware sever with their own DBMS instances. These are still with us, and will continue to be with us, as data warehouse environments evolve into even more platforms used at once. An ODS can be designed and optimized by users for a wide range of data domains and uses (including real-time data), but I’m currently seeing a lot of users deploying ODSs for various types of big data and other data earmarked for advanced analytics.

Q. Saying Inmon vs. Kimball is no longer relevant is like saying Newton is no longer relevant in the world of physics today. It's still important, maybe not as fundamental as 1–2 decades ago.

For decades, Newton practiced alchemy in his copious spare time, because he was convinced that changing lead to gold was possible. Our heroes aren’t always 100 percent right.

Concerning Inmon and Kimball, see the top of page 7 in the report. Also please read the User Story on that same page. “No longer relevant” is your phrase, not mine. In my view, Inmon and Kimball’s innovations are as relevant as ever, and are still being applied daily. And they just keep giving: Inmon has recently extended our understanding of unstructured data and Kimball is currently working new best practices for Hadoop.

It’s the users who’ve changed. Instead of arguing about which to choose, users choose to apply Inmon and Kimball techniques (and others, too) in the same extended warehouse environment. And that’s a wise choice on their part, since hybrids and diversity seem to be winning strategies for a growing number of user organizations and their diversified DW architectures nowadays.

Q. Some organizations consider Hadoop a replacement for their current DW appliance. How is this possible?

As I said in the Webinar, I’ve only found two organizations that took out a data warehouse and put Hadoop in its place. While that corroborates that a replacement is possible, it’s not likely, nor is it a compelling trend.

Instead of replacement, we at TDWI see far more users augmenting their data warehouse environment with the Hadoop Distributed File System (HDFS), plus related Hadoop tools, especially MapReduce, Hive, HBase, and Pig. In short, HDFS handles things that relational warehouses are not designed for, such as unstructured data, algorithmic analytics, millions of files, and petabyte-size data sets. But the relational warehouse is still best for the structured and multidimensional data that goes into standard reports, performance management, and set-based analytics (typically OLAP or SQL-based analytics).

Another possibility is that Hive atop MapReduce and HDFS makes a highly scalable “row store” type of database. Sometimes you don’t need a full-featured (and expensive) relational DBMS, and hence a row store will do just fine. For example, many of the ODSs found today in data warehouse environments are candidates for migration to Hadoop. That includes ODSs that manage large “archives” (I use the word loosely) of transactional data and other operational data that’s persisted and kept long-term for advanced analytics that just need simple tabular structures. Most standalone ODSs of that description today run on mature DBMSs, but could run almost as well (for less money) on Hadoop.

Finally, let’s remember that not all organizations need a data warehouse, as represented by 15 percent of survey respondents.

Q. Can you recommend any sample success stories on how to integrate Hadoop or similar big data into an existing data warehouse [environment]?

Yes, many real-world use cases and user stories are discussed in the 2013 TDWI report Integrating Hadoop into Business Intelligence and Data Warehousing.

Posted by Philip Russom, Ph.D. on April 30, 20140 comments

Evolving Data Warehouse Architectures: An Overview in 35 Tweets

By Philip Russom
Research Director for Data Management, TDWI

To help you better understand the ongoing evolution of data warehouse architectures and why you should care, I’d like to share with you the series of 35 tweets I recently issued on the topic. I think you’ll find the tweets interesting because they provide an overview of big data management and its best practices in a form that’s compact, yet amazingly comprehensive.

Every tweet I wrote was a short sound bite or stat bite drawn from my recent TDWI report Evolving Data Warehouse Architectures in the Age of Big Data. Many of the tweets focus on a statistic cited in the report, while other tweets are definitions stated in the report.

I left in the arcane acronyms, abbreviations, and incomplete sentences typical of tweets, because I think that all of you already know them or can figure them out. Even so, I deleted a few tiny URLs, hashtags, and repetitive phrases. I issued the tweets in groups, on related topics; so I’ve added some headings to this blog to show that organization. Otherwise, these are raw tweets.

Basic Components of the Average Data Warehouse Architecture

Most DW Arch’s have 4 layers: logical, physical, hardware topology, data standards.
DW logical architecture is mostly about data models, entity models & relationships.
DW logical arch also defines standards for data models, dev practices, interfaces, etc.
DW physical architecture is mostly a plan for data deployment on servers.
DW physical arch also defines topology for hardware & software servers plus interfaces.

Users’ Views of Architectural Components

#TDWI SURVEY SEZ: Data standards & rules are highest priority (71%) of #EDW architecture.
#TDWI SURVEY SEZ: Logical design (66%) is the starting point of an #EDW architecture.
#TDWI SURVEY SEZ: Physical plan (56%) locates logical pieces in an #EDW architecture.
#TDWI SURVEY SEZ: Only 12% have #EDW that’s “collection of data & platforms without a plan.”
#TDWI SURVEY SEZ: Only 12% feel Inmon vs Kimball argument is priority for #EDW architecture.

The Evolution of Data Warehouse Architectures

#TDWI SURVEY SEZ: 79% say their #DataWarehouse has an architecture.
#TDWI SURVEY SEZ: #EDW arch is evolving dramatically (22%), moderately (54%) or slightly (22%)
#TDWI SURVEY SEZ: Driving #EDW arch evolution: #Analytics 57%, #BigData 56%, #RealTime 41%.
#TDWI SURVEY SEZ: Driving #EDW arch evolution: BizPerfMgt 38%, OLAP 30%, UnstrucData 25%.
#TDWI SURVEY SEZ: Driving #EDW arch evolution: competition 45%, compliance 29%, dep’ts 29%.

The Importance of Data Warehouse Architectures

#TDWI SURVEY SEZ: Architecture extremely (79%) or moderately (19%) important to #EDW success.
#TDWI SURVEY SEZ: #EDW Architecture is an opportunity (84%), not a problem (16%).

Benefits and Barriers for Data Warehouse Architecture

#TDWI SURVEY SEZ: Stuff that benefits from #DWarch: #analytics, biz value, data breadth.
#TDWI SURVEY SEZ: Barriers to #DWarch success: skills gap, sponsorship, #DataMgt, funding.

Multi-Platform Data Warehouse Environments

#EDWarch trend: more standalone platforms: #analytics DBMSs, columnar, appliances, #Hadoop, etc.
As #EDW workloads get more diverse, so do types of standalone data platforms in #EDW environment.
As types and numbers of data platforms grow in DW environs, architecture gets ever more distributed. #
Distributed #EDWarch is good&bad: provides workload optimized platforms. But may spawn data silos.
Logical layer of #EDWarch more important than ever to unite big design across multi data platforms.

Single-Platform versus Multi-Platform DW Architectures

#TDWI SURVEY SEZ: Totally pure #EDWarchs are rare. Only 15% have central monolithic #EDW.
#TDWI SURVEY SEZ: Hybrid #EDWarchs are most common today = central #EDW + a few other data platforms (37%).
#TDWI SURVEY SEZ: 2nd most common Hybrid #EDWarch = central #EDW + many other data platforms (16%).
#TDWI SURVEY SEZ: Sometimes #EDW plays small role in #EDWarch compared to workload platforms (15%).
#TDWI SURVEY SEZ: Some organizations (15%) have many workload-specific data platforms, but no true DW.

Big Data’s Influence on Evolving DW Architectures

#TDWI SURVEY SEZ: 41% will extend existing core #EDW to handle #BigData.
#TDWI SURVEY SEZ: 25% will deploy new data platforms to handle #BigData.
#TDWI SURVEY SEZ: 23% have no strategy for their #EDW’s architecture, though they need one.
#TDWI SURVEY SEZ: Only 6% feel they don’t need a strategy for their #EDW’s architecture.

Reports and Analytics have Different DW Architecture Needs

Many users preserve #EDW for reporting, BizPerfMgt & OLAP, but take #analytics data elsewhere.
Data prep for reports differs from same for #analytics. So, many users prep data on separate platforms.

Want to learn more about evolving data warehouse architectures?

For a more detailed discussion—in a traditional publication!—get the TDWI Best Practices Report, titled Evolving Data Warehouse Architectures in the Age of Big Data, which is available in a PDF file via a free download.

You can also register for and replay my TDWI Webinar, where I present the findings of the TDWI report Evolving Data Warehouse Architectures in the Age of Big Data.

Posted by Philip Russom, Ph.D. on April 15, 20140 comments

Big Data and the Public Cloud

TDWI just released my newest Checklist Report, Seven Considerations for Navigating Big Data Cloud Services. The report examines what enterprises should think about when evaluating the use of public cloud services to manage their big data. The cloud can play an important role in the big data world since horizontally expandable and optimized infrastructure can support the practical implementation of big data. In fact, there are a number of characteristics that make the cloud a fit for the big data ecosystem. Four of these include:

Scalability. Scalability with regard to hardware refers to the ability to go from small to large amounts of processing power with the same architecture. The cloud can scale to large data volumes. Distributed computing, an integral part of the cloud model, works on a “divide and conquer” plan. So if you have huge volumes of data, they can be partitioned across cloud servers.
Elasticity. Elasticity refers to the ability to expand or shrink computing resource demand in real time, based on need. This means that you have the potential to access as much of a service when you need it. This can be helpful for big data projects where you might need to expand the amount of computing resources you need to deal with the volume and velocity of the data.
Resource pooling. Cloud architectures enable the efficient creation of groups of shared resources that make the cloud economically viable.
Self-service. This refers to the ability of a user to run a set of cloud resources via a portal or browser interface. This is different than requesting it from your IT department.

For instance, you might want to use a public cloud to run your real-time predictive model against high volumes of data because you don’t want to use your own physical infrastructure to do so. Additionally, some companies are using the public cloud to explore big data, and then move certain information to the data warehouse. In effect, the cloud extends the data warehouse. There are numerous use cases emerging for big data in the cloud.

TDWI is starting to see an uptick in interest in the public cloud for BI and analytics. For example, in our recent quick survey of users who attended our Las Vegas World Conference, only about 25% of respondents said they would never use the public cloud for BI or analytics. The rest were either currently using the cloud (about 18%) or were actively looking into it or considering it as a possibility. We saw a similar response in a quick survey we did at our Boston conference in the fall of 2013. This will be an active area of research for TDWI this year, so stay tuned!

For more on big data in the cloud, also refer to Big Data for Dummies.

Posted by Fern Halper, Ph.D. on March 3, 20140 comments

Q&A RE: The State of Big Data Integration

It’s still early days, but users are starting to integrate big data with enterprise data, largely for business value via analytics.

By Philip Russom, TDWI Research Director for Data Management

A journalist from the IT press recently sent me an e-mail containing several very good questions about the state of big data relative to integrating it with other enterprise data. Please allow me to share the journalist’s questions and my answers:

How far along are enterprises in their big data integration efforts?

According to my survey data, approximately 38% of organizations don’t even have big data, in any definition, so they’ve no need to do anything. See Figure 1 in my 2013 TDWI report Managing Big Data. Likewise, 23% have no plans for managing big data with a dedicated solution. See Figure 5 in that same report.

Even so, some organizations have big data, and they are already managing it actively. Eleven percent have a solution in production today, with another 61% coming in the next three years. See Figure 6.

Does data integration now tend to be haphazard, or one-off projects, in many enterprises, or are architectural strategies emerging?

I see all the above, whether with big data or the usual enterprise data. Many organizations have consolidated most of their data integration efforts into a centralized competency center, along with a centrally controlled DI architecture, whereas a slight majority tend to staff and fund DI on a per-application or per-department basis, without an enterprise strategy or architecture. Personally, I’d like to see more of the former and less of the latter.

What are the best approaches for big data integration architecture?

Depends on many things, including what kind of big data you have (relational, other structures, human language text, XML docs, etc.) and what you’ll do with it (analytics, reporting, archiving, content management). Multiple big data types demand multiple data platforms for storing big data, whereas multiple applications consuming big data require multiple processing types to prepare big data for those applications. For these reasons, in most cases, managing big data and getting business use from it involves multiple data management platforms (from relational DBMSs to Hadoop to NoSQL databases to clouds) and multiple integration tools (from ETL to replication to federation and virtualization).

Furthermore, capturing and integrating big data can be challenging from a data integration viewpoint. For example, the streaming big data that comes from sensors, devices, vehicles, and other machines requires special event-processing technologies to capture, triage, and route time-sensitive data—all in a matter of milliseconds. As with all data, you must transform big data as you move it from a source to a target, and the transformations may be simple (moving a click record from a Web log to a sessionization database) or complex (deducing a fact from human language text and generating a relational record from it).

What "traditional" approaches are being updated with new capabilities and connectors?

The most common data platform being used for capturing, storing, and managing big data today are relational databases, whether based on MPP, SMP, appliance, or columnar architectures. See Figure 16 in the Managing Big Data report. This makes sense, given that in a quarter of organizations big data is mostly or exclusively structured data. Even in organizations that have diverse big data types, structured and relational types are still the most common. See Figure 1.

IMHO, we’re fortunate that vendors’ relational database management systems (RDBMSs) (from the old brands to the new columnar and appliance-based ones) have evolved to scale up to tens and hundreds of terabytes of relational and otherwise structured data. Data integration tools have likewise evolved. Hence, scalability is NOT a primary barrier to managing big data.

If we consider how promising Hadoop technologies are for managing big data, it’s no surprise that vendors have already built interfaces, semantic layers, and tool functionality for accessing a broad range of big data managed in the Hadoop Distributed File System (HDFS). This includes tools for data integration, reporting, analysis, and visualization, plus some RDBMSs.

What are the enterprise "deliverables" coming from users’ efforts with big data (e.g., analytics, business intelligence)?

Analytics is the top priority and hence a common deliverable from big data initiatives. Some reports also benefit from big data. A few organizations are rethinking their archiving and content management infrastructures, based on big data and the potential use of Hadoop in these areas.

How is the role of data warehousing evolving to meet the emergence of Big Data?

Big data is a huge business opportunity, with few technical challenges or downsides. See figures 2 through 4 in the report Managing Big Data. Conventional wisdom says that the opportunity for business value is best seized via analytics. So the collection, integration, and management of big data is not an academic exercise in a vacuum. It is foundational to enabling the analytics that give an organization new and broader insights via analytics. Any calculus for the business return on managing big data should be based largely on the benefits of new analytics applied to big data.

On April 1, 2014, TDWI will publish my next big report on Evolving Data Warehouse Architectures in the Age of Big Data. At that time, anyone will be able to download the report for free from www.tdwi.org.

How are the new platforms (such as Hadoop) getting along with traditional platforms such as data warehouses?

We say “data warehouse” as if it’s a single monolith. That’s convenient, but not very accurate. From the beginning, data warehouses have been environments of multiple platforms. It’s common that the core warehouse, data marts, operational data stores, and data staging areas are each on their own standalone platforms. The number of platforms increased early this century, as data warehouse appliances and columnar RDBMSs arrived. It’s now increasing again, as data warehouse environments now fold in new data platforms in the form of the Hadoop Distributed File System (HDFS) and NoSQL databases. The warehouse has always evolved to address new technology requirements and business opportunities; it’s now evolving again to assure that big data is managed appropriately for the new high-value analytic applications that many businesses need.

For an exhaustive discussion of this, see my 2013 TDWI report Integrating Hadoop into Business Intelligence and Data Warehousing.

Posted by Philip Russom, Ph.D. on January 22, 20140 comments

Four Ways to Illustrate the Value of Predictive Analytics

My new (and first!) TDWI Best Practices Report was published a few weeks ago. It is called Predictive Analytics for Business Advantage. In it, I use the results from an online survey together with some qualitative interviews to discuss the state of predictive analytics, where it is going, and some best practices to get there. You can find the report here. The Webinar on the topic can be found here.

There were many great questions during the Webinar and I’m sorry I didn’t get to answer them all. Interestingly, many of the questions were not about the technology; rather they were about how to convince the organization (and the senior executives) about the value in predictive analytics. This jives with what I saw in my research. For instance,”lack of understanding of predictive analytics” was cited as a key challenge for the discipline. Additionally, when we asked the question, “Where would you like to see improvements in your predictive analytics deployment?”, 70% of all respondents answered “education.” It’s not just about education regarding the technology. As one respondent said, “There is a lack of understanding of the business potential” for predictive analytics, as well.

Some of the questions from the audience during the Webinar echoed this sentiment. For instance, people asked, “How do I convince senior execs to utilize predictive analytics?” and “What’s the simple way to drive predictive analytics to senior executives?” and “How do we get key leaders to sponsor predictive analytics?”

There is really no silver bullet, but here are some ways to get started:

Cite research: One way is to point to studies that have been done that quantify the value. For instance, in the Best Practices Report, 45% of the respondents who were currently using predictive analytics actually measured top- or bottom-line impact or both (see Figure 7 in the report). That’s pretty impressive. There are other studies out there as well. For instance, academic studies (i.e., Brynjolffson et al., 2011) point to the relationship between using data to make decisions and improved corporate performance. Industry studies by companies such as IBM suggest the same. Vendors also publish case studies, typically by industry, that highlight the value from certain technologies. These can all be useful fodder.
Do a proof of concept: However, these can’t really stand alone. Many of the end users I spoke to regarding predictive analytics all pointed to doing some sort of proof of concept or proof of value project. These are generally small-scale projects with high business impact. The key is that there is a way to evaluate the impact of the project so you can show measurable results to your organization. As one respondent put it, “Limit what you do but make sure it has an impact.” Additionally, think through those metrics as you’re planning the proof of concept. Additionally, someone in the organization is also going to have to become the communicator/evangelist to get people in the organization excited rather than fearful of the technology. One person told me that he made appointments with executives to talk to them about predictive analytics and show them what it could do.
BI foundation: Typically, organizations that are doing predictive analytics have some sort of solid BI infrastructure in place. They can build on that. For instance, one end user told me about how he built out trust and relationships by first establishing a solid BI foundation and making people comfortable with that and then introducing predictive analytics. Additionally, success breeds success. I’ve seen this countless times with various “new” technologies. Once one part of the organization sees something that works, they want it too. It grows from there.
Grow it by acting on it: As one survey respondent put it, “Analytics is not a magic pill if the business process is not set up.” That means in order to grow and sustain an analytics effort, you need to be able to act on the analytics. Analytics in a vacuum doesn’t get you anywhere. So, another way to show value is to make it part of a business process. That means getting a number of people in the organization involved too.

The bottom line is that it is a rare company that can introduce predictive analytics, and behold! It succeeds quickly out of the gate. Are there examples? Sure. Is it the norm? Not really. Is predictive analytics still worth doing? Absolutely!

Do you have any suggestions about how to get executives and other members of your organization to value predictive analytics? Please let me know.

Posted by Fern Halper, Ph.D. on January 20, 20140 comments

Emerging Technologies, Free of Hype

As my flight west from Orlando began its descent into San Francisco, I thought about how touching ground was a good metaphor for the just-completed TDWI World Conference. The theme of the conference was “Emerging Technologies 2014,” but one of my strongest impressions from the keynotes and sessions was the deflation of the hype surrounding those emerging technologies. Speakers addressed what’s new and exciting in business intelligence, big data, analytics, the “Internet of things,” data warehousing, and enterprise data management. However, they were careful to point out potential weaknesses in claims made by proponents of the new technologies and where spending on the new stuff just because it’s new could be an expensive mistake.

Setting the tone on Monday morning in their “Shiny Objects Show” keynote presentation, Marc Demarest and Mark Madsen debated pros and cons of new technologies, including cloud (the pursuit of “instant gratification”), in-memory computing, visualization, and Hadoop. Overall, they advised attendees to be wary of hype. “Strike out every adjective on the marketing collateral piece and see what’s left,” Demarest advised. The speakers were able to drill down to what are truly significant emerging trends, helping attendees focus on those instead of being distracted by the noise.

Evan Levy’s “Tipping the Sacred Cows of Data Warehousing” session was similarly educational. While deflating hype about various emerging technologies, Levy at the same time advised his audience to always question the value proposition of existing systems and practices to see if there might be a better way. He took particular aim at operational data stores (ODSs), noting that database and data integration technologies have matured to the point where maintaining an ODS is unnecessary.

I caught part of Cindi Howson’s session, “Cool BI: The Latest Innovations.” With guest appearances by some leading vendors to demo aspects of their products, the session covered promises and challenges inherent in several key emerging BI trends, including mobile BI, cloud BI, and visual data discovery. Cindi has just published the second edition of her book, Successful Business Intelligence, which offers a combination of interesting case studies and best practices advice to help organizations get BI projects off on the right foot and keep them going strong.

The Thursday keynote by Krish Krishnan and Fern Halper introduced TDWI’s Big Data Maturity Model Assessment Tool. Krish and Fern have been working on this project throughout 2013. It is a tool designed to help organizations assess their level of maturity across five dimensions important to realizing value from big data analytics: organization, infrastructure, data management, analytics, and governance. It is the first assessment tool of its kind. Taking such an assessment can help organizations look past the industry hype to gain a “grounded” view of where they are and what areas they need to address with better technologies and methods. Check it out!

Grounded: that’s where my plane is now, at SFO. Time to head home.

Posted by David Stodder on December 13, 20130 comments