TDWI | Training & Research | Business Intelligence, Analytics, Big Data, Data Warehousing

Think
- Research & Resources
  - TDWI Playbook | Next Generation Data Science: The AI-Driven Data Science Life Cycle
  - TDWI Data Points | The Data Foundation for AI
  - TDWI Best Practices Report | Data Strategies and Foundations for Modern Data Management
  - TDWI Insight Accelerator | Adopting a Platform Approach for Gaining Insights from Unstructured Data
- Webinars
  - Expert Panel: What's Next in Data Integration: Powering the AI-Driven Enterprise August 25, 2025
  - Expert Panel: Improving Data Quality, Accuracy, and Consistency August 27, 2025
  - The State of Self-Service Analytics: Results from TDWI’s Latest Research September 8, 2025
  - Expert Panel: Building an AI-Driven Data Strategy September 15, 2025
- Virtual Summits
  - Virtual Events Keys to Making Your Data AI Ready September 10, 2025
  - Virtual Events Data Quality for BI, Analytics and AI October 22, 2025
  - Virtual Events Modern Data Strategy November 12, 2025
  - Virtual Events What’s Ahead in 2026 for Data & Analytics December 10, 2025
- By Topic
  - By Topic
    
    Explore the Latest AI, Analytics, and Data Research and Training by Topic
  - BI, Analytics, and Data Literacy
  - AI, Data Science, and Machine Learning
  - Data Management and Governance
  - Platforms and Architecture
  - Strategy and Methods
- Speaking of Data Podcast
  
  Current Research Surveys
Train
- In-Person Events
  - Conference TDWI Transform 2025 San Diego August 18, 2025
  - Executive Summit TDWI Modern Data Leader's Summit San Diego: AI in the Enterprise August 18, 2025
  - Conference TDWI Transform 2025 Orlando November 16, 2025
  - Executive Summit TDWI Data & AI Leaders Summit Orlando: Governing Data, Analytics, and AI November 17, 2025
- Virtual Live Seminars
  - Data Governance Week July 30, 2025
  - Platforms & Architecture Week July 30, 2025
  - AI Bootcamp Week July 30, 2025
- Online Learning
- By Topic
  - By Topic
    
    Explore the Latest AI, Analytics, and Data Research and Training by Topic
  - BI, Analytics, and Data Literacy
  - AI, Data Science, and Machine Learning
  - Data Management and Governance
  - Platforms and Architecture
  - Strategy and Methods
- Train Your TeamCustom solutions for training your team
  
  Get CertifiedEarn a professional credential in BI and Analytics, Data Governance, or AI
  
  TDWI MembershipExclusive access to the research, tools, training, and connections
Engage
- Connect
  - Connect and Contribute to Our Vibrant Community of Data Leaders
    
    Subscribe to TDWI Stay up to date on the latest news and events. Sign Up
    
    Become a TDWI Member Gain exclusive access to the research, tools, training, and connections to move your careers, teams, and projects forward. Learn More
    
    Become a Part of the TDWI Research Panel Make a difference in the data and analytics industry and earn incentives by sharing your insights with TDWI. Explore Now
    
    Speak at TDWI Events Share your expertise and build your personal brand as a speaker at a TDWI In-Person or Virtual Event. Submit a Proposal
    
    Become a TDWI Research Fellow Apply to be a member of TDWI’s industry leading research team. Apply Today
    
    Become a Member of the Data & AI Leaders Forum Engage in collaborative discussions, stay ahead of the curve, and stay in the know. Apply Now
    
    Showcase Your Data & AI Solutions Reach and engage with TDWI community through multi-channel marketing programs. Learn More

TDWI Articles

Technology Swaps: Why You Must Unlearn What You Think You Know

Are you considering swapping one technology or solution for another? That may be a wise move, but be sure you aren’t basing your decision on dangerous (and mistaken) assumptions.

By Mark Madsen
March 25, 2016

The IT market is facing a big change in technologies used to process, store, and manage data. It seems like everybody is talking about using new databases and platforms for different tasks. Terms such as “polyglot persistence” are being bandied about to describe the “best-of-breed” approach where many different databases are used to manage data across the organization.

Many of the vendors would like you to believe that changing one technology for another is easy. The reality can be far different because there are tradeoffs made in adopting these different technologies, tradeoffs that may not be apparent at the outset.

Swapping technology A for technology B is based on the assumption that what you're dealing with is (1) a technology problem that (2) can be substituted, like for like, and in the process possibly confer additional benefits. Sometimes this works, as when organizations swapped out their old hierarchical and network databases on for relational databases. Sometimes it doesn’t, as when organizations tried to trade relational for object databases (a technology that rose and crashed in the mid-90s).

Tech swapping is the right response when you have a problem and are working with a similar type of technology -- for example, swapping a SQLServer RDBMS for an Oracle RDBMS. It's also the right decision when you're making a class change -- for example, you're swapping out an overmatched, traditional database for a massively parallel processing (MPP) system. In this case, as with the prior like-to-like example, the basic principles are the same: you're just shifting to a parallel relational database that's optimized for query-processing performance rather than a general-purpose, non-MPP relational database.

Tech swapping can also work if you're swapping out a better-suited but totally different type of technology. Imagine swapping out the costly Oracle database that's powering your under-performing website for a NoSQL database such as Cassandra. Imagine a similar swap, albeit one that involves replacing one RDBMS (Oracle) with another -- namely, a sharded MySQL database. (“Sharding” MySQL involves breaking up or distributing the data in one database into multiple databases across multiple computers. The term “sharding” comes from the pieces of glass, or shards, from breaking a mirror, a play on words to do with mirroring databases for read performance.)

Here there be dragons. When you make this change, you run afoul of the things you don’t know that you don’t know. You first discover that there are, in fact, things that you don't know that you don't know. Second, you learn that some of what you “know” for one technology type won’t help you with the new technology. In fact, your intuition developed from years of experience may tell you the exact opposite of what you need to do. In other words, you must relearn and more important, to unlearn.

For example, if you are experienced at data modeling in an RDBMS world, your ideas about how to organize data for performance or to make change easier will be very different from what is needed in most NoSQL databases. Best practices in building data models for SQLServer or Teradata are not all that different, but they can be bad practices in a different type of database such as Cassandra.

Changing from one type of technology to another has deeper and broader repercussions. Swapping in new, dissimilar technology affects more than simple technical interfaces. There are different development techniques and different management practices. Fundamentally, change of this kind affects the architecture of your systems.

The most common mistake people make with technology procurement is failing to recognize when they're contemplating a change that will affect the architecture of the system they are managing. Exhibit A is when an organization decides to replace a database with Hadoop.

This can be a good idea, as when you need to support analytics model building and execution. This workload usually means there are a smaller number of users, but those users may read -- and more important, write back -- enormous volumes of data. The algorithms they use are often iterative, reading, calculating, and then re-reading and re-calculating, all that data. Contrast that with the workload of business intelligence system. These systems tend to have more users, reading but never writing data, in a single pass with no iteration. This workload is what parallel relational databases were designed (some might say perfected) to run.

Moving this workload to Hadoop is fraught with difficulties because the design and management techniques you have learned in the relational world do not always apply. The components don’t work the same and the dependencies between tools are changed, sometimes in obvious ways and sometimes in hidden ways that are only uncovered at the most inconvenient time.

You should approach a project that is framed as a simple substitution of one product for another with caution. Learn how the new technology works and what the underlying differences to your existing technology mean. I’ve been using databases as examples, but this applies to any sort of technology.

Identify and list the tradeoffs that each of your choices makes. In the process, you may uncover things you didn’t know about what you already have. What tradeoffs are good or bad for your use? Cross-reference these and be sure to look at the secondary impacts.

For example, a flexible schema (or schema-on-read) is great for some purposes, but the tradeoff it makes is to move the enforcement of data conformance and quality to the application. This has far-reaching implications to the architecture of the application and any downstream system that might use its data. Because the data is not guaranteed to be in the correct fields, with the correct data types and without problems (such as missing values), it is the responsibility of any consuming application to address those when the data is read. Sometimes this is important, as with BI systems, and sometimes it isn’t, as with analytics model building (because that involves data preparation unique to each model and choice of data).

As with any design decision, the trick is to start with the goal and the problem(s) you’re trying to solve. If your problem is performance, you may have a simple technology problem. In this case, tech swapping -- swapping in Tech B, a new and unknown thing of a different type for Tech A, your existing solution -- could be a mistake. I don't mean to sound like a grouch, but people do this all of the time. They see that Tech A is slow, unresponsive, and can't support high levels of concurrency. They see that Tech B is used for big things by big serious companies, can scale to huge numbers of nodes, and is said to support high concurrency. Therefore B should replace A.

What they don't consider is that their issues may be a function of poor system design or under-provisioned resources, still the two most common sources of performance problems in the BI market.

When someone gives you a recommendation to try a new technology, look at it carefully. (Look at it with especial care if the recommendation came from a senior executive or one of your internal application developers.) In a well-understood market with the same types and classes of technologies, this should be a relatively easy decision.

When it involves a recommendation for a technology of a similar type but different class, the decision is harder. A good example of this is a database optimized for an OLTP workload versus one optimized for BI workloads. When it's a recommendation for a technology of a completely different type, you need to really focus on the tradeoffs it makes and what problems it has chosen to favor over others. This has implications to the other areas of your architecture.

Pay special attention to the second hardest problem: that you don’t know what you don’t know. The hardest problem? As Mark Twain said, “It ain't what you don't know that gets you into trouble. It's what you know for sure that just ain't so.”

About the Author

Mark Madsen is a fellow at Teradata in the Technology and Innovation Office. He focuses on data science and analytics ecosystems, problems of large-scale application, and complex systems. Prior to that he was president of Third Nature, where he advised organizations on strategy and technology for data science and analytics.

Mark spent most of the past 25 years working in the analytics field, starting with AI at the University of Pittsburgh and autonomous robotics at Carnegie Mellon University. He is also involved in technology research, speaks internationally, and chairs several industry conferences.

Get to Know Mark Madsen

An Interview with Mark Madsen

Agile BI: Re-architecting BI Means Understanding Methodologies

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI Members have access to exclusive research reports, publications, communities and training.

Individual, Student, and Team memberships available.

↑

TDWI | Training & Research | Business Intelligence, Analytics, Big Data, Data Warehousing

Research & Resources

Webinars

Virtual Summits

By Topic

In-Person Events

Virtual Live Seminars

Online Learning

By Topic

Connect and Contribute to Our Vibrant Community of Data Leaders

TDWI Articles

Technology Swaps: Why You Must Unlearn What You Think You Know

Get to Know Mark Madsen

Related Articles

Trending Articles

Breaking Barriers in Conversational BI/AI with a Semantic Layer

AI in 2025: Key Considerations for Technology Leaders

The Tech Blanket: Building a Seamless Tech Ecosystem

What’s Ahead in Generative AI in 2025? (Part Two)

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI

Engage

Research

Research & Resources

Webinars

Virtual Summits

By Topic

In-Person Events

Virtual Live Seminars

Online Learning

By Topic

Connect and Contribute to Our Vibrant Community of Data Leaders

TDWI Articles

Technology Swaps: Why You Must Unlearn What You Think You Know

Get to Know Mark Madsen

Related Articles

Trending Articles

Breaking Barriers in Conversational BI/AI with a Semantic Layer

AI in 2025: Key Considerations for Technology Leaders

The Tech Blanket: Building a Seamless Tech Ecosystem

What’s Ahead in Generative AI in 2025? (Part Two)

TDWI Membership

Accelerate Your Projects, and Your Career

TDWI

Engage

Research

Accelerate Your Projects,
and Your Career