TDWI | Training & Research | Business Intelligence, Analytics, Big Data, Data Warehousing

Think
- Research & Resources
  - TDWI Playbook | Next Generation Data Science: The AI-Driven Data Science Life Cycle
  - TDWI Data Points | The Data Foundation for AI
  - TDWI Best Practices Report | Data Strategies and Foundations for Modern Data Management
  - TDWI Insight Accelerator | Adopting a Platform Approach for Gaining Insights from Unstructured Data
- Webinars
  - Expert Panel: What's Next in Data Integration: Powering the AI-Driven Enterprise August 25, 2025
  - Expert Panel: Improving Data Quality, Accuracy, and Consistency August 27, 2025
  - The State of Self-Service Analytics: Results from TDWI’s Latest Research September 8, 2025
  - Expert Panel: Building an AI-Driven Data Strategy September 15, 2025
- Virtual Summits
  - Virtual Events Keys to Making Your Data AI Ready September 10, 2025
  - Virtual Events Data Quality for BI, Analytics and AI October 22, 2025
  - Virtual Events Modern Data Strategy November 12, 2025
  - Virtual Events What’s Ahead in 2026 for Data & Analytics December 10, 2025
- By Topic
  - By Topic
    
    Explore the Latest AI, Analytics, and Data Research and Training by Topic
  - BI, Analytics, and Data Literacy
  - AI, Data Science, and Machine Learning
  - Data Management and Governance
  - Platforms and Architecture
  - Strategy and Methods
- Speaking of Data Podcast
  
  Current Research Surveys
Train
- In-Person Events
  - Conference TDWI Transform 2025 San Diego August 18, 2025
  - Executive Summit TDWI Modern Data Leader's Summit San Diego: AI in the Enterprise August 18, 2025
  - Conference TDWI Transform 2025 Orlando November 16, 2025
  - Executive Summit TDWI Data & AI Leaders Summit Orlando: Governing Data, Analytics, and AI November 17, 2025
- Virtual Live Seminars
  - Data Governance Week July 30, 2025
  - Platforms & Architecture Week July 30, 2025
  - AI Bootcamp Week July 30, 2025
- Online Learning
- By Topic
  - By Topic
    
    Explore the Latest AI, Analytics, and Data Research and Training by Topic
  - BI, Analytics, and Data Literacy
  - AI, Data Science, and Machine Learning
  - Data Management and Governance
  - Platforms and Architecture
  - Strategy and Methods
- Train Your TeamCustom solutions for training your team
  
  Get CertifiedEarn a professional credential in BI and Analytics, Data Governance, or AI
  
  TDWI MembershipExclusive access to the research, tools, training, and connections
Engage
- Connect
  - Connect and Contribute to Our Vibrant Community of Data Leaders
    
    Subscribe to TDWI Stay up to date on the latest news and events. Sign Up
    
    Become a TDWI Member Gain exclusive access to the research, tools, training, and connections to move your careers, teams, and projects forward. Learn More
    
    Become a Part of the TDWI Research Panel Make a difference in the data and analytics industry and earn incentives by sharing your insights with TDWI. Explore Now
    
    Speak at TDWI Events Share your expertise and build your personal brand as a speaker at a TDWI In-Person or Virtual Event. Submit a Proposal
    
    Become a TDWI Research Fellow Apply to be a member of TDWI’s industry leading research team. Apply Today
    
    Become a Member of the Data & AI Leaders Forum Engage in collaborative discussions, stay ahead of the curve, and stay in the know. Apply Now
    
    Showcase Your Data & AI Solutions Reach and engage with TDWI community through multi-channel marketing programs. Learn More

RESEARCH & RESOURCES

Brighthouse Data Warehouse Works Smarter to Accelerate Queries

Infobright says its Brighthouse data warehouse works "smarter, not harder" to accelerate query performance

By Stephen Swoyer
April 23, 2008

Data columnar databases are an overnight sensation more than 20 years in the making.

Where columnar databases were once the odd stepchildren of the data warehousing world, they've grown in popularity largely thanks to their fast analytic query performance.

Last year, no less than three columnar database releases arrived, from ParAccel Inc., Vertica Systems Inc., and Infobright Inc. These players join Sybase Inc. and its venerable Sybase IQ in offering a columnar (or mostly-columnar) take on data warehousing.

Where Sybase IQ was once a tough sell, columnar vendors are finding that -- thanks to the technology's extremely fast query performance and extremely small data footprint -- it's a technology that largely sells itself.

Each vendor touts a different spin on technology. ParAccel likes to tout that it was first to market (among the columnar upstarts) and says it has an impressive list of benchmarks and proofs-of-concept. Vertica trumpets the extensive R&D it put into its DW technology, which it says amounts to a 21st-century retrofitting of a 20-year-old technology. Infobright is proud of the native intelligence of its Brighthouse data warehouse, which it says works "smarter, not harder" to quickly process queries.

"We're very, very different from the other technologies in the marketplace," argues Miriam Tuerk, Infobright's president and CEO. "With most [competing solutions], the output of what you're getting is quite rigid, so if you want to look at data in the way that it's been structured, and run a query against a particular case, that works. But if you want something more flexible and changing over time, that's where you run into problems."

That's in part because mainstream data warehouses (DWs) are basically just gussied up relational database management (RDBMS) platforms, Tuerk argues. "They're using [their] all-purpose relational database systems, and in order to turn them into data warehouse systems, they [become] very resource-intensive -- they require a lot of data center operations time, a lot of server time, etc."

One way to address this problem has been to throw lots of hardware at it, Tuerk contends; still others use partitions, indexes, or materialized views.

These are smart approaches, Tuerk and Infobright concede, but they aren't smart enough: while they may accelerate the performance of traditional RDBMSes (used as data warehouses), they don't address the shortcomings of the RDBMS-as-a-DW paradigm.

That's where Infobright's version of "smarter" comes into the picture. "We're all about working smarter instead of harder. We actually use the intelligence of the data to help accelerate our performance," she comments. "There's a lot of information in the data that can give you roadmaps and visibility in terms of how to do things."

Infobright uses this information -- call it naturally-occurring metadata -- to accelerate queries, reduce data volumes (and, consequently, data loading times), and perform other tricks. Its Brighthouse DW can be implemented on either a standalone basis or alongside a traditional data warehouse. In the latter scenario, data can be offloaded from a data warehouse into Brighthouse, which can be used to accelerate particularly intensive analytic queries.

One upshot of this is better performance all around, Tuerk argues: vanilla DW users benefit, as the original data warehouse -- which isn't optimized for analytic queries -- performs workloads that it does excel at (e.g., transaction-based tasks), while intensive workloads run faster because they're offloaded to Brighthouse.

It works almost automatically, Tuerk claims: as data's fed into it, Brighthouse creates what Infobright calls a "Knowledge Grid" -- i.e., a kind of index or metadata layer -- across every column in the data warehouse.

"We just load the data in as it comes. We break the data into individual columns, then break [those] into data packs [of 65,000 elements]. Instead of just using a standard compression algorithm against the data, we look at it [the data] as we're loading it into the data packs and we figure out how to store it most efficiently," Tuerk explains. "We can optimize for individual data packs. Every single data pack in a column will have a different compression value. That's how we get an average of at least 10 to 1 compression in Brighthouse."

Brighthouse's Knowledge Grid isn't just a one-trick pony: it actually uses three different metadata layers. "It keeps information about every single data pack and the min/max [the average range of data] in every single data pack," Tuerk says. "The next layer is metadata about data that's inside the data pack, so we would have a histogram of the data inside the data pack. The third level is relationships about the data. When you're looking at data warehousing, relationships [among] data [are] very important."

How much of a bite does its metadata-laden Knowledge Grid take out of the 10:1 compression? A tiny sliver, says Tuerk, who claims that "the metadata layer is the size of one percent of the compressed data."

This one percent sliver delivers a big bonus, she argues. "Most of the time, the Knowledge Grid has sufficient information between its relationship level, its detail data level and its parameters around the data pack to actually give you [the answer to a query]," she comments. "The Knowledge Grid sits in memory, so if you take into account bad data packs -- [i.e., data packs which are] completely irrelevant to a query -- you can prioritize the order of operations [in memory], and, depending on how you do things, you can often times eliminate even the ugly ones.

"The [query] optimizer doesn't calculate the query path just once. It calculates the best query path, runs it against the Knowledge Grid, then it looks to see how it can reoptimize [for performance]. So the way that I resolve the query isn't to access the data by using more MIPS processing or more memory or more storage, [it's] to figure out how to do the query without using actually doing the calculation at all."

Brighthouse is based on a hot-rodded version of MySQL, Tuerk says, complete with substantial under-the-hood modifications. For this reason, it's able to take advantage of MySQL's interoperability with most business intelligence (BI) and performance management (PM) tools

"We wrapped MySQL around the software; everything is beneath the covers, so out of the gate we work with all of the business intelligence tools: Business Objects, SAS, Cognos, MicroStrategy." Brighthouse also helps optimize one of the most onerous aspects of building and deploying an enterprise data warehouse: designing a data model.

"When you design a data model, there are two parts: a logical data model and a physical data model. The second part [of a data model] is where I have to create indexes, partitions, materialized views, [and] it's that second component that is totally eliminated," she explains.

TDWI Membership

Get immediate access to training discounts, video library, research, and more.

Find the right level of Membership for you.

Learn More

↑

Research & Resources

Webinars

Virtual Summits

By Topic

In-Person Events

Virtual Live Seminars

Online Learning

By Topic

Connect and Contribute to Our Vibrant Community of Data Leaders

RESEARCH & RESOURCES

Brighthouse Data Warehouse Works Smarter to Accelerate Queries

TDWI Membership

Get immediate access to training discounts, video library, research, and more.

TDWI

Engage

Research