Question and Answer: How a Column-based Architecture Can Deliver Quick Returns
Companies that improve database infrastructures by moving from a row-based to a column-based architecture can expect immediate improvements in query responses and report generation speed.
- By Linda L. Briggs
- September 16, 2009
The primary limiting factor in obtaining good, fast results from data isn’t the BI or analytics tools in use or even the tuning and optimization. Rather, it’s the underlying database infrastructure. So says Dan Lahl, director of analytics for Sybase.
In this interview, Lahl talks about the increasing need for speed in analytics, and how moving away from using a traditional, row-based, transactional database to run critical reporting and analytics systems can deliver that critical speed -- and improve the quality of responses by allowing organizations to query far more data at once. In moving to a columnar database design, Lahl says, companies can dramatically boost the response time and usefulness of their most data-intensive reporting and analytics -- improvements that can quickly translate to the bottom line.
BI This Week: What is behind the heightened interest we’re seeing around BI in general and analytics in particular?
Dan Lahl: More and more organizations have become acutely aware that their futures will be determined not by their ability to make incremental improvements in operational efficiency but by achieving analytical excellence. Winners and losers in this volatile and competitive global climate will be determined by the quality of the decisions and predictions they can make based on all available data.
In this new business reality, the stakes have never been higher, the timeframes have never been shorter, the datasets have never been larger, and the sophistication of the analysis required has never been greater. Furthermore, the consequences of getting it wrong have never been more devastating. Decision-makers need to accurately analyze results, predict outcomes, and keep the business competitive.
Regarding both analytics and reporting issues, what new expectations are you hearing from customers?
From a reporting standpoint, organizations want faster, more flexible reporting systems that allow them to quickly access and analyze business data -- without limits on drill-downs and ad hoc queries, the amount of data that can be queried, the number of people allowed on the system at the same time, the type of BI tools they can use, and so forth.
Users just want fast answers, using the query tool or site of their choice, without hassles and without IT conflicts. IT wants to deliver on the business needs, but without having to invest big money in hardware upgrades, DBA time, tuning and optimization resources, and proprietary solutions that don’t integrate with their existing ecosystem.
From an analytics standpoint, organizations are looking for faster, more accurate, and more cost-effective analytics solutions. They want winning business results that attract new customers, grow revenues, and predict outcomes and risks.
We’re seeing more organizations turn to predictive models and data mining to give them insight about future outcomes and manage risk. Organizations often make significant investments to build sophisticated predictive models to unlock insights from the data. However, the accuracy of those models is largely determined by the amount of data that they can run against. Users are demanding access to more data -- to three, five, and even ten years’ worth of data.
Above all, businesses and government agencies expect to get reporting and analytics solutions up and running fast -- within 30 to 90 days.
Given that, are people getting what they need from traditional solutions? What are the issues?
People are starting to realize that the limiting factors in getting the kind of results they require are not the business intelligence or analytics tools, or the hardware systems, or even the tuning and optimization skills they can acquire. The biggest limiting factor is the underlying database infrastructure.
What we hear is that traditional transactional database technologies consistently provide disappointing results when they are used to run data-intensive reporting and analytics. You can get some of the answers you need, or you can get answers to some of the people who need them, or you can compromise on the timeliness or completeness of the answers you get. There are far too many limiting factors, especially when you consider the ongoing stream of excess money that organizations end up pouring into extra hardware, DBAs to tune each query, and optimization resources to get these systems to perform.
Using a traditional, row-based database to run critical reporting and analytics systems is like entering a delivery truck in a Grand Prix race. It’s just not what it was designed to do.
If row-based databases are part of the problem, what does column-based database design offer, and when can a column-oriented architecture be a superior approach?
When we’re talking about analytics and reporting, a column-based architecture is inherently superior to a row-based architecture. A row-based database is designed to process transactions, and row-based databases are very good at that workload. Processing transactions is a fundamentally different activity from serving up fast reports, ad-hoc queries, and analytics. A column-based database is designed to be optimized for analytical queries -- the data is already arranged by exactly the kind of criteria that reporting and analytics use, or as we like to say -- the data is the index.
A row-based database has to be tricked into providing fast answers by loading it up with indexes, summaries, aggregates, materialized views, and so forth. Not only does this massively expand the footprint of the reporting or analytics environment, it creates a huge administrative workload. It’s like deciding to strap 50 additional jet engines on a massive commercial airliner in order to get the passengers to their destination faster. Even if you can somehow get this to work, it doesn’t make economic sense to try.
Where will the most immediate returns occur when a company invests in a column-based analytics server?
Using a smarter analytics server will immediately remove the limits imposed by traditional systems. The first thing we hear is how amazed customers are with the speed at which they can receive answers from the system when running their existing reports and analytics. Better analytics frees up people and hardware resources in the IT department, which immediately cuts costs and allows IT resources to be allocated to other priority projects. Users then start to envision business uses that seemed impossible or impractical before. Companies start to add more users, ask more complex questions, expand to new departments in the organization, add more years’ worth of data, offer new reporting and analytics services to both internal and external users -- and in the process, they drive better insights, reduce risks, increase customer satisfaction, increase competitive differentiation, and improve their margins.
For example, one of our customers, the securities division of a large European bank (BNP Paribas), which performs more than 32 million transactions a year, found within a month of implementing Sybase IQ that the speed of report generation -- the bank routinely provides 200,000 reports to more than 10,000 customers per month -- was increased by a factor of three. Business users were delighted with faster, more accurate answers.
In fact, the performance of their analytics and reporting environment improved so much that they were able to generate twice as many ad-hoc queries as they could before, allowing them to gain a deeper insight into their business. They accomplished all this while freeing up resources on the OLTP system and greatly increasing -- via Sybase IQ's data compression -- the amount of historical data that reports could include, all of this at a tremendous cost savings to the company.
When we talk about analytics and column-based processing, what does Sybase bring to the table?
With Sybase IQ, we bring more than a decade of experience, with technology that has been proven and honed in thousands of data-intensive customer environments. Sybase IQ handles the most challenging business reporting and analytics requirements with ease, delivering dramatically faster and consistently more accurate analysis to address the growing need to anticipate risks and opportunity in a volatile global climate. We have more than 1,600 customers and over 3,100 customer installations worldwide; Sybase IQ delivers a smart approach to enable enterprises to turn raw data into actionable information through analytics.