Appliances—Data Mart or Enterprise Data Warehouse?

By Stuart Frost, CEO, DATAllegro

Appliances are becoming establishedin the data warehousing market, butsome companies and analysts have positionedappliances as “just” suitable fordata marts (DM). Is this true, or can theyalso be used for large-scale enterprise datawarehouse (EDW) projects?

The answer is yes, they can—under certain circumstances. While few would claim that appliances are currently ready to handle complex EDW, appliances are finding an interesting niche as an integral part of many EDW infrastructures.

DM and EDW Differences

Definitions of DMs and EDWs vary, butthe most common differences lie in thenumber of business processes supportedby a given system. A DM typically supportsonly one business process or subjectarea, whereas an EDW supports several,and in some cases is a true enterprisewidesystem. In addition, DMs are often fedsummarized information from the EDWin a hub-and-spoke architecture, althoughthis varies across the industry.

Since appliances arerelatively easy andcheap to maintain,any additionalcomplexity … islimited in nature andoverwhelmed by thehuge benefits.

EDW Challenges

A significant majority of Global 2000companies have deployed data warehousesin the last 10 years, establishingthe overall business value of analytics.However, many companies are now strugglingto keep up with new demands ontheir data warehouse systems. Such challengesinclude:

  • Significant data growth due to:
    • New legislation (the Sarbanes-Oxley Act, EU data retention laws, etc.)
    • Mergers and acquisitions
    • The need to analyze growing volumes of point-of-sale or telecommunications transactions to remain competitive
  • Business demands for reduced latency, which translates into faster query times
  • Larger user bases
  • Demand for ever more complex, ad hoc queries to address fraud detection and anti-money laundering

As a result, many previously successfulEDW installations on platforms such asTeradata, DB2, and Oracle are becomingoverwhelmed by the need to supporthundreds of users with a broad mix ofquery types against tens of terabytes ofdata. Upgrade quotes for these platformscan easily be tens of millions of dollars—and even then they may not meetbusiness needs!

Using Appliances to Divideand Conquer the Problem

Since high-performance data warehouseappliances are now available at prices aslow as $20,000 per terabyte, a number ofEDW users are turning to this new technologyas a potential solution. However,they are not relegating appliances to therole of mere data marts. Instead, they areusing appliances as a low-cost front end tothe EDW itself.

In a typical scenario, large-volume, finegranularitytransaction records are storeddirectly on the appliance. The appliancethen handles tasks such as:

  • Data cleansing
  • Long-term storage of transaction details for compliance
  • Ad hoc queries
  • Applications such as fraud detection that require access to data at very fine granularity
  • Exports to external analytics systems such as SAS
  • Building large-scale aggregation or summary tables and exporting them to the EDW

By offloading these tasks from the EDWto the appliance, companies are greatlyreducing the need for an expensive EDWupgrade. In addition, the specializednature and advanced technology of theappliance enables these processes to runsignificantly faster, often by two ordersof magnitude.

Since appliances are relatively easy and cheap to maintain, any additional complexity introduced by this divide-and-conquer approach is limited in nature and overwhelmed by the huge benefits.


New data warehouse appliance technologieshave the potential to transform thedata warehousing market. By acting asa high-performance, high-capacity, andlow-cost front end to an established EDW,they can add significant value to analready successful installation—whileavoiding expensive upgrades.

If this all sounds too good to be true, manyvendors offer free proofs of concept so youcan check out their claims at minimal cost.What do you have to lose, apart from poorperformance and high costs?

