Meet TPCx-BB -- A Benchmark for Assessing Big Data Performance
The Transaction Performance Processing Council has revised its official big data benchmark: the aptly titled TPCx-BB. Benchmarks matter -- but not for the reasons you might think.
- By Steve Swoyer
- June 28, 2016
The Transaction Performance Processing Council -- TPC for short -- has itself an official big data benchmark: the aptly titled TPCx-BB.
That's "BB" as in BigBench, a long-incubating proposed spec for a big data benchmark.
The TPC published revision 1.1.0 of TPCx-BB in late May. The TPCx-BB spec was first published in February. Since then, it's gone through no less than three (mostly editorial) revisions.
The gist is that TPCx-BB uses 30 pre-defined analytics queries to simulate real-world conditions in a specific context -- namely, that of a retailer with a combined online and brick-and-mortar presence. Queries consist of a mix of SQL statements (expressed in SQL, passed to Hive or Spark) and machine learning (ML) algorithms (using ML libraries, user-defined functions, and procedural code).
The original BigBench benchmark was tested in tandem with Teradata's Aster platform, a relational database with built-in graph and text analytics capabilities. Aster also boasts in-database support for several hundred analytics algorithms, from basic MapReduce to highly esoteric ML algorithms.
Unfortunately, the TPC's Web site indicates that TPCx-BB is designed to “measure the performance of Hadoop-based” systems. It also specifically refers to Hive or Spark for SQL query processing. It's conceivable, however, that platforms such as Aster, Hewlett-Packard Enterprise's (HPE) Vertica, or IBM's Netezza could be configured to perform the TPCx-BB tests, too.
Believe it or not, two TPCx-BB results have already been submitted, both by HPE. Both entries measure the performance of a 12-node Cloudera Distribution for Apache Hadoop cluster on Intel's Xeon E5-2697 processors. (The second-place entry uses slightly older 14-core processors.)
Both of HPE's TPCx-BB results are currently under review by the TPC.
Benchmarks Matter but Not Necessarily for Obvious Reasons
Upticks in benchmarking activity tend to correlate with high levels of noise in the market.
If past history -- the transaction processing and OLAP wars of the 1990s, for example -- is any indication, then we should expect a flurry of cost-is-no-object TPCx-BB benchmark results over the next few years. Although the value of benchmarks to customers is highly dubious, their value to vendors, particularly in an overcrowded, largely undifferentiated market, is quite the reverse.
"The big data market is overcrowded. Lots of vendors are making the same claims about different products, some vendors are making wildly different claims about the same products. There's little else to differentiate on, so how to show? How to demonstrate the difference? A benchmark! An industry-standard benchmark is even better. Now [the vendors] can say 'Look how big and/or fast we are!'" says Mark Madsen, a research analyst with IT strategy consultancy Third Nature Inc.
For an example of the effect of benchmarks, let's rewind to the late 1990s, when large-scale SMP chipsets for Intel's Pentium Pro processor (and later for Intel's new Xeon chip) first became available. In 1996 and 1997, vendors began shipping eight-way Pentium Pro systems based on proprietary chipsets. ("Chipset" in this context describes the onboard logic that facilitates communications between processors, the onboard memory controller, and main memory; processors, memory, and system devices; and so on.)
The challenge was to demonstrate that one chipset (say, Corollary's Profusion) was better than another (NCR's OctaScale) -- or that one vendor's (e.g., Hewlett-Packard's) implementation of a certain chipset was better than another's (e.g., Compaq's).
The TPC-C benchmark, which measures transaction processing performance, became a preferred means of differentiation in this context.
The OLAP wars of the 1990s are even more illustrative. The late 1990s saw a period of OLAP market consolidation, with Oracle acquiring IRI Software (developer of the Express OLAP engine) and Arbor (which developed Essbase) merging with Hyperion Solutions. At about the same time, other vendors -- such as Microsoft, which introduced an OLAP capability with its SQL Server 7 database in 1998 -- were also exerting competitive pressure.
In fact, there was no shortage of OLAP products, up to and including MicroStrategy's then-innovative ROLAP technology. (Don't forget the granddaddy of in-memory OLAP processing, Applix TM1, which has since been acquired by IBM.) At pains to differentiate their products in the midst of all of this noise, vendors produced a slew of OLAP benchmarks based on the OLAP Performance Council's APB-1 benchmark.
Later, after a period of dormancy, Oracle and (then-independent) Hyperion actually revived the APB-1 benchmark in the mid-2000s.
In April of 2003, Oracle laid claim to the OLAP performance crown with the first new APB-1 benchmark result in five years. Hyperion quickly struck back, publishing a new record APB-1 result just a few months after Oracle's benchmark "triumph."
A little over a year later, Hyperion rubbed some salt in Oracle's wounds by conducting still another record APB-1 benchmark. Ironically, the organization nominally responsible for vetting APB-1 -- the OLAP Performance Council -- had by then been defunct for several years.
This second OLAP war is a good example of how vendors pay more attention to benchmarks than consumers do. First, more context: Oracle hoped to establish the performance bona fides of Oracle 9i and its new in-database OLAP engine, so the company dusted off and exploited a defunct benchmark to make its case -- unwittingly inviting Hyperion to do the same.
The larger backdrop is that by the mid-2000s, OLAP performance had largely ceased to be a hot issue. The performance concerns (and performance-related marketing hype) that had fueled the OLAP wars of the 1990s had ceased to matter. Microsoft's SQL Server Analysis Services (SSAS) and, indeed, Oracle 9i OLAP had helped to commoditize OLAP technology.
The APB-1 benchmark, dormant or not, provided a means for Hyperion, HyperRoll, and other vendors to differentiate their technologies in an extremely crowded market.
Big Data Benchmark's Value
What are we to make of this first industry-standard benchmark for big data? For one thing, it doesn't exactly track with the direction of the marketplace or the priorities of buyers. "This is a lot like the APB[-1] benchmark. It's hard to imagine a context where it's applicable," says Third Nature's Madsen.
"They're trying to show how to test a combined analytics and SQL workload on the same underlying database right at a time [when] the market has determined that one should use the engine that best matches the workload," he continues. "That means one should run the analytics jobs separate from the SQL queries, and there should be two benchmarks to test two distinct things. Why blend them?"
The TPCx-BB's biggest value will likely be to vendors. HPE and Cloudera can both claim bragging rights -- however temporary -- in an extremely noisy marketplace.
Expect more TPCx-BB benchmarks to come.
Stephen Swoyer is a technology writer with 20 years of experience. His writing has focused on business intelligence, data warehousing, and analytics for almost 15 years. Swoyer has an abiding interest in tech, but he’s particularly intrigued by the thorny people and process problems technology vendors never, ever want to talk about. You can contact him at firstname.lastname@example.org.