Bringing Data Back Home to Big Iron
It's cheaper than ever to host data processing workloads on IBM's zSeries mainframes. Now Rocket Software is touting the hitherto unthinkable: data virtualization for big iron.
- By Steve Swoyer
- August 8, 2016
Data warehousing (DW) specialists used to love to hate big iron.
Back in the day, getting at mainframe data required special gateway technology or kludgy FTP scripts, to say nothing of getting the mainframe team to hand-code file extractions.
Moreover, the mainframe guys could be touchy about access. Many DW technologists would have liked to pull the plug on big iron long ago, but luckily it wasn't their decision. I say luckily because thanks to the way IBM Corp. prices its systems, it's cheaper than ever to host data-processing workloads on zSeries mainframes.
Data Virtualization on the Mainframe?
These inexpensive mainframe workloads are the raison d'être for Rocket Data Virtualization, a mainframe data integration offering from Rocket Software that does exactly what its name suggests.
"When we were designing this, it was all about building a high-performance data architecture which at its core did not require you to use ETL. Obviously, we built ... parallelism into the solution, and we built in highly parallel data movement. We use IBM hardware for some significant compression advantages. We can do this inexpensively by taking advantage of some of the technologies [IBM] provide[s] for System z," says Gregg Willhoit, managing director and general manager with Rocket Software.
"You can use the product as an extract and load tool, but in this case you don't have to run a bunch of complex extract, transform, and load scripts. Rocket Data Virtualization handles the transformations. It virtualizes in-place, in-memory, then loads the data into the database."
Specialty Processors Have History on System z
For decades, Big Blue has offered price breaks for its System z mainframes in the form of so-called specialty processor engines. What's so special about a specialty processor engine? It's incredibly cheap.
Fifteen years ago, IBM unveiled its Integrated Facility for Linux (IFL), which made it possible to cost-effectively run Linux workloads on the mainframe. IFL capacity was priced at a fraction of online z/OS COBOL applications, as was IBM's zSeries Application Assist Processor (zAAP), released in 2004 (zAAP accelerates Java workloads).
At the time, Java workloads (or more precisely, Java runtimes) were comparatively inefficient, such that it was prohibitively -- if not ridiculously -- costly to allocate expensive online mainframe capacity to Java processing. ("It is not unrealistic that you might need at least 20 times as many MIPS to support the same number of users in a Java environment," one user told me in a 2005 interview.)
From a data warehousing perspective, the mainframe became a platform of renewed interest in early 2006, when IBM announced its zSeries Integrated Information Processor (zIIP), a specialty engine for data-processing workloads. Think of zIIP as a software license to host processing-intensive workloads on z/OS at a substantially reduced cost.
"We built a [data virtualization] engine optimized for [System] z that runs completely on zIIP," explains Wilhoit, who notes that IBM developed zIIP to stanch the flood of workloads leaving the mainframe for cheaper platforms.
"Ninety percent of [their customers] were doing ETL off of the mainframe or off of Oracle [on the mainframe] or whatever. IBM said to them, 'What you're doing is expensive, the data is no longer current, it's no longer as secure, you have multiple degrees of separation -- and it's unnecessary, because we're giving you this [zIIP engine]," says Willhoit.
Specialty Processors and Virtualization Benefit IBM Faithful
With zIIP, as with some of its other specialty processors (e.g., IFL), IBM is being incredibly -- even uncharacteristically -- generous.
Willhoit explains, "Back in 2010, you buy a new z196 mainframe. Say you also decide to buy eight zIIPs with that. At this point, you've paid a little less than $500,000 for tons of [data-processing] capacity, but say your capacity just keeps growing and growing, so when IBM announced its new z13 mainframe [this year], you buy that. Your z13 gives you all of this new capacity and speed, but guess what? Those eight zIIPs [for which] you paid $500,000 on your z196? You can bring those forward, at no cost, to your new z13, which is a much faster platform."
What's more, the zIIP on Big Blue's newest mainframe systems is now twice as fast, thanks to IBM's implementation of simultaneous multithreading. This means each upgraded zIIP engine is able to process two threads simultaneously, instead of just one. Add it up, Willhoit argues, and there are a slew of reasons why big iron customers would want to virtualize access to their mainframe platforms.
"The neat part about having a data virtualization server on zSeries is that you can also support mainframe apps, such as CICS or IMS. All of the mainframe environments, all of the mainframe data [stores] are supported, they're all consuming clients of our data virtualization engine that can use our SQL interface," he comments, noting that VSAM is one such data source.
"Usually when you talk about data virtualization, you're talking about SQL, NoSQL, Web services interfaces, and REST, but the applicable data sources should include mainframe applications such as CICS and IMS, too. You can get to data on z[Series] and off, so if you wanted to join data in Oracle with [data] in CICS or some other source, you could do that. We're the only solution that addresses this requirement."
IBM Embraces Analytics on the Mainframe
Given the cost, shops that don't already have mainframe systems aren't likely to add them. In large mainframe shops, however, big iron probably isn't going anywhere. (According to Gartner Inc., small customers -- companies that have 1,000 MIPS or less of mainframe capacity -- are likely candidates for mainframe migration. Shops with 2,000 or more MIPS tend to expand their big iron capacity investments, not reduce them.)
To the extent that data processing -- be it data transformations or analytical processing -- can be cost-effectively colocated with or at the site of mainframe data, it makes sense to do so, Willhoit and Rocket Software argue. In fact, IBM has given mainframe shops plenty of reasons to bring data processing and analytics workloads back home to the mainframe.
There's IBM z/OS Platform for Apache Spark -- announced just this April -- for starters, along with IBM's DB2 Analytic Accelerator for z/OS. Elsewhere, Big Blue offers its Smart Analytics Optimizer for DB2 for z/OS and its InfoSphere BigInsights for Linux for System z, along with Cognos BI for z/OS and SPSS for Linux for System z. Most if not all of these workloads are eligible to run in the context of System z's zIIP specialty processor.
The new mainframe status quo empowers vendors such as Rocket Software to champion the hitherto unthinkable. "There's no reason to take data off the mainframe or put it into a different format," says Calvin Fudge, product marketing director with Rocket Software.
In light of big iron's staying power, Fudge argues, data management and data warehousing technologists should instead look to leverage the mainframe's unique strengths. "The question becomes: How do you make mainframe data more usable across the enterprise? We're just providing a better way to share the data with other platforms, a way to make mainframe data universally available."
Stephen Swoyer is a technology writer with 20 years of experience. His writing has focused on business intelligence, data warehousing, and analytics for almost 15 years. Swoyer has an abiding interest in tech, but he’s particularly intrigued by the thorny people and process problems technology vendors never, ever want to talk about. You can contact him at firstname.lastname@example.org.