GridGain Goes In-Memory One Better
Vendors such as GridGain say they want to democratize in-memory computing, but that's happening right now. The prerequisites are falling into place.
- By Stephen Swoyer
- May 27, 2014
Start-up GridGain Inc. says its distributed, in-memory SQL database has the potential to "democratize" in-memory computing, but in-memory's democratization could well be a fait accompli. The technological, economic, and cultural prerequisites all seem to be falling into place. Besides that most reliable bellwether of the mainstream -- Microsoft Corp. -- shipped an in-memory compute facility (dubbed Hekaton) with its recent SQL Server 2014 release.
GridGain is saying and doing the right things. In March, for example, it made the non-Enterprise Edition of its in-memory computing platform available (under the Apache Software Foundation's Apache 2.0 license) on the GitHub code repository. That gives it a free and accessible software delivery channel. In addition, officials argue, GridGain's distributed architecture gives it a kind of democratic applicability: the GridGain In-Memory Computing Platform was designed with highly distributed REST-ful apps in mind. GridGain's SQL database technology supports streaming, has a Hadoop accelerator, and is ACID-compliant. This, they claim, is another key differentiator.
"I believe that the change from traditional application architectures to in-memory is going to be as widespread and profound as the Web was -- and as cloud was and still is," argued Jon Webster, vice president of business development, during an interview with TDWI late last year.
"What [GridGain has] is a memory-first architecture, and what that means simply stated is that we designed our products with the assumption that RAM was going to be our primary storage mechanism. When you do that, you can make different optimizations and different design decisions," Webster continued. "This isn't to say that disks aren't part of an in-memory architecture, but really what they're there for is durable storage, almost like the tape drives of yesteryear."
What's happening with hardware has also been a boon to democratization, Webster argued. Over time, processor performance and memory capacity have historically increased while costs have declined dramatically. "Traditionally, [running] in-memory was expensive. RAM was really expensive. Right now, we're seeing our customers purchase the hardware components for this infrastructure for about $25,000 per terabyte. If you get 10 terabytes [of physical memory], you can put pretty much any operational data or working data in that," Webster said.
On the other hand, market uptake hasn't always tracked with technological innovation; you can lead human beings to bigger, faster, and more powerful technology, but you can't always get them to use it.
Take 64-bit computing, for example. Advanced Micro Devices (AMD) Inc. introduced the first commodity x86-64 microprocessor (i.e., 64-bit chips that could also run 32-bit x86 code) in 2003, Intel followed suit a little over a year later. However, pervasive uptake of 64-bit computing is a phenomenon of the last half-decade. If anything, this was a function of (or owed much to) the cost of memory: in 2005, 1 GB of memory cost more than $150, and there just wasn't a compelling economic reason to make the move from 32- to 64-bit; the apps weren't there, operating system support was comparatively spotty (Linux had supported x86-64 since 2001, but the first 64-bit editions of Windows Server didn't ship until early-2005), and the most common workloads wouldn't benefit from a greater-than-4GB memory address space.
By 2010, however, RAM prices had declined significantly. Thus Webster's point: the market's ready, the workloads are there, and conditions are ideal for pervasive uptake of in-memory computing. They're likely to get even more ideal if history is any indication. Historically, memory and storage costs have declined exponentially. In 1995, 1 GB of RAM cost more than $30,000; in 2010, it cost just over $12.
GridGain isn't the only game in town. In fact, it isn't necessarily the only game of its kind in town. Pivotal markets a technology (GemFire is a distributed, in-memory database and data management offering) that bears a facile similarity to what GridGain proposes to do.
"Our Compute Platform is an in-memory, distributed database -- we call it a database, but really it's a 'database-plus-plus,'" explained Webster, referring to the engines and services -- such as a streaming database component and a Hadoop accelerator -- that ship with GridGain's In-Memory Computing Platform.
"We embed the ability to process directly on our data node. Whether it's one machine, 10 machines, or 1,000 machines pulled together in one space. What this allows you to do is it allows you to invert the normal thought about programming, which is data resides over here, compute is over there. This is inherently unscalable. What we do instead with the compute embedded in place [is] we process in place," Webster noted.
That sounds much like Hadoop, which bundles a distributed file system (the Hadoop Distributed File System, or HDFS) and a built-in compute facility. It sounds a bit like SpliceMachine Inc., which markets a similar offering (running on top of HDFS) which it, too, claims is ACID-compliant.
Webster agreed that GridGain's approach is Hadoop-like, although he dismissed the suggestion that his company and SpliceMachine (which, like GridGain, uses also multi-version concurrency control to ensure transactional consistency) are similar. There's a reason GridGain's model is similar to that of Hadoop, he argued: combining compute and storage is fundamentally a sound idea.
"Hadoop does the same thing but over spinning disks, which are going to have just much, much higher latency" than reading from and writing to physical DRAM, he said. "You can still do MapReduce over our in-memory store and achieve very low latencies. The most expensive thing I can do with distributed data is to move it, so we'd rather send computation out to the data. That's the most efficient way of doing things; it allows you to maximize the throughput and processing."
Right now, the business intelligence (BI) industry is basically agog over in-memory technology. For a long time, there were just a few specialty in-memory players -- e.g., companies such as the former Applix Inc., which marketed an in-memory OLAP engine; the former TimesTen, which marketed an in-memory database; Kognitio, which markets a memory-optimized analytic database; and QlikTech Inc., which markets a columnar, in-memory BI discovery platform). More recently, almost all BI and DW players have caught in-memory fever. IBM Corp., MicroStrategy Inc., SAP AG, SAS Institute Inc., and Teradata Corp. now claim to offer "in-memory" BI or data warehouse technologies.
Webster acknowledged that it's a crowded house, but argued that GridGain's model differs fundamentally from those of established players. "The reality of it is that the industry right now is obsessed with in-memory computing, but I think in-memory has to be inherent in the software from the get-go. Otherwise you're going to get only very modest [performance] improvements. We came at this with a clean-room approach; we came at this with an in-memory approach.
"In the absence of GridGain, folks have to pin together three or four or five different technologies or vendors. They have to handle the integration burden, they have to do all of this heavy lifting. That has nothing to do with what their core [business] is. We give [customers] a single in-memory platform with SQL query, ACID compliance, streaming support, and a well-integrated API."
Recently, the in-memory space got a bit more crowded when Microsoft Corp. released its SQL Server 2014 database, which incorporates an in-memory facility, dubbed Hekaton. Remember, Microsoft's entry into the OLAP (with Plato, the OLAP technology that shipped with SQL Server 7), Reporting Services (which shipped in early-2004), and x86-64 markets (with x64 versions of its Windows Server 2003 products) arguably augured the commoditization of those spaces.