RESEARCH & RESOURCES

In-Memory NoSQL Brings Real-Time Big Data to More Businesses

By Monica Pal, CMO, Aerospike


The Internet and mobile computing have paved the way to new applications that interact with consumers based on what they are interested in, what they are doing, and where they are located.

Early pioneers of interactive applications— dominated by companies in Internet marketing and advertising—turned to databases utilizing in-memory and NoSQL technology to deliver contextual big data in real time. Many of these early deployments have been marked by the ability to handle extremely large volumes of information within milliseconds.

Typical examples of this are some Internet ad and marketing companies relying on the Aerospike in-memory NoSQL database:

  • Each of The Trade Desk’s data centers processes millions of transactions per second (TPS) for hundreds of application servers and manages billions of records and terabytes of data across multiple clusters
  • eXelate runs its databases on Internap bare-metal cloud servers in four Internap data centers to manage 700 million unique user profiles across 60 billion transactions per month, while accessing 20 TB of real-time context
  • Federated Media has databases in five data centers to manage more than 180 million monthly ad impressions across 145,000 sites

At the same time, companies are realizing exponential gains in efficiency. Among the most notable is AppNexus, which has a trading desk that handles up to 50 million ad impressions per day, but has been able to reduce its cluster size by 84 percent, from 50 servers to 8 servers, using the Aerospike database and Intel solid-state drives.

More Businesses Rely on Real-Time Big Data

Databases built for real-time big data can efficiently use server and storage resources, so engaging in customer interactions based on large volumes of contextual data is no longer restricted to businesses with large multidata- center deployments.

A case in point is Snapdeal, India’s largest online marketplace, which has a network of more than 20,000 sellers serving over 20 million members—one out of every six Internet users in India.

The Snapdeal.com platform enables sellers to list products for sale on the site, manage inventory, and make pricing changes in real time based on what is happening in the marketplace. High volume—for example, a pair of shoes sells every 30 seconds—means that thousands of sellers are making dynamic price adjustments. This results in Snapdeal’s inventory and pricing management system processing more than 500 writes per second.

“In two years, we have scaled more than 200 times—the number of products listed, the number of sellers, the amount of business they do, the number of servers, storage, and the technology team— everything has grown 200 times,” said Amitabh Misra, Snapdeal vice president of engineering.

From a seller standpoint, Misra explained, “An efficient marketplace requires that sellers be able to push their updates in real time. As more sellers sign up and more products are listed and more price changes are made, we knew we needed to scale the system.”

Faster Responses from Fewer Database Servers

To support its inventory and pricing system, Snapdeal initially deployed 10 MongoDB NoSQL database servers with 5 GB of data in DRAM as a cache in front of MySQL. However, as the business scaled and more sellers made price adjustments on more products, the MongoDB response times shot up from five milliseconds to more than a full second. This not only compromised the consumers’ shopping experience, but it also led to lost revenue opportunities.

As a result, Misra said, “We decided to evaluate a variety SQL and NoSQL technologies, including Aerospike’s in-memory NoSQL database.”

Today, the Java-based Snapdeal inventory and pricing management system uses Aerospike to provide predictable sub-millisecond responses while managing 100 million-plus objects stored in 32 GB of DRAM. The data stored includes seller and product IDs, inventory, seller rankings, and pricing attributes. Product and price changes are made to both Aerospike and a MySQL database while seller rankings and product details are read from Aerospike.

The implementation runs on two Linux servers on the Amazon Elastic Compute Cloud (EC2) and it takes advantage of Amazon Elastic Block Store (EBS) for persistent blocklevel cloud storage.

“Aerospike has really come out with flying colors, and between 90 to 99 percent of the time, we have been getting the same consistent numbers,” Misra said.

“With our past database, whenever there was a search in concurrent price updates from many services, we saw degradation in the buyer experience,” Misra continued. “Now with Aerospike, we can push through huge price changes while maintaining the same response time experience on the buyer’s side—even with millions of buyers. That has been the biggest advantage.”

TDWI Membership

Get immediate access to training discounts, video library, BI Teams, Skills, Budget Report, and more

Individual, Student, & Team memberships available.