Transactional Safeguards: An ACID and BASE Primer
NoSQL technologies were developed to address the scaling and availability issues unique to Web application development. We explore the ACID vs. BASE dilemma for enterprises.
- By Steve Swoyer
- March 22, 2016
A foolish consistency, Ralph Waldo Emerson once observed, is the hobgoblin of little minds.
Be that as it may, there's nothing foolish about database consistency. Consistency forms part of the Tetragrammaton that is the conceptual foundation of data management: ACID, shorthand for the key properties -- acidity, consistency, isolation, and durability -- of a transaction processing database.
In the NoSQL Age, we're learning to ask a new question: how much consistency is too much consistency? Before we unpack this question -- and it's a doozy -- a brief recap is in order.
NoSQL technologies such as Hadoop, MongoDB, and Cassandra (to name just a few) were developed to address the scaling and availability issues that are unique to Web application development. Even though many NoSQL databases claim to be ACID-compliant, according to the axiomatic definition developed (and later implemented) by ACID creator Jim Gray, NoSQL systems relax ACID guarantees. Dan Pritchett, now director of engineering with Google Inc., helped popularize the term "BASE" -- or basically available, soft state, eventual consistency -- to distinguish the NoSQL approach to transactional safeguards from the traditional ACID tetrad.
For many kinds of applications, NoSQL proponents -- many of them vendors -- argue, BASE transactional guarantees are more than sufficient. This isn't an uncontested position, however. There are two primary issues here: first, are there, in fact, applications for which BASE consistency is sufficient? If so, what are they? A third, related issue, has to do with edge cases -- i.e., the anomalies and exceptions that can complicate BASE transactional guarantees in practice. They do exist, as they do in the traditional ACID model, and inevitably entail trade-offs.
A helpful comparison is the problem of designing and building aircraft in the pre- and post-computer ages. (This analogy ultimately breaks down, but it's illustrative if used to frame, not to define, the issue.) Prior to the advent of computer-aided design, computer-based flight and stress simulations, and the production of lighter, stronger materials and alloys, aircraft were typically over-engineered. Put simply, engineers couldn't reliably model the stresses to which a plane was subject during normal and abnormal flight conditions. Instead, they factored a big margin of error into their designs using more materials while being careful to balance weight and loads to engineer stronger, more resilient aircraft. In short, they over-engineered the planes they built.
The case for BASE and NoSQL is that ACID-compliant transactional guarantees are no less over-engineered. Rigid ACID transactional guarantees are absolutely critical for certain kinds of applications (such as a simultaneous debit and credit on a bank account) but much less critical with respect to strict adherence to ACID properties for many kinds of transactions. It isn't that ACID is categorically "better" than BASE; rather, it's that strict adherence to the ACID specification imposes limitations with respect to the scalability, availability, performance, and versatility of a database system. For applications in which strict all-or-nothing ACID guarantees are a non-negotiable requirement, this is an acceptable compromise. For other applications, an eventual consistency model in which transactions are at some point reconciled isn't just sufficient, it's desirable.
Because NoSQL designs eschew compliance with ACID requirements, they can scale to address a range of non-traditional data storage, data management, and data processing requirements.
"There's definitely a class of transactions that doesn't require the same level of [ACID] guarantees. Where you get into trouble is when do you need those guarantees and when you don't," says Mark Madsen, a research analyst with information management consultancy Third Nature Inc.
"Say you and I share a bank account and we're taking money out of the same bank account at the same time. You want to be able to serialize these debits, so your debit and my debit go one after the other instead of simultaneously, in order to protect us from taking all of the money out of the account," he explains. "It turns out there are other types of transactions that we didn't used to do before. There's stuff we do on websites such as Yelp, when we write a review, or when we tweet a message, or post something on Facebook.
"None of those things has any real effect in terms of transactional coordination, because they are only ever isolated to what the individual user is doing. It's very unlikely that you're going to be logged into your phone and your laptop simultaneously tweeting something. Even if you are, what's the damage? The risk is so low that it doesn't matter."
Except when it does. As some users of NoSQL databases have discovered, strict ACID compliance doesn't really matter until it really matters. Madsen uses the example of Digital Millennium Copyright Act (DMCA) takedown notices to which Internet service providers (ISPs) are legally bound to respond. ISPs have to do more than just respond, however, he notes. First, ISPs create a record of the DMCA takedown notice; second, they must alert the offending party that their content has been flagged for takedown; third, they can (a) continue to host the content if that party contests the DMCA takedown notice or (b) take down the content as requested if that party doesn't respond.
The DMCA use case is illustrative in that it doesn't involve financial data but is transactional in scope, Madsen points out. "The primary thing is that you're required as a website to take that takedown notice, record it, notify the rights' holder, do the due process, and if you don't hear from the other party, then you have to take it down. Several big Internet companies got into legal trouble over this because ... if you're a big high-traffic website such as Pinterest or Instagram, you actually can get significant volumes of takedown notices and if you have a database that doesn't guarantee transactional durability, the 'D' in ACID, you're going to have problems," he notes.
"You find these edge cases in the apps over time. When it turns out that you need [ACID-like safeguards] but you've dedicated yourself to a NoSQL database, your solution is to either rewrite your entire app to run against a new [ACID-compliant] database or figure out workarounds," Madsen continues. "In the latter case, what the database doesn't do for you becomes instantiated as new code in your app. This was the gist of the Google Spanner and F1 papers," he concludes, referring to Google's so-called "NewSQL" database design. "They basically said, 'We realized after years of doing this that the app keeps accreting more and more code to deal with edge cases until it becomes very, very brittle. We're spending more and more time dealing with brittle code because we aren't getting the guarantees out of our persistence layer that we need.'"
It isn't an either/or proposition, however, Madsen stresses. NoSQL addresses issues of scale and availability that OLTP database systems can't, at least cost-effectively. It's a question of anticipating problems -- edge cases, such as the DMCA-takedown example -- and building them into the apps or processes you design. "It turns out that there's a whole swath of stuff that doesn't really need the level of ACID-compliant guaranteed-consistency stuff. In other words, there are a lot of good trade-offs to be made. You just have to be smart or imaginative enough to anticipate those tradeoffs."
About the Author
Stephen Swoyer is a technology writer with 20 years of experience. His writing has focused on business intelligence, data warehousing, and analytics for almost 15 years. Swoyer has an abiding interest in tech, but he’s particularly intrigued by the thorny people and process problems technology vendors never, ever want to talk about. You can contact him at [email protected].