TDWI Articles

Apache Cassandra: Where Is the Open Source Database Headed?

More enterprises today see pure open source software such as Apache Cassandra not as a budget-friendly compromise solution but as the best solution available.

Pure open source Apache Cassandra already provides many advantages for enterprises’ mission-critical use cases, but now we’re at an especially pivotal time for the NoSQL database. For starters, an uncertain economy is putting pricey (and less flexible) proprietary and open core databases under the microscope. I suspect we’ll continue to see widespread momentum around open source project adoption up and down the stack -- and Cassandra is one that’s particularly ready to replace existing solutions.

For Further Reading:

Why 2022 Will Be About Databases, Data Mesh, and Open Source Communities

Cloud Database Trends and Reducing the Risk of Moving High-Stakes Workloads

3 Ways NoSQL Can Be Part of Your Data Architecture

Even outside of the macroeconomic forces that may play a role in Cassandra migrations, all enterprise tech leaders and their data and analytics teams should pay attention to three Cassandra trends right now.

Trend #1: The adoption of Cassandra 4.0 is (deservedly) growing and 4.1 is on the doorstep

Technology leaders and data management teams hesitant to become early adopters during the database’s major 2021 upgrade now see a more mature tool and substantial features ready for scale. At its launch, Cassandra 4.0 wasn’t yet supported by key tools such as Cassandra Reaper and others. This meant some organizations weren’t quite ready to make the move to 4.0. That was especially true among those that were plenty satisfied with their existing Cassandra implementations and unsure what advantages the upgrade had to offer. Reaper now supports Cassandra 4.0, with other critical tools following suit. Across just about any implementation, the choice to move up to 4.0 should now be clear and well worth any effort to upgrade.

Cassandra 4.0 benefits include improved performance (especially around indexing speed), which will meaningfully accelerate data queries and reporting for end users to make analytics results available faster than ever. Virtual tables similarly enable easier Cassandra performance queries for superior data management (and for optimizing applications accordingly). Improved security, added Java 11 support, and a valuable auditing log rounds out the new features that IT decision-makers should be paying close attention to. Beyond these enhancements, Cassandra 4.0 was built with a core focus on providing reliability -- with the goal of appealing to open source doubters and fighting misconceptions by offering a nearly bug-free solution. The initial release achieved its goal of including as few bugs as possible, and its reliability has only improved from there.

It's worth noting that Cassandra 4.1 is also coming soon, which focuses on improving Cassandra operations and the day-to-day data management experience. With the addition of a Guardrails framework (plus improved rate limiting and deny listing), organizations can be more confident in how they use Cassandra and work with their data model. There are also a ton of improvements across lightweight transactions (LWTs), hint behavior, permissions and security, CQL, and other areas of the project that are also worth looking at.

Trend #2: Cassandra 4.0 is becoming the go-to database option for an expanding set of enterprise use cases

In particular, Cassandra 4.0 is seeing accelerating adoption among technology leadership within the banking and financial services industries, where its strong audit trail offers compelling advantages. Other sectors in need of strict auditing as part of their security policies will find Cassandra 4.0 similarly appealing. Power distribution companies provide another use case ripe for Cassandra migration; for example, 4.0 is currently enabling many smart meter systems by serving as the back end, able to collect and manage data for analysis at a particularly high scale. Cassandra 4.0’s excellent write performance and high availability mean that these companies can deploy infrastructure with tens of thousands of smart meters, each sending usage metrics multiple times each day, with no bottlenecks or costly downtime.

Cassandra 4.0 is also increasing its appeal in these areas by aligning with the prevailing database trend toward simplified installation and operational management (this will be even more apparent with 4.1, as mentioned). A look at Cassandra enhancement proposals (planned or in consideration) shows an open source community actively pursuing simplification. At the same time, the Cassandra community is enabling more use cases by pushing to enhance indexing (especially secondary indexing), implement storage touch indexes, and achieve simpler and better integration with Kubernetes.

Trend #3: The gulf in cost between pure open source Cassandra and proprietary “open core” versions is expanding

Commercial open core Cassandra offerings will only become more expensive. In direct contrast, open source Cassandra will remain freely available as a more cost-effective path forward. The open core solutions begin from open source Cassandra code and add their own proprietary features to justify steep licensing fees. Those commercial features simply aren’t necessary for enterprises -- which nevertheless often find themselves locked into multiyear contracts even though a free enterprise-grade alternative is readily available. To succeed with their business model of charging for free software, open core vendors try to pursue every opportunity to capture long-term customers by force via vendor and technical lock-in.

To strike a more positive note, enterprises are increasingly recognizing open core for what it is and acknowledging open source Cassandra as the fully enterprise-ready technology that it is. As many organizations understand -- and more will in 2023 as budgets tighten -- proprietary open core features come at high costs, offer minimal benefits, and simply aren’t necessary. Many businesses have even introduced specific corporate mandates to stop paying for proprietary solutions when viable open source alternatives are available. That’s absolutely the case with open source Cassandra.

Change Is Underway

The enterprise mindset has undergone a massive shift in recent years, from considering pure open source software as budget-friendly compromise solutions, to now viewing open source as not only viable -- but often as the most robust solutions available. Enterprises are no longer validating Cassandra 4.0 as a powerful database: they know it is. Their internal conversations right now are about proving Cassandra as the best fit for their data management, analytics, and other data-layer use cases, and then moving to it as quickly as possible.

About the Author

Anil Inamdar is the VP and head of data solutions at Instaclustr by NetApp, which provides a managed platform around open source data technologies. Anil has 20+ years of experience in data and analytics roles. He regularly speaks on Cassandra topics and best practices. Prior to Instaclustr, he held data and analytics leadership roles at Dell EMC, Accenture, and Visa, among others.


TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI Members have access to exclusive research reports, publications, communities and training.

Individual, Student, and Team memberships available.