TDWI | Training & Research | Business Intelligence, Analytics, Big Data, Data Warehousing

TDWI Articles

00 Days

00 Hrs

00 Min

00 Sec

Database Evolution Meets the Era of Microservices

Why the future of your databases involves both multicloud and multiregion.

By Karthik Ranganathan
January 27, 2020

Major events have shaped how the database industry has evolved over the past 40 years, but none perhaps so disruptive as the era of microservices. From the simple web applications of the past and the interactive mobile apps of today, to microservices-based architectures of the future, Structured Query Language (SQL) has evolved to thrive in a globally distributed, hybrid and multicloud world.

For Further Reading:

Five Database Requirements for Digital Transformation

Under the Covers: Databases in Devices

Why Blockchain Will Never Kill the Database

When we look back at how legacy databases have grown to respond to major industry advances, we see a pattern emerge that may point us towards the future of the database and how new technologies may try (and perhaps fail) to better support cloud-based applications.

SQL is the universal language of relational databases, but to understand how far we've come, we should look back on the days of the single node relational database management system (RDBMS) databases.

The A.I. (After Internet) Years

Made famous by Oracle, the RDBMS database ran on a single node and, over time, became feature-rich. During the 1980s and 1990s, developers were focused on client-server applications (such as sales) to manage businesses. If you look at market share, these databases are still the winners today. However, back then it was more about enterprise applications -- standard apps people built for business productivity such as managing customer relationships, supply chains, and human resources. These monolithic applications were a perfect fit for a monolithic database -- for a little while at least.

The next wave of change came after the advent of the internet. Suddenly, the growth of web applications exploded and developers needed a faster, better, and cheaper way to meet the new demand of connected applications. This heralded the era of open source databases that filled the technology void that the stagnation of monolithic databases had created.

As consumption of the web increased, so too did the consumption of databases. As instances kept increasing, the feature possibilities of the web grew -- producing more features on the underlying database that challenged the capabilities of the existing technology. The mass adoption of the web pushed forward a proliferation of data services, where fragmented databases supported web features such as transactions, aggregations, and search.

This continued into the early 2000s, when applications had grown so enormous that when developers tried to store the data in an RDBMS, they had to do a sharded deployment. This meant taking incoming data, manually partitioning it into subsets, and storing each partition in a different RDBMS instance while still keeping the data available.

In a nutshell, this sharding on SQL meant developers had to compromise on transactions, referential integrity, JOINs, and other major SQL features. As Web 2.0 exploded, so did the demand for a way to manage massive amounts of data without losing SQL's flexibility for ever-changing applications.

Web 2.0

The all-pervasive era of the internet created a demand for online services such as spam detection and e-commerce site recommendations. These services required so much data that manual sharding became impossible. Developers needed a natively scalable approach and they found it in NoSQL.

NoSQL was the first class of distributed databases and a huge evolutionary step. Giving developers the ability to run applications at scale -- even if it meant a decrease in consistency -- provided a way ahead. High availability was more important than consistency and prioritization was key to keeping pace with the expanding online landscape.

For Further Reading:

Five Database Requirements for Digital Transformation

Under the Covers: Databases in Devices

Why Blockchain Will Never Kill the Database

This brings us closer to today, where the 2010's still see NoSQL solving scale and availability issues but continuing to compromise on SQL feature-set and consistency. Monolithic databases haven't gone away, but transactional applications originally based on Oracle, SQL Server, or PostgreSQL now need to be rewritten to adapt to more modern architectures such as the public cloud and Kubernetes. Customer expectations of today equate to a richer experience that can only be found in the cloud.

Where We're Going and How to Get There Faster

Today's user-facing applications need relational data modeling, high consistency, low latency, on-demand scale. and continuous availability. This inflection point is as industry-changing as the birth of the internet or Web 2.0. Digital transformation has pushed forward the need for microservices-based design; the move to the public cloud has meant another explosion of applications, requiring more databases, machines, and data centers -- and new challenges.

One such challenge is the rising costs from the database side as licensing and support costs of most providers rise exponentially with the amount of data. This has also created a disparity in the value-to-cost ratio and given rise to the open source model of database transparency. Even here, the model has limitations because many offerings hold back critical technology for a price that most enterprises would be required to pay eventually.

Another major challenge faces legacy database users. On the on-premises side, users have fewer machines to deal with and higher quality tools, but that's expensive, requires significant human labor, and carries risks. If you run an on-premises database in the cloud, there's a much higher probability of failure and certain decline of availability and consistency, not to mention the barrier to scale. To do it correctly requires effort that is better put into improving and building better applications. It's a possible avenue for very large companies with time, money, and resources to spare, but the window on this approach is quickly closing.

For users of legacy databases, the ROI of moving large amounts of existing data to the cloud can be low due to the time and resources required to make the move. However, these companies are choosing alternatives to their legacy architecture more frequently, not just for higher reliability, scale, and performance, but because of two additional trends that are pushing the monoliths out – multicloud and multiregion.

The Future is Multicloud and Multiregion

As multicloud deployments increase in popularity -- and requirements thanks to regulations such as the GDPR -- developers have another checkbox to tick off in their search for an effective and affordable database, adding "geographic distribution with consistency" to a requirements list that already includes availability, scalability, and features. Geographic data distribution ensures that user latency can be reduced by serving data from the nearest region and it improves security and compliance.

Kubernetes is increasingly proving itself as the most effective approach to taking advantage of multicloud. To run stateless applications as well as stateful databases using portable containers means that developers can take advantage of the different strengths that exist in different clouds, managed and operated simultaneously with Kubernetes. Kubernetes allows users to deploy applications in a cloud-neutral manner, which means leveraging both public and private clouds as and when needed.

Regardless of geographic location, this cloud-native architecture is shaping everything today's developers are doing. How do you choose the best database today? There are three key questions developers should ask:

Does the database serve the sophisticated relational data modeling needs of microservices?
Does the database provide zero data loss and low-latency guarantees even under a highly ephemeral infrastructure environment such as Kubernetes?
Does the database scale on demand with ease in a multizone, multiregion, and multicloud deployment in response to planned and unplanned business events?

A decade from now, we'll recall the microservices era of database evolution and how vital it was to the acceleration of application development. To ensure we get there, we must look ahead -- not at the past -- for new ways to meet the challenges of the cloud.

About the Author

Karthik Ranganathan is the co-founder and CTO of Yugabyte, the company behind YugabyteDB - the open source project delivering a distributed PostgreSQL database for modern applications. Karthik has played a key role in driving distributed SQL database adoption and bringing together NoSQL and SQL capabilities into a single relational database. Before Yugabyte, Karthik was one of the original database engineers at Facebook, responsible for building distributed databases like Cassandra and HBase. He is an Apache HBase committer and was an early contributor to Cassandra before it was open-sourced by Facebook.

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI Members have access to exclusive research reports, publications, communities and training.

Individual, Student, and Team memberships available.

TDWI | Training & Research | Business Intelligence, Analytics, Big Data, Data Warehousing

Research & Resources

Webinars

Virtual Summits

TDWI Articles

Database Evolution Meets the Era of Microservices

Related Articles

Trending Articles

From Reactive to Proactive: Automating Data Quality in Petabyte-Scale Analytics Pipelines

From Pilot to Production: Why LLM Features Stall, and a Readiness Checklist for Data Leaders

The Inferencing Cost Problem No One Is Talking About: Unstructured Data Quality

The Hidden Cost of Poor Training Data in Generative AI

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI

Engage

Research

Research & Resources

Webinars

Virtual Summits

TDWI Articles

Database Evolution Meets the Era of Microservices

Related Articles

Trending Articles

From Reactive to Proactive: Automating Data Quality in Petabyte-Scale Analytics Pipelines

From Pilot to Production: Why LLM Features Stall, and a Readiness Checklist for Data Leaders

The Inferencing Cost Problem No One Is Talking About: Unstructured Data Quality

The Hidden Cost of Poor Training Data in Generative AI

TDWI Membership

Accelerate Your Projects, and Your Career

TDWI

Engage

Research

Accelerate Your Projects,
and Your Career