Executive Perspective: Database Directions
More real-time data, more demands on that data, and a world where cloud computing and data is now mainstream. How do these trends affect what's ahead for databases and database management? Yugabyte's CTO, Karthik Ranganathan, shares his ideas.
- By James E. Powell
- May 15, 2020
Upside: How has the proliferation of data and data types shaped the evolution of the database?
Karthik Ranganathan: Today, operating internet-scale apps has become the norm for many businesses. However, the evolution to get here, from beginning to end, involved nearly a complete metamorphosis of many of the underlying technologies -- including databases -- used along the way. Before the internet, businesses had apps and data (and Rolodexes and paper memos and Oracle v3), but the environment and the stakes have completely changed now that businesses overwhelmingly deliver their services -- or important aspects of their services -- online.
Back in the day, if a service went down, it was possible for businesses to creatively work around outages for a period of time. Today, a company's reputation is almost synonymous with the online services it delivers, and if a service doesn't work (or doesn't work as expected), the consequences can be severe. Now that microservices are the norm for delivering digital experiences, the data feeding those services is critical. Data -- and there's more of it now than ever before -- needs to be highly available, on demand, and without a detectable delay to the services it feeds. That said, the evolution of the database and the evolution of data both follow the life of an application itself.
Before the internet and in the beginnings of the internet era, data existed in monoliths and relied on a relational database management system (RDBMS). As the internet became mainstream, and with the introduction of microservices, the demand for scale led many to NoSQL databases. Today, we live in a mainstream cloud world where microservices are cloud native, and we are beginning to see a shift away from NoSQL (where developers have to compromise on speed or performance) to distributed SQL (which is built for these cloud-native applications).
What's the most competitive approach to database management today? What's the most innovative/aggressive?
Online services are becoming the core essence of most organizations, and as a result, "delighting the user" is a common mandate. This includes providing a high-performance experience that rivals or exceeds that of the competition while keeping costs accessible for the enterprise. To that end, the most competitive and aggressive approach to database management is to improve performance (reduce latency, increase throughput, and eliminate downtime) at every layer within the application, including the data tier. This also needs to include the evaluation and implementation of an architecture that helps keep infrastructure costs to a minimum.
For example, we work with Plume, a whole-home mesh Wi-Fi system that intelligently manages the internet service for all devices and rooms in your home. Their legacy database, MongoDB, could no longer handle the increasing volumes of event data Plume's customer base was generating, and operating and managing MongoDB at scale was too time-consuming. The company tried switching to Apache Cassandra, but this increased expenses and required significant operational resources, not to mention delivered high latency due to an unpredictable data infrastructure.
Plume's subsequent move to a distributed SQL database led to a seamless and scalable increase in managing daily operations and more than 35TB in data sets while reducing latency significantly. As a result of this competitive approach, Plume can perform rolling upgrades without downtime, spend less time managing the database and more on their core business, and accept and integrate new customers rapidly.
How has the cloud changed database management?
The cloud and cloud-native technologies such as Kubernetes unlocked a myriad of benefits for the management of databases. However, to reap those benefits, the database technology needs to be able to respond in kind.
When it comes to stateful applications (applications that store data; a database is one example), Kubernetes can offer the benefits of scale and fault tolerance, but the stateful app itself needs to be orchestration-ready and deliver on those promises. The stateful app has to be ready to be scalable and fault-tolerant, all without losing data.
This means that the agile nature of Kubernetes requires the underlying database tier to be equally agile. Otherwise, the application will see outages, slowdowns, and -- worst of all -- data loss and incorrect results. Most SQL databases cannot exploit the benefits of Kubernetes because the database was never designed to do so. However, distributed SQL databases enable enterprises to take advantage of the inherent benefits that Kubernetes offers. For example, by constantly monitoring and rebalancing the data shards across the available nodes, even in a highly dynamic environment such as a Kubernetes cluster, a distributed SQL database can guarantee that applications never experience outages, slowdowns, or data loss.
What database technology is being most widely adopted now and what is next in advancements?
Open source databases claimed zero percent of the market ten years ago, and they now make up more than seven percent. As technology rapidly advances, open source technologies, including databases, dramatically remove barriers to enterprise adoption and are very attractive to developers and DevOps engineers building applications on cloud-native platforms. Alternatives, such as freemium models, take significantly longer for the software to mature to the same level as a true open source offering.
Additionally, Gartner predicts that by 2022 more than 70 percent of new in-house applications will be developed on an open source database management system (OSDBMS) or OSDBMS-based database platform as a service. It's clear that open source has proven to be the most successful approach to developing and distributing business-critical infrastructure software, and in the next decade, the adoption of open source databases will rise. This will be driven by the benefits of reduced total cost of ownership; access to high-velocity, collaborative, community-driven support and development; and innovation and compatibility with commercial databases, making it easy to migrate towards an open source solution.
Another advancement we're likely to see in the next decade is the proliferation of distributed cloud and multicloud adoption as the result of the geodistribution of data. Edge data centers are already starting to offer data services with geodistribution, and it's almost certain that geodistributed workloads will also increase in turn.
What's the biggest challenge enterprises face when it comes to database management and strategy? What's the answer to overcoming it?
Inertia. Legacy databases are too often seen as "good enough" and this holds an enterprise back from evolving its database and strategy. The answer to overcoming inertia starts with greenfield opportunities: look at what modern alternatives are available, investigate their potential, and move steadily into a new architecture.
Where do you see the database industry growing in the next decade?
The operational DBMS industry has been experiencing the distributed SQL revolution over the last few years. Unlike specialized databases, distributed SQL appeals broadly to both start-ups and large enterprises looking to adopt cloud-native technologies for their entire software stack, which includes database infrastructure. Application development and operations teams are excited that they no longer give up the data modeling flexibility and transactional capabilities of SQL in the process of going cloud native.
What does this mean for the broader industry? For one thing, the impact of Oracle will completely diminish in the next decade. We're seeing this unfold today with users selecting cloud-native databases rather than Oracle, shrinking the user base of the monolithic legacy provider. Overall, Oracle has lost market share every year since 2013, and legacy relational database players have dropped about five percentage points per year. Major organizations, such as Amazon and Salesforce, have already figured out that it's to their benefit to use less, not more, Oracle, and this is just the beginning of the giant's undoing.
Describe your product/solution and the problem it solves for enterprises.
YugabyteDB is a Google Spanner-inspired, cloud-native distributed SQL database that is 100 percent open source and provides the speed, scale, and performance developers need to deploy global, cloud-native applications. It helps developers take advantage of multicloud or solve for the geodistribution of data by developing, deploying, and operationalizing their modern applications with high performance and scale. With YugabyteDB, organizations don't have to give up data modeling flexibility and transactional capabilities by going cloud native.
YugabyteDB 2.1 includes two data center (2DC) deployments for reducing write latency, read replicas for reducing read latency, enterprise-grade security enhancements, and TPC-C and YCSB benchmark results confirming a 10x increase in performance.