Page 2 of 2
Data Management: A Look Back At 2021
Data management has evolved into a very different discipline from what it was just 10 years ago. Chief data management trends in the year now ending included a shift toward platforms that are adept at processing practically any type of information and those that are distributed across cloud infrastructures of growing complexity.
In 2021, TDWI followed several trends that clearly signaled the waning importance of SQL platforms in the larger outlook for enterprise data management.
Trend #1: Multimodel platforms hastened the end of the war between SQL and NoSQL
Customer engagement and other modern digital applications blend shifting mixes of relational and nonrelational data. Heterogeneous data is the lifeblood of transactional analytics, multichannel customer engagement, real-time next best action, recommendation engines, and other mainstream applications.
However, in 2021, enterprise IT professionals saw a new normal emerge. An increasingly popular no-copy architecture, the "multimodel" database, hastened the end of the war between SQL and NoSQL. Multimodel databases combine relational and nonrelational data and seamlessly execute analytics, transactions, and other workloads in a single platform with scalability, performance, high availability, and unified management.
The power of multimodel data platforms lies in their ability to enable various analytics engines and their associated data structures to be optimized for various use cases. The engines can be selectively loaded into memory according to use case. In addition, all engines can access the same data, eliminating the need to store multiple copies of the same data or to incur the overhead of transferring data between engines. This approach also facilitates NoSQL processing in a microservices environment so the engines can efficiently communicate state, events, and data with each other.
Having proved their worth in managing nonrelational data, NoSQL vendors embraced multimodel architectures more completely in 2021. Many established NoSQL vendors now offer multimodel platforms that consolidate relational, key-value, document, graph, wide column, geospatial, time series, and other data models on scalable back ends. They have also been competing more fiercely to offer their customers the ability to add new data sources, analytics engines, transaction models, immersive interactivity, and real-time contextualization capabilities.
As multimodel data platforms mature in coming years, TDWI expects them to become most enterprises' converged environment for data lakehouses, hybrid transactional and analytics processing, and other environments that require tight integration of transactional, analytics, and operational data applications. Data professionals will increasingly deploy multimodel platforms to unify data warehouses and data lakes, DataOps and MLOps pipelines, and business intelligence and advanced analytics platforms. A key focus of vendor differentiation in the multimodel arena will be AIOps tools for automating end-to-end management across increasingly complex, containerized multicloud fabrics.
Trend #2: Serverless APIs became a principal on-ramp for many cloud data applications
In the past year, cloud data continued to penetrate more deeply into enterprise applications. In the process, a wider range of cloud data platform providers offered serverless APIs to facilitate on-demand, event-driven, stateless, programmatic access to a wide range of data engineering, analytics, and other functions.
Serverless environments continued to prove their suitability for a wide range of core data management and analytics workloads. These serverless-ready, data-centric functions include creation of APIs that return data from back-end cloud microservices in response to requests; sending event notifications in response to specific analytics application behaviors; and serving data through interactive response dialogues, such as the outputs of AI-driven face, voice, and image recognition models.
As organizations transition their data environments to more completely embrace serverless cloud architectures, they are taking advantage of this approach's back-end operational advantages. These include the following abilities:
- Dynamically manage the allocation of data and other back-end machine resources
- Boost the density and efficiency of storage, CPU, and other back-end resources
- Provide a language-neutral platform for querying, analyzing, and manipulating data
- Abstract away the physical locations and operating platforms from which the back-end application logic is being served
- Eliminate the need for programmers to write the logic that manages containers, virtual machines, and other back-end runtime engines to which execution of application logic will be dynamically allocated
In 2021, enterprises continued to adopt data platforms that are either architected primarily for serverless cloud access or that provide serverless APIs in addition to SQL and other cloud-native interfaces. Key announcements from serverless cloud data providers in 2021:
- Amazon Aurora Serverless Version 2 went into preview. Currently limited to MySQL-compatible Amazon Aurora deployments, this enhancement lets Aurora users automatically adjust data capacity as needed for applications that have irregular, unpredictable or infrequent traffic patterns, while supporting auto-scaling to avoid purchasing more capacity than needed.
- Cockroach DB Serverless went into public beta release. Improving how developers work with relational data infrastructure, the service eliminates database operations, is fully elastic and globally distributed, scales automatically, and is designed to affordably provide the exact resources needed.
- Databricks Serverless SQL went into public preview on AWS. This new capability for Databricks SQL provides instant compute capacity to users for their BI, SQL, and lakehouse workloads, with minimal management required and capacity optimizations that can lower overall costs.
- DataStax Astra became generally available. This is a serverless instance of DataStax's open source Apache Cassandra database that has been reimplemented as a set of microservices deployed on top of Kubernetes clusters accessing an object-based storage system.
- Microsoft Azure Cosmos DB's serverless interfaces became generally available. Cosmos DB's serverless containers can serve thousands of requests per second with no minimum charge and no capacity planning required, and is recommended for users who expect intermittent and unpredictable data traffic with long idle times.
- MongoDB Atlas 5.0 went into general preview for existing customers. With this release, serverless instances of the database-as-a-service offering can automatically provision the database resources that users need based on workload demands.
TDWI expects vendors in every niche of the data platform market to follow their lead in 2022 and roll out serverless APIs for their respective database-as-a-service offerings.
Trend #3: Hyperledger data platforms retrenched toward focused applications
In 2021, some specialized niches of the data platform market retrenched into their core strategic focus areas. Hyperledger databases -- a segment that includes commercial blockchain platforms -- were in that category.
As 2021 draws to a close, the hype surrounding this segment is rapidly waning. Though it appears that there's still strong demand for hyperledger databases in niche, distributed, and community applications that require immutable logs, mainstream enterprise applications for the technology seem to be eluding solution providers in this segment.
This past year, the cloud solution providers who've been trying to drive hyperledger data platforms into broader enterprise opportunities have noticeably slowed or retrenched on their go-to-market strategies. Most notably:
- IBM, which made a big splash into the blockchain market five years ago, was reported earlier this year to have substantially shrunk its product unit devoted to the technology. Nevertheless, the company claims that blockchain has a multiplier effect on demand for revenue-bearing enterprise cloud services.
- Microsoft, which discontinued its six-year-old Azure Blockchain Service this year, has shifted its strategy, stating it has "made the decision to shift our focus from a product-oriented offering to a partner-oriented solution." However, just a few weeks later, it announced a preview of the new Azure Confidential Ledger Service, stating that the new offering "doesn't replace Azure Blockchain Service but is another distributed ledger that can be used by customers who want the maximum level of privacy afforded to them."
- Amazon Web Services has continued to offer credible solutions in this segment, primarily Amazon Managed Blockchain and Amazon Quantum Ledger Database. In 2021, they announced general availability of Ethereum on Amazon Managed Blockchain, but have otherwise offered no enhancements, additions, or other significant announcements.
- Google Cloud has been developing its own blockchain offering for the past few years and has also seemed to be softening that strategy in favor of prioritizing hyperledger vendor partnerships with the likes of Dapper Labs and Polygon Networks.
From an enterprise perspective, the most noteworthy news in the hyperledger market in 2021 was Microsoft's November announcement of SQL Server 2022. In this upcoming release, the database platform will take advantage of the distributed ledger tables that Microsoft has already released in Azure SQL Database. The tables, to be made available in both on-premises and Azure VM deployments for SQL Server 2022, will persist the full transactional history in tables in immutable distributed storage. This will provide additional levels of integrity to data in databases by maintaining an audit record that DBAs cannot overwrite in Azure Blob storage.
Nevertheless, the year now ending saw increasing competition among start-ups focused on financial, cybersecurity, and other vertical or niche applications of blockchain and hyperledger data platforms. One likely trend in the coming year is that the confidential computing industry, in which leading cloud and enterprise data platform vendors have taken a prominent role, will drive demand for hyperledger data platforms as trustworthy cybersecurity audit logs. Much of this demand will come from the edge computing industry, which is rapidly adding hyperledger tech as a distributed data platform for trustworthy transactions involving Internet of Things (IoT), edge, embedded, mobile, gaming, and multiverse devices.
One recent announcement in the hyperledger arena that may be of interest to data analytics professionals was Nokia's adoption of blockchain to facilitate trusted sharing of data and AI models between enterprises across distributed global supply chains. This past May, the vendor announced the launch of Data Marketplace, a blockchain-based service providing real-time B2B access to massive trusted data sets. The service complements the Nokia Worldwide IoT Network Grid, which offers global IoT connectivity and vertical applications for supply chains.
This sort of initiative isn't anything radically new in the hyperledger data market. TDWI has seen blockchain and other hyperledger technologies creeping into MLOps pipelines for years, and 2021 simply continued this trend.
James Kobielus is senior director of research for data management at TDWI. He is a veteran industry analyst, consultant, author, speaker, and blogger in analytics and data management. At TDWI he focuses on data management, artificial intelligence, and cloud computing. Previously, Kobielus held positions at Futurum Research, SiliconANGLEWikibon, Forrester Research, Current Analysis, and the Burton Group. He has also served as senior program director, product marketing for big data analytics for IBM, where he was both a subject matter expert and a strategist on thought leadership and content marketing programs targeted at the data science community. You can reach him by email (firstname.lastname@example.org), on Twitter (@jameskobielus), and on LinkedIn (https://www.linkedin.com/in/jameskobielus/).