Balancing the Challenges and Opportunities of Multiplatform Data Architectures
Today's architectures achieve a level of complexity that is difficult to manage and afford -- but they can enable greater data diversity, analytics, and business innovation.
- By Philip Russom
- April 23, 2018
We are experiencing a time of great change in data, its management, and its range of uses. In addition to traditional enterprise sources, data now also comes from new sources, such as machines, social media, and the Internet of Things (IoT). These diverse sources include every type of data from unstructured to multistructured as well as latencies from batch to real time.
At the same time, many organizations are modernizing their management of data -- not only to capture the data from new sources but also to employ data-driven practices such as business monitoring, performance management, multichannel marketing, and especially advanced analytics.
Diverse Data Management Tools and Storage Platforms
As a strategy for capturing and leveraging new data assets for new practices, many user organizations are diversifying their portfolios of data management tools and storage platforms. The idea is that no single data storage platform can be optimized for the extreme diversity of data structure, latency, and analytics purpose we face today. Instead, many organizations prefer the ability to select the most appropriate tool and platform combination for any given type of data and its use cases.
This multiplatform strategy for data management drives organizations toward environments consisting of numerous data platforms where data is physically distributed across multiple database servers, file systems, and storage. Extreme complexity results from the number of systems involved, regularly encompassing multiple brands of database management systems (DBMSs, both old and new), NoSQL platforms (especially Hadoop), and tools for data integration, analytics, and stream processing. These may be on-premises, in the cloud, or in hybrid combinations of the two. Tools and platforms may originate from software vendors, the open source community, homegrown development, or all of these.
The result is an eclectic mix of old and new data, managed on traditional and modern platforms with tools from many providers, stitched together by some form of distributed data architecture.
Defining Multiplatform Data Architectures (MDAs)
When diverse tools, technologies, platforms, and data sets are integrated this way, the result is a "multiplatform data architecture" (MDA). Note that MDAs are not new; they emerged with the earliest client/server implementations. What's new is that MDAs have recently achieved an extreme diversity, sophistication, and complexity that make early MDAs pale in comparison.
We assume that data is heavily distributed in an MDA. In other words, data is strewn physically across the many databases, clouds, and other storage platforms of the MDA. However, we also assume there should be some form of large-scale, cross-platform architecture that unifies an MDA and its data on a logical level. Ideally, the architecture should be actively designed by data architects and guided by some form of governance. Without such direction and control, an MDA can deteriorate into an unmanageable and ungoverned swamp that delivers minimal business value at high risk.
For the many user organizations embracing a multiplatform data architecture, its complexity is challenging to design, maintain, govern, and integrate with other systems. Yet users succeed with MDAs by relying on best practices in data architecture and data governance, plus technologies that stitch together the grand design and make it high-performance -- namely, data and application integration, central metadata and cataloging, and virtualization techniques.
TDWI regularly encounters modern data warehouses that are actually multiplatform environments of integrated platform types. Data warehouse professionals have dealt with complex data environments since the early 1990s. They are well equipped to succeed with the multiplatform warehouses required for the diverse requirements of advanced analytics while still satisfying requirements for traditional practices such as reporting.
Complexity aside, an MDA can incur substantial costs for its range of tools and platforms. This challenge is mitigated by the relatively low cost of open source and cloud-based data platforms and tools, which are especially common with hybrid MDAs.
As businesses deploy more sensors, customer channels, applications, and social media, the diversity of source data is driven up. Relational and other structured data types are being joined by a widening range of unstructured and multistructured data types. The benefit of an MDA is that it provides options for rapidly diversifying data and its business use.
Many businesses are expanding their use of analytics, reports, and data-driven business monitoring, all of which have unique requirements for data capture, storage, processing, and delivery. These requirements, in turn, drive the need for diverse data platforms and related tools.
TDWI regularly sees users succeeding with MDAs. Large, complex, and hybrid MDAs are already common in data warehousing, analytics, multichannel marketing, the digital supply chain, IoT, and other data-driven enterprise programs.
For many organizations, the changes in data and its management are driving up the scale, scope, and complexity of modern data ecosystems. Yet, the changes also present new opportunities for data management professionals and their business counterparts who are willing and able to embrace new data-driven practices, especially those for big data and advanced analytics.
For more information, view the TDWI Webinar "Defining the Multiplatform Data Architecture and What It Means to You."
You can also explore MDAs and related data-driven practices at the TDWI Leadership Summit "The Data Renaissance" to be held in Chicago in May 2018. Register at tdwi.org/chicagosummit.
Philip Russom is director of TDWI Research for data management and oversees many of TDWI’s research-oriented publications, services, and events. He is a well-known figure in data warehousing and business intelligence, having published over 500 research reports, magazine articles, opinion columns, speeches, Webinars, and more. Before joining TDWI in 2005, Russom was an industry analyst covering BI at Forrester Research and Giga Information Group. He also ran his own business as an independent industry analyst and BI consultant and was a contributing editor with leading IT magazines. Before that, Russom worked in technical and marketing positions for various database vendors. You can reach him at firstname.lastname@example.org, @prussom on Twitter, and on LinkedIn at linkedin.com/in/philiprussom.