How to Get More from Your Data in 2020
As organizations look for ways to drive flexibility, agility, and innovation, they can expect to see these three trends in the coming year.
- By Ravi Shankar
- January 2, 2020
The advancement in technologies such as the Internet of Things (IoT), wearable technologies, self-driving vehicles, and mobile technologies, supported by 5G connectivity, has led to the generation of large volumes of data. That data has grown unwieldy across systems, transcending data centers, cloud, and, more recently, to the edge.
In 2020, businesses will increasingly demand capabilities that enable them to achieve digital transformation through secured storage, faster integration, and better data discovery process. The adoption of AI-driven analytics supported by cognitive computing capabilities such as machine learning (ML) has expedited business insight delivery from low-latency to real-time analytics, resulting in faster time-to-market.
Companies looking to leverage these up-to-the-minute actionable insights in 2020 have to adapt newer trends such as data fabric/mesh, digital twins, and multicloud strategy to stay ahead of the competition. As organizations continue to look for ways to drive flexibility, agility, and innovation, they can expect to see these three trends in the coming year:
2020 Trend #1: Data fabric goes dynamic to become data mesh
Data fabric allows unimpeded access and sharing of data across distributed computing systems by means of a single, secured, and controlled data management framework. Many large companies run multiple applications for their business requirements, resulting in the collection of large volumes of structured, semistructured, and unstructured data. This data is siloed across diverse data sources such as transactional databases, data warehouses, data lakes, and cloud storage.
A data fabric architecture is designed to stitch together historical and current data across multiple data silos to produce a uniform and unified business view of the data. It provides an elegant solution to the complex IT challenges in handling enormous amount of data from disparate sources without having to replicate all of the data into yet another repository. This feat is accomplished through a combination of data integration, data virtualization, and data management technologies to create a unified semantic data layer that aids many business processes (such as accelerating data preparation and facilitating data science).
Increasingly, as data fabric shifts from static to dynamic infrastructure, it develops into what is called a data mesh. Data mesh is a distributed data architecture that follows a metadata-driven approach and is supported by machine learning capabilities. It is a tailor-made distributed ecosystem with reusable data services, centralized policy of governance, and dynamic data pipelines. The chief notion for data mesh is that ownership of domain data is distributed across different business units in a self-serve, consumable format. In other words, data is owned at the domain and these domain datasets are made available for efficient utilization across different teams.
Another important aspect of data mesh is its globally available centralized discovery system (also known as its data catalog). Using the data catalog, multiple teams looking for insight can access the data discovery system to obtain information such as available data in the system, point of origin, data owner, sample datasets, and metadata information. The domain data is indexed in this centralized registry system for quick discoverability. Finally, for data to be congruous across domains, data mesh focuses on delivering interoperability and standards for addressability between domain datasets in a polyglot ecosystem.
2020 Trend #2: Digital twins at the edge leads to real-time analytics
Digital twin is a digital facsimile or virtualized simulation of a physical object or system from telemetry data pieced together by sensors, via modeling software like computer-aided design (CAD). Digital twins can mimic the behavior of an automobile, industrial device, or human, and can be coalesced to mimic an engineering operation. With the genesis of technologies such as AI, ML, cognitive computing, and the Industrial Internet of Things (IIoT), the digital twin technology is creating unparalleled possibilities leading to innovative business concepts.
Digital twins can be integrated to form an intricate real-world setting while allowing companies to connect with individual digital twins that cater to an internal and/or external mechanism of an operation. Dynamic and historical data generated from IoT sensors offer insights about actual industrial operation and environment through real-time data feeds that can be leveraged by IoT applications (such as edge computing) later.
Digital twins are expected to be as accurately responsive as their physical equal. Live data generated and processed at the local device level by edge computing enables a latency-sensitive operation. Placing the digital twins from cloud to the outer boundaries of the network allow near-time latency, real-time analytics, data privacy, detection of operational incongruity, and failure prediction. The digital twins, complemented by edge computing, enthused by affordable sensor technology and augmented by computational competences, help companies accelerate product development process, boost efficiency, reduce cloud storage costs, and build a comprehensive portfolio of products.
2020 Trend #3: Multicloud provides best-in-class solutions
Multicloud involves using cloud services from multiple public cloud managed service providers (MSPs) in a single network architecture to attain the optimum mixture of near-time latency, cost, and other key metrics.
Multicloud adoption was initially driven by availability and performance as well as avoidance of vendor lock-in so organizations could benefit from best-in-class solutions. These days, companies look to cloud MSPs to support services for better security and failover, to meet data governance requirements, and to avoid downtime.
Another factor stimulating the adoption of a multicloud strategy is data fabric. Data fabric integrates disparate data in real time across multiple public clouds platforms. For example, a company using the services of multiple cloud platforms can run a particular application in Azure and a different application in AWS, with workloads dispersed across more than a few cloud networks. Also, a multicloud strategy enhances reliability through better disaster recovery by backing up data on more than one cloud provider.
Finally, companies leveraging the multicloud architecture need to have a centralized governance policy to avoid unauthorized instances of cloud services governed by the company's IT department, and not by line-of-business employees because the latter model causes shadow IT and can upsurge cloud costs.
Closing Thoughts for the New Year
It is clear from these three trends that large volumes of siloed data is the underlying muddle which has created a need for business to find better ways to access, store, and process high-voluminous data for latency-sensitive applications and swift business solutions. Analysts, data scientists, and business users will adopt these data-powered technologies to bring them mainstream in 2020.
Ravi Shankar is the chief marketing officer at Denodo, a provider of data virtualization software. Ravi brings to his role more than 25 years of proven marketing leadership and product management, business development, and software development expertise with enterprise software leaders such as Oracle, Informatica, and Siperian. You can contact the author at firstname.lastname@example.org.