2019 Industry Predictions for Data Professionals
As 2019 begins, here are key trends and themes for data and knowledge management professionals to watch.
- By Sean Martin
- January 7, 2019
During 2018, companies began to learn that they can monetize the value of far more of their existing data and information assets by deploying enterprise-grade information fabrics to catalog and simplify access. As predicted, 2018 was also the "Year of the Graph," with many new graph products. Enterprises became increasingly comfortable deploying solutions based on Knowledge Graph technologies.
As we begin 2019, what key themes and trends will emerge across the industry? Here are my three predictions for data and knowledge management professionals in this new year.
Graph Databases Will Deliver Increased Value to Enterprise Information Fabrics
Graph databases are focused on connecting data so that it can be accessed by its relationship to other data. Graph databases provide the means to harmonize between two or (many) more datasets to provide a single view. This lets us ask questions across data sets as a whole rather that siloed source data parts.
Data sourcing and preparation is a huge bottleneck for analytics. It is even more so for new data-science and machine-learning initiatives. Today, the majority of data generated is still siloed in both structured and unstructured formats. This year we will see companies and data scientists alike unite this disparate data -- not by breaking down the silos -- but by making all of it far more accessible and understandable through a unifying data fabric overlay, built on graph technologies.
Many Fortune 500 companies have already started these initiatives as they work to catalog and interconnect their data from separate sources as a knowledge graph, making use of it wherever it may reside. Data-intensive industries -- especially oil and gas, financial services, life sciences, and even government entities -- stand to benefit the most from this new approach to the democratization of data.
Graph Databases Will Power Machine Learning for Data Engineering
If 2018 was truly the "Year of the Graph," then 2019 may easily become the "Year When Machine Learning Meets Graph." Machine learning continues to revolutionize the processes of data analysis and software engineering, but there is an increasingly strong interest in how graph technology can directly assist and speed up this movement.
New graph-based tools for data discovery, harmonization and prep -- think of knowledge graphs and automatic query generation against them, for example -- are removing the data-related roadblocks to achieving success with machine learning initiatives. They can greatly accelerate much of the manual work that data scientists otherwise need to do. Graphs provide the means to train machine learning algorithms against broader accumulations of data, ensuring that these intelligent algorithms find wider application, faster. For machine learning to live up to the "hype" of usefulness that it enjoyed in 2018, vendors will turn to graph-based data discovery tools in 2019 that are capable of more quickly driving feature engineering for training useful machine learning models.
Increased Use of Multi-Cloud and Hybrid-Cloud Architecture to Prevent Vendor Lock-in
As new cloud-as-infrastructure services such as Amazon Web Services, Microsoft Azure, and Google Compute have enjoyed wide adoption, enterprises in turn began to understand how this approach can fundamentally shift how they will buy computing infrastructure. The cloud providers all started with proprietary APIs used to tap their respective infrastructure, but this is starting to change. Initial experimentation with cloud infrastructure by business has been wildly successful for most, but now they are beginning to think carefully about how adoption of this approach will affect them in the long run.
Businesses have little interest in being tied to one or another of the giant cloud vendors through the use of proprietary APIs and vendor service specific infrastructure because this results in a loss of control of their IT costs and ends up being expensive down the road if they stay or move.
The cloud infrastructure vendors have the exactly opposite problem in that to avoid what they sell becoming commoditized, they are working hard to quickly provide differentiated non-commodity services so as to become "sticky" and achieve a degree of customer lock-in. Both sides know that once an IT system is up and running successfully, it is both risky and expensive to change it.
In recent months we have seen a significant uptick in interest in both multivendor cloud provider policies and hybrid-cloud technologies. Hybrid-cloud is an IT architecture that allows businesses to seamlessly shift applications and workloads between infrastructures that the customer owns on their own premises and behind their firewalls and to any of the public cloud services. A dramatic expression of this was the $34 billion acquisition of Red Hat by IBM, which lays the foundation for how we'll see cloud vendors move toward more open-source and open-standards software and cloud service technologies in 2019.
In 2019, I anticipate that we'll see far more action with customers insisting on multicloud and multivendor support. Cloud customers will increasingly expect to seamlessly move their workloads between cloud vendors. This means that customers will be embracing a number of new abstractions, the most important of which is Kubernetes, "a portable, extensible open-source platform for managing containerized workloads and services, which facilitates both declarative configuration and automation."
Launched in 2014 by Google, Kubernetes boasts a rapidly growing ecosystem that orchestrates computing, networking, and storage for user workloads. This common API for managing containers, applications, and microservices ensures that customers can manage multiple IT services and workflows across multiple clouds at the same time.
Can We Declare 2019 the Year of the Data Fabric?
Whether we see 2019 emerge as the "Year of the Data Fabric" will depend on market developments and what other key technologies emerge. Nevertheless, enterprise information architects, chief data officers, and other key IT decision makers should be aware of the importance of data fabrics, the impact of graph databases on machine learning initiatives, and the exciting possibilities that multiple and hybrid clouds hold as they execute their business plans and strategies throughout the coming year. Here's to an exciting road ahead in 2019!
Sean Martin serves as chief technology officer at Cambridge Semantics. In his career, Martin has pioneered the use of semantic technologies and enterprise knowledge graphs to solve data integration and application development problems. Prior to founding Cambridge Semantics in 2007, he spent 15 years with IBM Corporation where he was a founder for the IBM Advanced Internet Technology Skunkworks group.