Enterprise Data Architecture Trends for 2019
The coming year will be one of big change in enterprise data architecture. Here are the trends you should build into your plans and expectations now.
- By William McKnight
- December 20, 2018
The world of data is rapidly changing. Data is the prime foundational component of any meaningful corporate initiative. The means to manage the prime asset of data is a key decision point being made continually in competitive organizations. Incorporating new information into this process is required, and tradeoffs must be considered.
A large part of the growth can be traced to integrating the cloud into data architecture-related products. Cloud use has become paramount to corporate efficiencies, and those solutions that embody a solution tightly integrated with the cloud provide the most value.
It is a fascinating, explosive time for data architecture. The two key drivers of the market today are the explosive growth of data science and cloud computing. It is a mix that includes preparing data for artificial intelligence and machine learning, which cannot happen without cloud computing.
However, this is only the beginning of the journey to data engineering as the embodiment of what is to come in enterprise data architecture trends for 2019.
Operational Analytics
The line between operational and analytics realms in the organization is blurring. Although most personnel in 2019 will still identify their projects with one or the other, many will begin realizing the distinction does not matter to their application. Not only will the hard distinction be removed between major silos, it will need to be removed from intrasystem flow. Analytics needs to flow throughout the ecosystem, and it is the enterprise data ecosystem entity that will frequently be considered in 2019.
Data Warehousing Morphs
Data warehousing is still the face of reporting in most organizations. It is still where we find the most data investment "bang for the buck." However, the ingestion tier is dramatically changing.
Some future processing is going to occur in a cloud storage tier that excludes the data warehouse. The data moving to the data warehouse will be a subset of data lake data, although as long as the cloud storage data lake exists, there will be a strong data warehouse in the mix. Gone are ETL tools for moving this data, and fast languages (such as Python) and the use of Spark will take up the data warehouse load cycles.
The elegant data warehouse architecture is columnar and uses a considerable amount of memory. It's in the cloud for sure, and it utilizes all the benefits of the cloud. Rapid provisioning, elastic scalability, and the separation of compute and storage will be givens for major data warehouse activity in 2019.
Data Team Dynamics
What has happened to IT lately has been pretty dramatic. With the uptake of interest in data and the accessibility of the cloud, business departments have clearly staked a claim in building their architectures. What hasn't changed is that they still need dedicated technology professionals to do the work. The notion of an "IT professional" is alive and well, although the reporting structure is more complicated than ever.
Organizations are beginning to respond to this talent shortage by creating divisions of responsibility for each level. There is still room for some centralized function, as in considering the enterprise data ecosystem mentioned above, so centralized IT is still in the mix, but in 2019, companies will finally acknowledge with their organization charts the need for data deployments to be near the business unit.
Furthermore, strategists and implementors are seeing a reduction in the challenges posed by internal grist and resistance to change. Dependence on certain individuals is lessened with the cloud, and in 2019, many will declare their organization unshackled from resistance to progress. We'll see an acceleration of acceptance and some challenging personnel moments inside the data apparatus in organizations.
Rise of the Data Lake and Cloud Storage
Whether to take data ingestion cycles off the ETL tool and the data warehouse as noted or to facilitate competitive data science and building algorithms in the organization, a place for pre-data warehouse, unmodeled, and vast data will be provisioned widely in 2019.
The Hadoop experiment warmed us up to the idea of something other than a relational database being relevant to data and organizations have weighed in that the data lake will be built in cloud object storage: AWS S3, Google Storage, or Microsoft Azure Blob Storage.
Data Manipulation Tooling
With the rise of the data lake, new cloud storage toolsets will emerge in 2019 that are more user friendly than the current set.
In general, data access needs to accommodate mediums with more rapid turnaround -- at the new speed of business. This will include voice media to some degree (more post 2019), which will open up the requirements for data access exponentially. Imagine trying to answer "Show me all of William's touchpoints for the last 3 years, highlighting the ones that really matter." This is largely a pipe dream today. Better have all data in a data lake for this one!
Although it's easy to suggest dramatic change across the board, as I have, I don't do so casually. I'm in the real world with my clients and I've seen the pace of change pick up over the last few years. Businesses will compete and succeed or fail based on their data programs so that suggests lots of activity and change.
Buckle up!
About the Author
McKnight Consulting Group is led by William McKnight. He serves as strategist, lead enterprise information architect, and program manager for sites worldwide utilizing the disciplines of data warehousing, master data management, business intelligence, and big data. Many of his clients have gone public with their success stories. McKnight has published hundreds of articles and white papers and given hundreds of international keynotes and public seminars. His teams’ implementations from both IT and consultant positions have won awards for best practices. William is a former IT VP of a Fortune 50 company and a former engineer of DB2 at IBM, and holds an MBA. He is author of the book Information Management: Strategies for Gaining a Competitive Advantage with Data.