Best Practices to Modernize Your Data Management
Does your data management strategy need an upgrade? These five principles can help you get started.
- By Troy Abraham
- June 21, 2021
Data management platforms have changed dramatically over the past decade. They now offer massive improvements in performance, scalability, reliability, and security. Automated, fully managed SaaS infrastructure frees data teams from the time-consuming ETL administration and maintenance required by legacy platforms. Organizations also benefit from lower costs and consumption-based pricing models, which make it easy to adopt data infrastructure at scale or by use case.
These five tenets will help you start crafting a more modern, holistic, and impactful data management strategy to take full advantage of modern data platforms:
1. Know your data sources and uses
Don't assume that the CIO is the sole gatekeeper for -- or is even aware of -- all the data sources, flows, and uses. Data is created not only by your employees but by customers, partners, and other digital processes via internal and external systems. Everyone now has some stake in data creation, curation, and consumption. Assume and account for constantly adding new data sources and use cases.
2. Think about automation carefully
Understand the many roles in your data stack and the ways you can leverage this stack to avoid manual data management methods whenever possible. Analysis by Andreessen Horowitz posits three different data architectures or blueprints, all of which have numerous moving parts and complex interdependencies that will take time to fully understand and incorporate into your data management strategy.
You should also keep in mind the eventual goal of having a modular design that can automatically connect to and structure a variety of data sources and analysis methods from the get-go.
Taking a modular approach is critical to providing dynamic, reliable, and automated data pipelines that deliver consistent fit-for-purpose data outputs. This approach empowers data teams to more easily keep pace with the constantly changing data consumption needs of the business versus spending the majority of their time maintaining brittle data pipelines.
Yet this approach is lacking in most organizations. In a study conducted by Dimensional Research of more than 500 data professionals from midsize and large enterprises, almost everyone recognized they had numerous challenges when building pipelines, which often take weeks to complete, require multiple tools, and rely heavily on scripting.
3. Plan to scale up
Architect and manage for scale from the beginning. We know that data volumes will continue to increase both in terms of raw numbers and a wider variety of sources. There are two reasons for this increase: specialized applications and the staff, partners, and customers who use them. As the number of siloed or loosely coupled applications grows, the amount of data will increase exponentially. This means that analyzing this data becomes a more universal activity and is spread across a larger pool of front-line workers. As new data becomes available, deeper insights should naturally result, fueling data-driven, revenue-impacting business decisions.
Scaling up used to be an issue back when we had to buy racks of gear to sit in our raised-floor data centers. Now that many businesses use cloud-based resources, scaling isn't as great an issue, but you still have to understand the implications of adding (or subtracting) 10 or 1,000 times the storage, processing power, and network traffic to your overall architecture and process flows.
4. Consider the desired business impact of your data
Who are the right stakeholders and how do they want to integrate and consume data? It is important to empower what we now call citizen data integrators -- end users who have no programming background and have two primary usage scenarios: 1) highly productionized, timely, and accurate reporting that represents an enterprise source of truth and 2) dynamic and ongoing business discovery through self-service ad hoc analysis.
Empowering both scenarios requires a balance between highly standardized and centralized data management capabilities and decentralized, end-user-driven data analysis. Modern data management solutions and processes must empower both in order to drive maximum business impact from your data and empower highly leveraged data teams that consistently meet the demands of the business.
5. Incorporate the concepts of data governance, stewardship, and control
These three concepts should become intrinsic to your data management solution. That means having well-defined but dynamic processes and user roles along with understanding the responsibilities and expectations that are required in today's complex data environments. Along with mapping the value stream of data from sources to use, these approaches help eliminate waste in the data pipeline, enabling businesses to be more responsive, and allows for a greater focus on better data stewardship.
A Final Word
The key to making all five of these principles work is ensuring strategy, tools, and processes enable dynamic responsiveness to constantly changing data requirements. This means your data management strategy has to help rather than hinder your entire organization as it seeks to become data-driven.
Troy Abraham is the global VP of the customer solutions group at Fivetran where he is responsible for leading integrated teams across sales engineering and solution architecture.