Adopting Agility for DataOps
Once you assemble a high-functioning DataOps team, it's time to establish agile mindsets, skill sets, and toolsets. Second part of a two-part series.
- By Mark Marinelli
- February 12, 2019
To create a data-driven organization, you first need the right DataOps team, a topic we covered in part 1 of this series. The next step is equally important. True transformation takes agile mindsets, skill sets, and toolsets.
The Agile Mindset
In the past 10 years, we've seen the emergence of "DevOps." It's an approach to agile software development that accelerates the build life cycle (formerly known as release engineering) using automation. DevOps focuses on continuous integration and continuous delivery of software by leveraging on-demand IT resources and by automating integration, testing, and deployment of code. This merging of software development and IT operations ("DEVelopment" and "OPerationS") reduces time to deployment, decreases time to market, minimizes defects, and shortens the time required to resolve issues.
A key aspect of embracing an agile mindset is for data engineering teams to start thinking of themselves not as technicians who move data from source A to report B but rather as software developers who employ agile development practices to rapidly build data applications. As such, teams of data engineers and data scientists must create a sister discipline -- data operations (DataOps) -- to address the needs of data professionals and transform organizations through data. DataOps is the extension of DevOps values and practices into the data analytics world.
The DevOps philosophy emphasizes seamless collaboration between developers, quality assurance teams, and IT ops admins. DataOps does the same for the admins and engineers who store, analyze, archive, and deliver data.
Like software engineers using DevOps methods, encourage your DataOps team to build something that works and get feedback before moving forward. This way, the DataOps team learns from its mistakes early and often and pressure-tests its assumptions about the data with real-world usage. Be sure the team embraces agile practices, getting quick wins and iterating instead of engaging in longer-term projects that don't deliver value until they're "done."
The Agile Skill Set
Any DataOps team must include data scientists because today's analytics demands are far more sophisticated than what's possible with traditional reports and dashboards. It also must include data engineers, who know more about the business than the data source owners and more about the sources than the business consumers. Think of them as the bridge between business and functional teams.
The good news: filling these roles is getting easier. The skill set of the average data professional is increasing. More college students learn data analytics and data science, and more traditional information workers are increasing their skills -- adding data analytics and data science to their repertoires.
The organization and management of the work needs to change along with the new approach. Teams acting on individual projects need ways to share tools and best practices as they engage in each new initiative. Projects four and five should benefit from what was learned in projects one, two, and three, and that won't happen without solid coordination built into the project management process.
Aligning the right mindset and skills won't get you very far unless you've selected technologies that fuel a new way of working involving agility, scalability, automation, and collaboration.
People have been managing data for a long time, but we're at a point where the quantity, velocity, and variety of data available to a modern enterprise can no longer be managed without a significant change in the fundamental infrastructure. Organizations must manage thousands of sources that are not controlled centrally and frequently change their schema without notice. It's therefore crucial to choose best-in-class tools for each part of the data supply chain versus traditional stack-vendor solutions that deliver pre-integrated, but subpar technology.
Everyone on the DataOps team needs at least one of the following tools to acquire, organize, prepare, and analyze/visualize data:
- Data access/ETL (e.g., Talend, StreamSets)
- Data catalog (e.g., Alation, Waterline)
- Enterprise data unification (e.g., Tamr)
- Data preparation (e.g., Alteryx, Trifacta)
- Data analysis and visualization (e.g., Tableau, Qlik)
All of these tools need to be interoperable by design. Seek out tools with strong API support and open data exchange because you'll need to combine them to build efficient data workflows. This comes at an initial cost, but it brings huge benefits in your ability to mix and match a data pipeline to meet your organization's unique needs. For example, you can replace individual components of the solution as better ones become available.
A Final Word
Today, it's not about why or when to start turning data into a strategic asset, but how to do it efficiently, effectively, and really, really quickly. It's time to get started by building your team and establishing agile practices, skills, and tools to guide them. Then find some challenging problems and start building the modern, DataOps-driven generation of compelling data applications to solve them. Move fast and fix things!
Mark Marinelli is head of product with Tamr, which builds innovative solutions to help enterprises unify and leverage their key data. A 20-year veteran of enterprise data management and analytics software, Mark has held engineering, product management, and technology strategy roles at Lucent Technologies, Macrovision, and most recently at Lavastorm, where he was chief technology officer.