Comprehensive and Agile End-to-End Data Management
The trend toward integrated platforms of multiple tools and functions enables broader designs and practices that satisfy new requirements.
By Philip Russom, Senior Research Director for Data Management, TDWI
Earlier this week, I spoke in a webinar run by Informatica Corporation and moderated by Informatica’s Roger Nolan. I talked about trends in user practices and vendor tools that are leading us toward what I call end-to-end (E2E) data management (DM). My talk was based on three assumptions:
- Data is diversifying into many structures from new and diverse sources.
- Business wants to diversify analytics and other data-driven practices.
- End-to-end data management can cope with the diversification of data, analytics, and business requirements in a comprehensive and agile manner.
In our webinar, we answered a number of questions pertinent to comprehensive and agile end-to-end data management. Allow me to summarize some of the answers for you:
What is end-to-end (E2E) data management (DM)?
End-to-end data management is one way to adopt to data’s new requirements. In this context, “end-to-end” has multiple meanings:
End-to-end DM functions. Today’s diverse data needs diverse functions for data integration, quality, profiling, event processing, replication, data sync, MDM, and more.
End-to-end tool platform. Diverse DM functions (and their user best practices) must be enabled by a portfolio of many tools, which are unified in a single integrated platform.
End-to-end agility. With a rich set of DM functions in one integrated toolset, developers can very quickly on-board data, profile it, and iteratively prototype, in the spirit of today’s agile methods.
End-to-end DM solutions. With multiple tools integrated in one platform, users can design single solutions that bring to bear multiple DM disciplines.
End-to-end range of use cases. With a feature-rich tool platform and equally diverse user skills, organizations can build solutions for diverse use cases, including data warehousing, analytics, data migrations, and data sync across applications.
End-to-end data governance. When all or most DM functions flow through one platform, governance, stewardship, compliance, and data standards are greatly simplified.
End-to-end enterprise scope. End-to-end DM draws a big picture that enables the design and maintenance of enterprise-scope data architecture and DM infrastructure.
What is the point of E2E DM?
End-to-end (E2E) data management (DM) is all about being comprehensive and agile:
- Comprehensive -- All data management functions are integrated for development and deployment, with extras for diverse data structures and business-to-DM collaboration.
- Agile -- Developers can very quickly on-board diverse data, profile it, and both biz/tech people can iteratively prototype and collaborate, in today’s agile spirit.
What’s an integrated tool platform? What’s it for?
An integrated platform supports many DM tool types, but with tight integration across them. The end-to-end functionality seen in an integrated DM platform typically has a data integration and/or data quality tool at its core, with additional tools for master data management, metadata management, stewardship, changed data capture, replication, event processing, data exchange, data profiling, and so on.
An integrated platform supports modern DM architectures. For example, the old way of architecting a DM solution is to create a plague of small jobs, then integrate and deploy them via scheduling. The new way (which requires an integrated toolset) architects fewer but more complex solutions, where a single data flow calls many different tools and DM functions in a controlled and feature-rich fashion.
An integrated tool platform supports many, diverse use cases. Furthermore, the multiple integrated tools of the end-to-end platform support the agile reuse of people, skills, and development artifacts across use cases. Important use cases include: data warehousing, analytics, application modernization, data migration, complete customer views, right-time data, and real-time data warehousing.
How does an integrated toolset empower agile methods?
Multiple data disciplines supported in one integrated toolset means that developers can design one data flow (instead of dozens of jobs) that includes operations for integration, quality, master data, federation, and more.
The reuse of development artifacts is far more likely with one integrated toolset than working with tools from multiple vendors.
Daily collaboration between a business subject-matter expert and a technical developer is the hallmark of agile development; an integrated DM platform supports this.
Feature-rich metadata management propels the collaboration of a business person (acting as a data steward) and a data management professional, plus self-service for data.
Self-service data access and data prep presented in a visual environment (as seen in mature integrated toolsets) can likewise propel the early prototyping and iterative development assumed of agile methods.
Automated testing and data validation can accelerate development. Manual testing distracts from the true mission, which is to build custom DM solutions that support the business.
Develop once, deploy at any latency. Reuse development artifacts, but deploy them at the speed required by specific business processes, whether batch, trickle feed, or real time.
Reinventing the wheel bogs down development. Mature integrated toolsets include rich libraries of pre-built interfaces, mappings, and templates that plug and play to boost developer productivity and agility.
What’s the role of self service in agile development methods?
Self-service data access for business users. For example, think of a business person who also serves as a data steward and therefore needs to browse data. Or consider a business analyst who is capable of ad hoc queries, when given the right tools.
Data prep for business users, analytics, and agility. Users want to work fast and independently – at the speed of thought – without need for time-consuming data management development. To enable this new best practice, the tools and platforms that support self-service data access now also support data prep, which is a form of data integration, but trimmed down for reasons of agility, usability, and performance.
Self-service and data prep for technical users. For example, self-service data exploration can be a prelude to the detailed data profiling of new data. As another example, the modern, agile approach to requirements gathering involves a business person (perhaps a steward) and a data professional, working side-by-side to explore data and decide how best to get business value from the data.
What’s the role of metadata in self-service and agile functionality?
We need complete, trusted metadata to accomplish anything in DM. And DM’s not agile, when development time is burned up creating metadata. Hence, a comprehensive E2E DM platform must support multiple forms of metadata:
- Technical metadata – documents properties of data for integrity purposes. Required for computerized processes and their interfaces.
- Business metadata – describes data in ways biz people understand. Absolutely required for self service data access, team collaboration, and development agility.
- Operational metadata – records access by users and apps. Provides an audit trail for assuring compliance, privacy, security, and governance relative to data.
If you’d like to hear more, please click here to replay the Informatica Webinar.
Posted on June 30, 2016