RESEARCH & RESOURCES

A More Agile Approach to Data Quality

Agile and data quality could make for an especially organic fit. After all, DQ already has an agile ace in the hole: the data steward.

Agile software development and project management methods emphasize continuous business involvement from the get go. So, too, should an agile data quality (DQ) program.

It sounds like a no-brainer. If you're doing agile anything, you have buy-in and backing from the line of business. In the case of an enterprise DQ project, however, it's easy to get sidetracked. This is in part because DQ demands a higher degree of rigor -- i.e., more planning, more coordination, and more centralized control -- than do other IT or quality improvement disciplines. Agile methods aren't incompatible with rigor, however; few things are more rigorous than a Scrum sprint, after all. Agile is typically inimical to the kind of contract-negotiated rigor that used to define the software development status quo.

How does one do agile data quality? A good starting place might be Agile Data Quality: Speed, Flexibility, and Business Alignment for Improving Data, a new report written by Philip Russom, research director for data management (DM) with TDWI Research. Russom provides a checklist of action items for "doing" agile DQ.

When you think about it, agile and DQ could make for an especially organic fit. For one thing, data quality already has an agile ace in the hole: like most agile methodologies, DQ emphasizes constant interaction between IT and business stakeholders. It even delegates a specific role for this purpose: that of the data steward.

In an agile context, Russom writes, data stewards are line-of-business representatives "who know exactly what a department, business unit, or business process needs from data and its quality for business success." Data stewards are also charged with ensuring that the requisite people and process changes -- in many cases, the most difficult aspect of any DQ project -- "are put in place to reinforce the IT changes delivered," Russom writes.

The first step in any agile DQ effort should be to link agility -- i.e., the speed of development -- with business goals. Again, this might seem obvious, but IT organizations often don't. In an agile context, it's easy to conflate the rapid development and maturation of a prototype -- e.g., the on-time production of deliverables; the clockwork regularity, measured in time-constrained intervals, of iteration itself -- with progress toward a business goal.

Agile isn't just about building things faster, Russom notes; it's about putting working prototypes into the hands of the line of business -- chiefly to solicit feedback and help identify, optimize, or narrow requirements. In addition, agile's rapid rate of iteration can help identify potentially disastrous failures early on, Russom points out: "If a DQ solution is inappropriate to users' requirements, it's best to know this early in the project while there's still time to regroup and make a course correction. Agile methods should embrace change based on user feedback, instead of pedantically following a plan."

The second step is to identify and eliminate "time sinks," which Russom says introduce harmful latency into the agile DQ process. The most common of these is communication, he says. "[M]any data management projects take 90 days, yet this sometimes equates to only a week of actual work. That's because of the unfortunate latency that results from bureaucratic communication channels that travel through multiple layers of management and across too many 'FYIonly' team members," Russom writes.

It's in this respect that DQ's built-in advantage -- the data steward -- is most advantageous. "Agile DQ is therefore lean compared to traditional methods that communicate through a cumbersome command-and-control organizational structure," Russom continues. "Agile DQ cuts the wasted time from a development process by enabling direct collaboration between a business person acting as a data steward and a technology lead representing the DQ development team."

Russom's third recommendation is to put a DQ tool with self-service capabilities into the hands of a business subject matter expert (SME) for a particular domain. This helps eliminate what he alliteratively calls "wasteful waiting." The SME, Russom argues, "can accurately identify and prioritize problems and opportunities pertaining to the quality of data."

Clearly, homegrown, or commodity DQ tools won't do. "A data steward of this type needs a DQ tool with business-friendly functionality that enables the analysis of data and the mechanisms for recording and communicating what was found in that analysis," he writes. "In other words, the modern data steward is not content to rely solely on technical people to analyze the data, because technical people don't know the full impact of data shortcomings on the specific business processes that rely on them."

Russom's report includes several additional key steps, starting with the periodic delivery of improved data sets; the practice of "data-driven" documentation; a shift to a DQ services layer, which promotes reuse; and the selection of agile DQ tools.

You can download the complete report here. A short registration is required for users downloading a report from TDWI's website for the first time.

TDWI Membership

Get immediate access to training discounts, video library, BI Teams, Skills, Budget Report, and more

Individual, Student, & Team memberships available.