Agile Data Integration with Extreme Scoping
Do your data integration projects need to be more responsive to your users? Extreme Scoping may be just what you need to be more agile, and it's easy to incorporate its principles into your current methodology.
- By James E. Powell
- June 17, 2014
[Editor's note: Larissa Moss will discuss her data-centric agile method for data integration projects at the TDWI World Conference in Boston (July 20-25, 2014). The seven steps of Extreme Scoping can help you turn any existing data-driven methodology into an agile approach.]
What is Extreme Scoping?
Extreme Scoping is a data-centric agile method specifically designed for enterprise-class data integration projects such as data warehousing, business intelligence, master data management, and so on. Data integration projects are very different from development projects in that 80 percent of the work effort is spent on performing data management activities, not coding activities. That means that the vast majority of project time is spent on modeling the data from an enterprise perspective (not an application perspective), standardizing and integrating the data, cleansing and enhancing the data, and preparing the data. In order to manage an enterprise-class data integration project, one needs a data-centric method that includes all data management activities.
What are Extreme Scoping's characteristics?
There are two unique features about Extreme Scoping. The first is that it is based on a soup-to-nuts data warehouse-specific development methodology, and therefore accepted by most IT managers and auditors as being robust enough to pass IT and audit standards. The second is that by following the seven steps of Extreme Scoping you can turn any other data-driven methodology into an agile approach.
All methodologies contain two distinct features. One is to prescribe a list of all potential activities and artifacts that may apply to a development project; the second is to provide guidelines about how to execute these activities. In other words, what activities and artifacts are dependent on other activities and artifacts, what resources are needed and what skills they require, how to organize the work and who to assign the work to, how to manage the work effort, how to put a project plan together, and how to report progress. Extreme Scoping does not prescribe a list of all potential activities, but it does provide guidelines on how to organize the project work and the project team in an agile way so that it follows the Agile Manifesto and the 12 Agile Principles as closely as possible (with very few minor adjustments for data-centric activities).
To illustrate how Extreme Scoping works, I use my Business Intelligence Roadmap methodology as an example, but the beauty of Extreme Scoping is that it can be used with any other data-driven methodology that companies already have in-house.
What are the seven steps of Extreme Scoping?
In the first step, the project team members try to determine how large the new EDW/BI service request is in terms of functional and data deliverables. In other words, they want to know if they are being asked to build an elephant, a tiger, or a field mouse.
In the second step, the team members break apart the large service request into smaller releases. They do this in five steps. They consider the business value of requested features, the data effort required to build those features, any potential technology or architecture considerations, as well as any project constraints or project interdependencies.
In the third step, team members identify the activities and artifacts that apply to only the first release. In fact, steps 3 to 7 apply only to one release at a time, either the first or next release.
In the fourth step, the project team organizes the development track teams (work groups) who will perform the selected activities of the first (or next) release. You want to have as many small work groups working in parallel as possible to complete the release as quickly as possible.
In the fifth step, team members plan the work for the first (or next) release. They already know what activities and artifacts they want to work on, and they know what team members will team up on which development tracks. Now they lay out the activities on the calendar between the start and end dates of the release. They do this by creating weekly milestones starting with the deadline and working backwards.
In the sixth step, the team members create their own internal "micro" project plan which they will use to track their own progress and to address hurdles and scope changes as they arise.
In the seventh and final step, the project team members create an external "macro" project plan, which is basically the milestone chart that they will use for progress reporting to management.
Who is involved in a team?
The project team in Extreme Scoping is organized into three distinct types of teams.
At the core is the project core team, which is also the project management team. The team members comprising this core team represent and have working expertise in four perspectives: business, data, technology, and project management. The core team members meet every day to discuss the progress of the project and to decide if they need to make any course corrections.
The second type of team is called the development track team. There are several development track teams working in parallel. The most common ones are ETL back-end, BI application front-end, and metadata repository. There could also be a scouting team doing ongoing requirements analysis, data modeling, data profiling, or even data cleansing. If data mining is an important function at the company, there could be a separate data mining team, and so on. Each development track team meets every day to discuss their track-specific issues.
The third type of team is considered the extended team with roles such as tech support, auditor, business sponsor, and other IT and business stakeholders. Members of this team participate on the project on a scheduled or as-needed basis.
What training is required?
Extreme Scoping is very easy to learn because it basically consists of only seven steps that can be applied to any existing data-driven methodology. The larger part of the learning curve is to take the examples given in Extreme Scoping and transfer them to the methodology currently in use at a company. If a company does not have their own methodology, then the project team can use Extreme Scoping in combination with the Business Intelligence Roadmap methodology, which is already embedded in Extreme Scoping.
What are some best practices for moving to agile in general and Extreme Scoping in particular?
Almost everyone in the data warehousing industry recognizes that traditional methodologies have two major flaws. First, most traditional methodologies do not adequately address the 80 percent data management activities for enterprise-class data integration projects. Second, it takes too long to deliver something tangible to the users.
The first defect has been addressed with spiral data warehouse methodologies in the mid to late 1990s. In fact, TDWI used to have several seminars presenting those spiral methodologies and project teams eagerly adopted them. However, by adding data-related activities, the unintended consequence of these spiral methodologies was that it took even longer to deliver something tangible to the users. In other words, all roads must lead to agile.
There are distinct advantages to using Extreme Scoping for enterprise-class data integration projects. One is the fact that Extreme Scoping is based on a robust spiral data warehouse methodology and is therefore accepted by most IT managers and business people who are skeptical of agile methods. Second, it can be attached, modified, and molded in many ways to fit the company's standards and work habits. Basically, it gives EDW/BI project teams a data-centric agile alternative.