The Analytics-Driven Organization: Allocating Effort
Successful project development requires an integration of the IT and quantitative strengths of the organization with domain-specific knowledge.
[Editor’s Note: Thomas A. Rathburn is leading an all-day session, Supporting the Analytics-Driven Organization, at the TDWI World Conference in Boston (October 20-25, 2013). This vendor-neutral course presents analytics topics and their roles in enterprise decision support.]
By Thomas A. Rathburn, Senior Consultant and Training Director, The Modeling Agency
Predictive analytics, and projects in related areas, are predominantly approached from a technology perspective. We focus on obtaining and developing increasingly sophisticated capabilities in the hope of utilizing them effectively. Our techno-centric view of the world has evolved steadily over the past thirty years since the personal computer put its capabilities and potential in the hands of every decision maker.
Every business has a unique set of objectives. Our success is measured on performance metrics that are unique to each organization. Business deals in relationships. There is no inherent underlying order, as we would see in the hard sciences, where the focus is based on physical systems. Quantitative techniques no longer give us a “right” answer they give us a mathematically correct answer. They deliver a potentially beneficial contribution to a decision process that allows us to pursue a continuing evolution of incremental enhancement.
We are not doing engineering. We are doing analysis.
We are in pursuit of achieving a higher level of performance than we are currently capable of. Our efforts are goal directed. Our business objectives sit at the top of the pyramid (see below). Our data is the raw material for the construction of information-based decision processes. Our quantitative techniques are our tools for turning the information content in our data into a useful contribution to our daily operations.
Unfortunately, the hype associated with the potential of advanced technologies has focused on the development of highly sophisticated capabilities. Our math becomes increasingly eccentric. Our techniques become increasingly esoteric. We build dashboards that are nothing more than elegant reports of historical data rather than insightful contributions to enhanced decision making. We transition from a transaction-based data environment to an analytics-based environment. We focus on the exponentially expanding availability of the raw data resources, and we obsess on the technical challenges.
The single most effective way to utilize these technologies is to enhance the resource allocation decisions we face when dealing with the wide array of relationships we maintain. We need to spend our time and money on those relationships that benefit us and avoid expending resources on the relationships that hurt us.
Our efforts need to evolve beyond the creation of the unachievable perspectives of what-if analysis. We need more than discovering “new, previously unknown relationships in our data” and pursue a focused effort to do a better job of spending our time and our money by asking “Who’s next?”
We have invested in the development of capabilities of information and data and in quantitative techniques. We must now learn to leverage those capabilities. Organizations that successfully utilize technology to achieve their objectives will always surpass organizations that pursue technology enhancement without a clear understanding of their goals, objectives, and performance metrics.
The Six-Phase Development Methodology
The six-phase development methodology presented here provides a practitioner’s approach to the planning, development, and implementation of business predictive analytics projects. We generally begin with Plan phase. However, project teams can transition from their current methodology at any point in the cycle.
The six phases are implemented in sequence while noting that the process is highly iterative. It is common for project teams to develop insights during a phase that requires a modification to the choices made, and conclusions reached in prior phases.
Plan: In the “plan” phase, we define all of the business aspects of the project design. In this phase, we develop a complete blueprint for everything that is to follow. We identify our business objectives and the performance metrics to evaluate our efforts. We determine our current decision process, and our current level of performance based on our adopted performance metrics. We evaluate available hardware and software to be utilized during the project. We prototype our delivery system, and detail its design requirements. We develop a specification for the experimental design; determine our data sources, and project the data transformation and representation expectations for the project. We determine the project type to be pursued, and the expected analytics techniques to be utilized in developing our models. This phase captures the business requirements for our projects and requires the active participation of the business decision makers.
Prepare: In the “prepare” phase, we develop a “data sandbox” for use throughout the project. During this phase, we physically implement the data representation, transformation, and design requirements specified during the plan phase. This phase is typically the most expensive in terms of overall effort.
Build: During the “build” phase, we develop our candidate pool of models and run them against the test data set to evaluate their relative performance. The bulk of the work in this phase is computer time, not project hours.
Confirm: The “confirm” phase corresponds to the completion of our validation studies. This phase requires multiple validation efforts. This process will be repeated multiple times before a successful candidate is identified. In many cases, it is necessary to retreat back to prior phases, consider alternative data representation and transformation strategies, and rebuild the candidate pool of models to finally develop a model that survives validation.
Adopt: Once a model has been successfully validated, we move into the “adopt” phase, during which we will complete development of the delivery system and confer with domain experts and end users to ensure that the implementation is consistent with desired business practices.
Replace: Once our model is successfully developed, we will need to monitor its performance. All models of human behavior degrade over time. When performance drops beyond a defined margin, a challenger model should be readied for installation.
Allocation of Effort
Active involvement by business users is critically important to the “plan” and “adopt” phases of the development process. The graphic below provides a typical allocation of the effort for each phase of a successful project. Actual percentages vary from project to project, but relative contribution by skill set typically remains consistent.
Plan: It is essential that the business decision makers not only participate in the plan phase but that they drive the fundamental decisions involved in the design of the project, including the specification of the study guide for our algorithms. Any inconsistencies in their specification result in the development of highly sophisticated solutions to the wrong problem.
Prepare: The prepare phase is dominated by the construction of the data sandbox that will serve as the basis for the technical development of our analytics project. It is typically built by the IT specialist on the project team, with active participation from the quantitative specialist.
Build: The build phase is the primary focus of the quantitative specialist. This phase typically involves the development and testing of a large number of challenger alternative models to replace the current champion model. We develop our candidate pool of models and run them against the test data set to evaluate their relative performance. The bulk of the work in this phase is computer time, not man hours. The build phase is typically implemented by the IT staff with a strong influence from the quantitative specialists.
Confirm: In the confirm phase, we complete our validation studies to establish our expectations of model performance and variance. The quantitative specialists typically dominate this phase, focusing on the capability of a model to generalize into experience not previously seen in the build stage.
Adopt: During the adoption phase, we will complete development of the delivery system, and confer with domain experts and end users that the implementation is consistent with desired business practices. The primary focus on the adopt phase is determining the acceptability of the approaches developed by the model development effort, and the appropriateness of their recommendations to the organizational unit.
Replace: Once our model is successfully developed, we will need to monitor its performance. Our efforts in the replace phase address the issues of model monitoring and replacement, as well as anticipated enhancements to the evolution of the business decision process.
Successful project development requires an integration of the IT and quantitative strengths of the organization with domain-specific knowledge. We are completing analysis for incremental enhancement of our business goals, utilizing the strengths of our technical capabilities.
Without a project design that incorporates the business goals and objectives, our projects capture only sophisticated technical solutions to the wrong problem.
Thomas A. Rathburn is a senior consultant and training director at The Modeling Agency. He has a strong track record of innovation and creativity, with over two decades of experience in the application of predictive analytics in business environments assisting commercial and government clients internationally to develop and implement applied analytics solutions. Mr. Rathburn is a regular presenter of the data mining and predictive analytics tracks at the TDWI World Conferences.