Creating a Successful High-Level Data Model
An overview of a ten-step approach that will help you successfully build a high-level data model.
Editor’s Note:The following is a condensed excerpt from the author’s Data Modeling for the Business (2009, Technics Publications) by Steve Hoberman, Donna Burbank, and Chris Bradley, in which they describe the importance of a high-level data model and how to master the techniques for building one. This chapter, written by Hoberman and Burbank, provides an overview of a comprehensive, ten-step approach to the task.
A data model is a visual representation of the people, places, and things of interest to a business. It is used to facilitate communication between business people and technical staff. A data model is composed of symbols that represent the concepts that must be communicated and agreed upon, and is therefore often referred to as a blueprint for data. Like a building architect, who creates a series of diagrams or blueprints from which a house can be constructed, a data modeler/architect creates diagrams from which a database may be built.
The blueprint analogy is often used because there are many parallels between blueprints, which many people are familiar with, and a data model, which few people outside of IT have seen. The most obvious parallel is that a blueprint translates a very complex and technical undertaking into a set of visual diagrams that a layperson can understand. This is the goal of a data model, as well -- to take business concepts and the complex rules required to create a database and simplify them into an intuitive picture that both business people and technical engineers can understand. Just as homeowners are involved in the design of their house before the technical design and building takes place, so, too, should business people be involved in the design of the data models from which the databases that run their organization are built.
A high-level data model conveys the core concepts and/or principles of an organization in a simple way, using concise descriptions. The advantage of developing the high-level model is that it facilitates arriving at common terminology and definitions of the concepts and principles.
There are ten steps that are required to successfully develop a high-level data model (HDM). Although you can start some of the steps out of sequence, they need to be completed in the order they appear. For example, you might find yourself jotting down stakeholders (Step 2) before identifying the purpose of the model (Step 1). However, you will need to revisit your model stakeholder list after finalizing the purpose of the model.
The ten steps for completing the HDM follows.
Step 1: Identify Model Purpose
Determine and agree on the primary reason for having a HDM. Always begin with the end in mind.
Remember to focus the purpose of the high-level data model around a business need or process improvement. Data models are built to ensure everyone has a precise understanding of terminology and business rules.
One of the fascinating outcomes of this first step is realizing that the model’s stakeholders see the world very differently from each other. It is not worth investing time and money in the other nine steps without a clear, agreed-upon reason for the model. That doesn’t mean the high-level data model cannot have more than one purpose, but there should be one primary purpose for building it.
Once there’s consensus on the purpose of the data model and it is documented, you need to determine whether a top-down, bottom-up, or hybrid approach is ideal. Matching the right factors with the right modeling approach will dramatically increase the probability of having a successful model.
Here are the most common reasons for building a HDM (remember, communication is the main reason behind each of these):
- Capture existing business terminology and rules
- Capture proposed business terminology and rules
- Capture existing application terminology and rules
- Capture proposed application terminology and rules
Step 2: Identify Model Stakeholders
Document the names and departments of those who will be involved in building the HDM, as well as those who will use it after its completion.
A HDM stakeholder is someone who will be affected directly or indirectly by the model that is produced during the modeling sessions.
As you might expect, when the purpose of the HDM is to capture an existing or proposed section of the business, the builders tend to be people who know the business, such as business analysts and business users. Similarly, when the purpose of the HDM is to capture an existing or proposed application, the builders tend to be more technical, such as developers and database administrators. The users of the model though, could be anyone from business and/or IT.
Those with more of a business-oriented background can help build the business-focused view and those with more of a technical background can help build the application-focused view.
All or some of those users should also be your stakeholders and are required to sign off on the model.
Step 3: Inventory Available Resources
Leverage the results of Step 2 to determine what people will be involved in building the HDM and also identify any documentation that could provide useful content to the HDM.
Now that you have identified why you are building the model and who will be involved in building and using it, you need to identify the resources you will be using. The two types of resources are: people and documentation.
People include representatives from both business and IT. Business people may be management and/or knowledge users. IT resources can span the entire IT spectrum, from analysts through developers, from program sponsors to team leads.
Documentation includes systems documentation or requirements documents. Systems documentation can take the form of standard vendor documentation for a packaged piece of software, or documentation written to support a legacy application. Requirements documents span business, functional and technical requirements and can be an essential input to building the HDM.
Step 4: Determine Type of Model
Determine which of the four types of HDMs will work best based on the purpose of the model and the available resources.
The purpose of the model identified in Step 1 aids in determining the type of model to build in Step 4. The four different variations include:
- Relational data model: A relational data model describes the operational databases that support business processes.
- Dimensional data model: A dimensional model is used exclusively for reporting.
- Business perspective: A business perspective is a high-level data model of a defined portion of the business; choose the business perspective for any of the following situations: understanding a business area, designing an enterprise model, or starting a new development effort.
- Application perspective: An application perspective is a high-level data model of a defined portion of a particular application; choose the application perspective for any of the following situations: understanding an application or starting a new development effort.
Step 5: Select Approach
Chose either a top-down, bottom-up, or hybrid approach based on the purpose of the model and the available resources.
Even though the three approaches for building a high-level data model sound completely different from each other, there is really quite a lot in common across them. In fact, the major difference between the approaches lies in the initial information-gathering step.
The top-down approach starts with purely a business need perspective. The business should aim for the sky. Ideas are accepted even if you know there is no way to deliver the requirement in today’s application environment.
Example: If a new system is being built from scratch and there are business experts eager to participate in the project, a top-down approach would be appropriate.
The bottom-up approach, on the other hand, temporarily ignores what the business needs and instead focuses on the existing systems environment. You build an initial high-level data model by studying the systems that the business is using today. It can include operational systems that run the day-to-day business or it can include reporting systems that allow the business to view how well the organization is doing.
Example: If there are minimal business resources available and ample systems documentation, and the purpose of the model is to understand an existing application, a bottom-up approach is ideal.
The hybrid approach is iterative and usually completes the initial information gathering step by starting with some top-down analysis and then some bottom-up analysis, and then some top-down analysis, etc., until the information gathering is complete.
The whole process is a constant loop of reconciling what the business needs with what information is available.
Example: If a new system is being planned, or an upgrade to an existing system, and business expertise is available and required, a hybrid approach is best.
Step 6: Complete the Audience-View HDM
Produce a HDM using the terminology and rules that are clear to those who will be using the model.
Once you are confident about which approach you should take, you need to build the audience-view HDM. This is the first high-level model to build. The purpose is to capture the viewpoint of the audience without complicating information capture by including how their viewpoint fits with other departments or with the organization as a whole.
The purpose here is to simply capture their view of the world; the next step will reconcile the deliverable from this step with enterprise terminology.
Step 7: Incorporate Enterprise Terminology
Now that the model is well-understood by the audience, ensure the terminology and rules are consistent with the organizational perspective.
Once you’ve captured your stakeholders’ view in the boxes and lines of the audience high-level model, you can move on to the enterprise perspective. To build the enterprise perspective, modify the audience model to be consistent with enterprise terminology and rules. Ideally, this enterprise perspective is captured within an enterprise data model.
Step 8: Signoff
Require and obtain approval from the stakeholders that the model is correct and complete.
After the initial information gathering, make sure the model is reviewed for data modeling best practices as well as the fact that it meets the requirements. The sign-off process on a HDM does not require the same formality as signing off on a physical design, but it should still be taken seriously. Usually email verification that the model looks accurate will suffice.
Step 9: Market
Similar to introducing a new product, advertise the data model so that all those who can benefit from it know about it.
Think of yourself as a product vendor of sorts—the best product on the market won’t necessarily sell unless it is marketed effectively.
In building a successful high-level modeling project, it is important to treat the marketing aspect as a project in and of itself. To that end, make sure to create a specific communication plan as part of your project’s deliverables. This communication plan outlines both the message and the target community.
Step 10: Maintain
Maintain. HDMs require little maintenance, but they do require some. Make sure the model is kept up-to-date.
Remember that even after the model is complete, there is still a maintenance task that you must stay on top of. The HDM will not change often, but it will change. You need to have formal processes for keeping the model up-to-date and aligned with the other model levels. You also want to make sure that the HDM is actively used by other groups and processes in the organization and doesn’t become a passive artifact.
Mastering the ten-step approach to building a High Level Data Model will increase awareness of a number of factors and constraints that will heavily influence the actual modeling process. Understanding and weighing these factors and constraints will help you choose the modeling approach that best suits your business’ needs.
Donna Burbank is senior principal product marketing manager for CA’s ERwin Data Modeling business unit and has more than more than 12 years of experience in the areas of data management, metadata management, and enterprise architecture. Donna currently is the director of product marketing for CA’s ERwin data modeling business unit. Previous to this role, she has served in key brand strategy and product management roles at Computer Associates (now CA) and Embarcadero Technologies and as a senior consultant for PLATINUM technology’s information management consulting division in both the U.S., and EMEA.
Steve Hoberman is a widely recognized innovator and thought leader in the field of data modeling. He has worked as a business intelligence and data management practitioner and trainer since 1990. He is the author of Data Modelers Workbench and Data Modeling Made Simple. He is also the founder of the Design Challenges group and the inventor of the Data Model Scorecard.