Know Thy Data
What do you do when your IT or business department decides data analysis is a luxury they can't afford?
- By Wes Flores
- August 16, 2016
Data analysis -- evaluating data to better understand and use it -- is a critical step in any data project, from master data management to data integration to business intelligence. Data analysis is the process of evaluating data to gain knowledge to better understand and utilize the data. Types of projects that depend on data analysis would include master data management, data integration, big data, business intelligence, and analytics centric projects. These projects thrive on the data itself, making the need to understand the data foundational to the project.
During the data analysis phase of a project, data profiling practices and tools are often used to learn about the origin, consistency, correctness, and function of the data. Through this process you can gain valuable insights such as key business rules, back door processes, and hidden business issues.
Time and cost pressures on a project may cause a team to question the need for data analysis, especially when its value is not fully understood. If your leadership is pushing towards not including a data analysis phase or significantly reducing it, their decision should raise a red flag.
When others challenge the need for meaningful data analysis, remind them that the cost to resolve an issue rises significantly the further down the life cycle you go. For example, fixing an issue in the requirements or design phase may cost $10, but fixing that same issue in testing may cost $100 and significantly more in production.
Why Teams Consider Skipping Data Analysis
Here are a few misconceptions I have heard over the years:
- We already know the data well
- The timeline does not allow for data analysis
- Data analysis is an optional phase of the project
- We will end up analyzing the data during development anyway, so let's not add more time to the schedule
- Testing will find any issues; that's the purpose of the testing phase
- The company is already using this data for other critical needs without issues
Misconceptions: A Closer Look
The team already knows the data.
Although it's often the case that some team members are working with the data every day, this only means that you have a great head start on data analysis activities. It does not mean you cannot learn new things about the data or benefit greatly from the results of analysis.
The timeline does not allow for data analysis, or it is considered optional.
The data analysis phase should not add significant time to your overall project plan, and it will improve your ability to maintain project timelines.
Data profiling practices can provide a structured way of analyzing the data. If you have a data profiling tool, leveraging it will simplify the analysis and help you handle more data in a shorter time frame.
When a project centers on the data, having more knowledge of the data helps mitigate overall project risk and ensure timely delivery. Senior resources often won't touch a data-centric project unless some degree of data analysis has been performed. Incorrect assumptions of data quality can impact architectural design decisions, causing significant rework after a project is completed or even preventing completion.
We will end up doing data analysis in the development or testing phase.
This view reflects inexperience working with data and the complexities that often arise. There may be a few data issues that will not totally derail a project if caught during development or testing, but if you need to perform detailed data analysis during these phases, you are fundamentally changing their purpose. Additionally, you will add significant time to these phases, especially in the case of rework or redesign.
The company is already using this data for other critical needs without issues.
You have probably heard the saying "garbage in, garbage out." Some teams assume it is fine to have "garbage out" if that is how the data was provided to them. Let's compare it to building a house. No matter how great the design is, it will not make up for using poor quality materials. Like the importance of building materials on the quality of a house, the quality of your data defines the success of your project.
Data Understanding at the Core
Your data needs to be understood, appreciated, and made the core of your project. If members of your project team question the need to analyze the data, especially in the early phases of your project, you need to address their concerns. Explain how you will save money, save time, and create quality output when you understand and leverage data analysis.
Wes Flores of McKnight Consulting Group has over 20 years of experience in the data management field. Specializing in the areas of enterprise data warehousing, MDM, analytics and BI programs, he has worked mid-sized to Fortune 15 companies with a passion in promoting data as an asset. You can contact the author at Wflores@mcknightcg.com.