The Secret to Organization Data Science Success: Data Literacy
In today’s world, organizational data literacy is a critical component of a successful data science project. Here are the skills your end users must develop to become data literate.
- By Troy Hiltbrand
- February 25, 2019
Data literacy -- the ability to read, write, and communicate data in context -- is a leading factor that determines the success or failure of a data science project, according to a recent Gartner survey of CDOs across industries. Aside from the technology and the staffing requirements that are internally focused aspects of project success, you must ensure that your constituents have the foundational understanding of how to use data. Without it, your analytics project efforts will be in vain. (The Gartner survey results, Fostering Data Literacy and Information as a Second Language: A Gartner Trend Insight Report, are available to Gartner account holders.)
Even with advances in machine learning and artificial intelligence, which have the potential to move data further along the path to value creation, the ability to convert output into decisions is a skill that your user base needs. Analytics leaders are taking steps to educate their users better in the skills needed to accomplish this goal.
How do you develop such an education program? Take a lesson from your local elementary school, which divides content into separate subjects that educate students from the ground up. Your training program should work the same way.
Here are five classes that will structure your content for data literacy: reading, art, writing, arithmetic, and science.
The first class you’ll teach is reading -- reading and comprehending the data , that is.
If you observe how the education system teaches reading, you will notice that teachers follow a pattern. They start by teaching children words and then they move on to sentences. Once children master the basics of reading and comprehending words and sentences, reading takes on a whole new life, and the focus moves to the identification of themes and patterns -- ultimately leading to children creating their own stories to share with others.
This instructional pattern is the same with data literacy. The first skill you need to teach is how to read data, which includes teaching how to consume the individual data element and make sense of it. It also includes teaching how to identify the difference between categorical, interval, and ordinal data, and how to combine data elements to represent series or patterns. Teaching this skill is the data equivalent of teaching a child how to get meaning from words and sentences.
Once your user base is comfortable with the basics of reading data, then comes the next step -- data storytelling. Users show mastery of this skill when they can spot patterns in the data and combine them with their contextual knowledge of the business to explain what the data means to their peers. As they form narratives from the data, they can effectively communicate with other stakeholders across the business, and the conversation can move from what is happening with the business to a conversation about what should happen with the business and how to achieve that state.
The next class to teach is art -- both art creation and art interpretation.
Like reading, the educational system is a great model of how children learn about art. In schools, teachers start by teaching kids to do and then evolve to teaching them how to interpret what others are trying to convey through their art.
To start your course, you will need to train your users on how the elements of design are used to communicate data patterns. Charts, graphs, and dashboards are common artistic delivery mechanisms. Instruction will include demonstrating how different elements (such as color, gradient, line, direction, shape, size, and position) can be used independently to represent data or how they can be combined to compactly represent data in ways that words and numbers require much more space to do, as well as being harder to consume.
Once they have mastered the basic design elements, the next step is to teach users how to interpret the results they receive from the analytics team. Your users will likely spend more time consuming this data art than they will creating it -- and the better they are at this skill, the better prepared they will be to utilize the analytics your team delivers. As they take the basics they have learned about the elements of design and how these elements communicate what is happening in the data, they will be able to monitor the health of the business by observing and interpreting your charts, graphs, and dashboards.
The third subject to teach is writing. Writing is about quantifying, in words, ideas to facilitate communication.
Because your business users are experts in their respective domains and your analytics team members are not, it is important that your users be able to communicate their needs clearly and effectively. To facilitate successful communication between the teams, you need to train your users to develop requirements in a way that makes two things evident:
- Users must explain key assumptions and constraints within the business that will have a bearing on how the analytics team develops solutions
- Users must describe what success looks like for the delivered analytics results
Your goal in this course is help your users learn to define and structure good business requirements and learn how to develop effective success criteria for these requirements.
Successful instruction would include providing them with both good and bad examples of requirements, analyzing the components of what differentiates a good from a bad requirement, and having them practice creating requirements of their own to improve their overall effectiveness.
The fourth class is arithmetic or mathematics.
In the education system, mathematics is a fundamental subject taught at every grade level. Each year builds on the year before. Students start with very basic concepts and build upon those year after year until they arrive at higher levels of mathematics. Even moving from primary to secondary and ultimately to college-level education, this trajectory of building on the concepts previously learned to move into higher levels is the norm.
As organizations implement analytics, they utilize all levels of mathematics to extract information from data, but not everyone in the organization needs to understand all aspects of mathematics at the same level. The data analytics team can retain the more advanced elements, but your users do need to have a solid understanding of the basics of mathematics and understand enough of what the analytics team is doing to have confidence in the results.
As the analytics you deliver is the culmination of multiple steps in the ETL process (each with different mathematical transformations), your users need to understand the basics of what these transformations look like and what they do. This understanding builds trust and confidence in the results.
In this course, you can focus on teaching your user base the fundamentals of statistics, such as mean, median, standard deviation, and z-score. Many people have had exposure to these concepts through schooling, but a refresher course on a subject they might not frequently be using can be helpful. You can also review the basics of algebra and geometry, if these methods can help your user base to be more effective at their jobs.
In addition to the basic mathematical transformations performed on the data, you will also want to cover higher order transformations, including such areas as probability and time-based series forecasting, at least at a conceptual level.
As the organization moves into advanced analytics and machine learning, your users need to understand theoretically how these algorithms work to convert data into predictions. The more your users view these processes as a black box, the more distrust they will have of them and suspect the validity of the results. They don’t need to leave this course understanding the intricacies of how to do the math behind these algorithms. Instead, having a basic understanding of how decision trees, neural networks, k-means clustering, or other methodologies work can give them confidence that the analytics team is meeting their business objectives. What topics you teach will be related to the algorithms you use in your analytics and the needs of your business users.
The last course on the docket is science.
As advanced analytics becomes more prevalent as a method of solving complex business problems, organizations are adopting data science practices at an increasing rate. One of the fundamentals of data science is the ability to apply the scientific method to the process of extracting value from data.
The scientific method includes creating a hypothesis, running tests against the hypothesis, and validating the test results. The data analytics team needs to work jointly with end users to determine a reasonable hypothesis and to validate the results to evaluate their accuracy and integrity. Instructing users about this process can help bridge the gap to success.
These hypothesis statements often take the form of “if we do x with the data, we will improve the outcome y in the business.” The testing includes doing x with the data and observing if y happens. Validating the results takes the form of evaluating not only that y occurred, but that the reason that y occurred was because of the changes to x.
In this course, you can teach users how to formulate a hypothesis statement that represents something of value to the business, how to identify what changes need to happen and what data can be leveraged to make those changes, and how to evaluate whether the desired outcome occurs and how to differentiate between correlation and causation in the results.
A Final Word
Because data literacy is a vital element of successful data science projects, it is becoming increasingly important for analytics organizations to focus on improving the capabilities of their user base. To effectively achieve this target, an analytics team needs to structure its training and educate its audience. Focusing on the areas of reading, art, writing, arithmetic, and science provides you a basis for developing an effective training program.