Solving the Top 4 Data Pain Points in 2021
We don't need new technologies or new app features. In 2021 we must face the 500-pound gorilla in the room: these data pain points.
- By Stan Pugsley
- December 11, 2020
As we enter 2021, rather than predict new technologies and features, I would like to focus on four key obstacles that will (or should) take center stage in the product road maps of vendors and budgeting of customers.
At this point in the evolution of our data capabilities, there is no shortage of technology and we have an overabundance of vendors. What is needed is a linkage of technologies and workflows to enable adoption: fewer new features, more simplicity and enablement.
Pain Point #1: I want to migrate to the cloud but I can't deal with the collateral impact.
There are few remaining industries or company sizes with a valid reason for not migrating to the cloud. Your peers are all doing it and the providers have knocked down any significant security or availability concerns. Cloud migration costs are being successfully managed by optimizing storage and compute patterns. Even so, migrating isn't a simple process in most environments.
Complicated workflows and system integrations need to be rewritten, with potential disruptions across the company. If you move one piece, all the integrations turn into potential spin-off projects requiring skill and resources you may not have.
Cloud software vendors will need to focus attention and resources on building bridges and interfaces that will enable transition with minimal disruption. This could mean more APIs and fewer ODBC connections. The solutions won't be easy, but will find a ready market in companies looking for a smoother path to cloud migration.
Prediction: On-premises gateways and connectors will become more intuitive and less disruptive.
Pain Point #2: I have a data lake -- now what?
Many tools are making it nearly painless to assemble a data lake. Point the tool to a data source and within minutes you have a data-extract robot running, depositing the data in a location of your choice. However, the next step is the complicated one. Many AI/ML/data science ideas never make it out of the lab and into the production environment. Those that do struggle to find ROI in moving the needle on business outcomes. Basically, building a data lake is becoming much easier than using it. Small and medium-sized business don't have the time or resources to hire and explore custom solutions.
Solutions to this pain point will target specific business functions and industry needs, building pre-configured models that allow customer to train and apply the model by simply adjusting parameters.
Prediction: Investment will go to turnkey solutions that are industry- and business function-specific.
Pain Point #3: A data warehouse is more work than I thought!
Anyone who thinks a data warehouse is an IT project that can be completed in a (virtual) backroom will be in for a big surprise when they launch a new project.
Conforming (merging) data sets is, by definition, a compromise between two systems or departments. A compromise takes discussions, bridging the gap between technical and business users, and an ongoing relationship between IT and business sponsors. The need for communication will never go away, but vendor solutions can make the process simpler.
There are two competing approaches to simplifying a data warehouse transformation. On the one hand you have low-code, visual solutions that require minimal SQL knowledge. On the other hand you have code-acceleration solutions like DBT that essentially write code behind the scenes to accelerate development. As those two approaches develop, companies will be able to match the skills and work patterns of their data teams to the right software package.
Prediction: Tools will shorten the time from business decision to validation-ready data. And document the business rules and dependencies.
Pain Point #4: We've upgraded our reporting tool but we didn't upgrade our sloppy practices
The story has been told a thousand times -- a company hates their old reporting/visualization tools because of the clutter of poorly made, untrusted, slow reports. They see a product demo of the latest tool and like what they see. They buy the new tool and in six months find themselves descending into the same mess with a new tool. The cycle repeats, accumulating unused report baggage along the way.
One way to break this cycle is to create a peer review and validation step before reports are published to the production workspace. Just as we see reviews, certifications and ratings on our favorite shopping sites, reports can be vetted and approved in a visible manner by business users. Users will see those ratings by peers whom they trust, and have confidence in using the reports. IT teams will have an objective measure of which reports are in active use and which have been abandoned.
Prediction: Visualization tools will develop better approaches for rating and certificating reports. Data storytelling best practices will be embedded in the workflow of the tool.
A Final Word
Here's to a great New Year -- 2021 can be the year we make progress on these data obstacles.
Stan Pugsley is an independent data warehouse and analytics consultant based in Salt Lake City, UT. He is also an Assistant Professor of Information Systems at the University of Utah Eccles School of Business. You can reach the author via email.