The Big Data Honeymoon is Over
What's driving our sudden, intense interest in data management and analytics?
By Aaron Fuller, Principal Consultant and Owner, Superior Data Strategies
The data community has been through an interesting hype cycle over the last couple of years. It's not new that we're going through this; it's been a pattern for a while. Client/server, object-oriented, service-oriented architecture, and agile are all buzz words that come to mind.
This cycle is different. It isn't just data professionals reacting. The rest of the world -- the non-technology community -- has joined the conversation in a new way.
Our leaders are not only recognizing but also voicing the importance of big data. It's like they're saying, "The big data! It's everywhere. It's everywhere, and we need it!" They have the "gimme gimme" mentality, but what they're missing is an understanding of how or what they want.
When you start digging into what data professionals mean by big data, you'll realize how odd this situation is.
Are people really that excited about the continuation of the exponential trend in data volume growth? Are they actually interested in the fact that we're managing more of our data in key-value pair and document structures? Are they fascinated by the idea of open source frameworks for statistical analysis as we are? If these questions leave you with a skeptical smirk on your face, our opinions are alike -- it just doesn't make sense.
What, then, is it? Why do we suddenly have so much interest in data management and analytics?
It probably isn't nearly as sudden as it seems. What was sudden was the way the marketing term "big data" took off. Sometimes with marketing that just happens; someone hits the publicity jackpot. This time they hit the Mega Powerball. When the term "big data" emerged, there had already been a growing appreciation for data management, business intelligence, data integration, and analytics in the wider community. With the addition of a few big scandals related to data collection -- most commonly NSA and hacking of consumer credit -- big data has now reached a critical mass where people whose jobs are not primarily about data are realizing the true implications of the capabilities we possess to collect, store, and use information to change our world.
This seemingly sudden -- but truthfully not-so-sudden -- mass realization has lead to some heady times for companies, the government and their vendors. New companies are popping up that sell all kinds of specialized software and cloud services, and existing software companies are innovating their marketing, and sometimes even their products, to meet the rising demand. Enterprises in all types of industries are assembling new analytics teams, hiring data scientists and chief data officers, and spending more money with the vendors. Consultants with the right skills and experience are more valuable than ever, inevitably causing some consultants to overstate their ability to help.
However, we're at the end of the honeymoon. We've basked in the glory of all of this attention and funding for long enough.
Now business leaders are starting to ask to see the value. Those of us who manage data for a living are noticing all this unbridled enthusiasm has led to people building data systems that don't adhere to the well-established best practices of an enterprise information architecture. We're being challenged again to figure out how we're going to make a new set of technologies and approaches to systems, and to prove to organizations our work produces sustainable, productive assets.
All is not well in most organizations when it comes to the sustainability of their big data solutions. The trend toward using databases that are less structured has challenged traditional data modeling processes and, unfortunately, rather than figuring out how to adjust, many data teams have abandoned data modeling. Because data dictionary and data lineage metadata can't all happen early in a project like we're used to due to not fully understanding our big data sources' content when we dump them into our data lake, many of us have abandoned documenting these metadata. In our haste to turn proofs of concept into production systems, there tends to be a real lack of rigor when it comes to data profiling and data quality assessment. This means there's high risk, and many of us know that we're holding a ticking bomb.
In data as in marriage, we need to get back to the basics and remember why we fell in love in the first place. We need to get back to the things we've know for some time will create value. We need to ask ourselves, "Do we understand where the data came from? Do we understand the content of the data? Do we know its purpose and how it will be used?"
Let's say you've forgotten about these foundations -- you're through the honeymoon and you're back to the day-to-day grind. It might be time for couples' therapy -- take the time to sit down and analyze what you've been doing.
Deciding to tackle big data is a fun, new world, but as with relationships, it's not about the wedding and honeymoon – it's about the marriage. What comes after the new venture? If you're looking for success -- in both relationships and data -- it's integral for a solid foundation to be in place.
Aaron Fuller is the principal consultant and owner at Superior Data Strategies and is responsible for guiding clients toward reliable and valuable business solutions as it relates to their data warehousing, business intelligence and enterprise architecture programs. Fuller is skilled in dozens of software, databases, and standards and methodological programs and serves as a faculty member at TDWI. You can reach him at firstname.lastname@example.org