Who Owns Your Data?
New decisions about the meaning of data ownership are urgently necessary in the context of big data and the Internet of Things.
- By Barry Devlin
- July 25, 2016
Data ownership has long been a topic of considerable interest in data warehousing -- at least among proponents of good data governance. Considerable disinterest might be a more appropriate description among business people (who have on occasion been seen fleeing meetings on the topic).
To some extent, their flight might be justified: data ownership is not as obvious as it sounds, and moving forward the difficulty of determining who owns a piece of data will only increase.
I am using "data" here in the colloquial sense of IT usage. In fact, we are discussing a subset of information ownership, which is a bigger and more challenging topic and beyond the scope of this article. (The distinction is explained in a previous article .)
Until the middle of the last decade, the vast majority of the data stored in traditional business computing systems came from the operational systems that run the business. Such data represents the legally binding position and, to some extent, history of the business. Intellectual property (IP) ownership of this data clearly resides with the business as a corporate entity.
In terms of responsibility for its quality and care, most businesses assign data ownership to the functional units (and their executives) that gather and create it. It is the ongoing and extensive work of this second meaning of data ownership -- the care and maintenance of data -- that business executives have been known to flee. Malcolm Chisolm describes data ownership as a poor analogy for the actual situation.
Data warehousing combines data from multiple sources, creating new data and building a complete -- in theory, at least -- historical record of the business. Much of the resulting data is of higher value and broader applicability than the original. Who, then, should own such data? IP ownership doesn't change, but care becomes more problematic as data is combined and enhanced through various processes operated by different parts of the organization.
The temptation is to assign the duty of care to the organization as a whole, perhaps in the person of the CEO. Of course, this is impractical, plus it tends to confuse the issue by conflating the two types of ownership -- IP and care -- into a single concept.
The emergence of big data and, in particular, the Internet of Things complicates matters even further. In the modern, digitalized business, an increasing proportion -- indeed, the majority -- of data comes from external sources. In such circumstances, who "owns" the data, in both meanings of the word?
Adam Rendel, a lawyer and IP expert, explains the legal position (in U.K. law, at least) regarding who owns data from the Internet of Things in a February 2014 article: "The answer is: no one -- there is no property right in a piece of data itself. The owner of a smart thermostat does not, for example, own the data about how he uses it."
He further clarifies that in EU law, an aggregation or collection of data can be owned, and that it is owned by the party that invested in its collection, aggregation, and organization into a "database." (Note that this is only a layman's interpretation of a two-year-old legal position! Similar or different positions may exist in the U.S. and other jurisdictions.)
Today, society has consented -- perhaps unwittingly -- to the universal and inescapable collection of data from every aspect of our lives. From the words we utter in the presence of our smart TVs to our minute-by-minute location on the highway, from the temperature in our bedrooms to the emotions on our faces in the supermarket, everything is fair game for commercial and governmental collection, aggregation, and organization into databases -- and for subsequent use and misuse.
As a result, privacy has belatedly become a live issue, although as far back as 1999, Sun CEO Scott McNeally proclaimed: "You have zero privacy anyway. Get over it." Assuming privacy is something we want to preserve (or resurrect), new decisions about the meaning and implementation of data ownership -- in both senses of the word -- are mandatory.
In essence, the advent of pervasive Internet of Things data demands that legal IP ownership of data instances (as opposed to data collections) must be revisited. Legal ownership of personal data, in the broadest sense, must be vested in the person to whom it refers. That person may license the right to use some or all of this data to other parties -- individual, commercial, and indeed governmental -- under defined, binding, and enforceable conditions.
Care of data licensed in this way remains the responsibility of the party holding the data. Failure to fully meet this duty of care becomes onerous in these circumstances. Furthermore, the idea of collecting and storing all data in the hope of discovering some business-changing nugget of information within it becomes less attractive as the risks associated with storing such a vast amount of data increase significantly.
If data governance becomes a requirement that has clear financial and reputational consequences, data ownership will finally get the business attention it deserves.
Dr. Barry Devlin defined the first data warehouse architecture in 1985 and is among the world’s foremost authorities on BI, big data, and beyond. His 2013 book, Business unIntelligence, offers a new architecture for modern information use and management.