Governance in a Changing Data World
Governance is often put on the back burner when organizations move into new technologies because it's becoming more complex. Learn the two biggest ways governance complexity may be affecting your enterprise.
- By Fern Halper, Ph.D.
- June 4, 2013
There is no doubt that the information technology world is getting much more complex. Mobile, social, cloud, and big data can benefit organizations, yet they add complexity to data management. Governance -- which is basically about ensuring that corporate and governmental rules and policies are adhered to using policies, processes, and controls -- is becoming even more important in this changing data landscape. These rules and processes might include understanding regulatory issues, dealing with data quality, maintaining standards, ensuring accountability, and keeping data secure and compliant in what is rapidly becoming a larger data ecosystem. Traditional governance will need to expand to address the complexities of these new environments.
Here are just a two of the ways that data governance gets more complex in this new world.
Governance Complexity #1: Enter the Cloud
Governance can get tricky in a cloud environment, especially if it is a hybrid cloud. In a hybrid cloud, multiple touch points between private cloud and public cloud services can exist. An on-premises application might connect to a public cloud service. There might be connectivity between clouds (i.e., your data might be implemented in a two SaaS services and you want to pull it into an analytics platform on your premises). You might be loading your ERP or CRM data into a cloud service for business analytics and that service provider also pulls in social media data that you can use. All of this means that there can be multiple parties involved in handling "your" data. This also means that there are a series of risks associated with this data. These include:
- Compliance risks: These might include regulatory issues such as issues around data jurisdiction. For instance, your company may not be allowed to move information across borders. Can your cloud provider ensure that this won't happen? There are also data access risks to consider (i.e., who is allowed to see the data). All of the data assets must also be properly controlled and maintained. This includes maintaining an audit trail.
- Security risks: These risks include data integrity (i.e., no tampering with the data), data loss, as well as data privacy and confidentiality risks. These might be linked to regulatory compliance risks (such as in the healthcare industry). Data quality controls are also important here. Of course some cloud providers have better security in place than you do, because they are thinking about it so much, but that doesn't mean that you are not involved.
- Contract risks: When you use the cloud, you enter into a contract with an internal provider (i.e., your IT department) or an external provider. That means that you need to read that contract carefully to see how it can affect your data. For example, if the service provider goes out of business, how will you be able to access your data? If there is an incident that affects your data, what is going to be done about it?
In terms of data governance in the cloud, you might use your existing governance board and data stewards, but you need to understand what changes in terms of your processes, policies, and controls. Issues of enforcement with external parties will also be critical. That may mean adding new members to the governance team or additional responsibility for current members. Additionally, although your cloud provider may be managing your data in some form, it is still your company's responsibility to govern your data. A key question is how you can monitor your cloud provider(s) to ensure that your policies are enforced. This monitoring may also involve new members of the governance team.
Governance Complexity #2: Enter Big Data
The advent of big data is another area that will impact data governance. There are all kinds of data that might make up your big data environment. These include documents, e-mail messages, social media, audio, video, as well as geospatial data and sensor data. Some of this data is internal to your organization, some of it is not. You may already have processes for data security, privacy, and governance in place for your existing structured databases and data warehouses. These processes may need to be extended for your big data implementation.
For example, in some instances, companies utilize data they have already collected for some of their first big data implementations but didn't know what to do with. You will need to ask yourself whether the processes, policies, and controls you had in place when you were simply collecting and storing the data are enough now that you might be doing something with it.
Although traditional data governance can be a good starting point for big data governance, new infrastructure, the fact that data may come from new sources, and analytics provide new risks. For example:
- Data privacy: The beauty of big data and big data analytics is that you can collect lots of data about people and behavior and answer questions that you couldn't before and at a level of detail that wasn't possible. However, this comes with its own set of data privacy issues. Who is going to keep on top of privacy legislation as part of the governance process? You certainly don't want any data you collect to be used in the wrong way.
- New sources of data: Another benefit of big data technology is that it makes it possible to support and analyze new forms of data, such as streaming sensor data and social media data. However, someone will have to set and implement the policies that deal with processing, managing, and storing this data. For example, you might need to have a new data steward to deal with streaming data from sensors. A different individual might be needed to monitor social media data or geospatial data because it has a different origin and different structure than traditional relational data. You will also need to make sure you have the right policies in place if you integrate this data with traditional data sources to ensure that data quality is not compromised. Retention and prioritization policies for these new sources of big data will also have to be put in place.
- Data in a proof-of-concept (POC) project. Typically, organizations tend to experiment with big data analytics or complete some POC work before the projects go mainstream. Smart organizations will have thought ahead and provided controlled test beds for big data experimentation. In this way, if the system does go live, there is a governed and controlled way to deal with it.
A Final Word
Clearly, traditional governance processes will need to grow as the data ecosystem continues to evolve. Yet governance is often put on the back burner as organizations move into new technologies. The problem is that sometimes it is not brought out front until the organization is already suffering the consequences of poor governance.