From Legacy to Hybrid Data Platforms: Managing Disparate Data in the Cloud (Part 2 of 2)
Important applications for an adaptive analytics fabric include privacy regulation compliance and self-service business intelligence.
- By David P. Mariani
- January 23, 2020
In the first part of this discussion, we explored how an adaptive analytics fabric can enable companies to leverage the power of their legacy investments, surfacing legacy data alongside data from newer systems as though it is all from a single source. There is no doubt that by unlocking insights from this collective data faster, businesses will be able to respond to new opportunities. However, one of the biggest reasons the fabric is needed goes beyond efficiency and performance gains: data in transit can easily be stolen for nefarious purposes.
The "Lift and Shift"' Approach Sets You Up for Danger
IT shops need to marry a massive amount of siloed data without re-engineering the entire IT architecture. In a traditional data migration, this is accomplished with ETL. Data is extracted, transformed, and loaded into a new system, from one environment where the data is not optimized for analytics to another that is. The problem with this approach is that the data is at risk of being compromised by human error or malicious intent while in transit between cloud and on-premises environments.
These risks can cost you dearly. Consider the GDPR (General Data Protection Regulation). Just a year after going into effect, the regulation now has proven it has real teeth. In July, the Information Commissioner's Office (the U.K.'s data protection authority) fined British Airways $228 million for exposing the personal data of 500,000 customers.
Beyond the GDPR, organizations also have to maintain compliance with various laws and regulations related to their industry. For example, compliance acts for financial institutions, such as Gramm-Leach-Bliley, put the onus of securing data in the cloud on the organizations themselves, not on the cloud providers.
Because an adaptive analytics fabric enables you to easily integrate structured, semistructured, and unstructured data in all formats, no matter where data is located, data remains in its place, and in-flight data is always encrypted without human handling when in transit. In short, the risks that come with the ETL approach are eliminated.
Data Needs to Be Locked Down Without Locking Out Users
However, your data cannot be allowed to stay locked up inside legacy databases or remain encrypted, and you shouldn't require IT teams to extract data to then share with BI teams. When shifting from being legacy-minded to a hybrid data platform approach, all databases, whether on premises or in the cloud, have to be equally discoverable and usable by analysts.
First, users need a central location where they can go to find trustworthy data. The analytics fabric provides visibility into every corner of the organization's data and becomes the single source of truth, effectively eliminating the need to create local extracts of data that have questionable accuracy and which may slowly go stale on the user's local machine.
Second, when users can query the online data directly, it is imperative that the analytics fabric be equipped to translate the varying semantic idiosyncrasies of each business intelligence tool. Essentially, the analytics fabric must provide a universal semantic layer to allow users to work with data through their preferred analytical tools, confident that similar users working with different tools will get consistent results. In this fashion, the analytics fabric unlocks secure, convenient, self-service analytics for the organization's business users and analysts.
Secure, Convenient, Self-Service Analytics
By leveraging an adaptive analytics fabric to unite data across systems and unlock fast, consistent insights for BI teams, an enterprise is primed to save big with greater efficiencies.
All existing security solutions and policies governing data remain in place, and global security and compliance policies are applied across all data. Leveraging this approach versus traditional "lift and shift" will ensure your enterprise avoids a GDPR penalty.
Dave Mariani is the founder and chief technology officer of AtScale. Prior to AtScale, Dave ran data and analytics for Yahoo!, where he pioneered the development of Hadoop for analytics and created the world's largest multidimensional analytics platform. He also held the position of CTO for Bluelithium, where he managed one of the first display advertising networks delivering 300M ads per day powered by a multiterabyte behavior targeting data warehouse. Dave is a big data visionary and serial entrepreneur. You can contact the author at LinkedIn.