How to Get More From Your Data in 2019
These three trends will lead the way to exploiting the value in your data.
- By Angel Viña
- December 20, 2018
It’s no surprise that data has become an important asset for most organizations. The ascension has been propelled by many trends. In the 80s it was the database, in the 90s it was the data warehouse, in the 2000s it was cloud and social data, and in the 2010s it was big data. Over the next decade, natural language processing (NLP), information catalogs, and multicloud architectures will drive the innovations that enable organizations to enjoy even greater benefits from data.
Data professionals can ride this wave to a better career by paying attention to these data trends!
Trend #1: Transforming queries with natural language processing
SQL has been a bedrock of querying data from analytical engines for the past 30 years. Just as structured data has given way to unstructured data in the past decade, the ability to query data sets from data warehouses and cubes is now in flux. Analysts are simply typing, “What are the most profitable products in the current year?” instead of constructing a SQL query that sums up the sales for the current year by joining the products and point-of-sales data tables, and then selecting products that meet a certain profit threshold. Thanks to natural language processing (NLP), data analysts no longer have to be SQL jockeys!
NLP parses the textual commands and tokenizes the sentence into keywords that can be used to construct a query. Then, a real-time engine such as data virtualization, can take that construct and create a SQL query under the covers. It does this by going across multiple data sources, extracting the relevant data from those sources, and combining the data into a result set that answers the question. Now, the data analyst can view the profitable products as a list or a chart within a visualization tool. The benefit of NLP to querying data using English sentences will be the explosion of self-service among the analysts and business users alike.
Trend #2: Enabling uniform data governance with information catalogs
Thus far, data governance regimes have been siloed across technology projects such as data quality, master data management, or a data warehouse. However, the need for enterprise-wide data governance that transcends all technologies already exists. Now, thanks to information catalogs, business users and IT have a means to discover, organize, and describe enterprise data assets in a single place. Businesses employ information catalogs to search for data assets relevant to analysis as an alternative to checking numerous siloed, disparate data sources.
Today’s advanced, dynamic information catalogs provide business users with self-service interfaces to assist them in performing Google-like searches. Information catalogs integrated with data virtualization is a good first step to help any business user easily locate relevant data from which to extract actionable insights.
Beyond simple data searches, business users can infer the lineage of the data, such as understanding the data source from which the data originated, and how it was combined with other data to yield the final result set. Users can also view the associations across related data. By providing a common vocabulary across the enterprise, information catalogs enable uniform data governance across the entire organization.
Trend #3: Easing the move to the cloud with a hybrid, multilocation architecture
As the cloud initiative progresses with more and more data migrating to the cloud, the center-of-gravity of the data shifts. The balance tips towards cloud platforms where the data is spread across both cloud and on-premises data sources. IT is now discovering that migrating to the cloud is not easy because they cannot lift and shift the data to the cloud in one easy step and then sunset the on-premises system without causing downtime for the business users. The move has to be a gradual transition requiring a hybrid approach that accesses data from both the on premises and in the cloud over a limited period of time.
Integration of the data across these two universes must transition to a multilocation architecture. Data virtualization facilitates a risk-free, phased approach to cloud migration. Multilocation architecture provides many benefits, such as data location transparency and data abstraction. With the abstraction of the underlying systems, business users will never know or need to care whether the needed data originated from the data center or in the cloud. IT teams can take their time to gradually shift the data from on premises to the cloud, ensuring that security and governance are not compromised.
A Final Word
Data professionals can certainly benefit by becoming aware of, and embracing, these trends. As data becomes ubiquitous within organizations, having a penchant for deriving value from data using these innovative developments will propel both the business and your professional career to greater heights.
About the Author
Angel Viña is CEO and founder of Denodo, a leading provider of data virtualization software. To learn more, visit www.denodo.com or follow the company on Twitter or Facebook.