Q&A: Big Data Trends
How are enterprises gearing up to conquer big data? What issues are they worried about? How do they plan to use big data? Read about the changes ahead.
- By James E. Powell
- January 19, 2016
As the director of product marketing and strategy for the Progress Data Connectivity and Integration team, Paul Nashawaty keeps his eyes peeled on what enterprises are doing about big data. Paul is responsible for applying practical business methodologies using technological solutions to drive success in organizations. His focus is on understanding technological concepts and applying their value to a business's bottom line, resulting uniquely driven solutions with rapid time to value. He takes ideas from inception to market using an agile approach.
We recently asked him about the top big data trends he's seeing now and what enterprises can expect from the big data market in the months ahead.
BI This Week: As an observer of big data trends, what do you see happening with big data in the enterprise?
Paul Nashawaty: The big data landscape is constantly evolving -- whether it's new products or new startups that seem to enter the market every day, big data is undergoing major changes. As a whole, the industry is starting to shift away from relational environments given the various types of data sources, which has ultimately created a fragmented data market. However, in the next few years, it will start to normalize through consolidation while best-of-breed solutions will continue to live on.
In the meantime, factors such as application performance, cost-saving measures, distributed features, security, and the rise of self-service business intelligence tools will continue to impact the businesses that utilize big data.
Let's tackle the first trend you mentioned. What cost-saving measures are enterprises taking in order to enter the big data market?
Relational databases, the dominant technology for storing and managing data, are not designed to handle big data. As such, enterprises are exploring different big data sources and databases that have proven to be more cost-effective given their less-intrusive licensing costs. Enterprises looking to streamline resources to be more cost sensitive often rely on a pay-as-you-go model as opposed to a perpetual licensing model, also known as the OpEx vs. CapEx debate.
For example, if you're a small business or entrepreneur, you may want to consider going OpEx, so you can get off the ground quickly with no licensing ties. On the other hand, well-established businesses and enterprises may benefit from taking the CapEx route because it may be able to give them more resources and competitive differentiation as well as tax benefits for investors. Ultimately, businesses focused on big data no longer can rely on the one-size-fits-all relational model; they must look toward new databases better designed to handle current workloads.
What application performance issues are enterprises worried about? For instance, are they concerned about having a large enough window to complete processing of big data overnight or are they more concerned about processing big data in real time? What changes to their environment are they considering in order to have sufficient computing power?
Processing data is a phrase that implies data is moving, so the trend that we're starting to see is the ability to access real-time data in its place. The ability to access data in real-time provides users with critical insight into the data. Such insight includes how the information is flowing, security details, and trends that can ultimately impact important business decisions and provide an important competitive advantage within respective industries.
Additionally, enterprises are moving away from the big infrastructures typically found within a traditional environment and into a lower-cost model where information resides in the cloud or a hybrid-environment. As such, enterprises are able to outsource some of their computer power to data centers so they are only using resources when they need them.
When you mention "futureproofing" your data sources, what do you mean? What kind of data sources are enterprises actively incorporating now and which types are they putting off?
Traditionally, data sources have used relational environments, which are a collection of data items organized as a set of formally described tables from which data can be accessed or reassembled in many ways without having to reorganize the database tables. The standard user and application program interface to a relational database is the structured query language (SQL). Although that probably won't change for some time, we're starting to see enterprises adopt more non-traditional sources. For example, technologies such as Hadoop and NoSQL provide platforms that enable scalable, flexible, cost effective, rapid, and "resilient to failure" solutions.
Many organizations will combine structured and unstructured data, and the platforms I've mentioned offer the space to host and handle these differing datasets. These organizations allow businesses to cope with large amounts of disparate data. They also help in creating a real-time environment for data. If real-time or near-real-time analytics are required, it's important to source an organization that can actually make this a reality. With all of the emerging trends around big data and analytics, IT organizations need to create conditions that will allow analysts and data scientists to experiment to find the best data solution for their businesses.
When you talk about big data, are you talking about big data from existing sources or smaller data sources that are combined and integrated into a bigger overall data source? What types of sources are enterprises most interested today?
Big data means a lot of things to a lot of people. From our perspective, big data is the ability to access information where it lives and to get the applications to the data regardless of source. Big data can reside anywhere and can be a combination of any set of data. Whether it's a relational database or a data lake, it's going to reside in many places, so users need to be sure that their data environment has the ability to access all of the information despite the number of different sources it may be coming from. Companies are most interested in anything to do with business intelligence and analytics, as this is often the data that lives within the enterprise. This can include customer information, retail analytics, or transactional data depending on the nature of the business.
It's one thing to collect big data, it's another to make use of it. What plans do enterprises typically have when it comes to analytics?
Enterprises are using big data to better determine how to run their operations more efficiently by giving more people an easier way to access the data. The essential reason why having lots of data is helpful to a business is that it allows you to find answers to potentially revealing questions. If big data is used effectively (i.e., in cloud applications, dashboards, and real-time analysis), it will generate important insight that can often give the business a competitive advantage and help guide important C-suite decisions.
There is an endless amount of data within organizations to be processed, with not only present analysis as well as a backlog of historical information to provide new insights into customer behaviors. Organizations must consider all of the elements that come together to create a big data ecosystem and ensure that they all play well together to create a robust and fit-for-purpose ecosystem.
Is the industry shifting more to self-service tools, and if so, why? What does the future look like for traditional BI tools?
Historically, businesses were running their BI against a single data source from a relational environment, in-house, for access to data. This method makes sense for structured data from a single source that can be classified into columns and rows in a data warehouse, usually geared toward managers and directors. However, the explosion of data has forced companies to be more strategic about how they access information in real-time in order to be competitive. The days of moving data from external sources to a specific data warehouse are gone as we continue to see the amount of data growing at an incredible rate.
In the past, BI was a competitive advantage for enterprises that had access to all of the data, but self-service tools have allowed SMBs to compete on an even playing field, providing low cost of entry and supporting infrastructure. Previously, businesses paid a pricey fee upfront for traditional BI, but now businesses of all sizes can utilize subscription-based models for real-time access to data to help support every business decision. The only organizations that may want the pendulum to swing back to traditional BI are those that have now lost their competitive advantage due to self-service tools.
James E. Powell is the editorial director of TDWI, including the Business Intelligence Journal and Upside newsletter.