4 Changes with Big Data Analytics
Analyzing big data is different from other analytics. Here are four things worth knowing about analytics at scale.
- By Fern Halper, Ph.D.
- April 7, 2015
Organizations are facing a data deluge. Just think about all of the data out there. There is structured machine-generated data such as transaction data, sensor data, point-of-sale data, and financial data. There is human-generated structured data such as clickstream data. Then there is text and video data, some generated by machines, some generated by humans.
All of this combined represents massive amounts of increasingly disparate and real-time data. TDWI Research indicates that there is a great deal of interest by organizations to analyze and (hopefully) act on this data. In many ways, this mega-data is driving big data analytics -- aka analytics at scale. The capability to do analytics at scale, to accommodate the growth in your data while bringing the right tools to bear at the right performance over this data, can provide unique opportunities for your organization. It means expanding the kind of analysis you can do. It means deriving real value from your data.
Big data also means that certain things change about analyzing it -- so what changes with big data analytics? Here are four changes worth noting:
Change #1: Algorithms get refactored: Refactoring algorithms refers to making changes to the internal structure of the algorithm without changing its external behavior (i.e., a regression is still a regression). Vendors are refactoring algorithms to work with big data; for instance, by re-writing them to work in a distributed environment without the algorithms losing integrity. These algorithms might be parallelized in order to maintain performance. Think about statistical algorithms that once worked on a thousand rows of data now working on a billion rows of data and you get why refactoring is important.
Change #2: New algorithms are introduced: New forms of disparate data need to be analyzed and this may lead to developing new algorithms. For instance, consider the case of preventive maintenance on an oil rig. The rig might have sensors that create streams of time-series data. Data about temperature, pressure, viscosity, and flow rates might be collected from different components. These might produce time-series data that then might need to be analyzed together. There can be many of these time series.
There might also be video images that become part of the data picture. It's easy to picture the need for some new algorithms to help analyze this kind of data, or, in the case of disease spread, it might be important to add geospatial data to the mix in a predictive model and display this accordingly. This might change the algorithm.
Change #3: Analytics moves closer to the data. This is a concept that has been talked about a lot in the past few years. With all of the data being created, it would be inefficient to lift that data out of wherever it is being housed to run analytics on it, so many vendors offer the capability to push down the analytics into the system. They are even putting the analytics into Hadoop.
Change #4: Open source becomes more popular. Open source languages such as R and Python are also taking their place in the big data analytics ecosystem. Open source is important because it allows a wide community to engage in innovation. For instance, open source R is a free software environment for statistics and graphics and more. R is known for being memory constrained, however, which doesn't make it scalable on its own. However, vendors are providing tools to make R more production ready.
Python is also gaining steam for analysis. Python is an open-source programming language that contains an extensive library. It is gaining popularity because many believe it to be a really good programming language that is relatively easy to use.
What You Need to Know
These are just a few of the changes that big data analytics brings. For more information about big data analytics, tune in to the replay of our recent Webinar, Analytics at Scale: What You Need to Know. [Editor's note: Short registration required for first-time access.]