LESSON - Emerging Technologies for Big Data Analytics
By Jon Bock, Director of Product Marketing, Aster Data
With more than 60 percent data growth per year in many enterprise applications, and over 100 percent in most Internet applications, addressing the challenges of “big data” has become a top priority. Organizations that want to gain an edge over the competition need to cost-effectively load and store terabytes to petabytes of data, perform rich analysis on these massive data sets, and do it at ultrafast speeds. Driven by uses such as fraud detection, customer behavior analysis, trending and forecasting, scenario modeling, and deep clickstream analysis, faster and deeper analysis of massive data sets has become a mainstream requirement.
Enabling this new generation of big data analytics requires overcoming the performance, scalability, and manageability limitations of traditional approaches. In the past, these limitations forced data analysts to accept limited and lower-quality analytics, but emerging technology in three key areas is enabling a new generation of analysis.
High Performance, Scalable Data Platforms
Traditional architectures struggle to provide fast results to queries on large data sets because they rely on monolithic architectures designed in an era of smaller data sets and slower data growth. They are typically expensive and cumbersome to scale as data grows due to inflexible systems, requirements that force “super-size” upgrades, required downtime, and error-prone upgrade processes.
Scaling data and analytics to manage terabytes or petabytes of data requires pervasive parallelism. Organizations are turning to massively parallel database architectures that parallelize all functions of the system, from loading to query processing. They are also turning to frameworks designed for large-scale data processing. Google, one of the first to face the challenge of analyzing petabyte-scale data, pioneered a software framework called MapReduce for fast processing of large amounts of data that is increasingly being leveraged for big data analytics.
Traditional architectures for data management and analytics are simply not built to move terabytes to petabytes of data through the data pipeline to the analytic application for processing. The larger the data volume, the larger the time and effort needed to move it from one location to another. The resulting performance and latency problems are so severe that application developers and analysts commonly compromise the quality of their analysis by avoiding big data computations.
“It’s really innovative, and I don’t use those terms lightly. Moving application logic into the data warehousing environment is a logical next step.” —James Kobielus, Forrester Research
New technology allows analytic applications to be pushed into the database so they are co-located with the data. This new approach allows businesses to embed existing application logic associated with data analysis into the database where data resides, eliminating data movement and delivering major improvements in speed of data analysis, depth of data analysis, and ability to analyze very large data sets rather than relying on data sampling.
Tools for Rapid Development
Traditional analytic application development within organizations is commonly plagued by complexity and inefficiency throughout the application lifecycle, from coding and integration to testing and deployment. These challenges are caused by the number of complex manual steps required, limited resources available for testing, and the need to involve multiple people and groups throughout the process.
Addressing these challenges requires simplifying and automating the process of creating and deploying analytic applications. Integrated development environments, desktop testing environments, and reusable building blocks for analytic applications are becoming available to address this critical pain point.
The need to provide fast and rich analytics on big data is forcing enterprises to rapidly evolve their data management and analytic architectures. Adopting technology that addresses the performance, scalability, and complexity challenges of big data analytics is critical to deliver the insights necessary to leverage the value in organizations’ data.
For a free white paper on this topic from Aster Data, click here and choose the title "Aster Data nCluster: A New Architecture for Large-Scale Data Analytics.” For more free white papers, click here.