View online: tdwi.org/flashpoint
|
|||||
February 5, 2015 |
ANNOUNCEMENTS
NEW TDWI Checklist Report
NEW TDWI Checklist Report CONTENTS
|
||||
Discovery vs. Decision Analytics Nauman Sheikh |
|||||
When a big data analytics project is initiated, the first thing your organization needs to understand is which type of analytics to start with: discovery or decision analytics. This article explains the differences between the two approaches, as well as strengths and weaknesses of each, so you can make a decision and set expectations accordingly. Discovery Analytics Here is an example of how visualization and insights work from the film Iron Man 3. The character Tony Stark asks his computer, Jarvis, using natural language to:
Jarvis produces a visualization that plots all explosion reports across time and geography on a 3-D plane, with shades of red marking the intensity of each explosion. From this plot, Tony Stark is able to identify the explosions that stood out, and then goes after the earliest explosion to investigate. Although the example is fiction, reality is not far removed. You can build a big data platform that collects information from various Internet sources and downloads all the structured and unstructured data to a Hadoop cluster, where processing algorithms will filter out the relevant information. Then a visualization tool can plot the output. The two most important characteristics of discovery analytics, as evident from this example, are:
In this example, insight was discovered by plotting the intensity of the explosions. If instead the number of deaths, time of day, or weather were used, there is no guarantee anything would have stood out. This is a significant challenge—a lot of trial and error may be needed before the desired insight is evident. It can also be tricky to identify when you have discovered a useful insight because there isn’t always a precise definition of “useful.” The second characteristic of discovery analytics is causation, or the underlying causes of the insight discovered. The insight is the “effect” and causation is what leads to a business action. Decision Analytics Let’s look at an example to understand decision analytics. Large and complex machines require routine maintenance. Through years of experience and analysis, engineers and their managers have developed a maintenance and part-replacement schedule. The decision to send a field engineer with the requisite parts and service equipment to the machine’s location is part of a fully integrated ecosystem across manufacturers, customers, resellers, and suppliers. Predictive analytics simply complements this existing process using a model that calculates the probability of a machine breaking down and provides proactive input to a manager to schedule the field staff. Decision analytics heavily relies on predictive modeling to calculate the probability of an event occurring. It uses the data and information already available within the business process and data warehouse or data mart. The value proposition for such an analytics project will be immediately evident to the business. The only challenge will be when the business asks whether the model is accurate or how many false positives it will generate. You don’t want your field engineers running around fixing things that were not likely to break. Predictive analytics, whether done using traditional statistical methods such as regression or using data mining methods such as neural networks, has strong model validation and testing methodologies widely available. Models can be run on historic data and compared with decisions already taken to demonstrate how the predictive model complements the existing decision-making process. The benefit of decision analytics is the simplicity in its problem statement and solution explanation, but the challenges are tied to the art of predictive model building, which requires years of experience in the same problem domain. There is no guarantee that the model will perform at an acceptable level. For example, if a model gets the correct decisions less than 50 percent of the time during validation, it implies a coin toss would be better suited to decision making. Decision analytics is easier to adopt but will eventually require discovery analytics for model improvement. Predictive models need to be fine-tuned and tested with newer variables over time as underlying business dynamics change. Discovery vs. Decision Analytics Nauman Sheikh is a specialist in data and business analytics with a core focus on intelligent applications for risk management, big data, consumer analytics, and innovative use of predictive modeling. He is the author of Implementing Analytics: A Blueprint for Design, Development and Adoption (Morgan Kauffman, 2013). Big Data Strategy Approaches: Since McKinsey Global Institute popularized the term big data in 2011, an explosion of marketing and sales activity has blurred the meanings of—and distinctions between—business intelligence, analytics, data warehousing, and big data. All manner of product companies, cloud providers, systems developers, and consulting firms have aligned their offerings to the new terminology. For example, older, packaged reporting applications are now being marketed as big data analytics. New open source techni¬cal products designed to solve very specific data management challenges (such as management of unstructured data) of Internet-based companies are often promoted as potential replacements to traditional RDBMS products. The result is a combination of excitement at the potential for leveraging unstructured data, uncertainty as to big data’s business value, and architectural confusion. This article sorts through the terminology and provides a straightforward way to formulate a company-specific big data strategy that optimizes business value and avoids undue risk. Learn more: Read this article by downloading the Business Intelligence Journal, Vol. 19, No. 4
Why Is Real Time Important? An appreciable 24% of respondents feel that real-time operation is not a pressing issue at this time. To the contrary, however, the majority of respondents consider real-time data, BI, and analytics to be extremely important for success (33%) or moderately important (43%). Read the full report: Download Real-Time Data, BI, and Analytics (Q4 2014)
Mistake: Failing to Expand User Access to New Big Data Sources Organizations today are under pressure to enable access to multiple structured data sources as well as big data—semi-structured and unstructured sources such as text, social media data, and streaming machine data. This is particularly important for personnel in marketing, e-commerce, and customer sales and service functions who are interested in analyzing customer behavior across channels. If your organization is seeking to democratize BI and analytics to these users, you need to provide access to a wider range of data. Business leadership should identify relevant sources and work with IT to determine which users and functions need to view which data sources. The tried-and-true strategy is to consolidate data from multiple sources into an enterprise data warehouse (EDW) that serves various users and business functions. Tasks for data mapping, profiling, discovery, ETL, and quality improvement then focus on what will go into the EDW. However, this approach can be slow and impractical. Organizations should complement their consolidation strategy with data federation or virtualization. These modes use global metadata or master data to access “data in place” without having to move it to a central store. This is useful if data cannot be moved for regulatory reasons. The tasks to prepare the data and ensure its quality can be performed either through middleware or at the sources themselves. Hadoop technologies offer another alternative. Organizations can implement Hadoop files to store “data lakes,” where the data is not restricted (as it is with an EDW) to just the acceptable types or structures. Working with raw data, users can be freeform in applying analytics. If users identify data that is worth moving into the EDW, IT can create ETL routines to make that happen. Read the full issue: Download Ten Mistakes to Avoid When Democratizing BI and Analytics (Q4 2014) |
|
||||
EDUCATION & EVENTS TDWI Conference TDWI Executive Summit TDWI Seminar |
WEBINARS |
MARKETPLACE TDWI Solutions Gateway TDWI White Paper Library TDWI White Paper Library |
PREMIUM MEMBER DISCOUNTS |
||
MANAGE YOUR TDWI PREMIUM MEMBERSHIP Renew your Premium Membership by: [-ENDDATE-] Renew | FAQ | Edit Your Profile | Contact Us
|
||
Copyright 2015. TDWI. All rights reserved. |