By Philip Russom, TDWI Research Director
[NOTE: The following article was published in the TDWI Trip Report of May 2012.]
The Technology Survey that TDWI circulated at the recent World Conference in Chicago asked attendees to answer a few questions about analytic database management systems and how these fit into their overall data warehouse architecture. Here’s some background information about analytic databases, plus a sampling of attendees’ responses to the survey:
More
Posted by Philip Russom, Ph.D. on June 8, 20120 comments
By Philip Russom, TDWI Research Director
High performance continues to intensify as a critical success factor for user implementations in data warehousing (DW), business intelligence (BI), data integration (DI), and analytics. Users are challenged by big data volumes, new and demanding analytic workloads, growing user communities, and business requirements for real-time operation. Vendor companies have responded with many new and improved products and functions for high performance—so many that it’s hard for users to grasp them all.
More
Posted by Philip Russom, Ph.D. on May 18, 20120 comments
Blog by Philip Russom
Research Director for Data Management, TDWI
All kinds of people have recently weighed in with their definitions and descriptions of so-called “big data,” including journalists, industry analysts, consultants, users, and vendor representatives. Frankly, I’m concerned about the direction that most of the definitions are taking, and I’d like to propose a correction here.
Especially when you read the IT press, definitions stress data from Web, sensor, and social media sources, with the insinuation that all of it is collected and processed via streams in real time. Is anyone actually doing this? Yes, they are, but the types of companies out there on the leading edge of big data (and the advanced analytics that often go with it) are what we usually call “Internet companies.” Representatives from older Internet companies (Google, eBay, Amazon) and newer ones (Comshare, LinkedIn, LinkShare) have stood up at recent TDWI conferences and described their experiences with big data analytics; therefore I know it’s real and firmly established.
So, if Internet companies are successfully applying analytics to big data, what’s my beef? It is exactly this: a definition of big data biased toward best practices in Internet companies ignores big data best practices in more mainstream companies.
For example, I recently spoke with people at three different telcos – you know, telephone companies. For decades, they’ve been collecting big data about call detail records (CDRs), at the rate of millions (sometimes billions) of records a day. In some regions, national laws require them to collect this information and keep it in a condition that is easily shared with law enforcement agencies. But CDRs are not just for regulatory compliance. Telcos have a long history of success analyzing these vast datasets to achieve greater performance and reliability from their utility infrastructure, as well as for capacity planning and understanding their customers’ experiences.
Federal government agencies also have a long history of success with big data. For example, representatives from IRS Research recently spoke at a TDWI event, explaining how they were managing billions of records back in the 1990s, and have recently moved up to multiple trillions of records. (Did you catch that? I said trillions, not billions. And that’s just their analytic datasets!) More to the point, IRS data is almost exclusively structured and relational.
I could hold forth about this interminably. Instead, I’ve summarized my points in a table that contrasts a mainstream company’s big-data environment with that of an Internet-based one. My point is that there’s ample room for both traditional big data and for the new generation of big data that’s getting a lot of press at the moment. Eventually, many businesses (whether mainstream, Internet, or whatnot) will be an eclectic mix of the two.
Traditional Big Data |
New Generation Big Data |
Tens of Terabytes, sometimes more |
Hundreds of Terabytes, soon to be measured in Petabytes |
Mostly structured and relational data |
Mixture of structured, semi-structured, and unstructured data |
Data mostly from traditional enterprise applications: ERP, CRM, etc. |
Also from Web logs, clickstreams, sensors, e-commerce, mobile devices, social media |
Common in mid-to-large companies: Mainstream today |
Common in Internet-based companies: Will eventually go mainstream |
Real-time as in Operational BI |
Real-time as in Streaming Data |
I’m sorry that I’m foisting yet another definition of big data on you. Heaven knows, we have enough of them. But I feel we need a less Internet-biased definition in preference of one that’s broad enough to encompass big-data best practices in mainstream companies, as well. For one thing, let’s give credit where credit is due; and a lot of mainstream companies are successful with a more traditional definition of big data. For another, we run the risk of alienating people in mainstream companies, which could impair the mainstream adoption of big-data best practices. That, in turn, would stymie the cause of leveraging big data (no matter how you define it) for greater business leverage. And that would be a pity.
So, what do you think? Let me know!
===============================
Some of the material of this blog came from my recent Webinar: “Big Data and Your Data Warehouse.” You can replay it from TDWI’s Webinar Archive.
Want to learn more about Big Data Analytics? Attend the TDWI Forum on Big Data Analytics, coming in Orlando November 12-13, 2012.
Posted by Philip Russom, Ph.D. on May 1, 20120 comments
Blog by Philip Russom
Research Director for Data Management, TDWI
To raise an awareness of what the Next Generation of Master Data Management (MDM) is all about, I recently issued a series of 35 tweets via Twitter, over a two-week period. The tweets also helped promote a TDWI Webinar on Next Generation MDM. Most of these tweets triggered responses to me or retweets. So I seem to have reached the business intelligence (BI), data warehouse (DW), and data management (DM) audience I was looking for – or at least touched a nerve!
To help you better understand Next Generation MDM and why you should care about it, I’d like to share these tweets with you. I think you’ll find them interesting because they provide an overview of Next Generation MDM in a form that’s compact, yet amazingly comprehensive.
Every tweet I wrote was a short sound bite or stat bite drawn from TDWI’s recent report on Next Generation MDM, which I researched and wrote. Many of the tweets focus on a statistic cited in the report, while other tweets are definitions stated in the report.
I left in the arcane acronyms, abbreviations, and incomplete sentences typical of tweets, because I think that all of you already know them or can figure them out. Even so, I deleted a few tiny URLs, hashtags, and repetitive phrases. I issued the tweets in groups, on related topics; so I’ve added some headings to this blog to show that organization. Otherwise, these are raw tweets.
Defining the Generations of MDM
1. #MDM is inherently a multigenerational discipline w/many life cycle stages. Learn its generations in #TDWI Webinar
2. User maturation, new biz reqs, & vendor advances drive #MDM programs into next generation. Learn more in #TDWI Webinar
3. Most #MDM generations incrementally add more data domains, dep’ts, data mgt tools, operational apps.
4. More dramatic #MDM generations consolidate redundant solutions, redesign architecture, replace platform.
More
Posted by Philip Russom, Ph.D. on April 13, 20120 comments
These days have been a whirlwind of projects. One of the biggest for me is the TDWI Best Practices Report I am working on, entitled “Customer Analytics in the Age of Social Media.” This report looks at what organizations are doing and could be doing to analyze information sources to improve their knowledge of and engagement with customers. Social media data is the revolutionary force in this realm; marketing functions are highly focused on how to take advantage social media both as a new channel and as a critical source of information about customer and market behavior. The heart of this report will be about how customer intelligence and analytics efforts are being reshaped by the influence of social media. This is exciting stuff.
More
Posted by David Stodder on April 12, 20120 comments
Blog by Philip Russom
Research Director for Data Management, TDWI
[NOTE -- I recently completed a TDWI Best Practices Report titled Next Generation Master Data Management. The goal is to help user organizations understand MDM lifecycle stages so they can better plan and manage them. TDWI will publish the 36-page report in a PDF file in early April 2012, and anyone will be able to download it from www.tdwi.org. In the meantime, I’ll provide some “sneak peeks” by blogging excerpts from the report. Here’s the fifth excerpt, which is the Executive Summary at the beginning of the report.]
More
Posted by Philip Russom, Ph.D. on March 30, 20120 comments