Managing Big Data: An Overview in 30 Tweets
By Philip Russom
Types of Multi-Structured Data Managed as Big Data
Research Director for Data Management, TDWI
To help you better understand new practices for managing big data and why you should care, I’d like to share with you the series of 30 tweets I recently issued on the topic. I think you’ll find the tweets interesting, because they provide an overview of big data management and its best practices in a form that’s compact, yet amazingly comprehensive.
Every tweet I wrote was a short sound bite or stat bite drawn from my recent TDWI report “Managing Big Data.” Many of the tweets focus on a statistic cited in the report, while other tweets are definitions stated in the report.
I left in the arcane acronyms, abbreviations, and incomplete sentences typical of tweets, because I think that all of you already know them or can figure them out. Even so, I deleted a few tiny URLs, hashtags, and repetitive phrases. I issued the tweets in groups, on related topics; so I’ve added some headings to this blog to show that organization. Otherwise, these are raw tweets.
1. #TDWI SURVEY SEZ: 26% of users manage #BigData that’s ONLY structured, usually relational.
2. #TDWI SURVEY SEZ: 31% manage #BigData that’s eclectic mix of struc, unstruc, semi, etc.
3. #TDWI SURVEY SEZ: 38% don’t have #BigData by any definition. Hear more in #TDWI Webinar Oct.8 noonET http://bit.ly/BDMweb
4. Structured (relational) data from traditional apps is most common form of #BigData.
5. #BigData can be industry specific, like unstruc’d text in insurance, healthcare & gov.
6. Machine data is special area of #BigData, with as yet untapped biz value & opportunity.
Reasons for Managing Big Data Well
7. Why manage #BigData? Keep pace w/growth, biz ROI, extend ent data arch, new apps.
8. Want to get biz value from #BigData? Manage #BigData for purposes of advanced #analytics.
9. #BigDataMgt yields larger samples for apps that need it: 360° views, risk, fraud, customer seg.
10. #TDWI SURVEY SEZ: 89% feel #BigDataMgt is opportunity. Mere 11% think it’s a problem.
11. Key benefits of #BigDataMgt are better #analytics, datasets, biz value, sales/marketing.
12. Barriers to #BigDataMgt: low maturity, weak biz support, new design paradigms.
13. #BigDataMgt non-issues: bulk load, query speed, scalability, network bandwidth.
Strategies for Users’ Big Data Management Solutions
14. #TDWI SURVEY SEZ: 10% have #BigDataMgt solution in production; 10% in dev; 20% prototype; 60% nada. #TDWI Webinar Oct.8 http://bit.ly/BDMweb
15. #TDWI SURVEY SEZ: Most common strategy for #BigDataMgt: extend existing DataMgt systems.
16. #TDWI SURVEY SEZ: 2nd most common strategy for #BigDataMgt: deploy new DataMgt systems for #BigData.
17. #TDWI SURVEY SEZ: 30% have no strategy for #BigDataMgt though they need one.
18. #TDWI SURVEY SEZ: 15% have no strategy for #BigDataMgt cuz they don’t need one.
Ownership and Use of Big Data Management Solutions
19. Some depts. & groups have own #BigDataMgt platforms, including #Hadoop. Beware teramart silos!
20. Trend: #BigDataMgt platforms supplied by IT as infrastructure. Imagine shared #Hadoop cluster.
21. Who does #BigDataMgt? analysts 22%; architects 21%; mgrs 21%; tech admin 13%; app dev 11%.
Tech Specs for Big Data Management Solutions
22. #TDWI SURVEY SEZ: 97% of orgs manage structured #BigData, followed by legacy, semi-struc, Web data etc.
23. Most #BigData stored on trad drives, but solid state drives & in-memory functions are gaining.
24. #TDWI SURVEY SEZ: 10-to-99 terabytes is the norm for #BigData today.
25. #TDWI SURVEY SEZ: 10% have broken the 1 petabyte #BigData barrier. Another 13% will within 3 years.
A Few Best Practices for Managing Big Data
26. For open-ended discovery-oriented #analytics, manage #BigData in original form wo/transformation.
27. Reporting and #analytics are different practices; managing #BigData for each is, too.
28. #BigData needs data standards, but different ones compared to other enterprise data.
29. Streaming #BigData is easy to capture & manage offline, but tough to process in #RealTime.
30. Non-SQL, non-relational platforms are coming on strong; BI/DW needs them for diverse #BigData.
Want to learn more about managing big data?
For a much more detailed discussion—in a traditional publication!
—get the TDWI Best Practices Report, titled Managing Big Data
, available in a PDF file via a free download.
You can also register for and replay my TDWI Webinar
, where I present the findings of Managing Big Data
Please consider taking courses at the TDWI World Conference in Boston
, October 20–25, 2013. Enroll online.
Philip Russom is the research director for data management at The Data Warehousing Institute (TDWI). You can reach him at [email protected]
or follow him as @prussom on Twitter.
Posted by Philip Russom, Ph.D. on October 11, 2013