Q&A: Considering Analytics in the Cloud
Teradata's director of cloud strategy and deployment discusses some of the benefits -- and issues to consider -- around analytics in the cloud.
- By Linda L. Briggs
- January 12, 2016
Issues to consider in using the cloud for analytics include price, performance, compliance, and security. In this interview, Teradata's cloud evangelist, Marc Clark, discusses some of those factors. "Cloud for analytics is no different than cloud for anything else," Clark says. "There are still a lot of issues you have to deal with, and I think that's one thing that is sometimes missed when we discuss using the cloud for analytics."
As director of cloud strategy and deployment for Teradata, Clark is responsible for collaborating and driving Teradata's cloud strategy with engineering, sales, marketing, and product management. Clark started the AITP Cloud Conference Committee in San Diego and has served as its chair for the past seven years.
BI This Week: Let's start with an interesting blog item you wrote titled, "The Cloud Isn't a Silver Bullet for Analytics." Are there misunderstandings around using the cloud for analytics?
Marc Clark: Recently, my boss was presenting to some Teradata partners and he remarked, very tongue-in-cheekily, "You know, with analytics in the cloud, you no longer have to worry about data integrity..." He was being sarcastic, of course, but making the point that cloud for analytics is no different than cloud for anything else. There are still a lot of issues you have to deal with. If you're moving to infrastructure-as-a-service or even platform-as-a-service for analytics, there is still plenty that you have to do -- or find a partner to do with you. I think that's one thing that is sometimes missed when we discuss using the cloud for analytics.
What about ownership of data within the cloud? How do things change with the cloud?
I don't actually think it changes that much. You -- meaning the customer -- always own your data. Now, it does depend on what kind of cloud you're using. If you're going to use an industry cloud, where somebody is going to take data and give you access to certain data that is publicly accessible, or that is industry data -- of course, you don't own that data. However, as long as it is customer data -- and by customer, I mean the company -- if a company puts data into the cloud, they always own that. If there's ever a cloud provider who suggests otherwise, then I wouldn't walk, I would run from that provider.
The time when data ownership comes into question is on the consumer side. The question that comes up is: Do I as a consumer own my social data? Do I own my Google searches? I don't deal on the consumer side too much with cloud, but I'd say the consensus seems to be that you don't. When something is free, such as Google or Facebook, they actually aren't free. Instead of paying for that service, you're trading your information, your habits. That's being used as currency. In general, most people would agree with that -- you use Gmail for free, realizing that there's a reason why it's free. They're selling access to you, to a certain extent.
With Teradata, even if we were to put that data in someone else's cloud, the expectation is 100 percent that you own your own company's proprietary data.
In a recent Webinar with TDWI and Teradata, Fern Halper, TDWI's research director for advanced analytics, cited TDWI statistics that show 23 percent of companies are using analytics in the cloud already, and another 40 percent are thinking about it. Do those numbers surprise you?
Not at all. Of course, a lot of it depends on who you survey. Our very biggest clients are less inclined to be looking at cloud than our smallest clients (or potential clients) who are often in the midmarket, which some define as $1 to $5 billion [in revenue]. These are companies that don't have rows of data scientists or data analysts and experts or even the budget to buy an enterprise data warehouse. They're looking for other ways to analyze data -- data that they haven't had access to until now. There's a much broader market that has opened up now, and it's feeding into the need for cloud analytics. It's getting people who have never thought being able to afford or use a data warehouse to start thinking about it. We certainly look at [the growth] as a good thing.
Where is adoption of analytics in the cloud on a curve toward maturity?
I think cloud in general is pretty high on the maturity curve, although I don't think we're quite at the top. It depends on the application. It depends on what people are doing and how they are doing it.
With cloud analytics, we're still climbing up the hill. Over the next 24 to 36 months, I think [we'll see lots of growth] there. I don't mean just BI tools -- I think those will actually mature much faster because they're easier to use. However, when you look at SAP in the cloud or Teradata in the cloud or data warehousing in the cloud, they will mature a bit slower.
With Hadoop in the cloud, there are some very bullish predictions about where that's going to go. Even so, I think it will take a couple of years. At Teradata, we truly believe there isn't any one tool that meets everyone's needs. Whether it's Teradata and Hadoop, or Oracle and Hadoop, or Teradata and Spark, in general, I think we'll see many combinations.
The key is the flexibility of the cloud, and the option, especially for midmarket companies, of having cloud vendors able to deploy for you. The big guys can do it themselves on premises, but for the midmarket, the cloud is just so enticing. Companies understand what it offers. I hear potential customers say, "I want that. I have social media data and text data and other kinds of unstructured data, and I have structured data as well, but I don't have the expertise to do it myself. I want to do it in the cloud."
They understand that they're going to have to do data analytics because if they don't, they're not going to be able to compete, but they have a dilemma of not being able to afford to do it on premises. They're caught between a rock and a hard place, so cloud analytics offers a solution.
You mentioned that Hadoop growth is looking bullish right now. What are the advantages of Hadoop in the cloud?
Well, I don't think anyone, even Hadoop's most vocal advocates, will argue the point that Hadoop is difficult to use. Hadoop is hard, and not just the data analysis part of it. Putting Hadoop into production can be challenging. That's why vendors are popping up to offer professional services to help companies with Hadoop.
Consequently, you're starting to see more people say, "Hey, I don't want to do this on premises. It's difficult, and I don't want to manage it. Let's put it in the cloud and layer some expertise on top of it." Again, it's the midmarket that will struggle with Hadoop more than the biggest players, so I think you'll see Hadoop in the cloud quite a bit, with services around it to help midmarket users in particular.
Let's talk about security and the cloud. In a recent blog, you said that concern about cloud security is the No. 1 reason given by more than 60 percent of companies that have not moved to the cloud.
In that blog posting, I go on to say that security concerns about the cloud are completely misguided. That doesn't mean that there shouldn't be any concerns. To me, the concerns are really no different than the kinds of concerns around security when you're deploying on-premises.
When you move to a cloud provider -- let's take Amazon, for example -- they have [security] certifications up to the virtualization layer. They've done everything they can from a data center perspective, from the hardware layer all the way up to the virtual layer. They've really taken care of that part for you as the customer.
However, you still have a lot of responsibility on your part, and those responsibilities are no different from your responsibilities with an on-premises rollout. You can consider that what AWS or Teradata or any good cloud provider has done up to the virtualization layer is as good as -- if not better than -- what most people do on-premises.
You've been working on cloud computing for a while now, and your Teradata's cloud evangelist. Any parting thoughts on analytics and the cloud?
This is my personal perspective on cloud: people need to be more diligent and in general make sure their eyes are wide open when they're looking at moving to cloud, whether it's for analytics or for anything else. Some of the things we've talked about here echo that point of view.
I'm a cloud evangelist; I get my paycheck because I know cloud and I know it's wonderful. That said, the other thing I know about cloud is that not always the right solution. I don't always tell [potential] Teradata customers, "Oh, yeah, let's put that in the cloud" no matter what. That would be disingenuous and it would be wrong. My job includes helping people consider the hidden costs of cloud, potential performance issues, and other ramifications. Too often, people don't go into a cloud situation with their eyes wide open, and when they do that, they set themselves up for problems.