CEO Q&A: Modern Platforms for Data Science
Too many enterprises try to implement data science applications without first deploying the right platform. Iguazio's Asaf Somekh explains why platforms are so important.
- By James E. Powell
- May 24, 2019
In our continuing series of CEO interviews, we recently spoke with Asaf Somekh of Iguazio about the current state of data management and up-and-coming tactics for deploying data science.
What technology or methodology must be part of an enterprise's data strategy if it wants to be competitive today? Why?
Asaf Somekh: A business must be working efficiently with data before it can create applications based on data science. An efficient data pipeline analyzes data from many sources and types, correlates fresh data with large sets of historical data in real time, and provides easy enrichment -- all in a single unified platform.
For a real competitive edge, a platform must provide freedom to work securely with multimodel data at any volume while eliminating silos and complexities. A powerful and fast data engine is the number one enabler of intelligent applications running on top of it.
What one emerging technology are you most excited about and think has the greatest potential? What's so special about this technology?
That would be serverless. Serverless frameworks allow developers to focus on building and running auto-scaling applications without worrying about managing servers because server provisioning and maintenance are taken care of behind the scenes. With serverless computing, users minimize development and maintenance overhead and automate deployment.
What is the single biggest challenge enterprises face today? How do most enterprises respond (and is it working)?
We see that enterprises looking to implement data science and AI in business applications struggle with complex, siloed data pipelines that require very long deployment periods at extremely high costs. Many companies have tried to harness Hadoop to build AI applications, but according to Gartner, the great majority of Hadoop deployments fail to meet cost savings and revenue generation objectives due to skill and integration challenges. Most businesses compromise time to market and try to deal with the problem by hiring more data scientists and data engineers instead of fixing the architecture from the ground up.
Is there a new technology in data and analytics that is creating more challenges than most people realize? How should enterprises adjust their approach to it?
It's not for nothing that Kubernetes is the leading orchestration tool today, and Iguazio's managed services run over it. However, the problem begins when data engineers think that they're all set just by working with Kubernetes. We believe that Kubernetes is a great base for a modern functioning platform, but ideally data scientists shouldn't deal with any installation and management at all. In order to focus solely on data science, data scientists need managed platforms with preinstalled services over Kubernetes. These services include AI tools such as Jupyter notebook, TensorFlow, and Pytorch, data services, serverless computing, and a friendly UI that manages it all.
Where do you see analytics and data management headed in 2019 and beyond? What's just over the horizon that we haven't heard much about yet?
Data science is still a nascent technology. We believe that data science will be implemented in more and more data-driven applications as its promise becomes easier to realize. Once the development and deployment process is more seamless, businesses will be launching new intelligent applications on a monthly basis.
What is your company's product/solution and what problem does it solve for enterprises?
Most data science projects today can't make an impact due to development and deployment complexities. By the time they reach production, too many compromises have been made and projects lose their magic. Iguazio's Data Science Platform was built from the ground up for production. Its production-native architecture enables fast development and deployment of data science applications, while retaining their full capabilities. Users correlate many different data types to enrich data, work with leading preinstalled AI tools, and deploy easily and automatically in any environment using a serverless framework. With Iguazio, data scientists finally stop focusing on infrastructure and start making an impact.
James E. Powell is the editorial director of TDWI, including research reports, the Business Intelligence Journal, and Upside newsletter. You can contact him
via email here .