3 Steps to Achieving Usable Data with AI Software (Part 2 of 2)
AI software can help businesses obtain usable data. This article explains how to get started
- By Koert Kuipers
- November 10, 2022
In Part 1, I discussed the need for AI to help businesses obtain usable data and achieve their digital ambitions. In this article, I will discuss how to make use of AI with data successfully. Before jumping into the good stuff, though, we need to take a step back and consider four key tasks before starting your AI software implementation.
Define your “why.” Have a clear goal for what you want to gain from your data as well as the value you ascribe to it. If you have a random, unspecified goal, what success looks like will also be unclear. Get specific, and work backward from the outcome you’d like to achieve with your data.
Be incremental. As lovely as it would be to solve every problem and answer every business question, you can’t boil the ocean. Start with something difficult to solve by manual means alone and leverage a simple yet sophisticated system to achieve your goal with accuracy and efficiency. Remember, AI isn’t a silver bullet. Although it can be an incredibly valuable tool, it has to be used intentionally in order to get the most out of it.
Recognize that your environment is key to implementing software. Where is your data? For instance, there are pros and cons to having your data in one versus multiple clouds. A multicloud approach enables users to leverage the unique innovation that each platform has. Many platforms run at different paces and have their own benefits that should be evaluated in relation to how you intend to use them. For example, Azure may have the fastest network but Amazon may provide the most flexibility. On the other hand, that uniqueness can also create difficulties. Each cloud provider also has its own version of a database, automation tools, and other technology. If you spend a great deal of time deep diving into one, imagine the time you’ll need to evaluate or use multiple tools.
Beware of “tech debt.” Working with a software provider that doesn’t have any proprietary lock-in technology, formats, or structures allows you to be agile and flexible so you can transition workloads between cloud providers, operating the same way in each one without issue.
AI Implementation Steps
The core steps for successful AI software implementation fall into three main categories: people, process, and product.
The state of business has been rocky, to say the least. With a recession looming, layoffs occurring more often than anyone would like, and corporate spending under scrutiny, the need to do more with less has never been more crucial. During times like this, people need to adopt a more data-centric mindset (and use automation wherever possible).
Change can be hard, especially when it involves challenging a legacy way of thinking. There will always be a strong bias for the familiar way of doing things, even if it is dated and doesn't keep pace with the growth of data. However, failure to reap the full transformational power of data will inevitably prohibit real innovation. Tackling this issue head-on requires people from the top down to shift to a data-first mentality, putting data at the center of every business decision as opposed to individual, point-and-shoot use cases.
You also need to take a look at how you think about data as a whole. There is admittedly a dearth of data engineering talent available to embrace and understand next-generation technology to make data usable at scale. Moreover, resources that can enable this innovation are few and far between and can be prohibitively expensive.
If your organization has more limited access to talent and resources, invest in training and improve the skills of your current staff. This can seem like a daunting endeavor because of the infinite amount of information to learn about data engineering, so it’s important to take things incrementally. Prioritize the core skill sets your team needs to be successful. This can start as broad as understanding how the cloud works, what DevOps is as it relates to building and operating in a cloud environment, or how to leverage the open source community.
Just as you wouldn’t skip regularly brushing your teeth, you can’t neglect your data hygiene. It is important to have the right processes in place to govern and manage data appropriately, and this starts at the individual bit level. Poor governance and management of different feeds and sources of data can lead to substantial challenges downstream. This process aligns with three core steps: ingest, enrich, and distribute.
Ingest. Before you can use your data, you need to know what you have and where it is located. This involves the “ingest” piece -- corralling all of your data, no matter where it’s stored. This step is made easier with automation, giving you back the time spent on standardizing formats, data cleansing, and preparation. A data-centric mindset means you go all-in with all your data and put every last bit of it to work, which given the volume of data today is only feasible at scale with automation.
Enrich. When aggregated, raw bits of data are just that -- a collection of meaningless bits. A raw bit by itself has no value but has endless potential in its assembly, combinations, and permutations. To achieve this, you must employ a flexible, schema-less model whereby value and opportunity can be recognized across the many-to-many relationships among your bits. This way, the same bits can be assembled and assigned values automatically across multiple value stores and use cases without the need for redesign efforts.
Distribute. To be useful, intelligence needs to be shared. When your data is easily distributed, you can foster true ideation and innovation that can give you valuable business intelligence, a real competitive advantage, and the ability to provide products and services your customers actually want. By automating these parts of the process, data scientists and analysts can stop wasting their valuable time on the mundane task of data preparation and instead focus on high-value activities that provide a greater competitive edge for their organizations.
To be successful in this process you have to be flexible and scalable so you can iterate quickly. Flexibility allows you to add data (internal, external, third-party) and use cases as needed without a huge investment upfront. Scalability allows you to refresh your outputs when you make incremental improvements.
The products you use should be simple, flexible, scalable, and fast. The simpler and fewer the steps, the lower the cost to implement, operate, and maintain. Flexibility allows systems to be built to accommodate change as data is always evolving. Your products also shouldn’t hold you back -- your business doesn’t run in batch mode, so why should your data assets?
Your software needs to scale to handle your growing data volumes. Many products only provide a very whittled-down view, but limiting scale also limits your understanding and opportunity. For example, if your company has hundreds of thousands (if not millions) of customers, your tools must support the volume of information generated with every transaction and interaction.
A digital company is one that can analyze all the data, all the time, and embed it into every product and service offered. Solving for scale is what really enables a company to become digital. The tech giants of the world can tailor services and offerings down to individual users (as opposed to groups or subsets) for hyper-personalization, allowing them to recommend tailored services or products that make sense for one individual, not a cohort, millions and billions of times over.
As sexy as being a digital business may appear, it first requires knowing how to apply the “sexy” stuff (such as sophisticated ML and AI) to the slightly less glamorous yet necessary steps for creating usable data at scale. Put differently, to be digital, you have to put everything you have -- all of your data -- to work. This is why the digital behemoths of our generation are all built around automated data platforms: Google with the world’s online data, Meta with the world’s social data, Amazon with the world’s purchasing data, and so on. Becoming just as digital and data-first is only achievable through automated data management that allows for the successful storage, processing, and monetization of all data at absolute scale.