By using website you agree to our use of cookies as described in our cookie policy. Learn More

TDWI Upside - Where Data Means Business

The Next Thing to Look For in AI Vendors: Interoperation

AI vendors that interoperate and partner well should significantly outperform their competitors.

When products first come to market, it isn’t unusual for them to play badly with others due to the proprietary nature of the development process and the need to keep a new product secret to avoid premature competition until launch. However, once launched, it becomes critical that the product pivot to being “open” to inform on its content, assure trust, and help other supportive applications interoperate with it.

For Further Reading:

Picking an AI Vendor: Look for Indemnification and Experience 

The Race to AI Implementation: 2024 and Beyond

Organizations Must Be Prudent To Realize Value In Generative AI

This opportunity and goal was highlighted with the recent announcement of ElevenLabs’ AI-powered text-to-speech offering that gave ChatGPT videos sound. This video showcases the result, which is impressive but not as impressive as it might have been had the video and sound been created simultaneously rather than sequentially.

Let me explain.

Why Integration During Creation Is Important

If you look at this ChatGPT/ElevenLabs result, the issue is that ElevenLabs’ application doesn’t appear to have access to the code behind the ChatGPT video, so it is only reacting to the result, not to the underlying code that defines that result. So rather than knowing things such as the speed of the vehicles, their internal workings which provide for the realistic physics, and any notations on the code that might better inform how the sound should be created, they are only able to work off the resulting image.

This reduces the potential accuracy of the result and needlessly increases the time needed to create that result. If the two programs were integrated, the user could define both the visuals and the sound in the prompt, and then let the various AIs craft the complete result which should have a far higher degree of believable accuracy and provide a quicker/better result.

Initially, most of what we are seeing is from simple prompts, so the lack of information passthrough isn’t significant. That’s why we aren’t seeing the problem today. However, as we advance the use of AI, we will begin feeding it complete scripts with direction to create better commercial-grade offerings. Those scripts include vocal prompts, and if you were to start with book content, for example, emotional context isn’t conveyed in the video, resulting in far more iterations and time spent than otherwise would be required.

In this example, both ChatGPT and ElevenLabs are parts of a better solution, but only if they function as partners during the creation of the result will that result be optimized. Otherwise, whoever is going second is getting a degraded set of instructions that will result in a degraded outcome.

Let me give you an example. Let’s say you direct the AI to create a walking scene between two people with dialog and action in the directions -- for instance, by providing a paragraph from the book you are turning into a video. ChatGPT will see the entire paragraph, but ElevenLabs only sees the video, not the dialog, which then will need to be added in post production. If both applications get access to the source, that saves a step and any additional context ChatGPT has left out is provided to ElevenLabs, which could use it for vocal inference.

Finally, facial expressions don’t always convey the energy in the words being said -- for instance if the speaker is walking away or not always looking at the camera. Without seeing the original direction, which might have been in a book’s prior paragraph, ElevenLabs may not have the information it needs to initially get the tone right, which also would have to be addressed in post, adding time and effort and potentially introducing avoidable errors.

Isolated AI Vendors Will Likely Fall Behind

As we move towards generative AI maturity, it’s likely two types of vendors will survive and prosper. Vendors such as NVIDIA and IBM (that both create complete ecosystems and partner well) and smaller vendors who partner effectively to create complete ecosystems are likely to do well and survive. Vendors that have partial solutions or don’t partner well are likely to fail in market because solutions will be increasingly complex and require a complete solution regardless of where that solution comes form.

I’m seeing far too many vendors, big and small, that are ignoring parts of solutions that should be on a must-do list and are instead being treated as if they are discretionary. This is not unusual in an early market like this one, but with the massive number of companies rushing to market and the realization that, like other waves, most will fail, picking those with the greatest probability of success becomes a critical part of any vendor selection process.

We will undoubtedly see vendor after vendor claiming they are faster, better, and cheaper, but if they aren’t open source or don’t interoperate or partner well in a market that requires both, their other stats won’t matter. They’ll be destined to get bought out or go under due to an inability to complete the solutions that customers want.

Wrapping Up

We have a massive wave of AI vendors rushing to market; most will be proprietary and not know how to partner or interoperate well. Those that come to market embracing open source concepts and who interoperate and partner effectively should significantly outperform vendors of any size that don’t, given how critical interoperation and partnering is (particularly with customers) to create successful AI deployments.

In the end, it is more important how well an AI vendor works with others to complete solutions than how great their part is, because parts aren’t solutions, and most solutions will continue to require the support of several interoperating vendors.

About the Author

Rob Enderle is the president and principal analyst at the Enderle Group, where he provides regional and global companies with guidance on how to create a credible dialogue with the market, target customer needs, create new business opportunities, anticipate technology changes, select vendors and products, and practice zero-dollar marketing. You can reach the author via email.

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI Members have access to exclusive research reports, publications, communities and training.

Individual, Student, and Team memberships available.