Anticipating Generative AI and Moving to Lower-Cost Hybrid Models
Generative AI will force many changes on both the client and the data center side. It’s not too early to assess the likely changes and be prepared.
- By Rob Enderle
- May 25, 2023
Morgan Stanley’s recent report -- “How Large Are the Incremental AI Costs ... and 4 Factors to Watch Next” -- highlights how generative AI, although beneficial for productivity, comes with a significant cost increase. (The company’s managing director, Brian Nowak, discusses the report in this video clip.)
You’d expect this given the size of the generative AI training and inference models. Although the report focuses primarily on Google and the problem of significant cost increases about a year in advance of the associated revenue growth, it does point to a significant problem as companies roll out this technology internally. Generative AI is resource-intensive and thus will come with some ugly cost surprises to IT shops that haven’t prepared line management for related CapEx and OpEx increases.
I’m not saying the costs won’t be worth it. The technology is better than most ever expected it would be at this time and the problems with quality, staffing, and current vendor competence will be mitigated over time. However, the cost issue will need to be addressed up front, suggesting that as companies and providers look at this technology, they should jump ahead and examine how they are already reducing costs. For example, at a recent analyst meeting, IBM said it reduced related IT costs by 90% by aggressively using its own blended, multicloud, and hybrid-cloud solutions.
For generative AI, much of the focus is on end users, which means the hybrid solution is less about servers and more about pushing as much load as possible into the client device, be that a smartphone or a laptop.
Personal Generative AI Hybrid Client
Using the client hybrid approach would seem to have several advantages. One is that the technology on the client side is already baked in. As companies such as Qualcomm, Microsoft, AMD, Intel, and NVIDIA build client hardware focused on generative AI, these client devices will be upgraded on their regular refresh cycles. As during the pandemic, we could see a push to accelerate the refresh cycle to reduce the related loading on data centers to mitigate denigrated performance.
Granted, this hardware is rare now, but with so many vendors working on solutions, I expect that by next year we’ll have plenty of choices regarding AI-ready client devices. Given the newness of this technology, most of the AI deployments are likely one or two years out anyway, suggesting you have time. Because most firms upgraded their client devices or replaced them during the pandemic, having to do it again in one to two years will be a bit premature for some.
However, a benefit of doing so will be that hybrid generative AI should work with reduced functions even if the device is offline. If the device is closer to where information is being captured, it can do some of the initial processing before a request is sent to the data center, such as refining the query, compressing any related images, and summarizing the content. When the data center provides a response, it can be compressed if the device can decompress it, thereby lowering data traffic -- which could work with both text and image content.
Generative AI is going to force many changes on both the client and the data center side. It might be wise to begin assessing the likely changes so you don’t end up wishing you waited and deployed a better optimized solution for the coming loads. Particularly look at and ask about vendor road maps regarding AI processing capabilities in coming client and data center hardware to make sure you won’t have to prematurely replace systems that were just installed.
You should also increase your prioritization of AI skill acquisitions so your staff can make informed decisions about future hardware, services, and vendors.
I expect this generative AI rollout will be far more disruptive and faster than we saw with the Internet given what we’ve seen so far. As one of my professors used to say: the wise person confirms the direction before pressing the gas pedal. I recommend you follow his advice.
Rob Enderle is the president and principal analyst at the Enderle Group, where he provides regional and global companies with guidance on how to create a credible dialogue with the market, target customer needs, create new business opportunities, anticipate technology changes, select vendors and products, and practice zero-dollar marketing. You can reach the author via email.