AI Ecosystem – Opening Salvo – Radio Free Mobile

OpenAI jumps in just before everyone else.

OpenAI’s new model has very little to do with improving raw performance and everything to do with making it easier and more fun to use, in a move designed to drive engagement and establish OpenAI as the premier AI ecosystem.
OpenAI fired the first shots in the developer conference season that will see Google, Microsoft and Apple all lay out their AI prowess in an attempt to keep users in their ecosystems as well as encourage developers to create generative AI services using their platforms.
This is the beginning of the battle for the AI ecosystem which is currently dominated by Nvidia and where the platform is likely to migrate away from silicon to the foundation model.
OpenAI launched its new model GPT-4o where real improvements have been made in terms of how users interact with the model rather than the model’s raw performance.
I see this as a deliberate move to capitalise on the market awareness that it has and to drive engagement with the model clearing the way for it to become the development platform of choice for generative AI.
- First, voice: GPT-4o has had a substantial upgrade to its ability to understand and produce voice communication.
- The voice system can understand and reproduce emotion and dramatic effect to voice communication which if the demos prove accurate, represents a big step up in the quality of using voice as a man-machine interface.
- Users can now interrupt the model which makes the interaction appear much more realistic.
- However, latency is clearly still a problem as the device used for the demo had to be hardwired to the Internet to ensure a realistic response time.
- Furthermore, I suspect that the communication remained within OpenAI’s infrastructure further highlighting that this is going to be a problem when people start using it in the real world.
- Second, local apps: which will be available on desktops and smartphones within a relatively short time.
- These apps will not contain the model but will be conduits to the cloud where the requests will be processed.
- However, they will add functionality like being able to capture data from the device and paste it into the app once again making the user experience easier and more fun to use.
- Third, vision where GPT-4o can recognise images, writing and facial emotions.
- This makes it much easier to share information with the model which in turn improves the model’s utility which makes it stickier for those that choose to use it.
- Fourth, reasoning: where OpenAI once again attempted to convince the audience that its model can reason.
- This is crucial because reasoning requires an understanding of causality which is the single greatest weakness of all of the large language models (LLMs).
- OpenAI managed to improve the quality of the illusion of reasoning with its linear equation demonstration but presented no evidence at all that the model was in fact reasoning as the answer to that equation could easily have been present in the training data.
GPT-4o is an iteration of GPT-4 and as such its performance is very marginally better than GPT-4Turbo, Claude 3, Gemini Ultra 1.0 and so on (see here).
This is being called out as a failure to improve the performance of the model or produce GPT-5, but I don’t think that this is the point.
The point here is to get out ahead of everyone else and leverage the fact that everyone knows what ChatGPT is and to offer them a great user experience, so they don’t bother to check out what everyone else is doing.
This is one side of the equation with the other being appealing to developers such that they choose to develop services on GPT rather than Gemini Claude or anyone else.
If successful, this will be a potent combination as more apps mean more users which means more apps and so on which is exactly how the big smartphone digital ecosystems were established.
Here, OpenAI has an advantage as it has 100m users as well as global name recognition giving it a great head start in the AI ecosystem race.
The net result is that OpenAI has set the standard for the coming developer conference season which kicks off tomorrow with Google i/o, continues with Microsoft Build on May 21 and Apple WWDC on June 10.
I suspect that the reality here is that when these services are in the wild, the conversation with the bot will be far more stilted and slow due to the time that it takes to relay the request over the network, process it and return it to the user.
This is just one reason why I think at the end of the day these sorts of models will be deployed on the device rather than in the cloud as this will be the only way to get the kind of user experience that OpenAI has just promised it will deliver.
Furthermore, it will be far more economical for OpenAI to figure out how to run these models on devices as then it does not have to pay the inference compute cost.
GPT-4o also raises the bar yet again for what is freely available at no cost which I think will heap further pressure on the business model of $20 per user per month that many companies have raised money on.
The net result is that this is a consumerisation of generative AI designed to drive engagement and cement OpenAI’s position in the emerging AI ecosystem rather than a performance upgrade.
We can expect that everyone else will do their best to dislodge and discredit OpenAI’s claims over the next few weeks as they set out their stalls in the market.
An interesting few weeks are coming up.

AI Ecosystem – Opening Salvo

About Me

Categories

Recent Post

Archives

ConvertKit Form

AI Ecosystem – Opening Salvo

You May Also Like

Tech Newsround – ASML, AMD ...

China vs. USA – Art of the ...

Tech Newsround – Nvidia an ...

About Me

Categories

Recent Post

Archives

ConvertKit Form