Microsoft Copilot – Risk adverse

A very cautious update.

  • Microsoft has launched an updated version of Copilot and while Microsoft remains extremely risk-adverse, I think Microsoft has produced a better product than OpenAI even though both are using the same technology.
  • Copilot exists in two forms (cloud and device) and this update makes it clear that the two iterations will increasingly be indistinguishable from one another and will often work together.
  • The updates include:
    • First, Copilot Voice: which adds voice to Copilot so that the user can talk to it rather than having to type.
    • The problem is that a decent user experience must closely simulate a conversation with another human, which means a virtually instantaneous response.
    • Both Meta and OpenAI have demonstrated this on a mobile phone but because their models are in the cloud, they had to hardwire the smartphone to the server to get a fast enough response.
    • That clearly is not a workable solution meaning that the only answer for implementing a generative AI voice agent is to run it on the device where the request is being made.
    • This is where Copilot+ comes in which is a runtime that sits on the device (Copilot+ PCs) and contains several smaller LLMs that are capable of different tasks.
    • Microsoft has not clarified where Copilot Voice is running in this version (which leads me to suspect that it is in the cloud) but the obvious endgame is that it will reside on the device.
    • This will give a substantial uplift to the user experience meaning that Microsoft needs to do this sooner rather than later.
    • Second, Copilot Vision: which can see what is on the screen or look through the camera (eventually), analyse it and then discuss it with the user.
    • This is very similar to what Google is offering with Gemini and so a lot will depend on how well this works.
    • In its quest for safety and security, Copilot Vision’s capabilities have been hamstrung as it will only work on certain websites and with certain content meaning that there will be lots of genuine use cases where Copilot frustratingly declines to do what is asked of it.
    • This is symptomatic of what RFM refers to as The Black Box Problem (see here) where the difficulty in controlling LLM behaviour means that overall functionality is reduced when controls are implemented.
    • Third, Think Deeper: which looks like an implementation of the OpenAI technology launched with o1 (see here) that OpenAI claims delivers reasoning to LLMs.
    • This involves breaking down a problem into a series of steps and OpenAI claims that this gives it the ability to correct its own mistakes as it goes along and take a different approach when one is not working.
    • Hence at a high level, o1 appears to offer a substantial improvement in the ability to reason.
    • However, the reality is that there is no empirical evidence of this whatsoever and in many areas, o1 remains as stupid as its predecessors and in some cases, stupider.
    • This will be great for students of maths and students learning how to write computer code, but it is not a step forward in delivering on super-intelligent machines.
    • Fourth, Recall: is finally making an appearance but with much stricter controls and Microsoft is rolling this out very timidly.
    • Recall is a feature that tracks and categorizes everything that the user does on a PC and makes it easy to find later when one has forgotten when or where an activity took place or what the details were.
    • Windows and Office search functions are pretty awful which makes this a very relevant feature to add.
    • For me, this is one of the most interesting features of Copilot+, given how much time I spend digging around for data and then forgetting where I found something, but I can see how it makes some users nervous.
    • The controls have been greatly strengthened to give the user complete control over his or her data but whether this hampers the performance of the product remains to be seen.
  • The net result is that Microsoft is in a much better position to distribute OpenAI technology than OpenAI is as it has no on-device product that can inference locally.
  • If voice is the next big thing in generative AI and RFM has long thought that it is, then inference at the edge becomes an even stronger theme.
  • This will add appeal to Copilot and serve as the first real competitive challenge to Google’s AI dominance in many years.
  • Who wins and who loses will be determined by both the quality of the AI offering but also the strength of the digital ecosystem that already exists to which ecosystem owners can deliver their AI and ensure that it is set as default.
  • Microsoft remains very strong in PCs but in smartphones, Apple, Meta, Google and Tencent have a big advantage given their dominance of these ecosystems.
  • Hence, I think that the hybrid approach of inference on the device combined with the greater power of the cloud when needed is the right approach to take, and so far, Microsoft and Google look to be the most advanced in this race.
  • Apple is behind but this is not yet a problem as no one is going to surrender their iPhone because the AI is rubbish or non-existent just yet.
  • The AI companies remain very expensive and the digital ecosystems are fairly valued and fairly uninteresting leaving me with my now well-known adjacencies of inference at the edge (Qualcomm and soon MediaTek) which voice further strengthens, and nuclear power.

RICHARD WINDSOR

Richard is founder, owner of research company, Radio Free Mobile. He has 16 years of experience working in sell side equity research. During his 11 year tenure at Nomura Securities, he focused on the equity coverage of the Global Technology sector.

Leave a Comment