OpenAI could debut a multimodal AI digital assistant soon

Date:

OpenAI has been showing some of its customers a new multimodal AI model that can both talk to you and recognize objects, according to a new report from The Information. Citing unnamed sources who’ve seen it, the outlet says this could be part of what the company plans to show on Monday.

The new model reportedly offers faster, more accurate interpretation of images and audio than what its existing separate transcription and text-to-speech models can do. It would apparently be able to help customer service agents “better understand the intonation of callers’ voices or whether they’re being sarcastic,” and “theoretically,” the model can help students with math or translate real-world signs, writes The Information.

The outlet’s sources say the model can outdo GPT-4 Turbo at “answering some types of questions,” but is still susceptible to confidently getting things wrong.

It’s possible OpenAI is also readying a new built-in ChatGPT ability to make phone calls, according to Developer Ananay Arora, who posted the above screenshot of call-related code. Arora also spotted evidence that OpenAI had provisioned servers intended for real-time audio and video communication.

None of this would be GPT-5, if it’s being unveiled next week. CEO Sam Altman has explicitly denied that its upcoming announcement has anything to do with the model that’s supposed to be “materially better” than GPT-4. The Information writes GPT-5 may be publicly released by the end of the year.

Altman also said the company isn’t announcing a new AI-powered search engine. But if what The Information reports is what’s revealed, it could still take some wind out of Google’s I/O developer conference sails. Google has been testing using AI to make phone calls. And one of its rumored projects is a multimodal Google Assistant replacement called “Pixie” that can look at objects through a device’s camera and do things like give directions to places to buy them or offer instructions on how to use them.

Whatever OpenAI plans to unveil, it plans to do so via livestream on its site on Monday at 10AM PT / 1PM ET.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Share post:

Subscribe

Popular

More like this
Related

How not to get bamboozled by AI content on the web

Skip to content Image: OpenArt Nowadays, it’s easy to get fooled...

Are 4K webcams worth it? The pros and cons to consider

Skip to content Image: Jon Martindale / IDG I always thought...

Best password managers 2024: Protect your online accounts

Image: Rob Schultz / IDG Humans are terrible at passwords....

Office apps crash on Windows 11 24H2 PCs with CrowdStrike antivirus

Image: rawf8 / Shutterstock.com Another week, another issue with Windows...