On the 13th, OpenAI, the developer of the conversational AI "ChatGPT," announced the development of a new AI called "GPT-4o."
GPT-4o's processing speed is twice as fast as the previous version, and its operating costs are halved. It is also said to be capable of conversing with humans at the same reaction speed. It also enables real-time processing of not only voice but also visual and text, and its performance far exceeds that of previous AI models.
The "o" in GPT-4o stands for omni, which means omnibus, omnichannel, etc., and is said to mean the whole or all directions.
OpenAI CEO Sam Altman said, "The new voice (and video) mode is the best computer interface I've ever used. It feels like an AI from a movie. I'm still a little surprised that it's real. Reaching human-level response times and expressiveness is a big change."
CTO Mira Murati also said, "GPT-4o is a step towards changing the future of human-machine interaction. This model makes collaboration much more natural and easier."
GPT-4o outperforms OpenAI's previous model, GPT-4 Turbo, as well as large-scale language models such as competitor Anthropic's Claude3Opus and Google's Gemini1.5 Pro. Altman confirmed last month that the chatbot being tested on Chatbot Arena under the codename "gpt2" was GPT-4o.
Features of GPT-4o
GPT-4o's unique feature is that it integrates speech recognition, intelligence and text-to-speech. This has significantly improved its reaction speed and allows it to fully understand elements of complex conversations, such as interruptions, background noise, multiple voices and tones of voice. It is a multimodal AI that can generate output of any combination of text, voice and images.
However, features such as video recognition and voice recognition will be provided in stages, and for now, services are limited to text and image recognition.
In the livestream demo of the GPT-4o announcement, Murati and others introduced the function by talking to GPT-4o installed on an iPhone.
A demo was shown in which GPT-4o translated what was said in Italian into English in real time, and GPT-4o read mathematical formulas written by hand on paper and provided hints on how to solve them in a natural dialogue.
Reference:Announcement,Altman's blog
Image: Shutterstock
Related articles
WorldCoin development company seeks partnership with OpenAI and PayPal
WorldCoin (WLD) announces its own Layer 2 "World Chain"