Google has reclaimed the top spot in a valued AI benchmark table, knocking OpenAI into second place.
On the respected Chatbot Arena leaderboard, the Alphabet-owned company has assumed the lead with the introduction of its Gemini-Exp-1206 experimental model. Previously, Sam Altman’s company was in pole position with its ChatGPT-4o offering, just shading Gemini-Exp-114 which was released on November 15.
Those competing LLMs were effectively matched, with Google appearing to close the gap on its nearest competitor.
Chatbot Arena reported the new Google Gemini version showed significant improvement across important categories including mathematics, creative writing, and visuals, with a 40-point improvement on previous offerings. Despite this, Tech Crunch has outlined how the current AI benchmarking approach could vastly oversimplify model evaluation.
What a way to celebrate one year of incredible Gemini progress — #1🥇across the board on overall ranking, as well as on hard prompts, coding, math, instruction following, and more, including with style control on.
Thanks to the hard work of everyone in the Gemini team and… https://t.co/BnjfUIexpU pic.twitter.com/i4Fof5KoWc
— Jeff Dean (@🏡) (@JeffDean) December 6, 2024
Google trumps OpenAI with free-to-use model
That is a separate issue to contend with, and Google will not be worrying too much at present, with the impressive credentials of Gemini-Exp-1206 now available. OpenAI has been a market leader in advanced AI models for some time, but Google has its rival firmly in its line of vision.
The free-to-use Exp-1206 can process and make sense of video content unlike key competitors ChatGPT and Claude, which are limited to images. Google’s model possesses a resourceful 2M token context window, meaning it can run through more than one hour of video content.
Google has undercut its main opponent by offering Gemini-exp-1206 for free via Google AI studio and the Gemini API, while OpenAI moved to increase the price of its top-tier service.
This truly matters as users could save $200 for a product that is essentially on the same level. This performance at no extra cost will make the market sit up and take notice, as well as kick open the doors for AI accessibility.
Image credit: Via Midjourney