Gemini 1.5: Google's new AI model already has a major update

Gemini 1.5: Google’s new AI model already has a major update

Google has released a new version of Gemini, its multimodal large language models, only a week after rolling out its Gemini 1.0 Ultra model. Gemini 1.5 is the tech giant’s next-generation model, which is said to have “dramatically enhanced performance.” According to DeepMind, Alphabet’s AI division, 1.5 Pro can process up to one million tokens of information.

Introducing Gemini 1.5: our next-generation model with dramatically enhanced performance. It also achieves a breakthrough in long-context understanding.

The first release is 1.5 Pro, capable of processing up to 1 million tokens of information. 🧵 https://t.co/qT0aXdFL0n pic.twitter.com/xA0ib11f00

— Google DeepMind (@GoogleDeepMind) February 15, 2024

In a blog post on Google’s website, CEO Sundar Pichai said, “Our teams continue pushing the frontiers of our latest models with safety at the core. They are making rapid progress.

“This new generation also delivers a breakthrough in long-context understanding,” he continued, stating that the model set could now run up to one million tokens consistently, “achieving the longest context window of any large-scale foundation model yet.”

WIRED reported that Demis Hassabis, CEO of Google DeepMind, drew parallels between its immense input capacity and a person’s working memory, reflecting on insights he gained as a neuroscientist years ago.

The AI chatbot and assistant is said to introduce significant performance improvements and incorporates a more efficient training and serving process, which Google calls the Mixture-of-Experts (MoE) architecture.

What can Google’s Gemini 1.5 do?

According to a thread posted on Google’s DeepMind profile on X, data scientists tested Gemini 1.5 using a series of text, code, image, audio, and video evaluations and found that 1.5 Pro outperformed 1.0 Pro on 87% of the benchmarks used for developing their LLMs.

One million tokens are reportedly equivalent to over 700,000 words, more than 30,000 lines of code, 11 hours of audio, or one hour of video. This exceeds the capabilities of other AI models, such as OpenAI’s GPT-4, which runs ChatGPT.

The team claimed that 1.5 Pro was even able to learn a new skill from information given in a prompt. They said, “When provided with a grammar manual for Kalamang, a language with fewer than 200 speakers worldwide, it could translate from English at a similar level to people learning from the same content.”

The report also mentioned that during the analysis of a 402-page PDF containing the Apollo 11 communications transcript, the model was tasked with identifying humorous segments. It highlighted several instances, including a moment when astronauts joked that a delay in communications was because of a sandwich break.

The department’s Research and Deep Learning Lead Oriol Vinyals called it a “drastic model” but cautioned that “like with any machine learning model, it sometimes doesn’t get it right.”

To show what’s possible with the drastically huge context window in Gemini 1.5 Pro, we prompted it with the three.js examples code – over 100,000 lines of code/800k+ tokens!

(That’s not even the max, it can handle millions of tokens 😀)

Gemini was able to process all the code… pic.twitter.com/N2VYgn2JFJ

— Oriol Vinyals (@OriolVinyalsML) February 15, 2024

How do I access Google Gemini 1.5?

A limited number of developers and cloud customers have been granted access to use 1.0 Ultra along with its Gemini API in AI Studio and Vertex AI, in a private preview. There is no date yet for a general release.

Gemini, previously known as Bard, was released this week for Android and iOS users in the U.S. after being rebranded. The rapid advancements in generative AI is seen to be in sharp contrast with concerns about the technology’s potential risks. Google asserts that it has subjected Gemini Pro 1.5 to “extensive evaluations” and believes that offering restricted access serves as a means to collect feedback on possible dangers.

In addition to this, the company said it has granted researchers at the UK’s AI Safety Institute access to its most advanced models for evaluation purposes.

Featured image: Canva / DALL·E