Home What is Google Gemini? How the AI model and chatbot works in 2024

What is Google Gemini? How the AI model and chatbot works in 2024

TLDR

  • Google's Gemini AI, launched as Bard's successor, powers multiple Google products, including Android.
  • Gemini’s multimodal model integrates text, images, audio, and video for richer context understanding.
  • A free and premium Gemini app is available on mobile, enhancing Google Assistant and Messages features.
  • Gemini features a long context window, supporting extensive documents and complex data handling.
  • Future plans may include "Project Jarvis," a tool to automate web tasks, expected in December.

Google has emerged as a leading powerhouse in the world of artificial intelligence (AI) and chatbot technology, alongside the likes of Claude and ChatGPT. It’s currently embracing its “Gemini era” after rebranding from its former iteration, known as Bard. However, in typical Google fashion, it has applied its family of multimodal AI models to many of its other products.

Here’s what we know about Google Gemini.

What is Google Gemini?

Google Gemini came onto the AI scene in February this year and quickly made waves. But it was the release of Gemini Live at the “Made by Google” event in August that truly captured attention. ReadWrite reported that Gemini Live brings conversational AI directly to Android phones, which allows users to talk about complex topics in real time using voice instead of typing—a much more natural and interactive experience.

At its core, Gemini is Google’s large language model (LLM), which powers a range of AI tools similar to those you may have seen, like OpenAI’s ChatGPT. Just as OpenAI’s GPT-4 model fuels ChatGPT-4 and ChatGPT Plus, Gemini powers Google’s AI chatbot and tools. However, Gemini represents more than just an AI model – it’s also the new identity for Google’s chatbot, previously called Bard. This rebranding simplifies things by unifying the model and chatbot under the Gemini name.

So, what can Gemini do? It can answer questions, summarize text, write code, translate, and create images (on mobile, not in the free browser version). Google’s also working on Imagen 3, its response to Midjourney, which will likely be integrated into Gemini soon for even more creative power.

Beyond being a conversational tool, Gemini is also integrated across various Google apps, adding intelligent features to Google Workspace tools like Gmail, Google Docs, and other productivity apps for paying users.

Developers can even incorporate Gemini’s capabilities into their own applications. Gemini could eventually replace Google Assistant, possibly offering an improved, AI-powered assistant that interacts seamlessly with Google’s ecosystem.

How does it compare to ChatGPT?

Google has shared some interesting insights into how Gemini, their AI model, works. Like many leading AI models, Gemini uses a transformer architecture and applies both pretraining and fine-tuning techniques. However, what makes Gemini unique is that it was trained on multiple types of media—text, images, audio, and video—all at once, rather than focusing on each individually.

This approach aims to give Gemini a more nuanced understanding of language and context. Imagine a phrase like “small talk.” If an AI is simply trained to associate images of “small” and “talk,” it might take it literally, generating an image of short people conversing. But because Gemini’s training integrates language and visuals simultaneously, it should grasp the playful, undertones of “small talk.”

This multimodal training helps Gemini “seamlessly understand and reason about all kinds of inputs from the ground up.” It can, for example, read charts alongside captions, interpret signs, and blend information across text, images, and more. While these features were innovative when Gemini launched, other models, like Claude 3.5 and GPT-4o, now have similar multimodal capabilities.

Another major feature of Gemini is its long context window. With Gemini 1.5 Pro, you can include up to two million tokens in a single prompt, accommodating extensive documents, databases, and complex contracts. This is particularly handy if you’re working with large text resources or building a retrieval-augmented generation (RAG) pipeline—though costs could add up if you use the full capacity regularly.

In terms of performance, benchmarks show that Gemini 1.5 Pro is slightly behind the top models like GPT-4o and Claude 3.5 Sonnet but on par with models like Llama 3 70B. The lighter version, Gemini 1.5 Flash, is comparable to GPT-4o Mini and Claude 3 Haiku, making it a solid option among mid-range models.

Is Google Gemini free?

There’s now a free Gemini app for Android, which may even replace Google Assistant on your phone if you like. iPhone users can find Gemini in the Google app, and it’s accessible to everyone through any web browser.

In addition to the free version, Google offers a premium option called Gemini Advanced. This subscription, part of the Google One AI Premium plan, gives access to a more powerful model, Gemini Ultra. Subscribers get extra perks, like using Gemini Live on mobile—a hands-free, voice-controlled AI experience for Android. So, whether you’re using the free version or the upgraded one, there are plenty of ways to access Gemini across devices.

What is Gemini Google Messages?

Google’s focus with Gemini has been on integrating it into productivity apps like Docs and Gmail, but now it’s made its way into Google Messages—an app most Android users rely on daily. Originally announced at I/O 2024, Gemini in Messages makes it easy to get AI help with everything from drafting texts to planning your weekend.

Before you can start chatting with Gemini in Messages, you’ll need to meet a few requirements: you should be 18 or older, have RCS chats enabled, use a personal Google Account, have an Android phone with at least 6GB of RAM, and be set to either English (in supported countries) or French (Canada).

Once you’re set, here’s how to chat with Gemini:

  • Open Google Messages
  • Tap “Start chat” in the bottom right corner
  • Select Gemini at the top as the contact
  • Pick a sample prompt or type your request
  • Chat until you get the text or image you need.

Gemini is also behind Magic Compose, a feature Google introduced in 2023 to help you rewrite and tweak message styles. While Magic Compose can adjust your messages in a few ways, its flexibility is more limited than a full chat with Gemini.

While Gemini in Messages means you don’t have to switch to the dedicated Gemini app or set it as your default assistant, it’s not quite the full experience. Responses are formatted like texts, which can lead to a few hiccups. For now, it’s a convenient tool for quick ideas and responses, even if it lacks some of the versatility you’d find in other Gemini-powered apps.

Is it any good?

Google Gemini is holding its own in the AI race, especially with its strong multimodal abilities and seamless integration across Google’s apps. Meanwhile, ChatGPT is making strides with its new SearchGPT feature, which provides real-time data access for the first time.

Google, however, has a significant advantage in its extensive search index, covering hundreds of billions of pages—a strong foundation for its reliability. It’s also reportedly working on a new AI tool, codenamed “Project Jarvis,” designed to operate a web browser for managing daily tasks.

The project may be previewed as early as December, along with Google’s next flagship Gemini model, expected to power Jarvis. If successful, it could leap-frog over the other models in AI capabilities, but we’ll have to wait and see how it performs.

Featured image: Google / Canva

About ReadWrite’s Editorial Process

The ReadWrite Editorial policy involves closely monitoring the tech industry for major developments, new product launches, AI breakthroughs, video game releases and other newsworthy events. Editors assign relevant stories to staff writers or freelance contributors with expertise in each particular topic area. Before publication, articles go through a rigorous round of editing for accuracy, clarity, and to ensure adherence to ReadWrite's style guidelines.

Suswati Basu
Tech journalist

Suswati Basu is a multilingual, award-winning editor and the founder of the intersectional literature channel, How To Be Books. She was shortlisted for the Guardian Mary Stott Prize and longlisted for the Guardian International Development Journalism Award. With 18 years of experience in the media industry, Suswati has held significant roles such as head of audience and deputy editor for NationalWorld news, digital editor for Channel 4 News and ITV News. She has also contributed to the Guardian and received training at the BBC As an audience, trends, and SEO specialist, she has participated in panel events alongside Google. Her…

Get the biggest tech headlines of the day delivered to your inbox

    By signing up, you agree to our Terms and Privacy Policy. Unsubscribe anytime.

    Tech News

    Explore the latest in tech with our Tech News. We cut through the noise for concise, relevant updates, keeping you informed about the rapidly evolving tech landscape with curated content that separates signal from noise.

    In-Depth Tech Stories

    Explore tech impact in In-Depth Stories. Narrative data journalism offers comprehensive analyses, revealing stories behind data. Understand industry trends for a deeper perspective on tech's intricate relationships with society.

    Expert Reviews

    Empower decisions with Expert Reviews, merging industry expertise and insightful analysis. Delve into tech intricacies, get the best deals, and stay ahead with our trustworthy guide to navigating the ever-changing tech market.