Home Mistral launches its first multimodal AI model called Pixtral 12B

Mistral launches its first multimodal AI model called Pixtral 12B

French AI startup Mistral has released its first ever multimodal model called Pixtral 12B, competing with the likes of OpenAI and Anthropic. The 12-billion-parameter model is capable of processing both images and text, and currently uses its existing text-based model Nemo 12B.

Pixtral 12B is expected to be integrated into the company’s chatbot, Le Chat, and API platform, La Platforme, according to the company’s head of developer relations.

The model is said to be 24GB in size, and in theory, should be able to perform tasks like captioning images and counting the number of objects in a photo. The official account of Mistral on X released the AI model in a post by sharing its magnet link.

Pixtral 12B’s performance and accessibility

Pixtral 12B is available for download, fine-tuning, and use under an Apache 2.0 license without restrictions. It can be obtained through a torrent link on GitHub and the AI and machine learning development platform, Hugging Face.

A Reddit user shared benchmark scores for Pixtral 12B, which appears to show that the language model surpasses both Claude-3 Haiku and Phi-3 Vision in multimodal abilities on the ChartQA benchmark. It also reportedly exceeds the performance of competing AI models in multimodal knowledge and reasoning on the Massive Multitask Language Understanding (MMLU) benchmark.

Pixtral benchmarks results
byu/kristaller486 inLocalLLaMA

The Amazon-backed company is already known for Codestral, a large language model which helps developers to code, as well as Mistral Large. ReadWrite reported on the new LLM in February, which was described as a “cutting-edge text generation model” with “top-tier reasoning capabilities.”

Most generative AI models, such as those from Mistral, use extensive amounts of public data from the web, which is often under copyright. While some providers of these models claim that “fair use” allows them to collect any public data, numerous copyright holders contest this practice. As a result, AI firms like OpenAI and Midjourney have faced lawsuits aimed at stopping this.

In December, the open-source startup received $414 million in funding, closing the investment window with a valuation of $2 billion. By May, the Paris-based company was able to close a $645 million funding round led by General Catalyst that valued the company at $6 billion.

Featured image: Canva

About ReadWrite’s Editorial Process

The ReadWrite Editorial policy involves closely monitoring the tech industry for major developments, new product launches, AI breakthroughs, video game releases and other newsworthy events. Editors assign relevant stories to staff writers or freelance contributors with expertise in each particular topic area. Before publication, articles go through a rigorous round of editing for accuracy, clarity, and to ensure adherence to ReadWrite's style guidelines.

Suswati Basu
Tech journalist

Suswati Basu is a multilingual, award-winning editor and the founder of the intersectional literature channel, How To Be Books. She was shortlisted for the Guardian Mary Stott Prize and longlisted for the Guardian International Development Journalism Award. With 18 years of experience in the media industry, Suswati has held significant roles such as head of audience and deputy editor for NationalWorld news, digital editor for Channel 4 News and ITV News. She has also contributed to the Guardian and received training at the BBC As an audience, trends, and SEO specialist, she has participated in panel events alongside Google. Her…

Get the biggest tech headlines of the day delivered to your inbox

    By signing up, you agree to our Terms and Privacy Policy. Unsubscribe anytime.

    Tech News

    Explore the latest in tech with our Tech News. We cut through the noise for concise, relevant updates, keeping you informed about the rapidly evolving tech landscape with curated content that separates signal from noise.

    In-Depth Tech Stories

    Explore tech impact in In-Depth Stories. Narrative data journalism offers comprehensive analyses, revealing stories behind data. Understand industry trends for a deeper perspective on tech's intricate relationships with society.

    Expert Reviews

    Empower decisions with Expert Reviews, merging industry expertise and insightful analysis. Delve into tech intricacies, get the best deals, and stay ahead with our trustworthy guide to navigating the ever-changing tech market.