Home Multimodal AI become accessible: new model runs on your laptop

Multimodal AI become accessible: new model runs on your laptop

A new open-source artificial intelligence model named Obsidian, announced in an Oct. 30 Reddit post, represents a breakthrough in multimodal AI accessibility. Obsidian is the first 3b parameter multimodal AI — which makes it a model compact enough to run efficiently on a regular laptop.

Multimodal AI refers to AI systems that can process and connect data from different modes, such as text, images, audio, and video — in this case, the model accepts text and pictures as input, much like the latest version of OpenAI’s GPT-4V. While multimodal AI models like DALL-E 3 and GPT-4 have shown impressive capabilities, their enormous size makes them resource-intensive to run, requiring expensive high-end hardware — and their models are a closely guarded secret, so you could never run them even if you had the necessary specialized hardware.

The AI intelligence model, Obsidian, packs multimodal intelligence into a standard laptop’s memory

Obsidian changes this by packing multimodal intelligence into a model small enough to fit into a standard laptop’s memory and run at practical speeds. At 3 billion parameters, Obsidian builds upon the Capybara-3B model architecture, which achieves state-of-the-art performance compared to similarly sized models. The developer also announced on Reddit that a multimodal model based on the highly-praised Mistral open-source 7B model will soon follow.

Obsidian’s compact size is thanks to techniques adapted from the LLaMA model architecture. According to the Reddit post announcing Obsidian, it was pre-trained on a diverse synthesized multi-modal dataset, including text paired with corresponding images. This training methodology allowed it to develop strong language and vision capabilities despite its reduced parameters.

The result is an AI assistant with conversational skills and visual understanding that can fit in your backpack. Obsidian breaks down barriers to accessing AI, opening up new possibilities for on-device intelligence.

While still an early version, Obsidian’s efficient form factor sets an exciting precedent. It demonstrates that multimodal AI does not have to be locked up in giant data centers but can be made compact enough to be distributed widely.

Featured Image Credit: From Image Creation at Aimesoft; Thank you!

About ReadWrite’s Editorial Process

The ReadWrite Editorial policy involves closely monitoring the gambling and blockchain industries for major developments, new product and brand launches, game releases and other newsworthy events. Editors assign relevant stories to in-house staff writers with expertise in each particular topic area. Before publication, articles go through a rigorous round of editing for accuracy, clarity, and to ensure adherence to ReadWrite's style guidelines.

Radek Zielinski
Tech Journalist

Radek Zielinski is an experienced technology and financial journalist with a passion for cybersecurity and futurology.

Get the biggest iGaming headlines of the day delivered to your inbox

    By signing up, you agree to our Terms and Privacy Policy. Unsubscribe anytime.

    Gambling News

    Explore the latest in online gambling with our curated updates. We cut through the noise to deliver concise, relevant insights, keeping you informed about the ever-changing world of iGaming and its most important trends.

    In-Depth Strategy Guides

    Elevate your game with tailored strategies for sports betting, table games, slots, and poker. Learn how to maximize bonuses, refine your tactics, and boost your chances to beat the house.

    Unbiased Expert Reviews

    Honest and transparent reviews of sportsbooks, casinos and poker rooms crafted through industry expertise and in-depth analysis. Delve into intricacies, get the best bonus deals, and stay ahead with our trustworthy guides.