Buddhi AI Release Notes

0.3.0

Nov 25, 2025

Introducing Image-to-Text: The Multimodal, Private AI

Buddhi AI 0.3.0 Release

We are excited to announce a monumental upgrade in Buddhi AI, moving beyond text to introduce multimodal capabilities with the launch of the Image-to-Text feature. This release is powered by a completely new, high-performance, and privacy-focused engine.

🚀 Next-Generation Engine for Vision and Text

This version involved a major technological transition to unlock true multimodal processing while maintaining our core privacy promise.

We have migrated our core framework from the WebLLM library to the Google MediaPipe Library. This provides a robust, cross-platform architecture optimized for efficient on-device machine learning, replacing the previous engine for a more stable and powerful foundation.

The second critical change is the integration of the Google Gemma 3n E2B Instruction Model. This state-of-the-art model is inherently multimodal, which is what makes our new vision features possible. This shift from a text-only model to this powerful new architecture enables us to deliver cutting-edge AI features right in your browser.

✨ New Feature Spotlight: Image-to-Text

With the power of the new Gemma 3n E2B model, the unified Chat interface is now capable of processing visual data:

Visual Analysis: You can now upload or paste an image directly into the Chat interface.
Intelligent Queries: Ask the AI to analyze, describe, or summarize the content of the image. You can ask sophisticated questions like, “What is the primary object in this photo?” or “Write a caption for this landscape.”
Privacy-First Multimodality: This is a groundbreaking step for client-side AI. Just like your text prompts, your images and visual data never leave your device for processing, ensuring complete privacy even for complex visual tasks.

🔒 Core Commitment: Privacy Reaffirmed

Despite this massive increase in capability, our commitment to privacy remains absolute: All multimodal and text processing is executed entirely within your browser using your local hardware. Your data, prompts, and chat history remain private and secure on your local machine.