0.3.0

Introducing Image-to-Text: The Multimodal, Private AI

Buddhi AI 0.3.0 Release

We are excited to announce a monumental upgrade in Buddhi AI, moving beyond text to introduce multimodal capabilities with the launch of the Image-to-Text feature. This release is powered by a completely new, high-performance, and privacy-focused engine.

🚀 Next-Generation Engine for Vision and Text

This version involved a major technological transition to unlock true multimodal processing while maintaining our core privacy promise.

We have migrated our core framework from the WebLLM library to the Google MediaPipe Library. This provides a robust, cross-platform architecture optimized for efficient on-device machine learning, replacing the previous engine for a more stable and powerful foundation.

The second critical change is the integration of the Google Gemma 3n E2B Instruction Model. This state-of-the-art model is inherently multimodal, which is what makes our new vision features possible. This shift from a text-only model to this powerful new architecture enables us to deliver cutting-edge AI features right in your browser.

✨ New Feature Spotlight: Image-to-Text

With the power of the new Gemma 3n E2B model, the unified Chat interface is now capable of processing visual data:

🔒 Core Commitment: Privacy Reaffirmed

Despite this massive increase in capability, our commitment to privacy remains absolute: All multimodal and text processing is executed entirely within your browser using your local hardware. Your data, prompts, and chat history remain private and secure on your local machine.