The Problem
India is home to over 19,500 languages and dialects. While tools like Google Translate exist, they suffer from two critical flaws in rural contexts: internet dependency and a lack of support for low-resource tribal languages. For non-literate populations in remote areas, text-based translation apps are inaccessible, creating barriers in healthcare, education, and governance.
The Solution
We developed a standalone, offline speech-to-speech translator running entirely on the edge (Raspberry Pi 5). It enables real-time bidirectional communication (e.g., Hindi ↔ English) without needing Wi-Fi or 4G data.
System Architecture
- Speech-to-Text (STT): Utilizing the Vosk model for lightweight, offline audio transcription.
- Neural Machine Translation (NMT): Integrating IndicTrans2, a transformer-based model fine-tuned for Indian languages, to translate text with high semantic accuracy.
- Text-to-Speech (TTS): Using IndicTTS to synthesize natural-sounding speech output, making the device usable for illiterate users.
Impact & Results
The prototype successfully demonstrated that advanced AI models can run on affordable edge hardware (~₹11,000 Total Cost) to serve social goals.
- Offline Capability: The system functions 100% offline, eliminating data costs and connectivity issues common in rural India.
- Optimization: We optimized the pipeline to run on ARM architecture (Raspberry Pi), managing the trade-off between model size and translation latency.
- Social Impact: By bridging the language gap, this tool directly supports SDG 10 (Reduced Inequalities) and SDG 16 (Peace & Justice), empowering marginalized communities to access public services in their native tongue.