Gnani.ai, a Bengaluru-based enterprise voice AI startup, launched Prisma v2.5, a speech-to-text model supporting 12 Indian languages, on June 17. The model was trained on 14 million hours of proprietary Indic speech data and is available to enterprise customers through APIs, according to medianama.com.
Prisma v2.5 is designed to transcribe Indian-language speech while accounting for dialect variation, background noise, and mid-sentence code-switching. Gnani.ai co-founder and CEO Ganesh Gopalan explained that unlike most automatic speech recognition models built for studio-quality audio, Prisma is optimized for Indian phone calls, which often feature compressed network audio and multiple languages in a single sentence. The model also improves accuracy for short utterances, numbers, alphanumeric strings, and named entities, which are critical for sectors such as banking, insurance, and healthcare.
Gnani.ai benchmarked Prisma v2.5 on word and character error rates, with internal and third-party evaluations showing it outperforms competitors including ElevenLabs, Sarvam AI, and Microsoft. The company did not disclose specific benchmark scores or methodology but emphasized that the model integrates common Indian speech challenges into its training data rather than treating them as exceptions, enhancing transcription accuracy in real-world scenarios.
Prisma v2.5 is now accessible to enterprise customers via APIs, enabling improved voice AI applications across multiple sectors. The launch on June 17 marks a key step for Gnani.ai in advancing Indic speech recognition technology, addressing challenges unique to Indian languages and accents.