Call me Dr. Kavya ๐Ÿคฉ

I was awarded doctoral degree by APJ Abdul Kalam Technological University, Kerala, India.

You can read my thesis ‘Linguistic challenges in Malayalam speech recognition: Analysis and solutions’ here.

Pics from the graduation ceremony hosted by College of Engineering Trivandrum and APJ Abdul Kalam Technological University.

/img/phd/cet-convocation.JPG
/img/phd/convocation.png
phd 

Live Dictation: Malayalam speech to text using subword tokens

The research carried out as part of my PhD was centred around the linguistic challenges in Malayalam speech recognition. One of the biggest chellenges associated with recognizing speech in morphologically complex languages is centred around how granular should be the text tokens. Classical ASR with Word tokens In the classical architecture of Automatic Speech Recognition (ASR) with word tokens, the acoustic model identifies fundamental sound units, the pronunciation lexicon maps sounds to words, and the language model predicts word sequences to convert speech to text. [Read More]

An Open Framework to Build Malayalam Speech to Text System

It was indeed a pleasure to present my paper on An openframework to develop Malayalam Speech to text Systems at the 35th Kerala Science congress held during 10th-14th of February, 2023 at Kuttikkanam, Kerala India. The work was presented in the category of Scientific Social Responsibility and recieved the best oral presentation award in that category. The presentation was all about how I ensured openness and transperancy in the development process of speech recognition system for Malayalam done as part of my PhD work Linguistic Challenges in Malayalam Speech Recognition: Analysis and Solutions. [Read More]

How to create a Malayalam Pronuciation Dictionary?

What is a phonetic lexicon? A pronunciation dictionary or a phonetic lexicon is a list of words and their pronunciation described as a sequence of phonemes. It is an essential component in the training and decoding of speech to text (STT) and text to speech (TTS) systems. A pronunciation dictionary is slightly different from a simple phonetic transcription. It should contain delimiters between phonemes, space is usually the default choice. [Read More]

Mlphon: A Multifunctional Grapheme-Phoneme Conversion Tool Using Finite State Transducers

The Mlphon tool I had been working on, for the past couple of years was extensively expanded as part of my research work at College of Engineering Trivandrum. A detailed presentation of the phonemic features of Malayalam, their incorporation as a sequential ruleset in the form of finite state transucers, a quantitative evaluation of the applications of the tool are now available in the article, published by the open access journal, IEEE Access. [Read More]

เดฎเดจเตเดทเตเดฏเดญเดพเดท, เดฏเดจเตเดคเตเดฐเดฌเตเดฆเตเดงเดฟ

เดถเดพเดธเตเดคเตเดฐเด—เดคเดฟ (เดœเต‚เตบ 2022) เดชเตเดฐเดธเดฟเดฆเตเดงเต€เด•เดฐเดฟเดšเตเดš เดฒเต‡เด–เดจเด‚ เด†เดฎเตเด–เด‚ เดฎเดจเตเดทเตเดฏเตผเด•เตเด•เต เดจเตˆเดธเตผเด—เตเด—เดฟเด•เดฎเดพเดฏเตเดณ เด’เดฐเต เด•เดดเดฟเดตเดพเดฃเต เดญเดพเดท. เด•เตเดžเตเดžเตเด™เตเด™เตพ เดเดคเต เดญเดพเดทเดฏเตเด‚ เด…เดตเดฐเตเดŸเต† เดชเดฐเดฟเดธเดฐเด™เตเด™เดณเดฟเตฝ เดจเดฟเดจเตเดจเตเด‚ เดธเตเดตเดพเดญเดพเดตเดฟเด•เดฎเดพเดฏเดฟ เดจเต‡เดŸเดฟเดฏเต†เดŸเตเด•เตเด•เตเดจเตเดจเต. เดˆ เดถเต‡เดทเดฟ เด’เดฐเต เด•เดฎเตเดชเตเดฏเต‚เดŸเตเดŸเดฑเดฟเดจเต เด•เตˆเดตเดฐเดฟเด•เตเด•เดพเตป เด…เดคเตเดฐ เดŽเดณเตเดชเตเดชเดฎเดฒเตเดฒ. เดธเดฟเดจเดฟเดฎเดพเดŸเดฟเด•เตเด•เดฑเตเดฑเต เดฌเตเด•เตเด•เต เดšเต†เดฏเตเดฏเดพเดจเตเด‚, เดญเด•เตเดทเดฃเด‚ เด“เตผเดกเตผ เดšเต†เดฏเตเดฏเดพเดจเตเด‚, เดฎเต†เดฏเดฟเดฒเดฏเด•เตเด•เดพเดจเตเด‚, เด…เดฒเดพเดฑเด‚ เดตเต†เดฏเตเด•เตเด•เดพเดจเตเดฎเตŠเด•เตเด•เต† เด‡เด‚เด—เตเดฒเต€เดทเต เดญเดพเดทเดฏเดฟเตฝ เดชเดฑเดžเตเดžเดพเตฝ เดšเต†เดฏเตเดฏเดพเตป เด•เดดเดฟเดฏเตเดจเตเดจ เดกเดฟเดœเดฟเดฑเตเดฑเตฝ เด…เดธเดฟเดธเตเดฑเตเดฑเดจเตเดฑเตเด•เดณเตŠเด•เตเด•เต† เด‡เดจเตเดจเตเดฃเตเดŸเต. เด‡เดคเดฟเดจเตผเดคเตเดฅเด‚ เดฏเดจเตเดคเตเดฐเด™เตเด™เตพ เดญเดพเดทเดพเดถเต‡เดทเดฟ เด•เตˆเดตเดฐเดฟเดšเตเดšเตเดตเต†เดจเตเดจเดพเดฃเต‹? เดฎเดฒเดฏเดพเดณเดฎเตเตพเดชเตเดชเต†เดŸเต†เดฏเตเดณเตเดณ เดฎเดฑเตเดฑเต เดญเดพเดทเด•เดณเตเด‚ เด•เดฎเตเดชเตเดฏเต‚เดŸเตเดŸเดฑเตเด•เตพเด•เตเด•เต เดตเดดเด™เตเด™เตเดฎเต‹? เด…เดคเดฟเดจเต เด•เตƒเดคเตเดฐเดฟเดฎเดฌเตเดฆเตเดงเดฟ เด†เดตเดถเตเดฏเดฎเตเดฃเตเดŸเต‹? เดˆ เดตเดฟเดทเดฏเด™เตเด™เดณเตŠเด•เตเด•เต† เดชเดฐเดฟเดถเต‹เดงเดฟเด•เตเด•เตเด•เดฏเดพเดฃเต เดˆ เดฒเต‡เด–เดจเดคเตเดคเดฟเตฝ. เดฏเดจเตเดคเตเดฐเด™เตเด™เตพเด•เตเด•เต เดธเตเดตเดฏเด‚ เดชเด เดฟเด•เตเด•เดพเดจเดพเด•เตเดฎเต‹? เดšเตเดฑเตเดฑเตเดชเดพเดŸเตเดฎเตเดณเตเดณ เดถเดฌเตเดฆเด™เตเด™เตพ เดชเดฟเดŸเดฟเดšเตเดšเต†เดŸเตเด•เตเด•เดพเดจเตเดณเตเดณ เด‰เดชเด•เดฐเดฃเด‚ เดŽเดฒเตเดฒเดพ เดซเต‹เดฃเตเด•เดณเดฟเดฒเตเดฎเตเดฃเตเดŸเต. เด† เดถเดฌเตเดฆเดคเตเดคเดฟเตฝ เดจเดฟเดจเตเดจเตเด‚ เดธเด‚เดธเดพเดฐเด‚ เดตเต‡เตผเดคเดฟเดฐเดฟเดšเตเดšเต, เดชเดฑเดžเตเดžเดคเต†เดจเตเดคเต†เดจเตเดจเต เดคเดฟเดฐเดฟเดšเตเดšเดฑเดฟเดฏเดพเดจเตเดณเตเดณ เดธเด‚เดตเดฟเดงเดพเดจเด‚ เดชเดฒ เดญเดพเดทเด•เดณเดฟเดฒเตเด‚ เด‡เดจเตเดจเต เดธเดพเดงเตเดฏเดฎเดพเดฃเต. [Read More]

Mozhi Malayalam TTS powered by Mlphon and Mlmorph

Krishna Sankar recently developed a Malayalam - English bilingual Text to Speech System, Mozhi. Check the web demo page and play around with your choice of words and listen to the natural speech it produces. Krishna has set a future goal of understanding the emotional content of the text and read it out accordingly. This is expected to make the application suitable for audio books. Generating audio for arbitrary speaker with very few training samples is another area he plans to work on. [Read More]

Publishing Malayalam Speech Recognition Model

Malayalam speech recognition model trained on various openly available speech and text corpora using Kaldi toolkit is now released here. It is now available for testing on the Vosk-Browser Speech Recognition Demo website. This Malayalam model can be used with Vosk speech recognition toolkit which has bindings for Java, Javascript, C# and Python. A speech recognition architecture that works best in scenarios of limited speech data availability is called a pipeline model, where it is composed of an acoustic model, a language model and a phonetic lexicon. [Read More]

The language of infinite words

I wrote this article for submission to a popular science writing competition by the Department of Science and Technology. Here is my research story, which did not make to grab the award. The story is based on our research on the morphological richness of Malayalam language. How many words are there in your language? How many of them do you know? Can you find all those words in a dictionary? [Read More]

Text Speech and Dialogue: TSD 2020

I presented a paper on Quantitative Analysis of the Morphological Complexity of Malayalam Language at 23rd International Conference on Text, Speech and Dialogue: TSD 2020, Brno, Czech Republic, September 8โ€“11 2020. The year being 2020, the entire conference happened in remote participation mode. Conference proceedings and pre-recorded presentation videos were made available to the participants and we discussed it over online zoom sessions. It was a novel experience and I am super excited about how I got feedbacks and ideas to work on, even after the live sessions. [Read More]