Posts -

Mother Mary Comes to Me

The unbearable heaviness of being

Posted on May 26, 2026 |

I can’t even remember the last time I read a book from cover to cover. I haven’t dared to buy myself a book for the same reason in a while. My Kindle died a natural death a year back. I never got around to replacing it because I wasn’t sure if reading would ever come back to me. Recently I’ve had discussions with childhood friends who used to be avid readers, lost it for a while, and got back to it with intentional effort — and with those who have moved out of the country and found it difficult to read in their mother tongue, but caught up with reading again. [Read More]

Reading Reflection

One Year In

Notes from a first work anniversary

Posted on March 28, 2026 |

March. One year since I joined Adalat AI. I have already written about the work itself — the courtrooms, the languages, the gap between benchmarks and real benches. This post is not about that. This is about what happened to life around the job. Off the Clock I came in carrying a particular kind of guilt — the kind that early career researchers accumulate quietly over years. The feeling of always being one paper short, one grant proposal behind, perpetually under-delivering against some invisible standard. [Read More]

personal Adalat AI

Recap 2025

Posted on January 1, 2026 |

My year end note for 2024 ended like this: Looking back, there are a lot of unfinished tasks. I want to have a pace up in 2025, personally and professionally. Will update!! And yes, life indeed was on a litereal level-up. My first industry job found me. I got into a place where everyone was enthusiastic when I spoke about languages and AI. And started to hear about the intricate space of judiciary. [Read More]

personal

From Benchmarks to Benches

The Journey of Building AI for Indian Courtrooms

Posted on October 29, 2025 |

I haven’t posted much about my job at Adalat AI. The team I’m part of builds dictation systems for the Indian court rooms, and initially, I thought it would be a straightforward extension of my PhD research. I was wrong. Unlike academic work where success means beating SOTA on benchmark datasets, the reality at Adalat AI demanded something more: a deep understanding of India’s diverse courtroom dynamics, legal workflows, and the intricate linguistic landscape where regional languages and English constantly intertwine. [Read More]

Adalat AI Legal Dictation Legal AI

The Ghost of ASCII Past

In Indian Language Computing

Posted on May 17, 2025 |

Indian language computing has evolved from ASCII-based font encoding to Unicode standardization. This article explains how text was represented in Indian languages before Unicode, the problems with ASCII-based fonts, and why Unicode became necessary. It covers various input methods developed for typing Indian languages and demonstrates how Unicode solved the compatibility issues between different systems. Table of Contents What is Unicode? Some Hindi (Devanagari) Unicode Characters Unicode is more than just Codepoints! [Read More]

Unicode Input Methods Fonts ASCII

Malayalam: Life and Praxis - Seminar Series at Tirur

Posted on February 20, 2025 |

A three day National Seminar, “Malayalam: Life and Praxis”, was organized by the Tirur Regional Centre of Sree Sankaracharya University of Sanskrit during February 18-20, 2025 as a tribute to Dr. Sushama L., Professor on Malayalam Lingustics, who is retiring from her teaching career in this academic year. Dr. Sushama currently serves as the Vice Chancellor of Thunchath Ezhuthachan Malayalam University. I was invited to deliver a session on “കമ്പ്യൂട്ടർ മനസ്സിലാക്കുന്ന മലയാളഭാഷ”. [Read More]

seminar malayalam

Recap 2024

Posted on January 1, 2025 |

I write this not for the world, but for me. In the first quarter of 2024, I struggled a bit to find an answer to whether I should switch back to the full time faculty position or just continue the current semi-academic reserch position and finally decided to stick with the latter. As the year ends, I gladly realize that the decision saved me from the mundaneness of many academic/administrative chores. [Read More]

personal

EMNLP 2024

Posted on November 13, 2024 |

Empirical Methods in Natural Language Processing (EMNLP), കമ്പ്യൂട്ടേഷണൽ ലിംഗ്വിസ്റ്റിക്സിന്റെ ലോകോത്തര കോൺഫറൻസ് വേദികളിലൊന്നാണ്. കേരള ഡിജിറ്റൽ യൂണിവേഴ്സിറ്റിയിലെ Virtual Resource Centre for Language Computing (VRCLC) എന്ന ഭാഷാകമ്പ്യൂട്ടിങ്ങ് കേന്ദ്രത്തെ പ്രതിനിധീകരിച്ച് കോൺഫറൻസിൽ പങ്കെടുത്ത് ഒരു പ്രബന്ധം അവതരിപ്പിക്കുകയുണ്ടായി. VRCLCയിലെ പ്രാദേശികഭാഷാഗവേഷണം ഇംഗ്ലീഷ് ഭാഷയ്ക്ക് അനുയോജ്യമായ വിധത്തിലുള്ള ഏറ്റവും മികച്ച ആർട്ടിഫിഷൽ ഇന്റലിജൻസ് മോഡലുകളുടെ നിർമ്മാണത്തിൽ ഒരുപാട് ബഹുരാഷ്ട്ര കമ്പനികൾ മത്സരിക്കുന്നുണ്ട്. അതിൽ ചില എഐ മോഡലുകളൊക്കെ ബഹുഭാഷാശേഷിയുള്ളതാണെന്നൊക്കെ അവർ അവകാശപ്പെടുമ്പോഴും അവയിലൊക്കെ കൃത്യത ഉറപ്പുവരുത്താനുള്ള ശ്രമങ്ങൾ പലപ്പോഴും ഉണ്ടാകാറില്ല. ഇംഗ്ലീഷിതര ഭാഷകൾക്കുള്ള ഭാഷാകമ്പ്യൂട്ടിങ്ങ്, സ്പീച്ച് എഐ മോഡലുകളുടെ നിർമ്മാണം ഒക്കെ പല കാരണങ്ങൾ കൊണ്ട് ബുദ്ധിമുട്ടുള്ളതാണ്. [Read More]

unicode multilingual nlp conferences emnlp2024

താപസം സെമിനാർ 2024

Posted on October 3, 2024 |

താരതമ്യപഠനസംഘം ഒക്ടോബർ 1, 2 തീയതികളിലായി സംഘടിപ്പിച്ച താപസം സെമിനാർ ശ്രീശങ്കരാചാര്യ സംസ്കൃതസർവ്വകലാശാലയിൽ വെച്ച് നടന്നു. ഈ സെമിനാറിൽ ‘യൂണിക്കോഡിലെത്തിയ മലയാളം: ചില ഭാഷാസാംസ്കാരികചിചാരങ്ങൾ’ എന്ന വിഷത്തിൽ ഞാനവതരിപ്പിച്ച പ്രഭാഷണം ഇവിടെ കൊടുക്കുന്നു.

seminar malayalam

Wav2Vec2-BERT+LM: Transcribing Speech and Evaluating Models using Huggingface Transformers

Posted on August 20, 2024 |

What is Wav2Vec2-BERT? Wav2Vec2-BERT is a successor of the popular Wav2Vec2 Model, a pre-trained model for Automatic Speech Recognition (ASR). Wav2Vec2-BERT is a 580M-parameters audio model that has been pre-trained on 4.5M hours of unlabeled audio data covering more than 143 languages. Following the basic architecture of Wav2Vec2, with increased pretraining data and slighly different training objectives, various models (XLSR, XLS-R and MMS) with pretrained checkpoints were released. Wav2Vec2-BERT pretrained model was introduced in the SeamlessM4T Paper by Meta in August 2023. [Read More]

malayalam speech recognition Transformer