Field Methods for Speech Technologies (Multilingual) | Council for Diversity and Innovatio

W2M-204
Field Methods for Speech Technologies (Multilingual)
An intensive one-week immersion course bridging the gap between linguistic fieldwork and AI development. Learn to create high-quality speech datasets for low-resource Dravidian and Indo-Aryan languages whilst working directly with the speakers.
Knowledge Partner, Community Support and Venue: Prism-NB (Uttaran), West Bengal, India
Sponsor: ​Unreal Tece LLP​
First Batch: June 18 - 23 2026
Application Deadline (Early): 29 March 2026
Application Deadline (Final): 15 April 2026
Focus Languages: Sadri and Kurux
Venue: Prism-NB (Uttaran), Madarihat, Alipurduar, West Bengal, India 
Location Map: https://maps.app.goo.gl/JHKSPuuFTb69Gq6m9

Apply Now!
Course Outline (Provisional)
Meet the Instructors
World-class researchers and practitioners combining decades of fieldwork experience with cutting-edge AI development. Our team bridges traditional linguistics and modern speech technology.
Bornini
Language and Folklore Documentation, Linguistic Typology, Language Description
Dripta

Karthick
Language Archiving, Language Documentation, Language Endangerment
Meiraba
Amalesh
Benu
Ritesh
Language Technologies, Underrepresented Languages
The Focus Languages: Sadri and Kurux
Sadri

Kurux

Critical Dates and Timeline
(Early) Application Deadline
29 March 2026
(Final) Application Deadline
15 April 2025
Acceptance Notification
Upto 05 April 2026 (for early applicants)
Upto 20 April 2026 (for late applicants)
Course Duration
18 - 23 June 2026
Course Fee Structure and Inclusions
₹25K
Professional Rate
For full-time, regular faculty members (incl those on tenure-track) and  industry professionals (incl those enrolled in part-time online programmes)
₹12K
Academic Rate
For students, full-time researchers and non-regular/part-time faculty members (contractual, ad-hoc, guest, etc)
100%
Expenses Covered
Accommodation, meals, materials, community payments, field visits included
W2M Merit-cum-Means Fellowship (W2MCM)
Financial Assistance will be given to select participants to partially/fully offset the expenditure of attending the course. For the inaugural course, all participants will be eligible for receiving the fellowship. Your application will be evaluated to ascertain the relevance of attending this course for you and the ability to bear the full expenses. It will be reimbursed after the completion of the course depending on the participants performance in the course. Please indicate your preference for fellowship with the application.
Comprehensive Course Inclusions
Accommodation
Modest accommodation at the Prism-NB (Uttaran) Campus will be provided for the complete duration of the course, ensuring convenient access to facilities and fostering collaborative learning environment among participants.
Meals and Materials
All meals throughout the week plus comprehensive course materials including software licenses, annotation guidelines, and reference texts essential for effective learning.
Community Engagement
Fair compensation for community members participating in the research, field visit expenses, and other incidental costs ensuring ethical and respectful collaboration.
Essential Equipment and Desirable Skills
Required Equipment
Laptop (2019 or newer model)
Computer mouse for precise annotation work
Good quality headset for audio analysis
Smartphone for field recording
Desirable Background
Familiarity with Indo-Aryan morpho-phonology
Experience with phonetic transcription in IPA and interlinear glossing*
Basic understanding of computational linguistics
Prior fieldwork or community engagement experience
Course Capacity
Maximum intake limited to 30 participants to ensure personalised attention and effective hands-on learning during field exercises and technical workshops.
*Pre-course remedial classes on IPA and glossing will be arranged for those participants who do not have knowledge of IPA but have otherwise experience of working with building speech datasets and transcriptions, in general
Admission Process: Four-Step Journey
Application Submission
Complete the comprehensive Google Form with academic background, technical experience, and research interests clearly documented.
Assessment Phase
Provisionally shortlisted candidates undergo written examination and interview to evaluate technical competency and research motivation.
Fee Payment
Successful candidates must submit full course fee within 2 days of acceptance to confirm enrollment and secure their position.
Course Commencement
Arrive at the venue one day before the start date. 100% attendance is mandatory for successful course completion and certification.
Transform your passion for language datasets into expertise. Join India's premier field methods programme.
Apply Now
Collaborators and Sponsors
Sponsor and Technology Partner
Unreal Tece LLP provides financial assistance and provides cutting-edge tools and infrastructure for speech technology development.
Addressing the Socio-Digital Divide in Indian Language Technologies
The Challenge
India's extraordinary linguistic diversity remains strikingly under-represented in mainstream speech and AI systems. This technological gap creates a socio-digital divide that excludes millions of speakers from benefiting from modern language technologies.
Data Collection Crisis
Recent national data-collection initiatives report alarming 40% attrition rates in field-collected audio. Poor elicitation design, inadequate metadata, and non-specialist field teams contribute to massive data wastage.
Skills Gap
High-quality, bias-aware datasets require professionals who understand both linguistic fieldwork methodologies and technology requirements—a capability virtually absent in current training pipelines.
From Words to Worlds to Models: Our Comprehensive Approach
Words (Data)
Collect authentic speech samples through carefully designed elicitation protocols
Worlds (Community)
Engage respectfully with speakers, understanding cultural context and ethical considerations
Models (Technology)
Build deployable ASR systems that bridge linguistic insights with computational applications
Target Participants and Prerequisites
Speech-Tech Professionals
Industry professionals with prior experience in transcription, IPA notation, and morphosyntactic analysis seeking to enhance fieldwork capabilities for dataset creation
Post-Graduate Linguistics Students
Advanced students pursuing research in computational linguistics, language documentation, or speech technology applications
Academic Researchers
Faculty and research scholars working on low-resource language technologies, corpus linguistics, or language revitalisation projects
Learning Objectives: From Theory to Practice
Design Task-Specific Elicitation Instruments
Master the art of creating questionnaires and prompts that balance linguistic depth with practical model requirements, ensuring comprehensive data coverage whilst respecting speaker comfort and cultural sensitivities.
Collect and Validate Quality Speech Data
Gather 1-2 hours of high-quality Sadri speech per team, complete with robust metadata standards that ensure reproducibility and enable effective model training and evaluation.
Build Deployable ASR Pipeline
Construct and evaluate a complete end-to-end Automatic Speech Recognition system, documenting transfer-learning decisions and establishing baseline performance metrics for future development.
Technical Skills: Transcription and Annotation Standards
IPA Transcription
Master International Phonetic Alphabet conventions for accurate phonetic representation of Sadri speech sounds, ensuring consistency across annotators and compatibility with speech recognition systems.
UD Framework
Apply Universal Dependencies annotation standards for morphosyntactic analysis, enabling cross-linguistic comparison and integration with existing computational linguistics resources.
Leipzig Conventions
Implement standardised glossing conventions for morphological analysis, ensuring scholarly rigour and enabling effective collaboration within the linguistics community.
Ethical Framework: Community Engagement and Bias Mitigation
Community Consent
Establish informed consent protocols that respect speakers' rights and ensure transparent communication about data usage and research outcomes.
Bias Recognition
Identify and address potential biases in data collection, annotation practices, and model development that could perpetuate linguistic inequalities.
Data Protection
Implement robust data security measures and privacy protection protocols throughout the research lifecycle, ensuring speaker confidentiality and cultural sensitivity.
Reciprocal Benefits
Develop sustainable frameworks for ensuring that technological advancements benefit the contributing communities rather than extracting value without reciprocation.
Transfer Learning and ASR Development
Model Evaluation
Performance metrics and validation protocols
Fine-tuning Strategies
Adaptation techniques for low-resource scenarios
Pre-trained Models
Foundation models trained on high-resource languages
Join the Revolution in Ethical Language Technology
This intensive course represents a paradigm shift towards ethical, community-centred approaches to speech technology development. By combining rigorous linguistic methodology with cutting-edge computational techniques, participants will emerge as pioneers capable of bridging the digital divide that currently excludes millions of speakers from technological advancement.
The skills acquired during this week-long immersion extend far beyond technical competency. Participants will develop a nuanced understanding of the delicate balance between technological innovation and cultural preservation, ensuring that future AI systems serve all of India's linguistic communities with equal dignity and effectiveness.
Apply Now!