Maithili Raw Speech Corpus Vol. II
OverView
109:09:50 hours | 206 Audio Segments | 122 SpeakersThe LDC-IL Maithili Raw Speech dataset Vol.II comprises audio files in wav format, accompanied by a corresponding textual layer containing phonetically normalized and orthographi...Your request cart is empty!
Dataset Description
109:09:50 hours | 206 Audio Segments | 122 Speakers
The LDC-IL Maithili Raw Speech dataset Vol.II comprises audio files in wav format, accompanied by a corresponding textual layer containing phonetically normalized and orthographically normalized annotations in Devanagari script. This dataset spans a duration of 109:09:50 (hh:mm:ss) , consisting of read speech with continuous text, and spontaneous speech along with the its transcription in Devnagari. The data is derived from 49 female and 73 male native Maithili speakers, encompassing diverse age groups and regions. A comprehensive explanation of dataset can be found in the Maithili Raw Speech Documentation.
For any research-based citations, please use the following citations:
- Shantanu Kumar, Ankita Tiwari, Rajesha N., Manasa G., Dr. Narayan Kumar Choudhary, Prof. Shailendra Mohan. 2025. Maithili Raw Speech Corpus Vol. II. Central Institute of Indian Languages, Mysore. 978-93-48633-37-8.
- Rejitha K. S. and Narayan Kumar Choudhary. (ed.). 2025. LDC-IL Corpus Insights. Central Institute of Indian Languages, Mysore. 978-93-48633-33-0.
Item specifics
- Authors Shantanu Kumar, Ankita Tiwari, Rajesha N., Manasa G., Dr. Narayan Kumar Choudhary, Prof. Shailendra Mohan
- Catalogue Number 1513
- ISBN 978-93-48633-37-8
- Data Source On Field
- Duration 109:09:50 hours
- # of Audio Segments 206
- Release Date 20/03/2025
- Terms and Conditions General instructions for use of the resources provided by LDC-IL.