128:46:59 hours|81.2GB | 476 Speakers | 73470 Audio Segments |48 kHz | 16 bit wavBengali is the official language of West Bengal and Tripura. It belongs to the Indo-Aryan language family. Bengali is influenced by Sanskrit. Greater use of Bengali has contributed to the growth of the language in terms of vocabulary and the number of styles ..
176:53:28 hours of 113 Gigabytes speech data | 456 Speakers | 77443 Audio segments | 48 kHz | 16 bit wavBodo, one of the scheduled language of India, is one of the Tonal languages of the world. There are two clearly distinguishable kinds of tones in Bodo which are known as Low and High. The language belongs to the Tibeto Burmese linguistic ..
118:40:03 Hours|75.1GB|489 Speakers|73695Audio Segments|48 kHz | 16 bit wav.Hindi is a Major, Indo-Aryan language, a descendant of Sanskrit, which is spoken in the central and northern India, in the states of Bihar, Chhattisgarh, Delhi, Haryana, Himachal Pradesh, Jharkhand, Madhya Pradesh, Rajasthan, Uttarakhand and Uttar Pradesh.The LDC-IL..
179:32:52 hours of 115 Gigabytes speech data | 656 Speakers | 99109 Audio segments | 48 kHz | 16 bit wavKannada is one of the Ancient Indian languages which belong to the Dravidian family. It has its own script. The language in a region is influenced by other languages of the region, the mother tongue of the speaker, etc. The reading speed, lo..
156:37:51 hours of 100 Gigabytes speech data | 504 Speakers | 72,938 Audio segments | 48 kHz | 16 bit wavKonkani belongs to the Indo-European family of languages. Konkani is the official language of Goa. However, the language is spoken widely across four states- Maharashtra, Goa, Karnataka and Kerala. Konkani is the only Indian language written..
72:02:12 (44.8GB) Hours | 300 Speakers | 35109 Audio Segments | 48 kHz | 16 bit wavMaithili is an Indio-Aryan language, a direct descendant of Sanskrit, which is spoken in the states of Bihar, Jarkhand and part of Nepal. It is one of the scheduled languages of India. The LDC-IL speech data is collected from geographic dialects of Sotipur..
164:01:02 Hours | 65.5 GB | 458Speakers| 43670 Audio Segments |48 kHz | 16 bit wav.Malayalam is the official language of Kerala and Laccadive Islands. It belongs to the Dravidian language family. According to the formation of Kerala and the language of Travancore, Cochin and Malabar regions are influenced by different internal and extern..
156:28:32 hours of Manipuri Raw Speech Corpus | 100 GB | 620 Speakers | 66,231 Audio segments | 48 khz | 16 bit wav Manipuri is the Administrative Language of Manipur. The development of LDC-IL Speech Data for Manipuri lies in capturing all the distinctive characteristics of speeches shared by different regional dialec..
89:17:25 hours of 58 Gigabytes speech data | 307 Speakers | 58544 Audio segments | 48 kHz | 16 bit wav.Marathi language is an Indo-Aryan language. Marathi language is prevalent from the 9th century. Standard Marathi (Puneri) is the official language of the State of Maharashtra. Standard Marathi is based on dialects used by academics and the pri..
87:14:44 Hours|56.5GB|350 Speakers|48975 Audio Segments|48 kHz | 16 bit wav.Nepali belongs to the Indo-Aryan language family. Nepali is the official language of Nepal and Indian State of West Bengal and Sikkim, and spoken in the states of Uttaranchal, Assam, Arunachal Pradesh, Manipur, Mizoram and Bihar, and as well as in other countries like ..
101:09:28 Hours|65.5 GB|467 Speakers| 76,240 Audio Segments|48 kHz | 16 bit wav. Punjabi is one of the Indo-Aryan Language. Punjabi is a tonal language it has three tones, high-falling, low-rising, and level (neutral). As we know Punjabi is not spoken only in India it is also a language of Pakistan called Shahmukhi Punjabi. Here we are tal..
22:43:59 hours |15 GB | 80 Speakers | 10,510 Audio Segments | 48 khz | 16 bit wav.Telugu is the official language of Telangana and Andhra Pradesh States. It belongs to the Dravidian language family. Among the Dravidian languages, Telugu is spoken by the largest population. Telugu is agglutinative in nature and its vocabulary is ve..
5161927 Words | 739 Titles | XML format | 5 domains.99:18:21 hours | 64.2 GB | 499 Speakers | 88,708 Audio Segments | 48 kHz | 16 bit wavUrdu is one of the Modern Indo-Aryan languages of India. It evolved from Shaurseni Apabhramsha. It uses Persio-Arabic script. The language in a region is influenced by ot..