Search - Assamese Raw Speech Corpus

Grid View:
Quickview

Assamese Raw Speech Corpus

requests (2)

54:21:12 Hours | 32.5 GB | 304 Speakers | 37,570 Audio Segments | 48 kHz | 16 bit wav. Assamese is the official language of Assam. Its linguistic presence is widely presented in the state of Assam and some parts of Arunachal Pradesh and Nagaland.According to 2011 census, the Assamese Language is spoken by 15 million speakers.Assamese a widely spoken language does encounter several dialectal variations. The regional dialects can be broadly divided into two parts - the Eastern Group and the Western Group.LDC-IL divided the Assamese speaking areas into these four regions Xiboxagoria, Central Assam, Kamrupi, Goalparia and have collected speech data from each speaker. LDC-IL Assamese Speech data set consists of different types of datasets that are made up of word lists, sentences, running texts and date formats.The available Speech Corpus details:Total Speakers 304 (154 Female and 150 Male)DomainsAudio SegmentsEach DomainDurationContemporary Text (News)30417:23:25Creative Text30411:44:37Sentence75935:55:29Date Format5990:33:59Command and Control Words91184:56:49Person Name60815:38:07Place Name30441:58:33Phonetically Balanced-W465673:41:45Form and Function-Word-W539602:28:28A detailed explanation of the Assamese Speech Corpus will be available in the Assamese Speech Data Documentation. For any research-based citations, please use the following citations: Ramamoorthy L., Narayan Kumar Choudhary, Atreyee Sharma, Jahnobi Kalita, Samhita Bharadwaj, Plabita Bora, Priyanshee Adhyapak, Mustafiza Tamim, Rajesha N., Manasa G..  2021. Assamese Raw Speech Corpus. Central Institute of Indian Languages, Mysore.Choudhary, Narayan, Rajesha N., Manasa G. & L. Ramamoorthy. 2019. “LDC-IL Raw Speech Corpora: An Overview”  in Linguistic Resources for AI/NLP in Indian Languages. Central Institute of Indian Languages, Mysore.  pp. 160-174...

Showing 1 to 1 of 1 (1 Pages)