Manipuri Sentence Aligned Speech Corpus (Meetei Mayek)
OverView
116:34:24 hours | 75.9 GB | 60,819 Audio Segments | 589 speakersThe LDC-Manipuri Sentence Aligned Speech dataset comprises audio...
		Categories
			Cart
			Account
			Search
			Recent View
			Go to Top
	
		
		
			
				
					
			
		
	
	
		
		
			
				
					
						
							
																	
										
										
															
									
					
				
			
		
	
	
		
	
		
		
			
				
					
						
							
								
									
										
											
											
										
									
									
										
											
										
									
								
							
							
						
					
				
			
		
	
	
		
		
			
				
					
						
				
			
		
	
	
		
						All Categories
						×
					
					
				
							Request Cart
							×
						
						Your request cart is empty!
							Search
							×
						
						
							Recent View Datasets
							×
						
						
					Dataset Description
116:34:24 hours | 75.9 GB | 60,819 Audio Segments | 589 speakers
The LDC-Manipuri Sentence Aligned Speech dataset comprises audio files in wav format, accompanied by a corresponding textual layer containing orthographically normalized annotation in Meetei Mayek. This dataset spans a duration of 116:34:24 (hh:mm:ss), consisting of read speech with continuous text, representative sentences, and date formats. The data is derived from 295female and 294 male native Manipuri speakers, encompassing diverse age groups and regions. A comprehensive explanation of dataset can be found in the Manipuri Sentence Aligned Speech Documentation.
For any research-based citations, please use the following citations:
- Amom Nandaraj Meetei, Yumnam,Premila Chanu, Rajesha N., Manasa,G., Stephen Fernandes, Nithin S.,Roopashri M.R.,Dr. Narayan Kumar Choudhary, Prof. Shailendra Mohan. 2025. Manipuri Sentence Aligned Speech Corpus. Central Institute of Indian Languages, Mysore. 978-93-48633-96-5
Item specifics
- Authors Amom Nandaraj Meetei., Yumnam Premila Chanu., Rajesha N., Manasa G., Nithin S., Dr. Narayan Kumar Choudhary., Prof. Shailendra Mohan
- Corpus Type Sentence Aligned Speech Corpus
- Catalogue Number 1504
- ISBN 978-93-48633-96-5
- Data Source On Field
- Duration 123:29:55 (hh:mm:ss)
- # of Audio Segments 60819
- Release Date 20250320
- Terms and Conditions General instructions for use of the resources provided by LDC-IL.
Commercial User
									
								Non-Commercial User
									
								LDC-IL Raw Text Corpora: An Overview
                                    
								LDC-IL Raw Speech Corpora: An Overview
                                    
								
-1000x1000.png) 
-150x150.png) 
		-270x270.png)