Zeliang Parallel Text Corpus: Linguistic Features and Structures
OverView
Total Words: 4,404,845 |Zeliang Words: 33883 | 5,332 sentences/phrases in each mother tonguesIndia has 270 mother tongues as per 2011 census. Following the re...Your request cart is empty!
Dataset Description
Total Words: 4,404,845 |Zeliang Words: 33883 | 5,332 sentences/phrases in each mother tongues
India has 270 mother tongues as per 2011 census. Following the requirements of the NEP-2020, LDC-IL developed parallel corpus in Indian mother tongues. The Zeliang parallel text corpus connected with English and 146 mother tongues of India. It contains 5,332 sentences/phrases systematically structured based on 159 grammatical categories. The Zeliang section includes 33883 words and 160654 characters. Overall, the corpus comprises 44,04,845 words (over 4.4 million tokens) and 2,33,74,289 characters (approximately 23.3 million).
The price indicated corresponds to a single language component. The total payment will be determined based on the number of language components requested by the seeker.
For any research-based citations, please use the following citations:
1. Dr. Kamaraj S, Dr. Rejitha K. S, Dr. Narayan Choudhary, Prof. Shailendra Mohan. 2026. Zeliang Parallel Text Corpus: Linguistic Features and Structures. Central Institute of Indian Languages, Mysore. 978-81-69175-74-6.
2. Rejitha K. S. and Narayan Kumar Choudhary. (ed.). 2025. LDC-IL Corpus Insights. Central Institute of Indian Languages, Mysore. 978-93-48633-33-0.
Item specifics
- Corpus Type Parallel Text Corpus
- Catalogue Number 1710
- ISBN 978-81-69175-74-6
- Data Source Descriptive Grammar
- Character Count 23374289
- Word Count 4404845
- Release Date 23/3/2026
