Odia Parts of Speech Annotated Corpus

0 reviews requests (0)

Owner Central Institute of Indian Languages

Catalogue Number: 1695

Stock In Stock

OverView

683112 Tags| 587653 Words | 53288 SentencesThe Linguistic Data Consortium for Indian Languages (LDC-IL) is developed Parts-of-Speech annotated corpus for Sche...

Please Login to see the price

Tags: Odia Parts of Speech PoS Annotated Text Corpus

Categories Cart Account Search Recent View Go to Top

Dataset Description

683112 Tags| 587653 Words | 53288 Sentences

The Linguistic Data Consortium for Indian Languages (LDC-IL) is developed Parts-of-Speech annotated corpus for Scheduled Indian languages. The corpus is annotated with Part-of-Speech (PoS) tags based on the Bureau of Indian Standards (BIS) PoS Tagset. This data is a significant resource for natural language processing and linguistic research. LDC-IL developed annotated text corpora for Odia. The Odia PoS annotated corpus is automatically tagged and then verified by linguistic experts to ensure accuracy and consistency.
Odia PoS annotated Corpus contains 683112 Part-of-Speech tags.

For any research-based citations, please use the following citations:

1. Subhashree Mohanty, Dr. Narayan Choudhary, Rajesha N., Manasa G. 2026. Odia Parts of Speech Annotated Corpus. Central Institute of Indian Languages, Mysore.978-81-69175-88-3.

2. Rejitha K. S. and Narayan Kumar Choudhary. (ed.). 2026. LDC-IL Parts of Speech Annotated Corpus Based on BIS Framework. Central Institute of Indian Languages, Mysore. 978-81-69175-60-9.

Item specifics

Authors Subhashree Mohanty, Dr. Narayan Choudhary, Rajesha N., Manasa G.
Corpus Type Parts of Speech Annotated Text Corpus
Catalogue Number 1698
ISBN 978-81-69175-88-3
Data Source Annotated
Word Count 587653
Release Date 23/3/2026
Terms and Conditions General instructions for use of the resources provided by LDC-IL.
Tag Count 683112

Odia Parts of Speech Annotated Corpus

OverView

Odia Parts of Speech Annotated Corpus

Gojri/Gujjari/Gujar Parallel Text Corpus: Linguistic Features and Structures

Gujarati Parallel Text Corpus: Linguistic Features and Structures

The Mother Tongue Parallel Text Corpus of India Vol. I

Dataset Description

Item specifics

Write a review