Telugu Sentence Aligned Speech Corpus

Telugu Sentence Aligned Speech Corpus

0 reviews requests (0)
Catalogue Number: 1506
Stock In Stock

OverView

15:38:53 hours | 10.1 GB | 9,548 Audio Segments | 80 Speakers The LDC-IL Telugu Sentence Aligned Speech dataset comprises audio files in wav format, accompanied by a corresponding textual layer containing phonetically normalized...
Please Login to see the price

Dataset Description

15:38:53 hours | 10.1 GB | 9,548 Audio Segments | 80 Speakers

 

The LDC-IL Telugu Sentence Aligned Speech dataset comprises audio files in wav format, accompanied by a corresponding textual layer containing phonetically normalized and orthographically normalized annotations in Telugu script. This dataset spans a duration of 15:38:53 hours (hh:mm:ss), consisting of read speech with continuous text, representative sentences, and date formats. The data is derived from 24 female and 56 male native Telugu speakers, encompassing diverse age groups and regions. A comprehensive explanation of dataset can be found in the Telugu Sentence Aligned Speech Documentation.


For any research-based citations, please use the following citations:

  1. Dr. Modugu Kasimbabu, Kavitha Lenin, Rajesha N., Manasa G., Stephen Fernandes, Nithin S., Roopashri M. R., Dr. Narayan Kumar Choudhary, Prof. Shailendra Mohan. 2025. Telugu Sentence Aligned Speech Corpus, Central Institute of Indian Languages, Mysore. 978-93-48633-04-0
  2. Rejitha K. S. and Narayan Kumar Choudhary. (ed.). 2025. LDC-IL Corpus Insights. Central Institute of Indian Languages, Mysore. 978-93-48633-33-0.

Item specifics

  • Authors Dr. Modugu Kasimbabu, Kavitha Lenin, Rajesha N., Manasa G., Stephen Fernandes, Nithin S., Roopashri M. R., Dr. Narayan Kumar Choudhary, Prof. Shailendra Mohan
  • Corpus Type Sentence Aligned Speech Corpus
  • Catalogue Number 1506
  • ISBN 978-93-48633-04-0
  • Data Source On Field
  • Duration 15:38:53 hours
  • # of Audio Segments 9548
  • Release Date 20-Mar-25
  • Terms and Conditions General instructions for use of the resources provided by LDC-IL.
Commercial User
Non-Commercial User
LDC-IL Raw Text Corpora: An Overview
LDC-IL Raw Speech Corpora: An Overview

Write a review

Please login or register to review