How to Get Data

How to Get Data from LDC-IL ?

Page Navigation

Step Wise Procedures for Commercial Users :
Step 1: Registration on the Portal
Step 2: Requesting Data on the Portal
Step 3: Submitting Hard copies of the Agreement and Other Documents
Step 4: Connecting with the Official of LDC-IL
Step 5: Making payment
Step 6: Approvals
Step 7: Getting the Data (via Media)
Step Wise Procedures for Non-Commercial Users :
Step 1: Checking your eligibility for Non-Commercial License.
Step 2: Registration on the Portal
Step 3: Requesting Data on the Portal
Step 4: Submitting Hard copies of the Agreement and Other Documents
Step 5: Connecting with the Official of LDC-IL
Step 6: Making payment (if any)
Step 7: Approvals
Step 8: Getting the Data (via Media)

Overview: Accessing Data from LDC-IL

Getting the datasets requires some manual steps both by the requester and the LDC-IL. So, you would not be able to download the dataset directly by simply registering on the portal. You need to follow some procedures which may take a week’s time or more depending upon how the dataset is transferred.
The text data are smaller in size (upto 100 MB for a language). As the text data are small, they are distributed online via a download link provided through a confidential download link. However, the speech datasets are bigger in size (most of the time above 50 GB each language).
Therefore, this data cannot be downloaded via online methods. Rather, the data is transferred via a storage media that can be either procured by you and sent to us for copying and sent back to you.
Else, you can make the payment for the USB/storage device and we will procure and send you the speech data in the given device.
A step-wise summary of the data procurement can be seen in the table of contents given above.
At present, we are providing the data to the users by distributing them into the two categories as noted below.

1. Commercial Users

  • MNCs and Foreign Entities
  • The Non-MNC Indian Company
  • MSME/Entities from SAARC Countries
  • Startups/MSMEs with a turnover of less than 5 crores
  • 2. Non-Commercial Users

  • Research Students- PhD Scholars/Research Scholars
  • Research /Educational Institution of State & Central Govt.
  • Step Wise Procedures for Commercial Users :--
    Step 1: Registration on the Portal
  • Click on Register
  • Select Commercial user group and continue.
  • Fill up your credentials like Name, last name, Email, Telephone/Mobile number, password etc
  • Complete the Captcha validation and click on privacy policy which in turn completes the registration.
  • An E-mail will be sent to your registered mail ID.


  • You will receive a confirmation e-mail from LDC-IL . Your registration procedure is complete.


  • You can login to your account to view the datasets and also raise the request for datasets.

  • Step 2: Requesting Data on the Portal
  • In Home page click on login to go to your account.
  • Select your preferred language Text/Speech corpus and add it to cart.

  • Go to My Cart and click on View Cart/checkout.

  • In billing details, it will show the address which you have filled while registering your account. If you wish to change the address, the option is available there to update it and click on continue.

  • In Payment Method please read the instructions given in blue box such as Cheque/DD drawn in favor of and address to where you need to send the necessary documents. Furnish your Cheque/DD details or can skip this and later you can send it along with the documents. Check and Click on Terms and Conditions.

  • Please download the commercial undertaking document. Click on continue to proceed to confirm request.

  • In confirm request, kindly read the bill amount & important information carefully to confirm your request, as you can’t make any changes after this step.

  • Step 3: Submitting Hard copies of the Agreement and Other Documents

    You need to submit all the following documents (self-attested) which are mandate to raise the request.

    ·  Valid ID Card/Citizen Card/Passport /Driving License (international request)
    ·   Photocopy of valid Voter's ID/ UID/ PAN card
    ·  Photocopy of Business/Entity License
    ·  Photocopy of Company’s PAN card and GST certificate.
    ·  The Cheque/Demand Draft must be drawn in favor of MHRD Higher, CAS CLG, New Delhi.

    ·  Take the print out of Undertaking and duly attest the same after filling necessary information. You need to sign at the bottom side on every page of the Undertaking document and your Organisation Head at the end of the document with company seal.

    ·  Post the necessary documents to below address: 

              The Director
            LDCIL Data Request
            Central Institute of Indian Languages
            Manasagangotri, Hunsur Road,
            Mysore - 570006
            Karnataka, India
    For any queries please call on – 0821-2345098
    Or mail us: oic hypen ldcil[at]gov[tod]in

    Step 5: Making payment

    All payments must be made either via Cheque or Demand Draft drawn in favour of “MHRD HIGHER CAS CLG, NEW DELHI”. In exceptional cases, we may provide a mechanism to do NEFT/RTGS/Bank Transfer (in which case you need to write separately to us.)

    After receiving the all necessary documents and payment, the request will be put up for internal approval i.e., the competent authority of CIIL. It may take minimum 10 working days.
    Step 7: Getting the Data (via Media)
    After getting the internal approval, the data will make available to the requester for download. If the data size is larger, (e.g. above 2GB, as in the case of all the speech data), it may take further more time as the data is delivered in a physical media and couriered to you the address of the requester.
    The physical media charges are as follows:
    Sl No Data Size Link/Physical Media Charges *
    1 Upto 2 GB Web link No Charges *
    2 2 GB to 110 GB 128 GB Pen Drive Rs. 2000/- (within India)
    3 Exceeding 110 GB External Hard Disk Rs. 4500/- (within India)


    ·        Charges include physical media expenses as well as shipping charges. 

    ·        The shipping charges may be extra for foreign countries which will be intimated after receiving the data request. 

    ·       The requester may also send the physical media which is having provision of encrypting the data. In such case no need to make the payment.

    Step Wise Procedures for Non-Commercial Users :--
    Step 1: Checking your eligibility for Non-Commercial License.

    For individual researchers affiliated to recognized academic and research Institutes, the use is limited only during their tenure as affiliated to that Institute (or subsequently other research Institutes). Once they cease to be affiliated to any research and non-commercial Institute, they should destroy the datasets and not use it for any commercial purposes. For more the terms and conditions, please see the Non-Commercial Undertaking.

    Step 2: Registration on the Portal.

  • Click on Register
  • Select  Non-Commercial user group and continue.
  • Fill up your credentials like Name, last name, Email, Telephone/Mobile number, password etc. The users need to upload their valid ID proof soft copy for registration.
  • Complete the Captcha validation and click on privacy policy which in turn completes the registration.
  • An E-mail will be sent to your registered mail ID.


  • ·        You will receive a confirmation e-mail from LDC-IL. Your registration procedure is complete.

    Step 3: Requesting Data on the Portal

    ·             In Home page click on login to go to your account.

    ·             Select your preferred language Text/Speech corpus and add it to cart.

     

    • Go to My Cart and click on View Cart/checkout.

  • In billing details, it will show the address which you have filled while registering your account. If you wish to change the address, the option is available there to update it and click on continue.
  • In Payment Method please read the instructions given in blue box such as Cheque/DD drawn in favor of and address to where you need to send the necessary documents. You need not to furnish Cheque/DD details here as it is a non-commercial request.  Check and Click on Terms and Conditions.

  • ·       Please download the non-commercial undertaking document. Click on continue to proceed to confirm request

    ·       In confirm request, kindly IGNORE the bill amount as it is a non-commercial request & read important information carefully to confirm your request, as you can’t make any changes after this step.



















    Step 4: Submitting Hard copies of the Agreement and Other Documents

    You need to submit all the following documents (self attested) which are mandate to raise the request.

    ·        Photocopy of valid Voter's ID/ UID/ PAN card/ PASSPORT/ Central/ State Government ID

    ·        Photocopy of Student/Employee ID.

    ·        If Institution, it must be registered and should be based in India.

    ·        Research students should send a letter through the HOD of their Universities.

    ·        A brief report about their research for which you require LDCIL data.

    ·        Take the print out of Undertaking and duly attest the same after filling necessary information. You need to sign at the bottom side on every page of the Undertaking document and your HOD at the end of the document with department seal.      

    ·         Post the necessary documents to below address:

    The Director

    LDCIL Data Request

    Central Institute of Indian Languages

    Manasagangotri, Hunsur Road,

    Mysore -570006

    Karnataka, India

    For any queries please call on – 0821-2345098/2345007
    Or mail us: oic hypen ldcil[at]gov[tod]in

    Step 6: Making payment(if any)

    All payments must be made either via Cheque or Demand Draft drawn in favour of “MHRD HIGHER CAS CLG, NEW DELHI”.

    Step 7: Approvals

    After receiving the all necessary documents and payment, the request will be put up for internal approval i.e., the competent authority of CIIL. It may take minimum 10 working days.

    Step 8: Getting the Data (via Media)

    After getting the internal approval, the data will make available to the requester for download. If the data size is larger, (e.g. above 2GB, as in the case of all the speech data), it may take further more time as the data is delivered in a physical media and couriered to you the address of the requester.

    The physical media charges are as follows:

    There are no datasets to list in this category.