|
[ Text Corpora ] [ Image Corpora ] [ Speech Corpora ] [ Lexical Resources ] [ NLP Applications ] |
|
|
[ How to Order ] |
|
|
|
|
|
CLE is making these linguistic resources available without cost for supporting academic, non-commercial research. The processing fees being charged will be used to maintain these resources. You are requested to contact CLE directly for any discounts (applicable only for selective public organizations in Pakistan) or for commercial licensing options.
|
|
|
|
|
|
CLE Pakistan District Names Speech Corpus - Urdu Speakers |
|
|
[ Pakistan ] [ International ] |
|
|
|
|
CLE Catalog #: |
CLE16S008 |
Release Date: |
12 July 2016 |
First Language of Speakers: |
Urdu |
Duration: |
52 minutes |
Number of Utterances: |
3424 |
Distribution: |
1 DVD, Web Download |
Processing Fee (Pakistan): |
30000 PKR |
Processing Fee (International): |
250 USD |
License: |
Yes |
|
|
|
|
Introduction |
|
This package is a collection of speech data of district names of Pakistan recorded from Urdu speakers. The corpus comprises of 139 single word vocabulary items. The data is recorded through mobile channel at a sampling rate of 8 KHz and digitization rate of 16 bits. Gender and district of origin of each speaker is also provided with the corpus. Age of the speakers ranges from 18 to 50 years. The data was collected in outdoor and office environments. The corpus has been cleaned and verified by expert linguists. The data is annotated at word level using CI SAMPA which is mapped on the Urdu IPA symbols. |
|
|
|
Data Source |
|
Data is collected from students and employees of different universities and research institutes largely from Lahore, Quetta, Bahawalnagar, Peshawar, Gujranwala, Gujrat, Karachi, Faisalabad and Rawalpindi. |
|
|
|
Data |
|
List of vocabulary items covered in the corpus is available here. The package contains three folders. The details of each folder are as follows:
- male: This folder contains audio files from male speakers in wav format.
- female: This folder contains audio files from female speakers in wav format.
- info: This folder contains information about corpus.
|
|
|
|
Sample |
|
Download Sample |
|
|
|
|
|