Pakistan has a rich multilingual and multi-cultural heritage, with about 70 spoken languages, deriving from a diverse set of Indo-Aryan, Indo-Iranian, Sino-Tibetan and Dravidian language families. More than half of these languages also have a written form, employing (predominantly) Perso-Arabic Nastalique and Arabic Naskh writing styles. Gujarati, Gurmuki and Tibetan scripts are also used by some communities, while some others are in the process of defining their writing systems. These languages exhibit a diverse set of sounds and underlying linguistic structures which are both linguistically and computationally exciting and challenging. Most of these languages are not well-studied or well-modeled, and present a vast training ground for researchers in linguistics and computer science.
Center for Language Engineering (CLE) is conducting research and development in linguistic and computational aspects of languages, specifically of Pakistan and developing Asia. The center also actively arranges and participates in seminars, workshops and conferences dedicated to promote language processing nationally and internationally. This work will be instrumental in development of computing in relevant languages.
CLE aims to create opportunities for local populations to access information and communicate in their local languages, to enable them to use Information and Communication Technology maximally for their socio-economic benefit.
This entails research and development objectives in the following areas:
· Linguistics and writing system
· Language computing standards
· Language, speech and script processing
· Relevant local language content development and dissemination
· Local language computing adoption and use
· Local language computing policy
CLE has a dedicated team of graduate students and full-time research staff, including linguists, sociologists, computer scientists and engineers, striving to achieve these objectives.