Center for Language Engineering






[ Localization ] [ Language Processing ] [ Linguistic Resources ]

  Urdu Cleaning Application  

Urdu text corpus have the issues of space, Zero width noun joiner, compound words, typological errors , normalization and the errors of affixation. This utility is developed to provide facility to the user to semi-automatically remove these errors and clean the text corpus. This utility helps out the user to navigate text file line by line using next and previous buttons. Separate buttons are provided to User to add/remove Zero Width non-joiner and Urdu special symbols with single button click.

  Download (This file has been accessed: times, since 24 July 2012)  
  Urdu Cleaning Application v1.0 User Guide License