---------------------------------------------------------------------------------- Unicode Urdu to ASCII Transliteration ---------------------------------------------------------------------------------- Transliteration utility maps Unicode Urdu text to ASCII encoding. There are two options available, either to diacirtize the input text before transliteration (Transliterate.bat) or not (TransliterateNoAerab.bat). Both utilities take two parameters and . Syntax: ---------------------------------------------------------------------------------- Transliterate TransliterateNoAerab File Formats: ---------------------------------------------------------------------------------- The input file should be a simple text file (.txt), containing the Urdu words to be transliterated. It should be in Unicode format. Output file contains the equivalent transliterated text, each word per line. These utilities transliterate only the Urdu text. The following files are used by these utilities to generate output. 1) NormalizeNFC.txt This file contains normalization rules for composition (NFC). Each line of file contains one rule. The format of a rule is: replace:pattern (right to left), where replace may be empty. 2) Rules.txt This file lists the Unicode to Urdu Zabta Takhti (UZT) conversion rules. 3) Transliteration.txt This file contains the transliteration rules to be used by Xerox Finite-State Tool (XFST) These utilities also generate some interim files during conversion. 1) Normalized.txt The input text is normalized and tokenized and stored in Normalized.txt. The input is tokenized on white space and some punctuation marks. 2) Diacritized.txt This normalized text is then diacritized by looking up the Urdu lexicon (wordformshashtable.lex) developed at Center for Research in Urdu Language Processing (CRULP). It contains 80 thousand diacritized Urdu words. If multiple options are available only first option is selected. 3) uztconverted.txt This file contains UZT equivalent text of diacritized text. 4) XFSTOut.txt This file contains the output generated by XFST