Bilingual Lexicography : Some Issues with Modern English Urdu Lexicography – a User ' s Perspective

The tradition of bilingual lexicography in the Indian subcontinent is more than two centuries old and goes back to as far as 1772 (Hadley). This article examines the development of bilingual lexicography in the Indian subcontinent with special reference to English-Hindustani or -Urdu dictionary development. It further explores some issues specific to this field and tries to suggest some solutions. First of all it describes the historical perspective of linguistic work in the subcontinent and then discusses issues relating to English-Urdu bilingual lexicography in particular.


Introduction
The author of this article is neither a professional lexicographer nor can he claim to have a background in linguistics.He was born in the Punjab, Pakistan, but now lives in England.He works as a full time paediatrician in the National Health Service in the UK.Historically the relationship between the medical profession and lexicography is not new, to cite Dr Peter Mark Roget's Thesaurus as one example; and the reader will see more in the following paragraphs.
During his training years in the field of medical genetics, the author of this article was asked to produce some material in Urdu for the purpose of genetic counselling for the ethnic minorities in Leicester (UK).By virtue of his command of Punjabi, Saraiki, Urdu and English and some working knowledge of Persian and Arabic he agreed to undertake this project.A study of previously produced translated materials in medicine revealed that the language used was inappropriate and hard to understand for an average (educated) Urdu or Punjabi speaker, let alone a semi-literate person.Most of the terms were rendered into complex Urdu terminology.A side-by-side analysis of the English and Urdu texts exposed critical mistakes, not only because of lack of translators' understanding of the medical terminology, but also due to translational inaccuracies.This led the author to survey existing English Urdu dictionaries to find out what had fundamentally gone wrong with these texts.This resulted in his interest in the field of lexicography and later on in EFL lexicography.He has compared several English Urdu bilingual and monolingual (including some Learner) dictionaries.The following treatise is the product of three years of close examination of these works.Here the author would like to quote famous Urdu scholar and linguist Faruqi, "God knows that I am no lexicographer myself.But that is an advantage perhaps.For (as) an outsider, I can see the forest better, and not be trapped by the undergrowth or by the thick growing trees and dead wood which often trip the professional lexicographer".

2
Historical background

Eighteenth-nineteenth century India
During the Great Mogul rule of India, Persian (Farsi) was the language of the court and the elite.Indigenous languages had the same status as English once had during Norman rule of England; that is, French was the language of the court and the elite, whereas English was the language of the peasantry they ruled.As the Mogul Empire in India waned and the East India Company gained more power, Persian (Farsi) fell from the superior status it had once enjoyed and Hindustani, the ordinary language of the masses or lingua franca assumed more importance.The author of this article has no intention of becoming bogged down into the polemical discussion that British might have intentionally brought Persian down and elevated the status of local languages, as it would expand the discussion beyond the remit of this paper.
Whatever the reasons may be, Persian was on the wane in the eighteenth century when Europeans stared developing their interest in the local languages of India.Before any further discussion, it is imperative to explain the word Hindustani.It is safe to say that initially what Europeans and especially the British referred to as Hindustani was a common vernacular that developed from local Prakrits (dialects), esp.Brij bhasha and Kari boli.This common vernacular was variously termed as Moors, Indostan, Jargon, Hindustani by foreigners, and Hinduwee, Hindavi, Zaban-e-Hind, Hindi, Zaban-e-Dehli, Rekhta, Gujari, Dakkhani, Zabane-Urdu-e-Mualla, Zaban-e-Urdu, or just Urdu by local people.The dichotomy of Hindi and Urdu as two separate languages with different scripts and lexical borrowings, from Persian and Arabic for Urdu and from Sanskrit for Hindi was more to do with socio-political changes in India in the nineteenth and early twentieth centuries (King 1994, Amrit 1984).An interested reader should consult Faruqi (2001).

Ketelaar's Grammar of Hindustani
The roots of bilingual lexicography in the subcontinent lie in the development of early Hindustani grammars.Joan Josua Ketelaar wrote his Hindustani grammar book (in Dutch) in the late 17 th Century when he was an envoy of the Dutch East India Company in India.There is only one surviving manuscript copy of the first ever written grammar of Hindustani language by Ketelaar.It is preserved in the state archives at The Hague.

Military Grammars
Capt. George Hadley was the first Briton to write a grammar for the officers of the East India Company in 1772.His grammar had a glossary, which contained English and Moor (Hindustani) words.This was the prototype of a bilingual dictionary.He was soon followed by Capt.J. Fergusson who produced his first Dictionary of the Hindoostan language in two parts in 1773.This work, in his own words, "contained a great variety of phrases, to point out the idiom, to facilitate the acquisition of the language" (Hadley 1772(Hadley , 1801)).
The vocabulary in these early military grammars relates to the kind of language an officer commanding an army for the East India Company might need to know.Therefore it contained words that they (officers) learned from their sepoys (men under their command) (Friedlander 2006).These men came from various parts of the country to serve in the East India army.They spoke different dialects; hence speech of a military bazaar/camp was highly heterogeneous and rustic due the fact that these men were largely illiterate.The language of these men would not represent what was called Hindustani.This was a corrupt jargon and an amalgamation of various dialects and local accents.This is clearly revealed in the dialogues contained in these books (Hadley 1772).

J. Borthwick Gilchrist
He was born in Scotland and qualified as a doctor in Edinburgh.He then joined the East India Company and was appointed as an assistant surgeon in Calcutta.He became interested in Hindustani language and learned it from ordinary people in the North of India.His informants were from all walks of life.He travelled far and wide and employed indigenous people who spoke better language than the rustic tongue of the East India Company's sepoys (from Persian Sipâhi meaning soldier) and orderlies.Therefore his collection of parlance was superior to those of Hadley's or Ferguson's (Gilchrist 1826).His dictionary was published in Calcutta and London in 1796.
He recorded words in both Perso-arabic and Devanagari scripts.His roman transliterations indicate that he was more familiar with Hindustani phonemes than his predecessors.Gilchrist was later appointed principal at the famous Fort William College Calcutta where he was responsible for the production of literature in local languages, especially in Hindi and Urdu.Some researchers believe that he contributed to Hindi-Urdu divide by asking pundits (Hindus) and maulvis (Muslim clerics) to produce parallel literature based on religious, cultural and geo-political affiliations.

S. W. Fallon
J. B. Gilchrist was followed by many other eminent orientalists like John Shakespear (1817) and Duncan Forbes (1845).They both produced grammars of Hindustani and Hindustani English Dictionaries in the first half of the nineteenth century, but the most elaborate work in this field was undertaken by Dr. Fallon (1879).For the first time he included, not only the colloquial terms and ordinary words of day-to-day speech but also the refined language of women of high social status.His treatment of the Hindustani language was thorough and comprehensive.Unfortunately he died before the completion of his work in 1880, but this was later on completed by Rev. J. D. Bate in 1883.Soon he was followed by Platts (1884), whose Dictionary of Urdu, Classical Hindi and English is still regarded as a reference work by scholars of today.

3
Twentieth and Twenty-first century lexicography

Dr. Abdul Haq
Dr. Abdul Haq (Baba-e-Urdu, literally means "father of Urdu") of Anjuman Taraqqi Urdu Adab India (Association of Urdu literature development in India) produced the Standard English Urdu Dictionary in 1937.Dr. Haq was truly a Samuel Johnson of Urdu.His work was an extension of his predecessors, especially Dr. Faollon's, but he tried to fix the language and expunge the impurities that had once crept into it.He provided many Urdu neologisms for new words that had entered into the English language because of the rapid increase in the scientific knowledge of certain disciplines.His dictionary was based on the popular Concise Oxford Dictionary.
Dr. Haq's dictionary served Urdu community very well.A concise version was produced for students in the early years of learning English.It remained unchallenged for the first half of the twentieth century and subsequent English Urdu dictionaries drew heavily on it

Prof. B. A. Qureshi
Prof. B. A. Qureshi produced Kitabistan's English-English-Urdu dictionary (henceforth called Kitabistan) in 1957, it was published by Kitabistan, an educational publisher in Lahore.This was a middle size dictionary, produced specifically for students.The author of this dictionary claimed that it was based on the principles employed in the Concise Oxford Dictionary (COD) by the famous Fowler brothers, and those laid down in "the Interim report on Vocabulary Selection" by Thorndike et al.It contained 35,000 words and approximately 25,000 idiomatic expressions translated into Urdu equivalents.He attempted to explain word significations in simple English but was not successful on many occasions, as we shall see.

Ferozsons
Ferozsons is a reputable publishing house based in Lahore Pakistan, produced their own English into English and Urdu dictionary in 1960's.A careful analysis revealed that this dictionary was based on vocabulary used in the Chambers' Dictionary.The coverage of English vocabulary was better than Qureshi's dictionary but it significantly lacked phrasal verbs and idiomatic expressions.
In the last two decades of the twentieth century, several indigenous English-Urdu dictionaries were produced, mainly in Pakistan.These were largely abridgements of the older works directed at school and college students.

Qaumi English Urdu dictionary
The National language authority (Muqtadirah Qaumi Zuban) Islamabad, an Urdu academy in Pakistan similar to the Académie Française, produced its own dictionary under the editorial guidance of an Urdu scholar, Dr. Jameel Jalibi.This 2,300-plus pages tome (henceforth called Qaumi) was first published in 1992.It was entirely based on the Webster's encyclopaedic dictionary 1986 edition.It is truly an Americanized dictionary.This has not been revised since.It is ironic that a dictionary named as Qaumi, which means national does not encompass the variety of English peculiar to Pakistan.Pakistani English and Hinglish have now been recognised as South Asian varieties of English and these days are covered in modern monolingual English dictionaries.I would like to quote B. L. K. Henderson about what a national dictionary looks like, "a dictionary is a compendium of a nation's thought, social life, domestic and foreign activities.It is almost possible to lay down a dictum and say: Show me the nation's dictionary and I will build up from it a true picture of the nation itself".Does Qaumi (National) reflect this?A detailed review of the dictionary with its skewed macrostructure and flawed microstructure was presented by Prof. Shahid Hameed in "The Annuals of Urdu Studies" (1994).

J mi-English Urdu Dictionary
This multi-volume dictionary was produced under the editorial guidance of Prof. Kalimuddin.The six-volume dictionary (henceforth called J mi) was completed in the late 1970's, but owing to lack of funds and bureaucratic red tape it was finally published between 1994 and 1998.It was already outdated by the time it was published.It was a project of National Council for Promotion of Urdu language (NCPUL), Delhi, an Indian government institution under the Ministry of Human Resource Development.This monumental work claimed to give translations for approximately 250,000 words and phrases and to cover over eighty four disciplines of knowledge.The dictionary's preface does not mention any single English work upon which it was based but careful study reveals that this work also draws heavily on an older edition of the Webster.

3.6
The Oxford English Urdu dictionary This is the latest among the English Urdu dictionaries and was published in 2003 by the OUP Karachi, Pakistan.Its compiler Mr. Shanul Haq Haqqee (1917-2005) was a well-known Urdu poet, scholar and translator.This work (henceforth called Oxford Urdu) was partly based on the eighth and ninth editions of the famous Concise Oxford Dictionary (1990Dictionary ( , 1995)).It took him nearly 13 years to complete it because it was almost entirely a one-man show.This dictionary is over 2000 pages and covers nearly 125,000 word and idiomatic expressions.

Specific issues with English Urdu lexicography
Despite considerable achievements in the linguistic field in the Indian subcontinent some issues still need to be addressed.Now we will deal with each issue separately.

Different audience and readership
Early English Urdu dictionaries were primarily written by orientalists for English language speakers.Subsequent dictionaries were written for and by the speakers of the Hindustani/ Urdu language.The content and structure of a bilingual dictionary changes radically, depending on the audience it serves.Thus an English to Urdu dictionary for Urdu speakers is meant for comprehension of texts in the target language (English), whereas an English to Urdu dictionary for English speakers would be used primarily for production in the target language (Urdu).Therefore, the structure of entries in the two types of English to Urdu dictionaries will be different because of the emphasis on different aspects of dictionary making.Take as an example the word suhag and suhagan.This is a culture-specific word of Indian origin and needs an explanation, as there is no lexical equivalent in English.Therefore the purpose of a dictionary and the audience it serves dictate its structure.A "foreign to native use" dictionary cannot replace a "native to foreign use" dictionary.Unfortunately our English-Urdu lexicographers did not appreciate this crucial difference.The author suggests the future lexicographers close attention to this vital aspect of dictionary making.

Lexicography or translation
Bilingual lexicography is not merely the translation of one language's words into another language.A careful study of currently available English Urdu bilingual dictionaries reveals that many of them smack of pure translation.In the author's opinion there are subtle differences between bilingual dictionary-making and pure translations.Although closely related, the two are not quite the same; literal translations may not encompass the different connotations that a word may carry, especially when the same word is used in different contexts.English Urdu lexicographers have sometimes fallen prey to this trap.Grammatical information about the use of words is particularly sparse in these dictionaries.Here we will quote a specific example to illustrate this caveat and the resultant syntactic pitfall.Take for example the English word buy defined in the latest edition of Oxford Dictionary of English (henceforth called ODE).
buy: "obtain in exchange for payment": A typical translation from Qaumi reads as follows: a usage Khareedna/ laana/ mool leyna.Now we will discuss some semantic and grammatical issues related to this entry.Let us take examples from ODE. "She bought six first-class stamps" (sub + v + obj).Now using the translation from Qaumi, its Urdu translation will be: us ne chhey darja-awwal ke ticket khareedey/khareed kiye.This is a perfect translation.But it does not solve the problem for an Urdu learner of English language, when the verb buy is used in a different grammatical structure.Consider another example: "She bought me a present" (sub + v + obj + obj).Following the above mentioned pattern of translation an ungrammatical (word for word) and syntactically absurd translation will be: *us ne aik tuhfa mujhe khareeda, or even worse will be **woh mujhe aik tuhfa mool laya, in both instances the English word "bought" (translated into khareeda or mool laya) refers to "me" and implies "buying someone as a slave".This confusion arises because in Urdu "buy" cannot be used as a verb with both direct and indirect objects, whereas this is a standard construction in English.
Here a grammatically correct translation would be: us ne meray leay aik tuhfa khareed kia/ khareeda.This example illustrates that word for word translation does not cover all possible grammatical structures, and in addition a source language will behave very differently from a target language when the same word is used in different structures.Unless a bilingual dictionary furnishes this vital information on semantics, grammar and pragmatics, a learner of English is bound to make mistakes when he tries to express himself in the target language.Therefore supplemental information is vital to make full use of a bilingual dictionary.Had Qaumi described this in the following manner, all ambiguity would have gone.
As you can see it gives more information on how to use the verb in different grammatical structures and the resultant changes in source language.
There are problems at the semantic level too.Consider the entry for hotel in Oxford Urdu.It renders it as hotel ( ), a loan word used commonly as such, but in addition it also gives funduq ( ) as an alternative.This is an Arabic word with no currency in Urdu at all.The situation is further complicated by the fact that what hotel means to an Urdu speaker is truly called restaurant in English.Thus it is not a faithful translation or lexical equivalent.This is what lexicographers have usually referred to as false friends/cognates in languages.Therefore a translation like musafir xana ( ) or saraey ( ) would have been better alternatives to hotel when translating this word into a bilingual dictionary.In addition an explanatory note would have been of immense help to the learners of English if this distinction of meaning in source and target languages was given.In this regard an EFL dictionary is superior to a bilingual dictionary as it gives a clear definition.For example the Cambridge Advanced Learner Dictionary describes "hotel" as: a building where you pay to have a room to sleep in, and where you can eat meals.
Another fundamental problem with some of these larger works is that their compilers have failed to appreciate the crucial differences between monolingual and bilingual lexicography, i.e. defining the lexical items in the former is more important than giving lexical equivalents, a characteristic of the latter type.Consider a common English word such as water.A monolingual lexicographer has to explain what water is, because there is no satisfactory lexical equivalent.There is no such a problem with bilingual lexicographer because he has lexical equivalents in the target language.Qaumi has defined "water" in no less than 114 words, describing its chemical structure and physical properties, in addition to its lexical equivalents in Urdu, Persian, Arabic and Hindi.This is clearly overkill.The author of this paper admits that not all lexical items in the source language have equivalents in the target language and explanations will be required in such circumstances but one has to be familiar with the different structure of entries in the two types of dictionaries.
Another problem with current English Urdu dictionaries is that these help, to some extent in the comprehension of the target (English) language (decoding).But their ability to equip the user to effectively express himself/herself in the target language (production or encoding) has serious deficiencies.Here a monolingual dictionary for foreign learners (EFL dictionary), bilingualised or even a Bilingual Dictionary Plus as proposed by Laufer/Levitzky-Aviad (2006), would be more useful to production than a conventional translational-type bilingual dictionary.

Defining Vocabulary
As explained earlier, vocabulary selection in bilingual lexicography depends on the audience and the native language of its users.Earlier works were based on monolingual English dictionaries.As the field of linguistics expanded, research into foreign language acquisition made it clear that foreign language learners needed a different set of vocabulary to understand the target language.Hence monolingual English dictionaries were replaced by EFL and ESL dictionaries.The core elements of these dictionaries were a "controlled defining vocabulary", extensive information on grammar and usage illustrations.Traditional English Urdu lexicographers have never paid attention to these very important issues.There is no English Urdu dictionary available so far exclusively based on an EFL/ESL monolingual dictionary.
The underlined English equivalents given here are as difficult as the original entry words.They do not form part of the controlled defining vocabulary used in either of Longman, Oxford, Cambridge, Macmillan or Cobuild advanced learner's dictionaries.Thus a learner becomes frustrated with L2-L2 content in these dictionaries and only consults L1 translations, and is prone to make mistakes when he uses a wrong sense of the word in a given context.

Language research and corpus linguistics
The author of this article notes with deep concern that no formal research in Urdu linguistics, especially corpus linguistics (with the exception of a small but very useful work by Becker/Riaz) has been undertaken so far.The author is not aware of any linguistic work that informs us with certainty how Urdu is behaving in different parts of the world today.Is Urdu in Pakistan, where it is the national and official language, the same as the one spoken and written in India, where it may have been affected by its close kin Hindi?To what extent is the language of the Urdu-speaking diaspora in the West, esp. in the UK, the USA and Canada, being affected by English?Therefore, a modern Urdu lexicographer/linguist cannot shy away from these burning issues of Urdu language.In the author's opinion The National Language Authority in Islamabad (Pakistan) and NCPUL in Delhi (India) should commission research projects in these neglected areas such as Urdu corpus development and computational linguistics.

Neologism
Earlier dictionary makers devised new words for those English terms where no exact lexical equivalents were available in the native language.With the explosion of knowledge in the scientific fields there was a surge in new terms.As new terminology in English and other European languages was based on Classical Latin and Greek roots, Urdu lexicographers made a similar analogy and produced terms based on Classical Persian and Arabic roots.The problem is compounded by the fact that nowadays thousands of new words are being added to the English lexicon each year and it is impossible to keep pace with new terms to provide Urdu equivalents from Perso-arabic roots.
A closer analysis of these new coinages in Urdu has shown that most of these terms are very complex, and they sound more foreign to an average educated Urdu speaker than their counterparts in English.In fact an English term may be more widely known to Urdu speakers than these arcane terms.Take as an example the English word atom (originally meaning indivisible -from Greek via Latin -where a means "not" and tomos "to cut"); in J mi this has been rendered into Urdu as zarrah-e-la yatajazzi ( -literally meaning a particle that cannot be subdivided).This is a conceptual definition and has already become obsolete because the concept of indivisibility is no longer held true in atomic physics.If someone searched current Urdu texts, as well as spoken language, he will be astonished to find that this word is always written and spoken of as atom ( ) or jauhar ( ) and not zarra-e-la yatajazzi.Here is another common word: chairman ( ); most Urdu newspapers and spoken discourses testify that it is an established loanword in Urdu.However purists in dictionaries such as Qaumi translate it as sadar-nasheen ( ) which is a Persian compound word.Other examples include common items such as radio, television, microwave, and a host of internet terms.These are well-established loanwords written in Urdu script exactly the same as English words.Therefore a bilingual dictionary maker should refrain from inventing lemmas based on Perso-arabic roots or producing conceptual translations, when an English word is well known to native speakers of Urdu and has gained sufficient currency in daily Urdu usage.
A dictionary search on such common medical terms as chromosomes, genes and virus would result in one coming across such bizarre terms as launia ( ), janeen ( ) and bis ( ) respectively.The author of this paper performed a detailed search on written Urdu texts on the internet to check the currency of these unknown neologisms and found no evidence in current general or medical Urdu literature.This clearly establishes the fact that standard Urdu has absorbed these medical words as such into Urdu.
The author feels strongly that it is about time that purists stop insisting on outdated coinages and embrace these new English terms with open arms.It is claimed that Urdu is a camp language and has the flexibility to mould itself and borrow words from other languages and yet retain its flavour.If it were not for the constraint of space, the author could cite numerous examples of these complex neologisms only found in Urdu lexicons of scientific terminology (for example in Saency wa takniki istilahaat.Scientific and Technical Terms published by National Language Authority Islamabad) that have no currency in modern Urdu usage.
The author of this paper would urge the English-Urdu lexicographer to cast off the shroud of prescriptivism and become a truly descriptive lexicographer.Secondly a lexicographer should refrain from jargonising or inventing an artificial language based on classical roots, and let people decide how they want to use the language.

Pronunciation
This is another vexed problem in current English Urdu lexicography.Earlier lexicographers like Fallons did not give English pronunciation a place in their works as these dictionaries were directed at English speakers.Later works by indigenous scholars paid little attention to pronunciation and only showed where it markedly deviated from spellings.
Because Urdu orthography spells words as they are pronounced, ordinary Urdu learners of English assume the same is true for English words.This results in funny pronunciations by Urdu speakers.An example would be the word "comfortable" which is pronounced by an Urdu speaker with four distinct syllables as cum-for-tay-ble (accent shown by bold face).In addition there is a shift in stress in line with the Urdu sound system.Larger works like Prof. Kalimuddin's J mi and Jalibi's Qaumi dictionary give no pronunciation for English words at all.However, the latest Oxford English Urdu does give pronunciations in IPA notations.The problem here is that the guide to pronunciation is in English.Therefore, an average English learner in India and Pakistan is totally at a loss to understand these notations, as these have been explained with English words.The pronunciation of vowels, diphthongs and some consonants can not be construed by studying the example words, unless a teacher trained in IPA or a native English speaker pronounces these words to show the correct articulation.It would have been extremely beneficial if English pronunciations had been rendered into Urdu phonetic transcript, just as Verma has given in Devanagari script in the Oxford Progressive English Hindi dictionary (1977).Even then it would not be a substitute for a teacher pronouncing these sounds in front of his/her class in real time.
The author of this paper suggests that modern English Urdu lexicographers should use multimedia technologies and Standard English pronunciations (British or American) should come in an accompanying CD-ROM with the dictionaries as many European bilingual dictionaries have recently introduced.Alternatively there must be a dedicated website where both English and Urdu pronunciations are made available for study and practice, just as many of the modern EFL/ESL dictionaries have accompanying websites.

Future
Finally the author of this article would like to make certain suggestions to those already involved or interested in English Urdu lexicography.The future lies in the new genre of bilingualised dictionaries based on monolingual EFL dictionaries.The incorporation of new multimedia technologies to create true bidirectional dictionaries which can help the userlearner in decoding and encoding should be the ultimate goal.More user oriented research with sound methodology is needed to fully comprehend his needs as well as his problems.
Hence large scale surveys and classroom based research should be done before commissioning any new project.