The question of Urdu’s linguistic identity has long been a matter of spirited debate, often polarised by politics, ideology, and ignorance of philology. While romanticised narratives have often exaggerated its Persian and Arabic affiliations, a closer, evidence-based linguistic analysis reveals an overwhelming substratum of Indic influence. Particularly compelling are the testimonies of eminent lexicographers and linguists, such as Rauf Parekh and Syed Ahmed Dehlavi, who have documented the structural foundations of Urdu in Sanskrit and Prakrit. This article re-evaluates the linguistic composition of Urdu through a scholarly lens, debunking popular myths and shedding light on its true etymological ancestry.
Urdu's Morphological Core: Sanskrit and Prakrit Foundations
Linguistic classification hinges on the structural, morphological, and syntactic constituents of a language. By this criterion, Urdu aligns unambiguously with the Indo-Aryan branch of the Indo-Iranian family, itself a subset of the larger Indo-European family of languages. Although Urdu borrows lexically from Persian, Arabic, and Turkic languages, such borrowings are often cosmetic—affecting only the surface vocabulary—while the grammatical framework remains Indic in essence.
According to linguist Dr Rauf Parekh, approximately ninety-nine per cent of Urdu verbs are rooted in Sanskrit and Prakrit. Verbs, being integral to a language’s syntactic architecture, carry more weight in linguistic classification than nouns or adjectives, which are more prone to foreign borrowings due to trade, conquest, and cultural exchange. That Urdu retains such an overwhelmingly Indic verbal structure decisively undermines claims of it being a fundamentally Perso-Arabic language.
Syed Ahmed Dehlavi’s Farhang-e-Asafiya: An Inadvertent Testimony
Perhaps the most rigorous empirical source supporting the Indic origins of Urdu is Farhang-e-Asafiya, the magisterial lexicon compiled by Syed Ahmed Dehlavi in the late nineteenth century. Dehlavi, a member of Delhi’s Persianised Muslim elite, had no discernible ideological bias in favour of native Indian languages. On the contrary, his predispositions lay firmly with Persian, often to the detriment of native vocabulary.
Yet, even within this context, the statistics he compiled speak volumes: out of the 55,000 Urdu words catalogued in Farhang-e-Asafiya, approximately 75 per cent are of Sanskrit and Prakrit origin. Moreover, he notes that the entirety of the base stock of Urdu, “without exception”, derives from these sources. That such an admission comes from a scholar who actively marginalised indigenous words in favour of more obscure Persian alternatives only serves to reinforce the extent of the Indic substratum.
Had the lexicon been compiled by a scholar more attuned to vernacular usage and less beholden to Persian literary prestige, the percentage of Prakritic-origin words would likely have been even higher. Indeed, this aligns with Parekh’s assertion that the language of the average Urdu speaker is far more Indic in vocabulary and syntax than elite-written forms might suggest.
Urdu Without Indic Roots: A Linguistic Experiment and Its Failure
The artificial severance of Urdu from its Indic roots was once attempted in practice, most notably in the composition of Pakistan’s national anthem by Hafeez Jalandhari in 1952. In an effort to purge Sanskrit and Prakrit influences, Jalandhari employed a highly Persianised register of vocabulary. The result, however, was a text that drifted so far from intelligible Urdu that it verged on becoming a different language altogether—comprehensible only to those well-versed in classical Persian.
Notably, despite these efforts, certain Prakrit-derived grammatical elements still crept into the composition. Words such as "tū" (you) and "kā" (of), both of Prakritic origin, appear in the anthem, attesting to the indispensability of the Indic grammatical framework in expressing even the most formal ideas in Urdu.
The Myth of a Distinct Linguistic Identity
Some scholars and commentators have posited Urdu as an independent linguistic entity, distinct from Hindi. However, this assertion does not withstand rigorous scrutiny. A basic litmus test for linguistic distinctiveness—namely, the ability to construct a grammatically correct sentence in the language without recourse to borrowings from unrelated tongues—fails in Urdu's case.
To further explore the nature of Urdu, one must consider whether it is possible to create a sentence in Urdu without relying on words derived from Sanskrit and Prakrit. Linguists who assert that Urdu is a distinct language must rise to this challenge by crafting a sentence in Urdu without using any of the following words: मैं, आप, तू, का, की, यह, वह, हम, तुम, कहाँ, यहाँ, etc., all of which are derived from Sanskrit and Prakrit and are integral to the Hindi lexicon.
The inability to compose a coherent sentence in Urdu without using these words calls into question the very notion of Urdu as a distinct language. On the contrary, Hindi, with its closer relationship to Sanskrit, retains a more distinct identity, with far fewer loanwords from Persian and Arabic. This distinction further emphasises the nature of Urdu as a language built on the foundation of Hindi, with Persian and Arabic influences added over time due to historical and sociopolitical factors.
One cannot construct even a simple sentence in Urdu without relying on grammatical particles, pronouns, and syntactic constructions that are entirely inherited from Hindi (and by extension, Sanskrit and Prakrit).
Conversely, one can easily compose elaborate sentences in Hindi without invoking a single word of Persian or Arabic origin. This linguistic asymmetry reveals that while Hindi enjoys ontological and structural independence, Urdu is structurally parasitic—relying fundamentally on the grammatical architecture of Hindi, over which a superficial Persianate veneer has been applied.
Conclusion
The overwhelming preponderance of Sanskrit and Prakrit elements in Urdu—particularly in its verbs and grammatical particles—renders any claim of its being an autonomous language with a non-Indic pedigree untenable. Urdu, in its structural essence, is a daughter of Indo-Aryan linguistic traditions. Its identity as a distinct language has been politically constructed, not linguistically derived.
While the literary embellishments and poetic grandeur conferred upon Urdu through Persian and Arabic borrowings are not to be dismissed, they constitute adornments rather than foundations. A more honest engagement with Urdu's history demands acknowledgement of its Indic roots—not as an affront to its beauty, but as a testament to its deep cultural syncretism. Indeed, it is only through such scholarly integrity that one can hope to rescue language from the clutches of politicised narratives and return it to the realm of philological truth.
Notes
Ahmad, Aijaz (2002). Lineages of the Present Ideology and Politics in Contemporary South Asia. Verso. p. 113. ISBN 9781859843581. "On this there are far more reliable statistics than those on population. Farhang-e-Asafiya is by general agreement the most reliable Urdu dictionary. It twas compiled in the late nineteenth century by an Indian scholar little exposed to British or Orientalist scholarship. The lexicographer in question, Syed Ahmed Dehlavi, had no desire to sunder Urdu's relationship with Farsi, as is evident even from the title of his dictionary. He estimates that roughly 75 per cent of the total stock of 55,000 Urdu words that he compiled in his dictionary are derived from Sanskrit and Prakrit, and that the entire stock of the base words of the language, without exception, are derived from these sources. What distinguishes Urdu from a great many other Indian languages ... is that is draws almost a quarter of its vocabulary from language communities to the west of India, such as Farsi, Turkish, and Tajik. Most of the little it takes from Arabic has not come directly but through Farsi."
References
Parekh, R. (2011, December 17). Urdu’s origin: It’s not a ‘camp language’. Dawn. Retrieved from https://www.dawn.com/news/681263/urdus-origin-its-not-a-camp-language
Ahmad, A. (2002). Lineages of the Present: Ideology and Politics in Contemporary South Asia.
Farhang-e-Asifiya. (1901). Rifāh-i Ām Press. Retrieved from https://archive.org/details/FarhangAsifiya
Urdu. (2025, May 22). In Wikipedia. Retrieved from https://en.wikipedia.org/wiki/Urdu
No comments yet. Be the first to comment!