On my Own

This conversation is closed.

Let us bridge the great Urdu-Hindi divide (caused by the two registers using mutually illegible scripts)!

Urdu-Hindi is the third or second most widely spoken language in the world - following Mandarin and possibly English.

Urdu and Hindi are the two standardized registers of this language. Unfortunately, they use mutually illegible scripts.
* Urdu uses a right-to-left script derived from the Persian alphabet, and has strong Pakistani and Muslim associations
* Hindi uses the left-to-right Devanagari script, and is strongly associated with India and Hindus

A third script, Roman, is widely used for Internet chatting, SMS, etc. It is the only script known to most Desi diaspora.

Here's an idea atowards bridging this divide - through a three-way transliteration engine working under the hood of web

portals, social networks, blogging platforms, etexts and readers, etc.

This idea has huge potential to bring about change by enabling greater people-to-people discourse and cross-fertilization

of ideas. A presentation here (http://bit.ly/iBridge) lays out the idea in some more detail.

Your comments, views and suggestions are (cordially :)) invited.

  • thumb
    Jan 5 2012: hey great Ahmer! loved the presentation.. perfectly doable.. in fact, google has already done 50% of the work with their Roman-Devanagari & Roman-Nastaliq Transliteration Engines.. and even great is that their APIs are open.. their scheme is all phonetic based so we can very easily use google APIs/Data for developing a comprehensive 3-way (DevanagariRomanNastaliq) Engine as envisioned.. however, still needs very hardcore development and programming, been looking for a while but couldnt find anyone for volunteering.. estimated dev requirement is about $12k..

    the Hindi-Urdu divide is such unfortunate yet so tough to undo cos of the communal associations attached. even the very words Hindi & Urdu invoke the 'Other' image in commoners imagination.. am working on a Hindi-Urdu Reform & Modernization Initiative aimed at bridging the divide by bringing the whole range of Hindi-Urdu together (as spoken across India, Pakistan, greater South Asia and the 30Million+ Desi Diaspora) using roman script and develop & promote a neutral writing style as exemplified by Bollywood and popular media. The real value and significance of Hindi-Urdu is in its being the Lingua franca of the South Asian Subcontinent which is home to 400+ language subgroups, using 16+ distinct writing systems. Hamari Boli in Roman is the only practical means to reach all Desis equally!

    in technical lingo, "Hamari Boli" is the new name of Hindi-Urdu (written in Devangari/Nastaliq/Roman) and a full-scale 'Language Planning Inititiave' aimed at Hindi-Urdu a) Script, b) Style, c) Status & d) Lexical Reform & Modernization..

    currently compiling English-to-Hamari Boli Dictionary.. also directing 'Khan Academy Hamari Boli' --the HindiUrdu dubbing of KhanAcademy.org video library--. transliteration is among one of the several planned future undertakings.. will be very glad and grateful if you'd like to join and contribute..

    Cheerz/Azad

    www.HamariBoli.com
    www.youtube.com/khanacademyhindiurdu
    • Jan 5 2012: Thank you very much, Azad :) Very excited to know about your "Hamari Boli" connection. You're just the kind of person my project needs to attract early on.

      The divide is indeed deep - consider the fact that the language does not have a name both registers can agree on. Hamari Boli is indeed a highly interesting project - which is why I chose to include the link in the presentation.

      However, my idea is to embrace the conventional scripts rather than sidestep them. The hope is that in time, a neutral style will emerge of its own once people from both sides of the divide have adequate opportunities for interaction.

      Lexical variations do not bother me -- all lexicons of both Urdu and Hindi are integral to the language -- they enrich it. The Bollywood lexicon, for instance, is far from identical to the colloquial Pakistani Urdu, but we all understand it. My project looks to achieve the same level of accessibility for all lexicons through transliteration. It is conceivable that some Urdu-Hindi lexicons will be unintelligible to speakers of some other lexicons - but they should not remain mutually illegible!

      Your lament as to the unavailability of programming expertise is spot on -- and one of the reasons why I had to float this conversation. Let us see how this unfolds. Failing all else, I can always brush up programming skills and crank the code myself.

      Please do tell me more about your Hindi-Urdu Reform & Modernization Initiative. I would very much like to work with "Hamari Boli" and contribute what little I can.
      • thumb
        Jan 5 2012: thanks Ahmer.. i coined the name Hamari Boli as the new name of reunited Hindi-Urdu to relieve it of the communal antagonism.. entirely neutral and carries a very pleasant inclusive air to it. hopefully Hindi & Urdu wallas wont have much trouble digesting :)

        HB is not about sidestepping Devangari/Nastaliq, perhaps i was lacking somewhere, here's a succinct articulation;

        "Hamari Boli is re-unified Hindi-Urdu written using either Roman -preferably- or Devanagari or Nastaliq, with combined vocabulary drawn from standard Hindi, Standard Urdu & Dakhini as well as all the naturalized words from regional Desi languages, with generous helpings of English as exemplified by "Hamara Cinema, i.e. the world's most prolific film industry, the Bollywood"
        • Jan 6 2012: Azad, thank you for the succinct articulation of what HB is.

          Now, we have convergence in that we are both trying to help bridge the Hindi-Urdu gulf, but there are fundamental points of divergence.

          ONE, HB thinks in terms of 're-unified Hindi-Urdu'. My project looks to make each register universally legible.

          TWO, albeit it embraces Devanagari and Urdu, HB still prefers Roman. My project, not so -- the idea being to catch the people where they are, as they are -- and to deliver them what they need. (I personally have a difficult time processing Urdu in Roman - only manage the shortest Short Messages. More than a phrase of Roman Urdu in a chat session puts me off.) This should allow me to have my Urdu in Urdu -- and you to have the same page in Roman, and others to have it in Devanagari. To look at it another way, the idea of my project is to do away with the need of learning (and unlearning) scripts.

          THREE, HB has a vision as to how its vocabulary should be. (This is not to say I have a problem with the HB vision of vocabulary; in all likelihood, that is the direction a global Urdu-Hindi register would take.) My project puts all its faith in the goodness of people.

          Bollywood movies, for instance, does not use the lexicon it does to please any particular audience; it simply draws on Mumbai's pluralist language tradition. The hope is that once the script divide is bridged, websites and etexts will automatically be as Bollywood movies -- universally accessible to speakers of all registers. This media space will then become a great big Mumbai -- and 'Ooper Wala' will make more sense than God, Bhagwan, Khuda, etc.
  • thumb
    Jan 8 2012: Hi Ahmer. got it now.. actually the confusion here is of terminology.. i guess a more accurate description of ur project will be "Addressing Devanagari-Nastaliq digraphia/divide -- through web/computer based tools"..

    u see, the Hindi-Urdu divide is not just about mutual illegibility.. its equally about the lexicon and most importantly about the popular status & perception of Hindi-Urdu - the sociolinguistic dimension.. in common perception;

    a) Hindi and Urdu are two separate languages! we both know the distinction is absolutely rubbish.. whatever the perceived differences, per linguistic criteria, 'Hindi-Urdu' is 'The Language'.. but the thing is that its not about the facts at all.. naked identityphilia perpetrated by the state and liguistic elite.. may seem harmless, but i know how far affecting repercussions it creates -- from my experience at English, Urdu Wikipedias and Khan Academy Translations Platform.. so frustrating keeping Hindi/Urdu wallas in line.. governments in India and Pakistan have both used divisive and elitist language policies for social control in the name of nation building.. worst victim is our education system, you must be well aware of how deep, biased and destructive the English Medium and Urdu Medium divide is.. here's a very good analysis by Prof. Tariq Rahman;
    http://www.viewpointonline.net/language-policy-multilingualism-and-language-vitality-in-pakistan-tariq-rahman.html

    b) Hindi & Urdu are erroneously imagined as synonyms/abbreviations of 'Standard Hindi' and 'Standard Urdu' . this is not the case. Standard Hindi/Standard Urdu are formalized styles -Standardized Register- of writing Hindi-Urdu (lexicon + script) which is only used in formal applications. nobody speaks that. colloquial vernacular all over India and Pakistan is Hindi-Urdu -the erstwhile 'Hindustani', which was the Official Language and Lingua Franca of British India- , called Hindi when written in Devanagari and Urdu in Nastaliq; now increasingly in Roman.
  • thumb
    Jan 5 2012: Don't you think these translation engines have to be more intelligent in order to translate from on script to another.
    even Google translate is not much efficient .Information in one language is not conveyed completely to another language some essence of it is wasted during translation process.
    • Jan 5 2012: :) Thank you Khayam - but this one is wide of the mark: we are not talking about translation - the name of the game is transliteration (of course there would be transformation rules for presentation in addition to transliteration).

      And I firmly believe we can manage three-way transliteration that would round-trip any which way.