Sunday, August 03, 2008

Tamil 'sha': Unicode to the rescue

In the previous post, I had argued for including the Grantha letter 'sha' in Tamil. I was very happy to find out that Unicode 4.1.0 has already done so. Its code-point is 0x0BB6.

It looks like this version of Unicode was released in March 2005. This means that Tamil fonts created before, such as the default one in Windows XP, will not have this character. If you are interested, you can download the "Lohit" tamil font, which supports the new code-point. It should work on Windows and Linux. (Not sure about Mac.)

As I feared, there were some anti-Sanskrit folks who fought against this character. From a Unicode mailing-list thread on this subject:

Sanskrit is always seen a wanton intrusion [sic] to destroy all Indic languages and cause confusion. Tamil has been defending itself for hundreds of years.... Unicode is not the entity that should decide the demise of the ancient and sophisticated Tamil, like the demise of all other Indic languages.... 0BB6 must be deprecated. 0BB6 was encoded illegally by Unicode.

I am pleasantly surprised that, inspite of this, 0BB6 made it through.

Right now in translipi, I use ச to transliterate 'sha' into Tamil. This is fine for the cases where it is accompanied by a vowel. However, when it appears as a pure consonant or together with other consonants, ச is conventionally pronounced 'cha' in Tamil. This is not very satisfactory; the grantha 'sha' letter fits the bill perfectly here. However, since not many fonts support it yet, translipi will go along with ச for 'sha' for some more time.

6 comments :

Anonymous said...

why is it "not very satisfactory"?
any language makes modifications when it imports foreign words. are you also there to be a "zha" in other languages?

your argument from venkatesh type names is utterly silly -- if christians can call themselves francis and xavier without begging for extra letters to be introduced into thamizh, ventakesh can just cope.

don't you think it's extrmely late in the day to be raising the flag for importing letters into thamizh to spell sanskrit names?

oh wait, it's never too late for your type.

Srikanth said...

I wonder why you use thamizh. It seems to me that you didn't find tamil very satisfactory.

Anonymous said...

Thanks for posting that Lohit tamil font has support for 'sha'. The picture you have included shows the maatraas joining correctly however when I am using it on XP, I am getting the dashed circle between the letter and maatraas. Is theer soem other setting to be used to get it to work correctly. I am basically helping a friend by transliterating sanskrit to Tamil.

Thanks!

ANKITH said...

Why dont you stupids learn Devangari to learn all mantras. ru guys wanna attain divinity using tamil neeksh basha.... lazy folks.. learn devangari u can study all mantras with correct sounds.. its deva bhasha..

kennady said...

woo! Tamil language is a Dravidian languages owing to its geographical expansion, for it has spread beyond the frontiers of India. Apart from being the language of forty million people in Tamil Nadu it is the spoken and written language of several millions of Tamils living in Ceylon, Burma, Singapore, Malaysia, Indonesia, South Africa, Fiji Islands and Mauritius.

Saketh said...

I do not understand the problem you have against the grantha alphabets. The grantha alphabets are a recognized part of Tamil as per the Tamil Nadu state governments Tamil All Character Encoding.https://en.wikipedia.org/wiki/Tamil_All_Character_Encoding

These alphabets have been used in Tamil for centuries,nay millenia and there is no problem if this is reflected in the digital world. T