Hindi encodings/fonts and conversions among them

Personal notes by Mark Liberman --
      See http://www.ldc.upenn.edu/myl/hindi.html for more extensive notes on Hindi resources...

Information about ISCII: http://tdil.mit.gov.in/standards.htm   http://ldc.upenn.edu/myl/iscii91.pdf

ITRANS:  http://www.aczone.com/itrans/online/ http://www.aczone.com/itrans/#itransencoding
                 http://www.aczone.com/itrans/idoc/idoc.html

Unicode 3.0 chapter 9: http://www.unicode.org/unicode/uni2book/ch09.pdf
         Unicode devanagari code chart: http://www.unicode.org/charts/PDF/U0900.pdf

Jeroen Hellingman's Indian Scripts and Unicode.

Proposal for improving ISCII/Unicode compatibility: http://www.exnet.btinternet.co.uk/uniprop/encoding.htm

List of common Hindi conjunct consonants:  http://www.geocities.com/muurtikaar/conjuncts.html
Microsoft typography stuff:
Converting a Devanagari font to Unicode / OTL: http://www.microsoft.com/typography/developers/volt/devanagari.htm
Creating and supporting OpenType fonts for Indic scripts: http://www.microsoft.com/typography/otfntdev/indicot/default.htm
Visual OpenType Layout Tool: http://www.microsoft.com/typography/developers/volt/default.htm

Conversion software:

ISCII to and from Unicode  http://www.cse.iitk.ac.in/users/isciig/

http://www.iiit.net/ltrc/FC-1.0/fc.html
    converts to ISCII from:
       devpooja, devpriya, DV-TTYogesh, DVB-TTYogesh,  Sanscrit-98,
       shusha, MITHI,  DVBW-TTYogesh,

http://www.cfilt.iitb.ac.in/resourcepage/index.html
    converts to ISCII from:
       SurekhB, Akruti, Iitk

This program has some useful "map" files: http://members.tripod.com/~sbiswas/IWrite32/IWrite32.html
in particular for Jagran, Shusha, Samachar...

Commercial software that claims near-universal coverage:
FontSuvidha: http://www.cybershoppee.com/software/
http://www.cybershoppee.com/software/fontsuvidha.phtml

ISCII plug-in (claims to convert from various font mappings): http://www.iiit.net/ltrc/iscii/

CMU encoding identifier source code:  http://www-2.cs.cmu.edu/~ref/public/idenc.zip


Documentation andother resources:

Web in Indian languages: http://india-n-indian.com/it/wil.html

Font resources (abbreviated list):
http://kjs.nagaokaut.ac.jp/mikami/SRIG/font-resources.htm
http://www.agfamonotype.com/software/wt_fontsample.asp?lan=devanagari
WorldType Devanagari glyph list: http://www.agfamonotype.com/software/wt_glyph.asp?lan=devanagari

font-name vs. encoding table:http://www.india-n-indian.com/it/ifonts.html
also gives info on which sites use which fonts/encodings, from ...  http://www.cs.colostate.edu/~malaiya/devafonts.htm
another copy/version at http://mlcr.nagaokaut.ac.jp/main1/languages/hindi_frame.htm
the information is no longer valid in some cases.

Info on Indic language fonts: http://cgm.cs.mcgill.ca/~luc/indic.html

Charsets of news and other Hindi text sites:

http://www.bbc.co.uk/hindi/
Unicode

http://www.prabhasakshi.com/    user-defined
font-family: Kruti Dev 010
http://www.prabhasakshi.com/KRUTIDE0.eot

http://www.hindimilap.com/ user-defined
http://www.hindimilap.com/akmilap.eot
supposedly uses devpuja (according to http://www.cs.colostate.edu/%7Emalaiya/devafonts.htm)
and http://www.iiit.net/ltrc/FC-1.0/fc.html can do devpooja (sic) to ISCII!
but home page cites http://www.hindimilap.com/akmilap.pfr (local copy)
and http://www.hindimilap.com/akmilap.eot (local copy)


http://www.naidunia.com/    charset=user-defined
Conversion routine nai2ucs2.pl

http://www.jagran.com/  user-defined
http://www.Jagran.com/Jagran.pfr
PFR is "portable font resource"
according to devafonts site above, map is "samachar"
perhaps this? http://www.gujarat-samachar.com/Fonts/GSFree.htm
mentioned in "cybercensus" here: http://mlcr.nagaokaut.ac.jp/suzuki/project.htm
check out IWRITE font map
True Type font from jagran.com web site: jagran.ttf
      and its font map as printed by fontographer: jagran.pdf


http://www.navabharat.net/
Summit INDICA font(s):  Yog.ttf  Yogb.ttf   Yogbita.ttf   Yogita.ttf  

http://www.rajasthanpatrika.com/
   http://www.rajasthanpatrika.com/pfrs/Mitra.pfr

http://www.bhaskar.com/
Bhaskar.ttf

http://www2.amarujala.com/today/default.asp
au.ttf

Hindi literary magazines:
http://www.udgam.com/
http://www.bharatdarshan.co.nz/

Parallel sources (more or less):
http://sify.com/news_info/news/    http://sify.com/hindi/
apparently uses jagran font: http://sify.com/eot/JAGRAN0.eot
True Type font from jagran.com web site: jagran.ttf
      and its font map as printed by fontographer: jagran.pdf
People at UMd are supposed to be working on converting this font

http://www.indiatoday.com/itoday/index.html     http://www.indiatodayhindi.com/
[can't reach hindi site to diagnose font...]

 http://www.vigyanprasar.com/dream/index.asp
[text is in .pdf form -- font is included in files but does not seem to match any known ???]

News in English, Hindi, Telegu: http://www.niharonline.com/  http://www.niharonline.com/hindi/news/
pfr/SHREEA096.pfr
pfr/SHREE708E.pfr
pfr/SHREE714E.pfr

English, Hindi, Marathi:     http://www.rediff.com/  http://www.rediff.com/hindi/index.html
       font is Shree708.ttf  fontmap Shree708.pdf
        partial conversion table is here
check out http://www.cirrussoft.com/products.htm for possible converters...

Saileela Magazine -- parallel English/Hindi -- also  in Shree708:
http://www.shrisaibabasansthan.org/saileela/instruct.htm

http://www.boloji.com/Default.asp  http://www.boloji.com/hindi/index.html
boloji Hindi font is shusha.ttf and shusha5.ttf
Note that free converters at http://www.iiit.net/ltrc/FC-1.0/fc.html can apparently handle this font

http://www.nhm.ac.uk/darwincentre/insite/english/index.html
          http://www.nhm.ac.uk/darwincentre/insite/hindi/index.html
(encoded in UTF-8)

Indian government sites:

Government Portal: http://indiaimage.nic.in/
Parliament:
http://goidirectory.nic.in/parliament.html
 English version: http://rajyasabha.nic.in/
 Hindi version: http://rajyasabha.nic.in/hindisite/hindipage.asp
  rajyasabha font download page: http://rajyasabha.nic.in/rsdebfont.htm
  Five fonts available there:
           "New web font" http://rajyasabha.nic.in/downloadfornts/new/Dvbwyg3n.ttf     DVBW-TTYogeshEN Normal
           "Web font old"  http://rajyasabha.nic.in/downloadfornts/old/Dvbwsr3n.ttf        DVBW-TTSurekhEN
           "Billingual font" (sic) http://rajyasabha.nic.in/fonts/DVBSR3NT.TTF                 DVB-TTSurekhEN Normal
           "Hindi font (Yogesh)" http://rajyasabha.nic.in/fonts/DVYG0ntt.ttf                        DV-TTYogesh Normal
           "Hindi font (Surekh)" http://rajyasabha.nic.in/fonts/DVSR0ntt.ttf                        DV-TTSurekh Normal
  Note that  http://www.iiit.net/ltrc/FC-1.0/fc.html
    converts to ISCII from: DV-TTYogesh, DVB-TTYogesh,  DVBW-TTYogesh
  and http://www.cfilt.iitb.ac.in/resourcepage/index.html
    converts to ISCII from: SurekhB
  ...but I have not verified that these match exactly the fonts used on the rajyasabha site.

Constitution:  http://indiacode.nic.in/coiweb/welcome.html

Ministry of Home Affairs: http://rajbhasha.nic.in/
http://rajbhasha.nic.in/dolst_eng.htm http://rajbhasha.nic.in/dolst_hin.htm
(English welcome page above seems to be be mistakenly linked to Hindi page)
    uses DVBW-TTYogeshEN  DVW-TTYogeshEN DVBW-TTSurekh  (with .eot files on site)
     And content is encoded in Unicode char references to get proprietary-font code points!!!
     ...e.g. ±É to encode Hindi!!!!


Press information bureau: http://pib.nic.in/ http://pib.nic.in/urdu/hindimain.html
pib.pfr
http://164.100.24.208/

http://planningcommission.nic.in/
    http://planningcommission.nic.in/hindi/welcome.html
DVDIVYA9.eot
DVDIN___.TTF

http://tdil.mit.gov.in/homepage.htm     http://tdil.mit.gov.in/hindi_site/homepage.htm
DVW-TTYogesh.pfr
Note that free converters at http://www.iiit.net/ltrc/FC-1.0/fc.html can handle this font

Gita in Hindi and Engish: http://www.gitasupersite.iitk.ac.in/

PFR, EOT information:

PFR specfication: http://www.bitstream.com/categories/developer/truedoc/pfrspec.html

Microsoft WEFT:
http://www.microsoft.com/typography/web/embedding/weft3/default.htm

EOT font viewers (these are pretty pitiful...)
http://www.ruland-web.de/EOT-VIEWER/
http://www.bearzcave.com/plugins/eot-view.zip
http://lettermanstationery.tripod.com/ezeot.htm

Info on embedding fonts: http://www.netmechanic.com/news/vol3/css_no15.htm
                                          http://hotwired.lycos.com/webmonkey/99/45/index0a_page8.html?tw=design
                                          http://www.okinfoweb.com/moe/format/format_007.htm

http://www.webattack.com/freeware/gmm/fwfont.shtml