The Shree-Dev-0708 space problem: there are 6 different places in the font whose glyph is basically defined as a space. Only one of them really seems to mean space. The others appear to have a purely typographical function, e.g. to prevent crowding of characters. I assume that they are linguistically meaningless and should be omitted from the ISCII/Unicode translation.

Hex Code
Width

Frequency
(in 2.7MB sample)

Comment
0x20
200
485,587
Codepoint for normal ascii space; renders as visible space.
0x24
156
155,502
Used to prevent crowding of characters?
0x3e
37
25,619
Probably also not a real space?
0x9f
125
1,593
Probably also not a real space?
0xa0
5
10
Probably also not a real space?
0xb7
5
1357
Probably also not a real space?

 

Test case 1: a headline.

Hex character sequence is:

0000000 67 61 48 24 6d 61 20 5a 6f 20 48 24 6f 7e 62 20
0000020 51 72 20 64 72 20 48 24 6f 20 7b 5a 60 5f 20 48
0000040 24 6d 60 58 6f 20 56 60 20 7b 48 24 45 20 0d 0a

This is rendered by MSWord as:

Interpreted bytewise the line is (first column is input character in hex; second column is one or more characters of ISCII translation; third column gives ISCII codepoint names):



67 ->  d7 =  (sa) 
61 ->  cf =  (ra) 
48 ->  b3 =  (ka) 
24 ->  20 =  (ascii space) 
6d ->  da =  (aa matra) 
61 ->  cf =  (ra) 
20 ->  20 =  (ascii space) 
5a ->  c6 =  (soft na) 
6f ->  e1 =  (ey matra) 
20 ->  20 =  (ascii space) 
48 ->  b3 =  (ka) 
24 ->  20 =  (ascii space) 
6f ->  e1 =  (ey matra) 
7e ->  ca =  (ba) 
62 ->  d1 =  (la) 
20 ->  20 =  (ascii space) 
51 ->  bd =  (hard ta) 
72 ->  dc =  (ii matra) 
20 ->  20 =  (ascii space) 
64 ->  d4 =  (va) 
72 ->  dc =  (ii matra) 
20 ->  20 =  (ascii space) 
48 ->  b3 =  (ka) 
24 ->  20 =  (ascii space) 
6f ->  e1 =  (ey matra) 
20 ->  20 =  (ascii space) 
7b ->  db =  (i matra) 
5a ->  c6 =  (soft na) 
60 ->  cd =  (ya) 
5f ->  cc =  (ma) 
20 ->  20 =  (ascii space) 
48 ->  b3 =  (ka) 
24 ->  20 =  (ascii space) 
6d ->  da =  (aa matra) 
60 ->  cd =  (ya) 
58 ->  c4 =  (soft da) 
6f ->  e1 =  (ey matra) 
20 ->  20 =  (ascii space) 
56 ->  c2 =  (soft ta) 
60 ->  cd =  (ya) 
20 ->  20 =  (ascii space) 
7b ->  db =  (i matra) 
48 ->  b3 =  (ka) 
24 ->  20 =  (ascii space) 
45 ->  ac =  (ey) 
20 ->  20 =  (ascii space) 
d ->  [CarriageReturn]
a ->  [NewLine]

If we remove the 0x24 characters from this line, MSWord renders it as below -- the result is that some glyphs encroach on the space of their neighbors:

I assume without checking that the other four spaces are similarly to be ignored.