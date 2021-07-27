« previous post |

Phil H wrote these comments to "Uncommon words of anguish" (7/18/21):

The anguish is very real. My wife had a character in her name that most computers will not reproduce ([石羡]), despite it being relatively common in names in our part of the world, and has been refused bank accounts, credit cards, and a mortgage because of it. In the end she changed her name rather than continue to deal with the hassle. The character is in the standard, but it was too late for us.

…there have always been ways to get the character onto a computer, but any given piece of bank software might not recognise it, and any given bank functionary might be unfamiliar with them. We then had trouble when some organisations used the pinyin XIAN in place of the character, but that then made their documentation inconsistent with her national ID card (which had the right character on it) and so yet further bodies would not accept them… It was the standard "mild computer snafu + large inflexible bureaucracy = major headache" equation.

An anonymous correspondent, a computer scientist, sent in the following remarks:

Phil H is talking about a character which is in a "supplementary plane" in Unicode (and similarly in GB-18030). Unfortunately, an awful lot of software was only ever tested on Basic Multilingual Plane characters.

The way I explained it to a Chinese friend was: imagine you live in a small town where everybody's telephone number is 4 digits. Then they build a load of new houses on the edge of town, but when they come to connect them up, they find there aren't enough numbers left to give all the new people 4-digit telephone numbers, so they get 5 digits.



But none of the existing 4-digit residents want to change their numbers into 5-digit numbers, so they carry on having 4-digit numbers. And they know each other very well, and only ever call each other, and it's very rare indeed that any one of them wants to call up one of those new 5-digit people living on the edge of town, so some of them still go through their lives thinking that telephone numbers only ever have 4 digits.



Which unfortunately leads to some slightly-careless equipment designers testing their equipment only on 4-digit numbers, not checking if 5-digit numbers also exist. Which means, when you want to use a 5-digit number, you just might find that some piece of equipment which had claimed to be fully compatible with the phone system suddenly doesn't work.



(Oh, and there does exist a way to translate a 5-digit number into TWO of the 4-digit numbers, which is supposed to make it compatible with any old equipment that insists on 4-digit numbers. But some of that equipment doesn't support putting two telephone numbers in the space of one.)



If they'd insisted on giving EVERYBODY 5-digit numbers, then all the makers of equipment would have been forced to face the reality of 5 digits. But then everybody would have had to change, and that's too awkward.



At least a lot of Western designers are facing up to 5-digit numbers now. Why? Because a whole bunch of EMOJI characters are in that 5-digit area! So I think the way to get blog tools etc to support the extra characters is simply to say we want to use all the latest emoji: English developers can relate to that more than they do to rare characters in a language they don't understand. (Just hope they don't make some other assumption like "if it's a 5-digit number then the first digit will always be a 1 because that covers the emoji".)



Meanwhile China is, apparently, building new towns and cities which have the 5-digit characters in their names. That'll make them more common, once people start moving in to those places and they want to write about where they live without having to use some cheap-looking homophone.

Does the anonymous correspondent's analogy work?

