Language Log

The history of characters in computers

January 3, 2025 @ 6:34 pm · Filed by Victor Mair under Language and computers, Writing systems

Sino-Platonic Papers is pleased to announce the publication of its three-hundred-and-sixtieth issue:

“Kanji and the Computer: A Brief History of Japanese Character Set Standards,” by James Breen.

https://www.sino-platonic.org/complete/spp360_kanji_computers_japanese_character_set.pdf

ABSTRACT
This paper describes the development of the character coding systems and standards that enable Japanese text to be recorded and used in computer systems. The Japanese coding systems, which were first developed in the late 1970s, pioneered the approaches to handling the large numbers of kanji characters and established a pathway that was adopted in other standards for Asian languages. The paper covers the development of the major Japanese standards and their evolution into the Unicode character standard, which is now the basis for all language coding.
—–
All issues of Sino-Platonic Papers are available in full for no charge.
To view our catalog, visit http://www.sino-platonic.org/

finis

This paper is also available as a WWW page at:

https://www.edrdg.org/~jwb/paperdir/kanjicomp.html

There are some Chinese associations – the first PRC hanzi, etc. standard (GB 2312-80) was modelled on the earlier and pioneering JIS C6226-1978. And of course the Taiwanese CCCII development (aka EACC), which was the first attempt at fusing the codings of hanzi, kanji and hanja, was a precursor to the great Han Unification to took place in Unicode, and changed the coding world forever.

Selected readings

"Triple review of books on characters and computers" (8/23/24)
"Sinographic inputting: 'it's nothing' — not" (2/22/21) — with lengthy bibliography
Victor H. Mair and Yongquan Liu, eds., Characters and Computers (Amsterdam, Oxford, Washington, Tokyo: IOS, 1991)

January 3, 2025 @ 6:34 pm · Filed by Victor Mair under Language and computers, Writing systems

Permalink

1 Comment

Philip Taylor said,

January 6, 2025 @ 8:12 am

I would be very grateful if the following comment could be fed back to the author :

Things began to improve in the 1960s when the computing industry started to move to 8-bit units, for which IBM coined the term “bytes,” a word that has stuck with us. These allowed for up to 256 combinations, so we finally could use lowercase alphabetic characters as well, and non-English languages could potentially add characters such as é, ö, and ç.

Worth noting, perhaps, that full use of the 8-bit space was not made immediately — there was an interim period, possibly required by the infidelity of data transmission over acoustic couplers, during which the 8th bit was used solely to indicate parity, thus reducing the number of available characters to 128 (less, of course, the control characters, of which there were 32).

RSS feed for comments on this post

The history of characters in computers

1 Comment

Philip Taylor said,

Follow us on Twitter

Archives [+/–]

Blogroll [+/–]

Meta