On the otoro blog, there is another amazing article about sinograms:
I say "another amazing article" because, just a week ago, in "Character building is costly and time consuming" (12/22/15), we looked at a fascinating report on the vast amount of labor necessary to build fonts made up of real Chinese characters. Basically, the latter report examined the history of Chinese characters and then explained how typographers create new fonts comprising all the characters necessary for printing books, newspapers, magazines, advertising copy, and so forth.
The article under discussion goes in the opposite direction. Instead of telling us how to produce a font of all currently existing characters, it explains how one could go about creating an unlimited amount of hitherto unknown characters. One might object that this is a whimsical and trivial pursuit, but if you read other posts on the same blog, you will see how it fits into the author's overall investigations that have both theoretical and practical implications for design research. Psychologically and personally, however, there is another set of motivations that prompted the author to undertake this particular project:
As a child growing up in a mostly English speaking country, my parents would force me to attend these dreadful Saturday morning classes where I was to be taught Chinese. There would be these dictation tests where the students have to write out full passages of memorised Chinese text from a textbook, usually indirectly exposing us to Confucian moral values. We would have to spend a lot of time during the weeknights memorising passages to prepare for the test on the following Saturday. A score less than perfection is frowned upon. This would go on for years. I still have nightmares about those dictation tests. I think that’s how most children learn Chinese as well via this rote learning method around the world. Maybe in some sense, Chinese language education resembles how LSTM’s are trained to reproduce sequences from training examples.
[VHM: N.B.: I have added the link on LSTM.]
I have written about these dreaded tīngxiě 听 写 / 聽寫 ("dictation") tests before on Language Log. See, for example:
"Spelling bees and character amnesia" (8/7/13)
At Penn, when it comes to language courses, for the first two decades of my career, I taught 3rd and 4th year Mandarin, and for the last two decades of my career, I have been teaching Classical Chinese, so I have not had to administer tīngxiě 听 写 / 聽寫 ("dictation") tests, which normally are only given during the first two years of study. But whenever I might mention tīngxiě 听 写 / 聽寫 ("dictation") tests in passing, my students would groan and gasp, as though the tests were a nightmare from the past. As a matter of fact, if I ever had to teach first- and second-year Chinese classes, I would not subject my students to this kind of mindless, rote memorization. (David Moser and I have described many times on Language Log much more enlightened and benign ways to learn how to read and write Chinese [references available upon request].)
Be that as it may, I can sympathize with the author with regard to the bane of tīngxiě 听 写 / 聽寫 ("dictation") tests. There is a certain trauma associated with them that can scar a person for life. In the case of otoro, the trauma has been turned to fruitful use in research on the fundamental nature of the sinograms at a very deep level.
It is interesting that otoro's investigations complement the work of innovative artists such as Xu Bing. See, for example:
Petya Andreeva, "From Xu Bing to Shu Yong: Linguistic Phenomena in Chinese Installation Art," in Victor H. Mair, ed., Language and Ideology in Nationalist and Communist China, being Sino-Platonic Papers, 256 (April, 2015).
The Chinese character system is essentially open-ended. With the available elements (radicals, components, phonophores, strokes), it is possible to create an infinite number of different characters. It is curious that some of the resultant characters in otoro's experiment look very much like possible legitimate characters or alternative characters. For instance, see here and here.
The computer code seems not to position the strokes quite correctly (yet), but other than that, many of the characters generated by the program might well be mistaken for actual characters or variants of characters.
[Thanks to Rachel Kronick]