Classical Chinese computing

« previous post | next post »

Several colleagues called this article to my attention:

"Programming Language for the ancient Chinese"

Here's the introduction:

文言, or wenyan, is an esoteric programming language that closely follows the grammar and tone of classical Chinese literature. Moreover, the alphabet of wenyan contains only traditional Chinese characters and 「」 quotes, so it is guaranteed to be readable by ancient Chinese people. You too can try it out on the online editor, download a compiler, or view the source code.

The home page then goes through "Syntax", "Compilation", and "Get (Source Code; Online Editor; Reference".

I asked a few colleagues who are highly computer literate and knowledgeable in East Asian languages, "What in the world are they trying to do?"

One who is a professional programmer and learned Sinologist immediately replied:

"Have fun."

Me:

"You're telling me to have fun?  Or they're telling people to have fun with it?

Is it all a big joke?"

Him:

"No — you asked what they are trying to do. They did this because they were trying to have fun."

Me:

"Only fun?

Is it of any use to anyone, including themselves?"

Him:

"It might very well get them a raise or a new job."

Me:

"Gotcha."

Him:

"And writing this compiler or interpreter was probably a 'learning experience' that might improve the programmer's skill."

Me:

"All the more."

To my original question, "What in the world are they trying to do?", J. Marshall Unger replied:

Thinking back to the days in which I first programmed on a Burroughs mainframe using its version of ALGOL, I recall that one could change the default names of commands, procedures, etc., which were English words, to anything one wanted.  I remember someone showing me a large print-out of program in the same version of ALGOL from the University of Mexico in which every such name was a Spanish word:  they had made all the substitutions as the compiler level, I think.  Evidently, these guys are doing something similar, just plugging in wenyan names for all the names defined in a particular language.  It seems to be little more than a sinocentric exercise in nerdiness.  Who knows?  Maybe the Vatican computers use programs tricked out in Latin.

For those who were wondering what this programming language is all about, I think that sums it up pretty well.

[h.t.: Hiroshi Kumamoto, Tom Houpt]



18 Comments

  1. Marcelo Rinesi said,

    December 19, 2019 @ 1:59 pm

    You'd be surprised how much of computer language development is done to have fun. It's perhaps not as visible now that there's so much money involved, but in the early days playfulness with languages was a very ingrained part of the developer mindset. "Brainfuck" and "INTERCAL" are two of the canonical examples, but there are more, and surely a lot that never achieved their reputation.

    I mean, computer languages are explicitly constructed and highly modifiable languages — the temptation to do things on them, to create new ones that reflect personal philosophies or just whimsy, was and remains strong, even if, I fear, the changed economic and social context of computer programming might have blunted its expression, at least in relative terms.

  2. Milan said,

    December 19, 2019 @ 3:01 pm

    Someone actually developed a Perl module that lets you write code in a Latin-like grammar. This is quite impressive, because it emulates the morphology and free syntax of Latin in the medium of (normally quite analytic) programming languages. It was all done for fun as well, I think.

    http://users.monash.edu/~damian/papers/HTML/Perligata.html

  3. Jerzy Penguinowicz said,

    December 19, 2019 @ 3:18 pm

    One of the subtle jokes of this language consists in that the code is actually extremely legible as Classical Chinese mathematical prose, mostly through repeating things over and over. To programmers verbosity and legibility as English (or Classical Chinese) is COBOL-level ridiculous and on more than one accounts this is more COBOL-ish than COBOL.

  4. Hyman Rosen said,

    December 19, 2019 @ 3:42 pm

    Program in Latin, you say?
    Lingua::Romana::Perligata by Damian Conway is your ticket! http://users.monash.edu/~damian/papers/HTML/Perligata.html
    I've seen him do a presentation on this in person, and he's hilarious.

    Here's the Sieve of Eratosthenes (for generating lists of prime numbers):

    #! /usr/local/bin/perl -w
    use Lingua::Romana::Perligata;
    maximum inquementum tum biguttam egresso scribe.
    meo maximo vestibulo perlegamentum da.
    da duo tum maximum conscribementa meis listis.
    dum listis decapitamentum damentum nexto
    fac sic
    nextum tum novumversum scribe egresso.
    lista sic hoc recidementum nextum cis vannementa da listis.
    cis.

  5. Christopher Barts said,

    December 19, 2019 @ 4:33 pm

    Esoteric programming languages are great fun for some people, and this isn't even the weirdest.

    Check out Befunge:

    https://en.wikipedia.org/wiki/Befunge

    It's a two-dimensional programming language, and it inspired a whole category of such bests, called "Funges" after the first.

    There's a whole wiki about esoteric programming languages in general:

    https://esolangs.org/wiki/Main_Page

  6. Topher Cooper said,

    December 19, 2019 @ 5:02 pm

    The committee which defined the Algol 60 language (basically, the language that people mean when the just say Algol) recognized the need to deal with different character sets, different encodings of the character sets (ASCII was first formalized, like Algol, in 1960 and is 7-bits possibly plus a spare bit to either ignore, to use for parity, or to mark card numbers, and has a 6-bit upper case only variant within the standard), and different natural languages for the coders (it was designed to be an "international standard"). Consequently it does not define a single language but three levels of language. A **compiler** is compliant with the standard if it processes something that corresponds to any one of the three levels.

    The first level was the "reference language". This was a version of the language that was the one used to specify the structure. It used English keywords, and a somewhat different set of characters than ASCII. Several additional operator characters, like ∧ (and) and ≤ (less than or equal to) were used.

    The second level language was the "publication language" which was to be used in printed and hand-written communication. It was intended to be a direct lexeme by lexeme substitution for the reference language so as to meet the local writing or typographic conventions.

    The third level of languages were the "hardware languages". This was similar to a publication language except it was intended to adapt for the character set and encoding available to the hardware as well as the local language. In practice, mostly the encodings and character sets were either the ASCII or EBCDIC character set encodings, the keywords used, at least in Europe and the Americas were English, English keywords replaced the operators unavailable in the character set. The one thing which would be unfamiliar today was a left arrow character ("←") was the assignment operator, where the reference language specified the two character ":=". At its inception the ASCII character set included a left arrow, but this was replaced in 1965 by the familiar underscore ("_") — rumors were that this was something that IBM pushed for specifically to kill Algol, since IBM pushed Fortran heavily at that time, but I think it likely it was simply motivated by the development of printing technology that allowed backspace and overstrike, so that the "_" character could be used to underline.

  7. cameron said,

    December 19, 2019 @ 8:14 pm

    My favourite jocular programming language is probably the whitespace programming language.

    When I first read about it, I was disappointed that they didn't make good use of the vertical tab character, but I figured they wanted to save paper

  8. Ken said,

    December 19, 2019 @ 8:29 pm

    Wenyan looks similar to the Shakespeare language (http://shakespearelang.sourceforge.net/report/shakespeare/), in which programs are plays.

    My favorite aspect of Shakespeare is the way of entering numbers as sequences of adjectives that define powers of two. "You lying stupid fatherless big smelly half-witted coward!" is of course -64: -1 for the derogatory noun, times 2 for each adjective. The "you" obviously causes -64 to be assigned to whichever character is being addressed.

  9. unekdoud said,

    December 20, 2019 @ 1:42 am

    I find it hilarious that the rendered sample (Universal Turing Machine) at the top of the Wenyan page contains a string of 云云云云云云 helpfully syntax-highlighted in red. (This is the natural way for a words-only programming language to close three nested blocks.)

    Anyway, my new favorite of this kind of programming language is Rockstar, "a dynamically typed computer programming language, designed for creating programs that are also song lyrics": https://esolangs.org/wiki/Rockstar

    Math is very well hidden compared to Shakespeare, as the basic operations can be spelled "with, without, of, over", and all numbers/decimals can be represented by words of specific lengths.

  10. Robot Therapist said,

    December 20, 2019 @ 4:18 am

    And there was a famous joke programming language called "C++". Several of the smartest computer scientists got together to design a language so complex that nobody would ever be able to use it correctly. The official "competence tests" for certification in this "language" simply consisted of a short fragment of code, and the question "can you work out what this does?"

    A similar earlier satire was called "APL", by Iverson. People would present one another with fragments and say "bet you can't work out what this does".

  11. Athel Cornish-Bowden said,

    December 20, 2019 @ 7:00 am

    Back in the mists of time when I used MS/DOS, I never did descend to the point of using Windows. However, there was a DOS command called WIN that launched Windows. On my computer I fixed it so that in the unlikely event of my wanting to launch Windows I could type LOSE.

  12. Philip Taylor said,

    December 20, 2019 @ 10:39 am

    I don't know whether C++ is any more of a joke than the language on which it is based (C), but there is a far better candidate for "a language so complex that nobody would ever be able to use it correctly", and that is Algol 68. The language is incredibly powerful, and makes C look like the sad joke that it is, but sadly few practising programmers seem to have had the intellectual ability to rise to the challenge that Algol 68 posed, and the language has now virtually disappeared without trace, with the sole exception of Marcel van der Veer's truly remarkable "Algol 68 Genie" — Marcel accomplished, single-handedly, and in his spare time, something that it is believed took the Control Data (CDC) programming team 20 man-years to achieve — a full-language compiler.

  13. Trogluddite said,

    December 20, 2019 @ 12:18 pm

    @Hyman Rosen
    Or for the ultimate in nerd kudos, Conway's other (in)famous Perl variant; tlhInganHol::yIghun ("The Klingon Language: hey you, program in it!"). The ability to write it using "pIqaD" (Klingon orthography) is stymied, however, by the spoil-sports on the Unicode Technical Committee – no doubt much to the disappointment of some Trekkie coders!

  14. Brahma Sara said,

    December 20, 2019 @ 12:43 pm

    Here's a practical use case. All you have to do is load the compiler and start executing malware scripts in this language to your target computer and it will bypass any intrusion detection systems or antivirus that's looking for malware code. Sweet! Thanks for the tip!

  15. Yerushalmi said,

    December 22, 2019 @ 5:15 am

    As others have stated, there are plenty of esoteric programming languages that were developed solely to have fun. Here are a couple developed by the inestimable David Morgan-Mar:

    * Piet, a language whose code looks like works of modern art

    * Chef, a language whose code looks like recipes

    * <a href="https://www.dangermouse.net/esoteric/ook.html&quot;Ook!, a language written for orangutans.

  16. Yerushalmi said,

    December 22, 2019 @ 5:15 am

    Gah. Something broke in that last link. Ook! can be found at https://www.dangermouse.net/esoteric/ook.html

  17. Christopher J. Henrich said,

    December 22, 2019 @ 2:01 pm

    Appropriate to the season as well as to Philip Taylor's post: I have seen an ALGOL 68 program in which, by judicious definitions, factorials from 1! to 12! were implemented by the lyrics to The Twelve Days of Christmas.

  18. Philip Taylor said,

    December 22, 2019 @ 2:53 pm

    Ah, that would be this one, I assume :

    Author: John P Baker, University of Bristol
    Published in (and retrieved from):
    ALGOL Bulletin archive
    Issue 42, May 1978
    Pages 50–52

RSS feed for comments on this post