Unicode: The brontosaurus emoji

« previous post | next post »

Today's xkcd:

Mouseover title: "I'm excited about the proposal to add a 'brontosaurus' emoji codepoint because it has the potential to bring together a half-dozen different groups of pedantic people into a single glorious internet argument."

At least things have gotten better, now that browsers and text editors and terminal emulators have sort of started to more or less implement the Unicode standard. Some dispatches from the bad (or anyhow worse) old days:

"Strange scrambling alphabets", 12/14/2003
"Them old diacritical blues again", 3/21/2004
"Convenience for the wealthy, virtue for the poor", 3/31/2004
"Agbègbè ìpàkíyèsí", 12/15/2008
"It's worse than you thought", 11/21/2012

And an antique unicode-related xkcd:

Mouseover title: "U+FDD0 is actually Unicode for the eye of the basilisk, though for safety reasons no font actually renders it."


  1. Daniel Barkalow said,

    August 29, 2016 @ 2:38 pm

    I want to see a text layout engine that adjusts the direction of side-view emoji based on the LTR characters, so we can argue about whether brontosauruses face towards the start or end of the line of text.

  2. leoboiko said,

    August 29, 2016 @ 3:31 pm

    At least things have gotten better, now that browsers and text editors and terminal emulators have sort of started to more or less implement the Unicode standard.

    Not included: Languagelog's blogging system (WordPress), which will silently eat non-BMP Unicode characters (after happily displaying them on the comment preview) (tip for Languagelog commenters: go back to the 90s and type '&#x' + Unicode hexadecimal code point + ';'.)

    One argument for the inclusion of characters is the completion argument: we have some country flags, so we should add flags for all countries to complete the set and avoid being partial/ethnocentric. In the ongoing brontosaurus discussion at the Unicode mailing list, I sent the following proposal:

    We obviously need an emoji for every species name listed within The Official Registry of Zoological Nomenclature's ZooBank.

    I propose a new set of Basic Latin characters, the Zoological Nomenclature Indicator Symbols [cf.], to be used for spelling scientific names, which are then rendered as cutesy colorful icons used as mood indicators. A ZOOLOGICAL NOMENCLATURE INDICATOR SYMBOL SPACE must be included to separate name components; sequences including one such separator are assumed to be binomens, and two, trinomens. For example, a cat emoji can be encoded with the Zoological Nomenclature Indicator Symbols corresponding to [FELIS␣CATUS] or, following modern practice, [FELIS␣SILVESTRIS␣CATUS] (biological homonyms are to be treated as alternative encodings of the same abstract emoji).

    Notice that the current emoji set include such characters as CRYING CAT FACE (U+1F63F)) and KISSING CAT FACE WITH CLOSED EYES (U+1F63D), in addition to the default human (or, in a certain vendor, disgusting yellow amœbæ) faces; but no such equivalents for, say, dogs or bunnies, which can be a very dangerous political slight towards dog-people and bunny-people. With some adjustment, Zoological Nomenclature Indicator Symbols can solve the issue once for all, with perfect neutrality. All of the current face expression emoji are to be decomposed as FACE plus abstract combining characters; for example, U+1F642 SLIGHTLY SMILING FACE will be considered a compatibility variant of FACE + COMBINING SMILE + COMBINING SLIGHT FACIAL EXPRESSION. This would allow a dog version of U+1F63D encoded as: [CANIS␣LUPUS␣FAMILIARIS] + COMBINING FACE + COMBINING KISSING FACIAL EXPRESSION + COMBINING CLOSED EYES, and similarly for any species and expression combination, like, say, a ring-tailed lemur rolling on the floor laughing, or an okapi with tears of joy. (Drawing all possible glyphs is of course not Unicode's problem.)

  3. Jerry Friedman said,

    August 29, 2016 @ 3:42 pm

    leoboiko: Excellent suggestion! Except for the implicit pro-mammalian bias. See for example the happy-face darner, a dragonfly that can be encoded as [AESHNA_PALMATA] (and is known to conventional English speakers as the paddle-tailed darner).

  4. January First-of-May said,

    August 29, 2016 @ 10:28 pm

    Incidentally, in ch*rping 2016, I still get the wrong diactritics (at the linked 2004 post) on Firefox 47.0 under Windows XP.
    I'll check the Windows 10 version in the morning, admittedly – maybe it's a XP thing and the other guys handle it better.

    Can't recall ever actually using Unicode emoji, however. (And I still think the proper smiley is colon, hyphen, bracket – though the Russian tradition those days is to use the bracket only.)

  5. Jakob said,

    August 30, 2016 @ 2:48 am

    Apatosaurus, surely?

  6. Alon Lischinsky said,

    August 30, 2016 @ 4:44 am

    @Jakob: what single glorious internet argument did you think Randall Munroe was alluding to in the mouseover?

  7. Michael Watts said,

    August 30, 2016 @ 7:15 am

    One argument for the inclusion of characters is the completion argument: we have some country flags, so we should add flags for all countries to complete the set and avoid being partial/ethnocentric.

    This isn't a very good argument. No country flag is a part of unicode, for precisely the reason that countries change all the time and it would be incredibly politically awkward to standardize on a set of countries.

    See https://esham.io/2014/06/unicode-flags.

  8. leoboiko said,

    August 30, 2016 @ 7:49 am

    @MIchael Watts: That's exactly the rationale behind the Regional Indicator Symbols as an encoding for emoji – RIJ 🇺 + RIJ 🇸 display as the U.S. flag 🇺🇸 on compatible systems – which resolves the completion argument by allowing all possible flags, ever, while sidestepping taking a political stance regarding Taiwan and so on.

    This is what I was parodying with my absurd proposal of an analogous set of Zoological Nomenclature Indicator Symbols. Just like the Regional Indicators are Latin letters used to spell region codes which are then rendered as flags, allowing a complete representation of all the flags, the Zoological symbols would spell species names, which would then be rendered as cute icons. So a bronto/apatosaurus emoji would be encoded either as [BRONTOSAURUS␣EXCELSUS] or [APATOSAURUS␣AJAX] or any other current or future name, and Unicode avoids the political stance of deciding pro- or against the name "Brontosaurus", while encoding the set of all possible animal emoji, current or future.

  9. mollymooly said,

    August 30, 2016 @ 7:57 am

    @Alon Lischinsky:
    The Apatosaurus lobby is only one of the "half-dozen different groups of pedantic people". Others are listed here.

  10. January First-of-May said,

    August 31, 2016 @ 3:32 am

    @ my previous post:
    Nope, works all right on Firefox 44 under Windows 10. So they must have fixed it in the meantime.
    I guess it might also be an available-fonts thing, but I don't know how to control for that correctly.

RSS feed for comments on this post