Font making for oracle bone inscription studies

« previous post | next post »

"Jingyuan Digital Platform: Font Making and Database Development for Shang Oracle Bones (Part 1)", Peichao Qin, The Digital Orientalist (9/17/24)

If you're wondering what "Jingyuan" means, it's a fancy, allusive way to say "Mirrored contexts [for thorough investigations]" ([gézhì] jìngyuán [格致]鏡原) (source), just a means for the creator of the platform to give it a proprietary designation.

A goodly proportion of Language Log readers probably have some idea of what oracle bone inscriptions are, but just to refresh our memories and for the benefit of new and recent readers who are not familiar with the history of Sinographic scripts, I'm going to jump right into the third paragraph of Qin's article, which is like a basic primer of oracle bone inscription studies.

Oracle bone inscriptions (OBI), also known as the oracle bone script, can be dated to the later part of the Shang dynasty (ca. 1250 B.C. – 1046 B.C.). It is the inscriptional product of pyromantic divination conducted by Shang elites, a controlled process of systematic drilling of hollows and burning of metal rods to produce cracks on turtle shells or ox scapulae, and the subsequent record-keeping practices of Shang scribes to keep track of the relevant divination events (also see Henry’s post). The oracle bone script, in general, is known to possess highly complex character structures and a huge number of variant forms. Since its first discovery in 1899, over 4,000 characters and 50,000 distinct variants have been identified. The 125-year-long development of the scholarship has produced a lot of useful literature related to the decipherment studies of individual characters and transcriptions for published oracle bone corpora, offering invaluable materials for relevant linguistic and historical examinations of the Shang dynasty. However, the lack of font support and coherent encoding for both archaic and modern forms of the oracle bone characters, and the long absence of efficient database query support have often made the field rather difficult to navigate for both beginners and advanced learners. The input of oracle bone glyphs and database building have been constantly relying on copying and pasting rubbings [of] images which are not so easily indexed and searched.

To return to the beginning of the post, wherein the author gives the rationale for their creation of the platform:

Tired of struggling to find and type out complex oracle bone script characters? You’re not alone. For years, scholars and enthusiasts alike have faced the frustrating challenges of working with these ancient inscriptions—challenges that stem from the lack of a proper font and efficient search tools. An insane number of characters, variants and transcriptions are out there right now thanks to more-than-a-century-long discoveries and research. Imagine spending hours and hours just trying to locate a single glyph or having to manually piece together characters from a mixture of strokes and blot marks using low-resolution rubbings. This not only creates problems for scholars who want to read the texts and search for the relevant literature, but also for enthusiasts who just want to type the character and create non-pixelated artworks. This was the reality for the oracle bone script, until now.

I’m excited to introduce the Jingyuan Digital Platform, a brand-new solution designed to transform how we interact with Shang oracle bone inscriptions. This platform offers two major game-changing tools: the world’s first ultra high-resolution font for oracle bone script (available for free download here) and a comprehensive, user-friendly search engine for these ancient glyphs. Whether you’re typing in Word, designing a poster, or conducting in-depth research, the platform streamlines the entire process, making it faster, easier, and more accurate. Plans and proposals for lexicon, dictionary and thesaurus creation are also in place, which makes the website more worthwhile to watch out for in the future.

The author explains:

At the moment, the site consists of four major modules, with others still under active development:

    1. A high-resolution oracle bone font.
    2. A database including over 52,288 glyphs.
    3. A Multi-purpose text editor for inputting transcriptions.
    4. Geographical and timeline visualizations.

In this two-part article, I will explain the programming technology that made them possible and the academic considerations behind the creation of these modules. In doing so, I hope that some reflections on my attempts regarding font development, database building, and interface design can be helpful for palaeography studies and the general field of DH.

The remainder of this long, first part of the two part post explains in technical detail how the font is constructed and how it is accessed and applied.  The author concludes:

In general, this combination of the font as the base glyph compilation and the database interface as the base search tool firmly guarantees the correct input of desired oracle bone graphs and sets the foundation for future development of a genuine glyph database that covers the functions of a dictionary, lexicon, and eventually a transcription corpus. Some effort has to be made in order to become familiar with the functionalities of these modules of course. But compared to the current academic situation where everything is done by copy-pasting images, it is no doubt a worthwhile attempt towards the efficient utilization of the textual resources this field has to offer.

This is indeed a great improvement over the "drawing" and "copy-pasting" of individual glyphs (don't forget that there are 50,000+ of 'em!) that has heretofore constituted the state of the field.  It's an ambitious project, but remains to be perfected and utilized.

A closing note.  This post by Peichao Qin has appeared in The Digital Orientalist, which has also published scores of other posts on the application of DH and AI for the study of South Asian, Central Asian, East Asian, African, Middle Eastern, etc. languages,

 

Selected readings

[Thanks to Geoff Wade]



7 Comments

  1. Chris Button said,

    September 19, 2024 @ 6:09 am

    Wow! What an achievement.

    Until now, I've been using Mojikyo and subsituting with the Shirakawa Shizuka font for gaps.

  2. Jonathan Smith said,

    September 19, 2024 @ 5:56 pm

    Yeah, this is awesome. Site bookmarked; font downloaded…

    A caveat is that while free-hand drawing was always a pretty hazardous idea, copied/pasted (photographs of) rubbings etc. will remain essential so that one can understand/evaluate the choices/interpretations being made. After all the "characters" are not really drawn from some discrete well-defined set.

  3. Chas Belov said,

    September 19, 2024 @ 9:04 pm

    While I have no practical use for this font, it is way cool and I have downloaded it. I also copied the Chinese-language download page into a LibreOffice doc and set the font for the text to Oracular. It is fun to see how many/few modern Chinese characters have a counterpart in the oracle bone font.

    Alas, I am getting random errors where characters are jammed together and overprint. While I suspect the problem is with LibreOffice and not the font, since I'm getting two different versions of the overprint at different times, I wonder whether there are any issues with the font itself.

    I intend to file the issue with LibreOffice and will also report it to the font creators.

  4. Chas Belov said,

    September 19, 2024 @ 9:05 pm

    I'll note that I seem to have a tropism for bugs, so your results may vary.

  5. Chas Belov said,

    September 19, 2024 @ 9:08 pm

    The overprinting seems to have cleared up when I switched to my browser to make the above posts then switched back to LibreOffice. I don't know whether there was a delay loading certain information from the font or if its the age-old computer behavior of clearing up a problem by describing it to somebody else.

  6. Chris Button said,

    September 20, 2024 @ 6:31 am

    And with so many variants to choose from too! What a staggering achievement. I look forward to examining in more detail.

    The subscription-only CHANT database used to have fonts too (I think ICS3, ICS4), which used to be freely downloadable but not anymore. However I don't recall them looking all that great. The Mojikyo ones look great, but they are somewhat limited in scope.

    Incidentally, it's nice to see this coming out of the University of Cambridge. When I studied there, no-one was researching this stuff at all.

  7. Karl said,

    September 28, 2024 @ 1:07 am

    With the increase of digitalization and the number of highly-educated people with time on their hands, a fun and doable project would be to create oracle bone characters out of current Chinese characters. The development of another set of Chinese graphs could lead to untold hours of head-scratching among members of the unitiated. And maybe a few sadistic Chinese teachers would even choose to inflict it upon unsuspecting non-native learners, arguing that "Once you learn the oracle bone form, learning the complex and simplified is easy."

RSS feed for comments on this post