Yet again the Voynich manuscript
« previous post | next post »
Perhaps as early as 1640, decipherers have tried practically everything to decode the maddeningly frustrating Voynich manuscript. So far it has resisted all efforts to identify the language in which it was presumably written. About the only way to make further progress in cracking the code is to apply some new technology. As described in the following reports, it seems that a type of digital enhancement has become available and been used to fill in some of the gaps in the manuscript.
The first is the primary document, "Multispectral Imaging and the Voynich Manuscript", which appears on Lisa Fagin Davis' blog, Manuscript Road Trip (9/8/24). She begins with an explanation of what the technology consists of.
Multispectral imaging is a way of capturing a digital image using non-visible wavelengths such as ultraviolet and infrared (click here to learn more). Where medieval manuscripts are concerned, UV imaging in particular can make faded or effaced text legible. This is because most medieval inks (including that used to write the Voynich Manuscript) have a significant iron component. This allows the ink to “bite” into the surface of the parchment rather than sliding off of it. When ink is scraped away or fades, the molecular bond remains, and the faded text may therefore fluoresce when exposed to UV bandwidths. This technology has proven invaluable in helping scholars read palimpsests and damaged manuscripts such as the Archimedes Palimpsest and the Syriac Galen Palimpsest. Could such imaging of the Voynich Manuscript help reveal its secrets?
What follows is a lengthy and highly detailed description of Fagin Davis' analysis of the new data provided by the application of multispectral imaging to the Voynich manuscript.
The second report is Jennifer Ouellette's "New multispectral analysis of Voynich manuscript reveals hidden details: Handwriting suggests Prague doctor named Johannes Marcus Marci tried to decode in 1640" in Ars Technica (9/9/24), which digests and summarizes Fagin Davis' post, while adding amplifications of her own. Here are pertinent portions that highlight the significance of the latest findings.
About 10 years ago, several folios of the mysterious Voynich manuscript were scanned using multispectral imaging. Lisa Fagin Davis, executive director of the Medieval Academy of America, has analyzed those scans and just posted the results, along with a downloadable set of images, to her blog, Manuscript Road Trip. Among the chief findings: Three columns of lettering have been added to the opening folio that could be an early attempt to decode the script. And while questions have long swirled about whether the manuscript is authentic or a clever forgery, Fagin Davis concluded that it's unlikely to be a forgery and is a genuine medieval document.
As we've previously reported, the Voynich manuscript is a 15th century medieval handwritten text dated between 1404 and 1438, purchased in 1912 by a Polish book dealer and antiquarian named Wilfrid Voynich (hence its moniker). Along with the strange handwriting in an unknown language or code, the book is heavily illustrated with bizarre pictures of alien plants, naked women, strange objects, and zodiac symbols. It's currently kept at Yale University's Beinecke Library of rare books and manuscripts. Possible authors include Roger Bacon, Elizabethan astrologer/alchemist John Dee, or even Voynich himself, possibly as a hoax.
There are so many competing theories about what the Voynich manuscript is—most likely a compendium of herbal remedies and astrological readings, based on the bits reliably decoded thus far—and so many claims to have deciphered the text, that it's practically its own subfield of medieval studies. Both professional and amateur cryptographers (including codebreakers in both World Wars) have pored over the text, hoping to crack the puzzle.
So much by way of introduction. Clearly, what the Voynich manuscript (VM [not VHM!]) is and what it means is still very much up in the air, but that doesn't prevent VM enthusiasts from throwing in their precious lot.
Among the most dubious is a 2017 claim by a history researcher and television writer named Nicholas Gibbs, who published a long article in the Times Literary Supplement about how he had cracked the code. Gibbs claimed that he had figured out that the Voynich Manuscript was a women's health manual whose odd script was actually just a bunch of Latin abbreviations describing medicinal recipes. He provided two lines of translation from the text to "prove" his point. Unfortunately, said the experts, his analysis was a mix of stuff we already knew and stuff he couldn't possibly prove.
Fagin Davis was among Gibbs' most vocal critics. She also did not mince words when critiquing the 2019 claims of Gerard Cheshire, an honorary research associate at the University of Bristol, when he announced his own solution. Cheshire claimed the mysterious writing was a "calligraphic proto-Romance" language, and he thought the manuscript was put together by a Dominican nun as a reference source on behalf of Maria of Castile, queen of Aragon. "Sorry, folks, 'proto-Romance language' is not a thing," Fagin Davis tweeted at the time. "This is just more aspirational, circular, self-fulfilling nonsense." Two days after the initial announcement of Cheshire's "breakthrough," the University of Bristol released a statement retracting its original press release.
Now we come to the nitty gritty of what Fagin Davis' blog post achieves. It is a good example of a responsible, resourceful, determined scholar resurrecting valuable data that had been collected a decade earlier but lain dormant during the interim.
Per Fagin Davis, in 2014, the Beinecke Library granted permission to the imaging team from The Lazarus Project to take multispectral images of ten pages from the Voynich manuscript with the intent of making them publicly available online. For various reasons, the images weren't posted. Fast forward to 2024, when Roger Easton of the Rochester Institute of Technology—who was a member of the original imaging team—noticed an article Fagin Davis had written and emailed asking if she would like to examine the images. She was interested so Easton spent the last three weeks reprocessing the multispectral images to produce the current image set.
…
When the manuscript first came into Wilfrid Voynich's hands in 1912, he noted that the first page had an effaced inscription in the lower margin, applying a chemical reagent to the page around 1914 to make it more visible. He thought he could make out a signature: "Jacobi à Tepenecz," aka an alchemist in Prague named Jacobus Sinapius, who probably owned the manuscript in the late 16th or early 17th century.
Fagin Davis's analysis confirmed Voynich's discovery. She also noted that there was no evidence that the Voynich manuscript is a palimpsest, i.e., parchment that had been reused and thus showed evidence of underwriting. That would have helped refine the manuscript's date of origin. Carbon-14 testing puts the date as around 1425, which Fagin Davis thinks is likely since the illustrations are consistent with that period, but some scholars disagree. Nor is the manuscript likely to be a modern forgery.
…
More recently, Voynich scholars had noted what seems to be a Roman alphabet written in the right-hand margin of that first page. Multispectral imaging clearly reveals the letters a, b, c, d, and e, according to Fagin Davis. In fact, there are actually three columns of lettering, not just one: the Roman alphabet, a series of Voynich characters, and another Roman alphabet, this time offset by one letter. Fagin Davis did her own preliminary transcription of those alphabets and concluded that this is mostly likely an early attempt to decode the manuscript. But who had made the attempt?
To find out, Fagin Davis combed through several letters written in the so-called "humanistic bookhand' commonly used by Petrarch and Boccaccio in 14th-century Italy, since the two Roman alphabet columns in the Voynich manuscript were also written in that style. She compared those handwriting samples with the columns in the Voynich manuscript.
One was a very close match: a September 12, 1640 letter to Athanasius Kircher written by Johannes Marcus Marci, a doctor in Prague who inherited the manuscript from his friend Georg Baresch when the alchemist died in 1662. Marci sent the manuscript to Kircher in Rome in 1665, hoping that the Jesuit scholar and polymath would be able to decipher it.
Fagin Davis identified several "strong markers" between the two handwriting samples that she thinks identify Marci as the would-be decoder. For instance, at this time in the 17th century, many people used prominent loops on the letters b, d, f, h, p, q, s, and y, but Marci did not. Not did the person who wrote the two Roman alphabet columns on that page of the Voynich manuscript. Marci also sometimes used an "open bowl" g, an m with a taller first stroke than the last, and a distinctive shape to his z's—all of which are consistent with the handwriting sample in the Voynich manuscript.
That said, anyone hoping this multispectral analysis of the scans will finally solve the mystery of the Voynich manuscript once and for all is bound to be disappointed, although any new textual evidence is significant for scholars.
"These alphabets will likely not help us actually decipher the manuscript," Fagin Davis wrote on her blog. "This is because linguists… and other researchers have established that the manuscript is almost certainly not encrypted using a simple substitution cipher, and the substitutions in these columns result in nonsense anyway. Even so, they do add an interesting and new chapter to the early history of the manuscript. I look forward to hearing from other researchers about this new evidence, especially from experts in cryptography who may have ideas about why Marci or any other early-modern decrypter would need three columns of alphabets to do their work."
Both articles have copious photographs demonstrating how the multispectral imaging brings out details that are not visible to the naked eye.
Despite the hard-won, hitherto unknown data about much earlier attempts to decode the VM provided by multispectral analysis, which tilts the balance in favor of the conclusion that this most vexing cultural artifact is not a forgery or a hoax, we still don't know what this elaborate, illustrated text is communicating. VM case not closed.
Selected readings
- Voynich and midfix" (7/3/04)
- "Voynich code cracked?" (5/16/19)
- "The indecipherability of the Voynich manuscript" (9/11/19)
- "The Voynich Manuscript in the undergraduate curriculum" (10/10/19)
- "ChatGPT: Theme and Variations" (2/21/23) — CHAT 2
- "Once again the Voynich manuscript" (4/21/24)
- "Latin, Hebrew … proto-Romance? New theory on Voynich manuscript: Researcher claims to have solved mystery of 15th-century text but others are sceptical", Esther Addley, The Guardian (5/15/19)
- "Inscription decipherment with digital image enhancement" (12/1/20)
[Thanks to Hiroshi Kumamoto}
Em said,
September 11, 2024 @ 12:29 pm
I find it remarkable that the line of explanation defended (convinclngly in my view) by Torsten Timm* has never been featured on here afaict, and more generally received very little attention, while clearly bogus claims of decipherment regularly make the rounds. I myself just heard about it from twitter.
The claim is that the manuscript could just be procedurally generated nonsense, some sort of stochastic lorem ipsum. This explains certain properties of the text (very repetitive word forms in particular) and of course also the fact that all attempts at decipherment have been failures. It's less sexy than an exotic language written in an otherwise unattested alphabet or an elaborate encryption scheme, of course.
* See for instance https://www.tandfonline.com/doi/abs/10.1080/01611194.2019.1596999
JMGN said,
September 11, 2024 @ 5:18 pm
@Em But can we prove it is undoubtedly a sham language?
Andrew McCarthy said,
September 11, 2024 @ 5:48 pm
To me it seems most likely that the Voynich Manuscript is a forgery, but an old forgery: glyphs written intentionally without meaning in a bogus script meant to look impressive, made by Early Modern con-artist alchemists in the hope of selling it to a rich patron – and then perhaps selling them bogus "translations" of the text, for an additional fee, no doubt. But clearly, before the 17th century ended, it passed into the hands of people who thought the text encoded some sort of meaning, and tried to decipher it accordingly.
The analysis by Torsten Timm mentioned above would fit very well with the idea that Voynich is an extremely well-made con job from several centuries ago.
Paul Garrett said,
September 11, 2024 @ 5:51 pm
Obvious, knee-jerk reaction: let's see what a generic AI can do if asked to create something analogous… :)
Peter Grubtal said,
September 12, 2024 @ 2:13 am
The last time this came up I was more of the opinion too, that it's a 16th/17th C. hoax.
But this has solid reasoning pointing the finger at Voynich himself.
Benjamin E. Orsatti said,
September 12, 2024 @ 7:43 am
https://web.archive.org/web/20130430123254/http://www.tabletmag.com/jewish-arts-and-culture/books/129131/cracking-the-voynich-code?all=1
Well..?
Stephen Goranson said,
September 12, 2024 @ 10:33 am
The proposal that Voynich himself forged it seems excluded because the old annotation predates him and because it is not a palimpsest, meaning a modern forger would have needed to have obtained a large block of old but unused writing surface.
On the other hand, it still could be an old fake.
David Marjanović said,
September 12, 2024 @ 10:42 am
I've never heard of any alphabet made up from scratch "to record a foreign language" around that time. Other centuries, yes, occasionally, but not then.
Also, if you read on, you find:
David Marjanović said,
September 12, 2024 @ 10:43 am
That is actually possible, says the Jewish Arts and Culture article, and moreover the parchment of different pages isn't the same age.
Too bad there's apparently no carbon in the ink.
Benjamin E. Orsatti said,
September 12, 2024 @ 11:11 am
David M.,
Yeah, the rest of the article spoils the magic. The part you cited, where the syllable distribution plot more or less proves that someone just took a grid and 90'd it to generate new syllables seems like the finishing blow.
If it's gibberish, do we care if it's 15-century or 20-the century gibberish?
Stephen Goranson said,
September 12, 2024 @ 11:17 am
" Hodgins estimates with 95 percent certainty that the animal died between 1404 and 1438." If so, this is not a later pieced-together miscellaneous batch of skins.
The post-2002 fake "Dead Sea Scrolls" were small snippets.
The VM is not like those.
cameron said,
September 12, 2024 @ 11:34 am
it's interesting that in this context "artwork" and "hoax" are equivalent
to be a salable "exotic" manuscript the document has to be interesting, so through the hoaxer's efforts it ends up exhibiting what Kant calls Zweckmässigkeit ohne Zweck
Scott P. said,
September 12, 2024 @ 2:41 pm
If it's gibberish, do we care if it's 15-century or 20-the century gibberish?
Certainly, in the same way we'd care if an Egyptian papyrus bore a 3rd century CE gibberish magical incantation or if it were a 20th-century forgery.
Benjamin E. Orsatti said,
September 12, 2024 @ 2:51 pm
Scott P. reveals an interesting feature of English ambiguity. I meant, "should _we_ care more about a 600- than a 100-year-old forgery" — intending "we" as "people-on-language-log", who would only be interested in the thing _as_ a linguistic artifact — but Scott P. (quite legitimately) interpreted it as "we" (i.e. your average citizen on the internet with a passing interest in archaeology).
Vulcan with a Mullet said,
September 12, 2024 @ 4:37 pm
I am with the camp that it's a creation of pure imagination by a weird artist who eiither made up his own language and alphabet (people did this way before Tolkien and modern conlangs) or it is truly gibberish, and either way it's fascinating. I would never dislodge it from its artistic position even if it did prove to be 20th century, which seems unlikely but defintely in the realm of possibilty. I just think it's non acknowledged by most people who know about it, and they assume it MUST have some "secret message".
Philip Anderson said,
September 13, 2024 @ 8:14 am
@Bejamin E. Orsatti
But are "we" as "people-on-language-log", really only interested in linguistic artefacts? Most of us do have other interests, and for me the riddle is not yet solved until the creator is, if not identified, at least located in time and space (even if the linguistic mystery has been dealt with).
It seems odd that if such an impressive manuscript has been around since the C15th century, it was not recorded anywhere by an earlier owner. Maybe that will change. Maybe more information can be extracted about the animals who contributed their skins, or the ink.
Benjamin E. Orsatti said,
September 13, 2024 @ 10:06 am
I wouldn't be surprised either way — imagine how many "priceless artifacts" have been "uncovered" at estate sales or flea markets just because some "old [book/scepter/ark]" had been shuffled from one dark attic to the next over the centuries.
Philip Andersaid said,
You and Marty Heidegger, buddy.
Benjamin E. Orsatti (& family) said,
September 13, 2024 @ 10:09 am
For the linguists out there, quick — which word in the above post had I been typing when my attention was diverted by anxiety as to whether I would correctly enter the html code for the immanent block quote?
David Marjanović said,
September 13, 2024 @ 10:10 am
Oh yes. If it's from the 15th, it tells us about the history of gibberish, of what people at the time thought a mysterious artefact ought to look like or whatever. It would be like the history of science fiction, complete with zeerust and everything.
Stephen Goranson said,
September 13, 2024 @ 10:59 am
"Nonsense is nonsense, but the history of nonsense is scholarship."
Saul Lieberman introducing Gershom Scholem.
Benjamin E. Orsatti said,
September 13, 2024 @ 11:04 am
DM: Oddly enough, it was the "zeerust" that put your comment together for me. My "go-to" is the Jetsons' flying cars and how everything "future" in the '60's had to have these weird transmitter-type rings around them. I get it now, though. Alchemy is still cool even if it doesn't work, for example.
SG: I'll bet that was "zippier" in the original Yiddish?
Stephen Goranson said,
September 13, 2024 @ 11:39 am
BEO, It was not originally Yiddish, but an introduction to a lecture in English.
Claire said,
September 13, 2024 @ 8:11 pm
Readers might be interested in this overview that Knowable magazine published https://knowablemagazine.org/content/article/society/2021/can-statistics-help-crack-mysterious-voynich-manuscript, as well as the overview in the Annual Review of Linguistics that Luke Lindemann and I wrote: https://www.annualreviews.org/content/journals/10.1146/annurev-linguistics-011619-030613
Both papers summarize the current state of knowledge of the language underlying the manuscript
Torsten Timm said,
September 14, 2024 @ 6:58 am
Andreas Schinner and I present a detailed analysis of the Voynich text. Based on this analysis we present a concrete text generator algorithm (the "self-citation" process), easily executable without additional tools even by a medieval scribe. Therefore in our eyes the "self-citation" is the most compact way to describe the features of the Voynich text. (see https://doi.org/10.1080/01611194.2019.1596999)
We assume that the author of the VMS had developed a method to manually generate some sort of lorem ipsum text. We assume that the medieval scribe generated the algorithm intuitively, rather than on a conceptual basis, with growing experience amplifying the peculiarities of the copying process. An experiment by Gaskell and Bowern (2022) strengthens our viewpoint. Gaskell and Bowern recruited Volunteers to write short "gibberish" documents, as a basis for a statistical comparison with the VMS and linguistically meaningful texts. Gaskell and Bowern write: "Informal interviews and class discussions confirmed that many participants did indeed adopt this type of approach to create their texts" while referring to our "self-citation" algorithm. (see https://ceurws.org/Vol-3313/paper4.pdf). In this way the "self-citation" method is an intuitiv and easy way to produce asemantic writing.
The "self-citation" method is also known as the natural way to generate meaningless filler text to conceal encoded messages. The Voynich manuscript expert Mary D’Imperio of the NSA writes: "The scribe, faced with the task of thinking up a large number of dummy sequences, would naturally tend to repeat parts of neighboring strings with various small changes and additions … ". D’Imperio, M. E. 1978. The Voynich Manuscript – An Elegant Enigma. Laguna Hills, CA: Aegean Park Press.
Therefore we argue that any scribe creating language mimicking gibberish will sooner or later replace the tedious task of inventing more and more words by the much easier reduplication of existing text (and stick with this strategy). (see Timm & Schinner 2024, p. 318, https://doi.org/10.1080/01611194.2023.2225716)
The overview paper of Bowern and Lindemann published in the Annual Review of Linguistics is based on unfounded assumptions. For instance the full reduplication, in which an entire word is repeated, is also common in Voynich text. However, it is not within the realm of plausibility for natural language texts as assumed by Bowern and Lindemann. Bowern and Lindemann argue that the range among the samples in our language corpus is between 0.02-4.8%. A value of 4.8 % for full reduplications is far to high for natural languages. Bowern and Lindemann most likely used an inadequate parsing method for Wikipedia pages and therefore achieved questionable statistical results (see https://doi.org/10.1080/01611194.2021.1911875, p. 15).
Another idea of Bowern and Lindemann is that of a relation between illustration and text. This idea goes back to a paper of Montemurro et al. from 2013 (https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0066344). The research of Montemurro et al. is based on the idea that "uninformative words tend to have an approximately homogeneous (Poissonian) distribution" and "the most relevant words are scattered more irregularly, and their occurrences are typically clustered". However, they did not verify whether words with a homogeneous (Poissonian) distribution are present in the Voynich text. The paper of Bowern and Lindemann from 2021 assumes that "Montemurro et al. (2013) use techniques from information theory to identify which words are most likely to contribute to topics in texts. That is, they identify words that are more uniformly distributed throughout the Voynich Manuscript and compare them with those that tend to cluster." (see Bowern and Lindemann 2021). But Bowern and Lindemann also didn't verify if uniformly distributed words exist in the Voynich text. In fact, uniformly distributed words doesn't exist (see Timm & Schinner 2021, p. 6 https://www.tandfonline.com/doi/full/10.1080/01611194.2021.1911875). Therefore, Montemurro et al., along with Claire Bowern, incorrectly assume that it is possible to differentiate between uniformly distributed words and topic-specific words within the Voynich manuscript. This assumption leads them to a flawed conclusion: they erroneously infer that there is a meaningful correlation between the topics suggested by the manuscript's illustrations and the distribution of words in the text. See also the review of the Chris Chrisomalis "Is the Voynich Manuscript structured like written language?" (https://glossographia.com/2013/06/24/is-the-voynich-manuscript-structured-like-written-language/) and of René Zandbergen: This does not demonstrate "that the text variations are caused by different subject matter (as suggested in by Montemurro and Zanette). If that were the case, the difference between herbal A and herbal B should not exist. The cause of the (statistical) language variation is still unexplained." (see René Zandbergen: https://www.voynich.nu/extra/curabcd.html).
In the VMS frequently used tokens differ from page to page. Moreover, when we look at the three most frequent words on each page, for more than half of the pages two of three will differ in only one detail. No obvious rule can be deduced which words form the top-frequency tokens at a specific location, since a token dominating one page might be rare or missing on the next one. (see Timm & Schinner 2020, p. 3). However, function words (like conjunctions, articles etc.) do not appear contextual in natural languages, but rather serve to implement grammatical structures, and they normally do not have co-occurring similar words of comparable frequency (see https://arxiv.org/abs/1601.07435).
Peter Grubtal said,
September 14, 2024 @ 9:44 am
That the Voynich is genuinely from the 17C. or earlier seems widely accepted.
Wikipedia tells us that the first confirmed owner was Georg Baresch, then Jan Marek Marci (also known as Johannes Marcus Marci), who sent it to Athanasius Kircher (all three alive in the 17th C).
The author of the Jewish Arts and Culture article seems to think that the evidence is not conclusive that the book/MS with which these three were involved was in fact what we now know as the Voynich.
I find it surprising that Wikipedia doesn't express the history of the MS more tentatively.
Stephen Goranson said,
September 14, 2024 @ 10:07 am
That the ms is older than 20th-century appears safe to say, given the C14 of the skins and the provenance annotation made visible by multi-spectral imaging, unless one thinks Voynich capable of all that.
I wonder how reliable is the assertion that more than one scribe was involved.
The project "The Hands that Wrote the Bible" is investigating some Qumran mss, using paleography, C14 and AI.
Perhaps the two projects could be usefully compared.
Torsten Timm said,
September 14, 2024 @ 10:46 am
The most prominent advocate for the multiple scribe hypothesis is Lisa Fagin Davis (see https://muse.jhu.edu/pub/56/article/754633). For a review of Lisa Davis attempt to distinguish between different scribes see: https://www.academia.edu/105542019/Discussion_of_Voynich_Paleography
In short, from our point of view, four aspects appear especially problematic: insufficient documentation, missing discussion of a competing paleographical analysis, mixing of paleographical and statistical argumentation, and misinterpretation of statistics. Davis mainly focuses on two glyphs only. Moreover, the details used to distinguish between different scribes are observable both across different scribes and also within a single folio. (see also Timm & Schinner, p. 7 https://www.academia.edu/104054256/The_Voynich_manuscript_discussion_of_text_creation_hypotheses).