The computational linguistics of COVID-19 vaccine design

He Zhang, Liang Zhang, Ziyu Li, Kaibo Liu, Boxiang Liu, David H. Mathews, and Liang Huang, "LinearDesign: Efficient Algorithms for Optimized mRNA Sequence Design", arXiv.org 4/21/2020:

A messenger RNA (mRNA) vaccine has emerged as a promising direction to combat the current COVID-19 pandemic. This requires an mRNA sequence that is stable and highly productive in protein expression, features which have been shown to benefit from greater mRNA secondary structure folding stability and optimal codon usage. However, sequence design remains a hard problem due to the exponentially many synonymous mRNA sequences that encode the same protein. We show that this design problem can be reduced to a classical problem in formal language theory and computational linguistics that can be solved in O(n^3) time, where n is the mRNA sequence length. This algorithm could still be too slow for large n (e.g., n = 3, 822 nucleotides for the spike protein of SARS-CoV-2), so we further developed a linear-time approximate version, LinearDesign, inspired by our recent work, LinearFold. This algorithm, LinearDesign, can compute the approximate minimum free energy mRNA sequence for this spike protein in just 11 minutes using beam size b = 1, 000, with only 0.6% loss in free energy change compared to exact search (i.e., b = +infinity, which costs 1 hour). We also develop two algorithms for incorporating the codon optimality into the design, one based on k-best parsing to find alternative sequences and one directly incorporating codon optimality into the dynamic programming. Our work provides efficient computational tools to speed up and improve mRNA vaccine development.

Read the rest of this entry »

Comments (1)


"Gold" as element and "gold" as substance — as conceived by Mendeleev

[This is a guest post by Conal Boyce]

Your wonderful arabesque on the world of 'kedi'* (and the disappearance of cats for a time — perhaps to a different planet, because they had grown weary of trying to school us humans?) reminded me that you are a connoisseur of languages plural, not just Chinese. In that connection, you might find my 2019 article** on Mendeleev interesting.

 
[**"Mendeleev’s Elemental Ontology and Its Philosophical Renditions in German and English", HYLE – International Journal for Philosophy of Chemistry, Vol. 25 (2019), No. 1, 49-70.]

Read the rest of this entry »

Comments (69)


A Ghanaian-Taiwanese in the military service

Comments (4)


@Everybody

From Randy Alexander, a photo taken in the courtyard of an apartment complex in Huaying, Guang'an, Sichuan (广安华蓥):

Read the rest of this entry »

Comments (5)


Kanji amnesia of the week

Tokyo crime beat:

"Arrest for fraud follows man’s failure to fulfill writing request", by Tokyo Reporter Staff (7/24/20)

TOKYO (TR) – With personal computers, smartphones and tablets now more common than ever, many may consider the actual writing of kanji characters to be of diminished importance.

But for one man, now in custody for fraud, he learned that is not the case, as TBS News (July 23) reports.

On July 7, Hayato Tsuboi, of no known occupation, posed [as] a police officer upon his arrival at the residence of a man in his 90s in Fuchu City.

After collecting five bank cards from the man, Tsuboi withdrew 2 million yen in cash in defrauding him.

Read the rest of this entry »

Comments (11)


Turkish "kedi" and English "cat"

In reacting to the fierce denunciation of Xi Jinping by Cai Xia (see bibliographical note at the bottom of this post), Conal Boyce mused:

Mind-boggling material. I had to do a double-take on the passage you show that contains both chǔn and jiāhuo (蠢家伙 ["stupid guy / fellow"]).  And sure enough, in the video, she actually uses the term zhèngzhì jiāngshī (政治僵尸 ["political zombies"]) more than once!

These are shocking terms, with a peculiar color all their own. They reminded me that, in a sense, there are no words that are actually 'equivalents' between two languages. For instance, the Turkish for 'cat' is 'kedi', which has a comfortable look of familiarity at first, because of English 'kitty', yet we suspect that the semantic range of 'kedi' in Turkish versus the semantic ranges for 'cat' and 'kitty' in English probably overlap in some unexpected Venn diagram style, with much of 'kedi' not immediately accessible to a speaker of English.

Read the rest of this entry »

Comments (18)


Text corrections

Today's xkcd:

Mouseover title: "I like trying to make it as hard as possible. 'I'd love to meet up, maybe in a few days? Next week is looking pretty empty. *witchcraft'"

Read the rest of this entry »

Comments (33)


Never get stuck in a bunker again

Today this ad popped up for me on a newspaper site:

So I tried to figure out how the mysterious object on the left could be used to tunnel out through concrete and earth, and how it could safely be combined with C-3 plastic explosive, and what all this has to do with the woman on the right brandishing a stick amid the explosion. I also wondered why exploding their way out of bomb shelters should be on peoples' minds these days.

Then I realized that it was about golf.

Read the rest of this entry »

Comments (29)


Transmutation of species: the three c(r)ows

Stephen H writes:

I was shuffling French Impressionists in Art Authority on my iPad and came across this:

Read the rest of this entry »

Comments (13)


Google, the wannabe Egyptologist

Sensational article by Hagar Hosny in Al-Monitor (7/23/20):

"Google presents new tool to decode hieroglyphics:  Google has created a new tool to translate hieroglyphics into English and Arabic at the stroke of a key."

It starts like this:

In a July 15 press release, Google announced the launch of a new tool that uses artificial intelligence to decipher Egyptian hieroglyphs and translate them into Arabic and English.

Google said that the tool, dubbed Fabricius, provides an interactive experience for people from all over the world to learn about hieroglyphics, in addition to supporting and facilitating the efforts of Egyptologists and raising awareness about the history and heritage of ancient Egyptian civilization.

“We are very excited to be launching this new tool that can make it easier to access and learn about the rich culture of ancient Egypt. For over a decade, Google has been capturing imagery of cultural and historical landmarks across the region,” Chance Coughenour, program manager at Google Arts and Culture, said in the statement.

Read the rest of this entry »

Comments (7)


Another Northeastern topolectal term without specified characters to write it

Yesterday Diana Shuheng Zhang and I went to a Trader Joe's and saw some pretty, gleaming yellow berries for sale.  Diana was delighted because it reminded her of the same type of berries she used to eat when she was back home in the Northeast of China.

I asked her what they were called in Northeast topolect (Dōngběi huà 东北话).  Her answer both intrigued and amused me:

They are called gu1niao3 or gu1niang3; either way is fine and either way is used by many people interchangeably. Even for myself, I sometimes say the first one, sometimes the second one, depends on… well, randomly. Haha!
 
Then the inevitable question:  how do you write gu1niao3 and gu1niang3 in characters?

Read the rest of this entry »

Comments (24)


What she said?

There's a bit of fuss on Twitter about what reporter Kimberley Halkett said when the press secretary Kaleigh McEnany cut off her follow-up question at yesterday's White House briefing (overall video here, official White House transcript here).

Read the rest of this entry »

Comments (17)


The importance of being and speaking Taiwanese

Meet Hsiao Bi-khim, Taiwan's de facto ambassador to the United States:

Read the rest of this entry »

Comments (2)