Archive for Artificial intelligence

Mirabile scriptu: fake kanji created by AI

Read the rest of this entry »

Comments (1)

ChatGPT writes Haiku

[This is a guest post by Bill Benzon]

I’ve been spending a LOT of time with ChatGPT. So naturally, I decided to have it create some haiku.  [VHM:  See the link to Bill's blogpost after the page break.]  This post is about that, but also about a most remarkable woman, Margaret Masterman (1910-1986). She’d studied with Wittgenstein in the 1930s and then went on to create the Cambridge Research Unit in Linguistics in the 1950s. There she became one of the founders of computational linguistics and had a computer generate haiku in 1969. As far as I know, it’s the first time that’s been done.
 
Take at look at the very end. I’ve taken to closing my dialogs by thanking ChatGPT. I know it’s not conscious, nor sentient, but why not? It’s fun. This time I decided to thank it in Japanese. Except that I neither speak nor read Japanese. But I can use Google Translate. I thought ChatGPT would have no trouble, but I do think its reply was rather clever.
 
Best of the season to you, and the rest of the Log.

Read the rest of this entry »

Comments (15)

Speech to speech translation of unwritten languages: Hokkien

Everybody's talking about it.

"Meta has developed an AI translator for a primarily-spoken language

It only translates between Hokkien and English for now, but offers potential for thousands of languages without official written systems."

By Amanda Yeo, Mashable (October 20, 2022)

If true, this technology could be an enormous boon for illiterates everywhere.  It also has important theoretical and linguistic implications.

Read the rest of this entry »

Comments (3)

"Collapsed" calligraphy, part 2

New article by Nyri Bakkalian in Unseen Japan (9/17/22):

"New App Promises Greater Convenience in Reading Old Japanese Cursive:

Kuzushiji, the 'crushed letters' found in historical Japanese documents, have long been the bane of scholars. A new app may change all that."

The author bemoans:

During my graduate education in Japanese history, interpreting handwritten primary source material from the 19th century and earlier was one of my greatest challenges. Typeset historic documents exist, especially in my period of focus during the Bakumatsu-Meiji transition. But the further back in time one’s research focus is situated, the rarer these documents become. There is a plethora of handwritten documents, written in historic cursive, but learning how to read them is a significant investment of time and resources beyond the means of most people who might otherwise have the inclination to learn.

Read the rest of this entry »

Comments (1)

Google Translate is even better now, part 2

"Google Translate learns 24 new languages"
Isaac Caswell, Google blog (5/11/22)

==========

Illustrated green globe with the word "hello" translated into different languages.

For years, Google Translate has helped break down language barriers and connect communities all over the world. And we want to make this possible for even more people — especially those whose languages aren’t represented in most technology. So today we’ve added 24 languages to Translate, now supporting a total of 133 used around the globe.

Over 300 million people speak these newly added languages — like Mizo, used by around 800,000 people in the far northeast of India, and Lingala, used by over 45 million people across Central Africa. As part of this update, Indigenous languages of the Americas (Quechua, Guarani and Aymara) and an English dialect (Sierra Leonean Krio) have also been added to Translate for the first time.

Read the rest of this entry »

Comments (24)

Why is Facebook's Chinese translation still so terrible?

[This is a guest post by Jenny Chu]

Has Language Log been following up on the great sorrow that is Facebook's (Chinese) translation feature? The last reference I found was this one

It came up today when I was reading this somewhat viral post on Facebook

I switched on the auto-translate option to help me understand. The results were not just astonishingly bad, but had a surprisingly medical bent.

 
今天這個主權政府作承諾的時候大辭炎炎,七情上面,結果又是如何?–> "Today, when the private government is working, the weather is colon inflammation, above the sentiment, what is the result?"

Read the rest of this entry »

Comments (11)

AI cat and mouse robot censorship war

Now it's getting interesting:

"China’s internet police losing man-versus-machine duel on social media"

Stephen Chen, SCMP (11/14/21)

    Hordes of bot accounts using clever dodging tactics are causing burnout among human censors, police investigative paper finds
    Authorities may respond by raising a counter-army of automated accounts or even an AI-driven public opinion leader

Read the rest of this entry »

Comments (3)

Difficult languages and easy languages, part 3

There may well be a dogma out there stating that all languages are equally complex, but I don't believe it, especially not if it has to be "drummed" into our minds.  I have learned many languages.  Some of them are exceedingly hard (because of their complexity) and some of them are relatively easy (because they are comparatively simple).  I have often said that Mandarin is the easiest language I ever learned to speak, but the hardest to read and write in characters (though very easy in Romanization).  And remember these posts:

"Difficult languages and easy languages" (3/4/17)

"Difficult languages and easy languages, part 2" (5/28/19)

Read the rest of this entry »

Comments (33)

The implications of Chinese for AI development, part 2

With this post, we are already acquainted with Inspur's Yuan 1.0, "one of the most advanced deep learning language models that can generate coherent Chinese texts."  Now, with the present article, we will delve more deeply into the potentials and pitfalls of Inspur's deep learning language model:

"Inspur unveils GPT-3 equivalent for Chinese language", by Wei Sheng, TechNode (1026/21)

The model is trained with 245.7 billion parameters—the number of weights in an artificial neural network, according to the company. This is more than the Elon Musk-backed GPT-3 language model for English, which has 175 billion parameters. Inspur said the Yuan model was trained with 5 terabytes of datasets.

Read the rest of this entry »

Comments (4)

The implications of Chinese for AI development

New article in EnterpriseAI (October 21, 2021):

"Language Model Training Gets Another Player: Inspur AI Research Unveils Yuan 1.0",  by Todd R. Weiss

From Pranav Mulgund:

This article introduces an interesting new advance in an artificial intelligence (AI) model for Chinese. As you probably know, Chinese has been long held as one of the hardest languages for AI to crack. Baidu and Google have both been trying for a long time, but have had a lot of difficulty given the complexity of the language. But the company Inspur just came out with a model called Yuan 1.0 that shows significant advances from previous companies' AIs.

Read the rest of this entry »

Comments (5)

Domo arigato, Mr. Roboto

Given this:

"Measure words for robots" (9/4/21)

and this:

"Arigatō" (9/3/21),

I could not help but think of this:

Read the rest of this entry »

Comments (14)

Measure words for robots

Christian Horn was reading an article in Japanese Endgadget (8/11/21) about the introduction of a new kind of robot called a "Cyberdog".

Says Christian:

You don't need to know Japanese to understand the fascinating part:  in Japanese, when counting things, the type of "thing" you are counting is relevant.  So you count "flat things" differently than "long shaped" things.  Or machines, fish, or animals.

The article states that Cyberdog is aimed at developers, and is limited to "1000台(匹?)", showing hesitation over which measure word to use, dai 台 (counter for machines, including vehicles) or hiki 匹 (counter for small animals​; counter for rolls of cloth; counter for horses​).  If you use dai 台 as a measure word for counting Cyberdogs, it would indicate that you think of them as machines.  If you use hiki 匹 for counting them, it would indicate that you regard Cyberdogs as animals.

Read the rest of this entry »

Comments (23)

Tortured phrases

Article by Holly Else in Nature (8/5/21):

"‘Tortured phrases’ give away fabricated research papers

Analysis reveals that strange turns of phrase may indicate foul play in science"

Here are the beginning and a few other selected portions of the article:

In April 2021, a series of strange phrases in journal articles piqued the interest of a group of computer scientists. The researchers could not understand why researchers would use the terms ‘counterfeit consciousness’, ‘profound neural organization’ and ‘colossal information’ in place of the more widely recognized terms ‘artificial intelligence’, ‘deep neural network’ and ‘big data’.

Further investigation revealed that these strange terms — which they dub “tortured phrases” — are probably the result of automated translation or software that attempts to disguise plagiarism. And they seem to be rife in computer-science papers.

Research-integrity sleuths say that Cabanac* and his colleagues have uncovered a new type of fabricated research paper, and that their work, posted in a preprint on arXiv on 12 July1, might expose only the tip of the iceberg when it comes to the literature affected.

[*VHM:  Guillaume Cabanac, a computer scientist at the University of Toulouse, France]

Read the rest of this entry »

Comments (28)