Archive for Artificial intelligence

Google Translate is even better now, part 2

"Google Translate learns 24 new languages"
Isaac Caswell, Google blog (5/11/22)

==========

Illustrated green globe with the word "hello" translated into different languages.

For years, Google Translate has helped break down language barriers and connect communities all over the world. And we want to make this possible for even more people — especially those whose languages aren’t represented in most technology. So today we’ve added 24 languages to Translate, now supporting a total of 133 used around the globe.

Over 300 million people speak these newly added languages — like Mizo, used by around 800,000 people in the far northeast of India, and Lingala, used by over 45 million people across Central Africa. As part of this update, Indigenous languages of the Americas (Quechua, Guarani and Aymara) and an English dialect (Sierra Leonean Krio) have also been added to Translate for the first time.

Read the rest of this entry »

Comments (24)

Why is Facebook's Chinese translation still so terrible?

[This is a guest post by Jenny Chu]

Has Language Log been following up on the great sorrow that is Facebook's (Chinese) translation feature? The last reference I found was this one

It came up today when I was reading this somewhat viral post on Facebook

I switched on the auto-translate option to help me understand. The results were not just astonishingly bad, but had a surprisingly medical bent.

 
今天這個主權政府作承諾的時候大辭炎炎,七情上面,結果又是如何?–> "Today, when the private government is working, the weather is colon inflammation, above the sentiment, what is the result?"

Read the rest of this entry »

Comments (11)

AI cat and mouse robot censorship war

Now it's getting interesting:

"China’s internet police losing man-versus-machine duel on social media"

Stephen Chen, SCMP (11/14/21)

    Hordes of bot accounts using clever dodging tactics are causing burnout among human censors, police investigative paper finds
    Authorities may respond by raising a counter-army of automated accounts or even an AI-driven public opinion leader

Read the rest of this entry »

Comments (3)

Difficult languages and easy languages, part 3

There may well be a dogma out there stating that all languages are equally complex, but I don't believe it, especially not if it has to be "drummed" into our minds.  I have learned many languages.  Some of them are exceedingly hard (because of their complexity) and some of them are relatively easy (because they are comparatively simple).  I have often said that Mandarin is the easiest language I ever learned to speak, but the hardest to read and write in characters (though very easy in Romanization).  And remember these posts:

"Difficult languages and easy languages" (3/4/17)

"Difficult languages and easy languages, part 2" (5/28/19)

Read the rest of this entry »

Comments (33)

The implications of Chinese for AI development, part 2

With this post, we are already acquainted with Inspur's Yuan 1.0, "one of the most advanced deep learning language models that can generate coherent Chinese texts."  Now, with the present article, we will delve more deeply into the potentials and pitfalls of Inspur's deep learning language model:

"Inspur unveils GPT-3 equivalent for Chinese language", by Wei Sheng, TechNode (1026/21)

The model is trained with 245.7 billion parameters—the number of weights in an artificial neural network, according to the company. This is more than the Elon Musk-backed GPT-3 language model for English, which has 175 billion parameters. Inspur said the Yuan model was trained with 5 terabytes of datasets.

Read the rest of this entry »

Comments (4)

The implications of Chinese for AI development

New article in EnterpriseAI (October 21, 2021):

"Language Model Training Gets Another Player: Inspur AI Research Unveils Yuan 1.0",  by Todd R. Weiss

From Pranav Mulgund:

This article introduces an interesting new advance in an artificial intelligence (AI) model for Chinese. As you probably know, Chinese has been long held as one of the hardest languages for AI to crack. Baidu and Google have both been trying for a long time, but have had a lot of difficulty given the complexity of the language. But the company Inspur just came out with a model called Yuan 1.0 that shows significant advances from previous companies' AIs.

Read the rest of this entry »

Comments (5)

Domo arigato, Mr. Roboto

Given this:

"Measure words for robots" (9/4/21)

and this:

"Arigatō" (9/3/21),

I could not help but think of this:

Read the rest of this entry »

Comments (14)

Measure words for robots

Christian Horn was reading an article in Japanese Endgadget (8/11/21) about the introduction of a new kind of robot called a "Cyberdog".

Says Christian:

You don't need to know Japanese to understand the fascinating part:  in Japanese, when counting things, the type of "thing" you are counting is relevant.  So you count "flat things" differently than "long shaped" things.  Or machines, fish, or animals.

The article states that Cyberdog is aimed at developers, and is limited to "1000台(匹?)", showing hesitation over which measure word to use, dai 台 (counter for machines, including vehicles) or hiki 匹 (counter for small animals​; counter for rolls of cloth; counter for horses​).  If you use dai 台 as a measure word for counting Cyberdogs, it would indicate that you think of them as machines.  If you use hiki 匹 for counting them, it would indicate that you regard Cyberdogs as animals.

Read the rest of this entry »

Comments (23)

Tortured phrases

Article by Holly Else in Nature (8/5/21):

"‘Tortured phrases’ give away fabricated research papers

Analysis reveals that strange turns of phrase may indicate foul play in science"

Here are the beginning and a few other selected portions of the article:

In April 2021, a series of strange phrases in journal articles piqued the interest of a group of computer scientists. The researchers could not understand why researchers would use the terms ‘counterfeit consciousness’, ‘profound neural organization’ and ‘colossal information’ in place of the more widely recognized terms ‘artificial intelligence’, ‘deep neural network’ and ‘big data’.

Further investigation revealed that these strange terms — which they dub “tortured phrases” — are probably the result of automated translation or software that attempts to disguise plagiarism. And they seem to be rife in computer-science papers.

Research-integrity sleuths say that Cabanac* and his colleagues have uncovered a new type of fabricated research paper, and that their work, posted in a preprint on arXiv on 12 July1, might expose only the tip of the iceberg when it comes to the literature affected.

[*VHM:  Guillaume Cabanac, a computer scientist at the University of Toulouse, France]

Read the rest of this entry »

Comments (28)

Delete / elite button

I've written several posts about unpredictable typing mistakes that are not the result of auto-correct or sloppiness, but are produced through phonological confusion in my own neuro-muscular hardware and software (see "Selected readings").  This morning I experienced another funny occurrence of such a mistake.

I had lost over 7,000 of the recent e-mails in my inbox, so I wrote to the excellent IT guys in Williams Hall:

Crisis

I'm making good progress moving things from inbox to archives, but I just had a disaster.  Everything in my inbox between these two e-mails is missing:

Margaret ********   today (6/18/21) 11:53 a.m.

MISSING

Jing ***  (11/18/20)  11:06 p.m.

There are thousands of important e-mails to me with all sorts of information, attachments, and so forth that I need to take care of, some of them very soon.

Can you somehow restore the missing items?

Read the rest of this entry »

Comments (18)

Ingredients of Chinese rice crackers translated by GT phone camera device

Have you tried the Google Translate app on your phone? It has a camera tool that automatically translates text that you point it to, but it looks like it needs some work for Mandarin…

Read the rest of this entry »

Comments (3)

I'm milk

This has been making the rounds:

1. Go to Google Translate.
2. Set the input language to Spanish.
3. Paste in "soy milk"
4. Set the output language to English or X language.
5. Hilarity ensues.

The obligatory screen shot:

 

Read the rest of this entry »

Comments (36)

Google, the wannabe Egyptologist

Sensational article by Hagar Hosny in Al-Monitor (7/23/20):

"Google presents new tool to decode hieroglyphics:  Google has created a new tool to translate hieroglyphics into English and Arabic at the stroke of a key."

It starts like this:

In a July 15 press release, Google announced the launch of a new tool that uses artificial intelligence to decipher Egyptian hieroglyphs and translate them into Arabic and English.

Google said that the tool, dubbed Fabricius, provides an interactive experience for people from all over the world to learn about hieroglyphics, in addition to supporting and facilitating the efforts of Egyptologists and raising awareness about the history and heritage of ancient Egyptian civilization.

“We are very excited to be launching this new tool that can make it easier to access and learn about the rich culture of ancient Egypt. For over a decade, Google has been capturing imagery of cultural and historical landmarks across the region,” Chance Coughenour, program manager at Google Arts and Culture, said in the statement.

Read the rest of this entry »

Comments (7)