Language Log

Electric sheep

April 18, 2017 @ 4:16 am · Filed by Mark Liberman under Computational linguistics, Elephant semifics

A couple of recent LLOG posts ("What a tangled web they weave", "A long short-term memory of Gertrude Stein") have illustrated the strange and amusing results that Google's current machine translation system can produce when fed variable numbers of repetitions of meaningless letter sequences in non-Latin orthographic systems. [Update: And see posts in the elephant semifics category for many other examples.] Geoff Pullum has urged me to explain how and why this sort of thing happens:

I think Language Log readers deserve a more careful account, preferably from your pen, of how this sort of craziness can arise from deep neural-net machine translation systems. […]

Ordinary people imagine (wrongly) that Google Translate is approximating the process we call translation. They think that the errors it makes are comparable to a human translator getting the wrong word (or the wrong sense) from a dictionary, or mistaking one syntactic construction for another, or missing an idiom, and thus making a well-intentioned but erroneous translation. The phenomena you have discussed reveal that something wildly, disastrously different is going on.

Something nonlinear: 18 consecutive repetitions of a two-character Thai sequence produce "This is how it is supposed to be", and so do 19, 20, 21, 22, 23, and 24, and then 25 repetitions produces something different, and 26 something different again, and so on. What will come out in response to a given input seems informally to be unpredictable (and I'll bet it is recursively unsolvable, too; it's highly reminiscent of Emil Post's famous tag system where 0..X is replaced by X00 and 1..X is replaced by X1101, iteratively).

Type "La plume de ma tante est sur la table" into Google Translate and ask for an English translation, and you get something that might incline you, if asked whether you would agree to ride in a self-driving car programmed by the same people, to say yes. But look at the weird shit that comes from inputting Asian language repeated syllable sequences and you not only wouldn't get in the car, you wouldn't want to be in a parking lot where it was driving around on a test run. It's the difference between what might look like a technology nearly ready for prime time and the chaotic behavior of an engineering abortion that should strike fear into the hearts of any rational human.

Language Log needs at least a sketch of a proper serious account of what's going on here.

A sketch is all that I have time for today, but here goes…

According to Yonghui Wu et al., "Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation", 9/26/2016:

Our model consists of a deep LSTM network with 8 encoder and 8 decoder layers using attention and residual connections. […] To improve handling of rare words, we divide words into a limited set of common sub-word units ("wordpieces") for both input and output. This method provides a good balance between the flexibility of "character"-delimited models and the efficiency of "word"-delimited models, naturally handles translation of rare words, and ultimately improves the overall accuracy of the system.

"LSTM" is an acronym for "Long Short-Term Memory". As Wikipedia explains,

Long short-term memory (LSTM) is a recurrent neural network (RNN) architecture (an artificial neural network) proposed in 1997 by Sepp Hochreiter and Jürgen Schmidhuber. Like most RNNs, a LSTM network is universal in the sense that given enough network units it can compute anything a conventional computer can compute, provided it has the proper weight matrix, which may be viewed as its program. Unlike traditional RNNs, a LSTM network is well-suited to learn from experience to classify, process and predict time series when there are time lags of unknown size and bound between important events.

Andrej Karpathy, in "The Unreasonable Effectiveness of Recurrent Neural Networks", notes that an LSTM network trained on various kinds of text can be used to generate arbitrary amounts of new material in the general style of its inputs. Thus training on the the works of Shakespeare in modern spelling yields stuff like

PANDARUS:
Alas, I think he shall be come approached and the day
When little srain would be attain'd into being never fed,
And who is but a chain and subjects of his death,
I should not sleep.

Second Senator:
They are away this miseries, produced upon my soul,
Breaking and strongly should be buried, when I perish
The earth and thoughts of many states.

DUKE VINCENTIO:
Well, your wit is in the care of side and that.

Second Lord:
They would be ruled after this chamber, and
my fair nues begun out of the fact, to be conveyed,
Whose noble souls I'll have the heart of the wars.

Clown:
Come, sir, I will make did behold your worship.

VIOLA:
I'll drink it.

In contrast, training on 96 MB of English-language Wikipedia yields hallucinations like this one:

Naturalism and decision for the majority of Arab countries' capitalide was grounded
by the Irish language by [[John Clair]], [[An Imperial Japanese Revolt]], associated 
with Guangzham's sovereignty. His generals were the powerful ruler of the Portugal 
in the [[Protestant Immineners]], which could be said to be directly in Cantonese 
Communication, which followed a ceremony and set inspired prison, training. The 
emperor travelled back to [[Antioch, Perth, October 25|21]] to note, the Kingdom 
of Costa Rica, unsuccessful fashioned the [[Thrales]], [[Cynth's Dajoard]], known 
in western [[Scotland]], near Italy to the conquest of India with the conflict. 
Copyright was the succession of independence in the slop of Syrian influence that 
was a famous German movement based on a more popular servicious, non-doctrinal 
and sexual power post. Many governments recognize the military housing of the 
[[Civil Liberalization and Infantry Resolution 265 National Party in Hungary]], 
that is sympathetic to be to the [[Punjab Resolution]]
(PJS)[http://www.humah.yahoo.com/guardian.
cfm/7754800786d17551963s89.htm Official economics Adjoint for the Nazism, Montgomery 
was swear to advance to the resources for those Socialism's rule, 
was starting to signing a major tripad of aid exile.]]

Google's NMT system system differs from Karpathy's experiments in several key ways, including the fact that it deals with "wordpieces" as units rather than letters, and the fact that it was trained on trillions of words, rather than hundreds of thousands or millions. But like Karpathy's system, its recursive character means that it's capable of turning meaningless input into complex and seemingly unpredictable hallucinations that nevertheless evoke aspects of its experience.

More from Geoff Pullum on the dangers of recursion:

While preparing a talk on Emil Post for a seminar group at the University of Lille early this month I wrote a script that implements Post's problematic tag system on the alphabet {a,b}, and its behavior is truly extraordinary. Give it a string of anything up to 13 b's, and it loops forever (I conjecture!) on strings of varying lengths. Then at 14 b's it suddenly decides to terminate in 410 steps. On 15 b's, it terminates one step sooner, 409 steps; on 16, even quicker, 408 steps; but then at 17 b's it starts complicated looping behaviors again. And so on, chaotically, with no sign at all of any sensible or predictable behavior. Given the input "abbabbbbab-bbabbbabb-abbabb-abbba-bbbbbbb" it goes bananas for a while, building hundreds of strings up to 139 characters long, but slowly they begin to shrink, and eventually you get termination after 1,355 steps.

Post was trying to work out what systems of this sort would do, and whether derivability of a given string was decidable, using just pencil and paper (!), in 1921 when he did his postdoc year at Princeton. It almost literally drove him mad: my Lille friend Liesbeth De Mol has discovered evidence that it was working on tag systems that drove Post to his first manic episode, which was followed by thirty years of mental illness and occasional hospitalization for what we now call bipolar disorder.

I mention this only because the core of my script is tiny (only 7 lines of code, 38 words, 173 characters). As Post discovered, you can get batshit-crazy behavior out of extraordinarily simple systems. What these games with Google Translate are showing is that by adding enormous complexity and gigundo amounts of training data a system doesn't necessarily get any less batshit crazy.

For the relevant background, see Wikipedia on Tag systems, and Liesbeth De Mol, "Generating, solving and the mathematics of Homo Sapiens. Emil Post’s views on computation", 2013.

From what I understand of current self-driving car technology, manic episodes are unlikely. But I agree with Geoff that this aspect of current AI algorithms is something to wonder about, as it comes to be deployed in circumstances with more serious real-world consequences than amusing translations of variable-length gibberish.

April 18, 2017 @ 4:16 am · Filed by Mark Liberman under Computational linguistics, Elephant semifics

Permalink

38 Comments

Keith said,

April 18, 2017 @ 5:01 am

After reading the posts mentioned at the beginning of this one, I was curious as to what would be the result of typing repeated phrases of a European language into Google. Then in this post I read the following.

Type "La plume de ma tante est sur la table" into Google Translate and ask for an English translation, and you get something that might incline you, if asked whether you would agree to ride in a self-driving car programmed by the same people, to say yes.

So I went to Google translate and started typing that French phrase…
When I got as far as "La plume de ma", the translation into English was quite acceptably "The pen of my".
However, adding a word messed things up a bit: "La plume de ma tante" becomes "My aunt's feather".
Completing the phrase "La plume de ma tante est sur le bureau de mon oncle" gets us back to the established translation "My aunt's pen is on my uncle's desk".
The word "plume" repeated just gets "feather" repeated, as long as you separate the word with spaces… remove the spaces and things get hairy.
"plumeplumeplumeplume" becomes "Zones Zones allowed me Zones filoplumes", and the "detected language" has switched from French to Xhosa.

[Let's face it, once again I was too generous to the machines. I took "La plume de ma tante est sur la table" to be stereotypically perfectly ordinary French that nobody, not even a machine, could possibly get wrong. Thank you for putting my overly naive belief to the test and showing that I was wrong.
—GKP]
leoboiko said,

April 18, 2017 @ 6:42 am

> From what I understand of current self-driving car technology, manic episodes are unlikely.

I wonder what steps they're taking against malicious misleading inputs to machine-learning AIs, including self-driving cars (Coyote murals anyone)?)
AntC said,

April 18, 2017 @ 7:14 am

As Post discovered, you can get batshit-crazy behavior out of extraordinarily simple systems.

You don't even have to venture into string handling. Pondering the rising and falling and lengthening chains of integers for the Collatz Conjecture is similarly hypnotic. https://en.m.wikipedia.org/wiki/Collatz_conjecture
dfan said,

April 18, 2017 @ 7:47 am

Google Translate is basically a big black box (the RNN) that has been trained, given a sentence in language A, to produce the sentence in language B that is mostly likely to be a translation of it. The main problem here is that in all of its training data (sentence pairs in languages A and B) it has never seen perverse inputs like 18 repetitions of the same two-character sequence. It's an area of the space of all input "sentences" that it has never been told about, and so it doesn't have the faintest clue what to do with it. It just plugs the characters in and cranks through the machinery and out comes the "most probable translation". If it could tell you how likely it thinks it is that this most probable translation actually is a correct translation, it would tell you that it's very unlikely indeed; but it's the best idea it's got.

[(myl) In this case, "probability" is not really an applicable concept.

In some earlier machine translation models, as in earlier speech recognition models, there was an explicitly bayesian estimate of the "most probable" hidden state sequence given the observable sequence. The probability estimates were massively wrong — dependent on obviously false independence assumptions, and purely pragmatic fudge factors of various sorts — but it was often the case that a right answer was more "probable" than most wrong answers.

Modern pseudo-neural multi-acronym machinery no longer pretends to be calculating and maximizing probabilities. It's just mapping inputs to outputs through a complex network of inner products and point nonlinearities. When such networks are recursive, they're intrinsically likely to exhibit the instability characteristic of non-linear recursive systems. It's clear in this case that hallucinatory behavior is most likely to result from inputs far outside the realm of training — and most of the fun could be halted by introducing a front end trained to recognize sufficiently unlikely stuff — but it would be unwise to bet that such things could never happen with more normal-seeming inputs. We saw one example in the case of the lists of country names, and no doubt there are infinitely many others lurking in the depths of the networks.]
KevinM said,

April 18, 2017 @ 8:42 am

Given nonsense or unknown input, it makes mistakes for the same reason many of us do: inability to say "I don't know."
Laura Morland said,

April 18, 2017 @ 9:24 am

"It's the difference between what might look like a technology nearly ready for prime time and the chaotic behavior of an engineering abortion that should strike fear into the hearts of any rational human."

Hats off to the author!
MattF said,

April 18, 2017 @ 9:52 am

I've always been a little leery of neural nets. People in the field seem to take some pride in their ignorance of exactly what any given neural net is doing. Their argument, I guess, is that we don't really know what the brain is doing, so why should we know what these simulations of the brain are doing?

And to the unwashed masses who may think, e.g., that language translation is a relatively deterministic process, they just roll their eyes.
Jerry Friedman said,

April 18, 2017 @ 10:19 am

AntC: That Wikipedia article points out the the Collatz sequence can be written as one of the "tag sequences" being talked about here.
Joe said,

April 18, 2017 @ 10:32 am

@leboiko: No need for a fancy coyote mural – all you need is "a large white 18-wheel truck" in bright sunlight.
Gwen Katz said,

April 18, 2017 @ 12:22 pm

Probability may not be an applicable concept, but Google Translate does rate translations as common, uncommon, or rare, and oh boy, are its choices fraught.

Neural networks make me wary because their black-box nature can cause laypeople to blindly trust that whatever they're doing must be right; meanwhile, biases and errors are harder to detect and much harder to correct. A traditional algorithm could simply include a manual correction indicating that "his," "her," and "their" are equally acceptable translations of "свой," but that's hard to do with a neural network.
KeithB said,

April 18, 2017 @ 12:39 pm

I have always wondered how a self driving car will distinguish between a person standing in an intersection waving his arms (something to be simply avoided) and a traffic cop giving hand signals (something to be obeyed).
Ethan said,

April 18, 2017 @ 1:06 pm

I wonder if a contributory factor in the behavior we are seeing for Japanese kana->English is that the algorithm has to deal with normal Japanese input that contains no space characters. Longer and longer input character strings are still, I would suppose, within the range of input it was trained on. A string of non-whitespace roman characters in English (or French or German…) is still one "word" which may or may not be recognized as a unit. But a very long string of CJK characters may be a well-formed sentence or phrase despite the lack of spacing.

However Cyrillic "жо"^N -> ever stranger English whimsy as N increases would seem to argue against this, since as far as I know word length in Russian is constrained much as word length in English.
Jarek Weckwerth said,

April 18, 2017 @ 1:12 pm

@dfan: So it's a poverty-of-the-stimulus problem is it? Is there a literature that juxtaposes this kind of neural net behaviour with the traditional Chomskyan argument?

[(myl) No. As Andrej Karpathy's post explains, the more training material you have, the more vivid the hallucinations you can get from it.]
Chris C. said,

April 18, 2017 @ 3:07 pm

That LTSM-produced text is glorious. You swear it ought to mean something syntactically, but…
jick said,

April 18, 2017 @ 5:41 pm

> But look at the weird shit that comes from inputting Asian language repeated syllable sequences and you not only wouldn't get in the car, you wouldn't want to be in a parking lot where it was driving around on a test run.

I.e., I don't understand how X works, therefore it is dangerous. I don't understand how Y works either, so they must be equivalent.

Seems like a common trap that even highly knowledgeable people readily walk into, from physicists to software developers to, yes, linguists.
Keith M Ellis said,

April 18, 2017 @ 7:15 pm

I have a slightly different take on this, which is that it's an example of why the recent assertions that strong AI is imminent are so absurd and ill-informed. What's really happening is that we're entering a sort of uncanny valley of weak AI, using things like neural nets and extremely large data sets, that is powerful enough to accomplish things we couldn't manage before and seems surprisingly and mysteriously "intelligent" but which is, in truth, restricted to a very limited problem domain as well as being (often) unacceptably fragile. We are nowhere close to genuine strong AI and all the claims to the contrary reveal a vast ignorance of the topic.

That said, I resolutely disagree with the Chinese Room argument here — which is to say, I think that human intelligence is not qualitatively different, only many orders of magnitude more complex, more layered, and trained on data within a problem domain spanning evolutionary time (that is to say, vastly greater). This is why it is, for our usual purposes, so much more reliable and robust. It's also why it, too, fails spectacularly and unpredictably at the margins. Furthermore, we've not even included what I think is a genuine cognitive layer of functional culture.

That contemporary machine language translation simultaneously is surprisingly successful and yet evidently fragile is not really an indictment of the underlying paradigm or ambition. It's an indictment, rather, of how poorly we humans ourselves understand the problem domain of language, translation, and cognition that we have trouble recognizing properly how these systems are both like our cognition and yet in relative terms, extraordinarily simplistic.

With regard to self-driving cars, that is a problem domain that is far more limited and regular than language and also one in which typical human faculties are themselves, comparable to something like language, very limited and prone to catastrophic error. Self-driving vehicles at the current stage of development would, if adopted universally, undoubtedly result in a drastic decrease of collisions and casualties — but mostly because the roads are far more unsafe than we like to admit. But the ways in which self-driving cars will fail, though less frequently than human failure, will almost certainly include extremely non-human failure modes that will greatly unsettle us. That, I think, will be the barrier self-driving cars will have to clear, if you'll pardon the unfortunate nature of my metaphor.
Robert Ayers said,

April 18, 2017 @ 9:56 pm

I wondered if the "sequence of two characters" had to be in Thai or similar not-European language, Nope. Set GTranslate to English->Spanish and enter a repeating sequence of the English word "he". At first you see a sequence of "el". Then things get interesting:
10x he => Él él él él él él él (Yeah, seven of them)
16 x he => Él él él él mismo él él él él él él él él
20x he => El él el el el el el el el el he el he (Yeah, only one accented)
26x he => El hee el heeeeeeeeeeeee heehe
Maybe I'll give the next car a wide berth …
Roger Lustig said,

April 18, 2017 @ 11:57 pm

I don't know so many languages, so I gave GT something simple: "Buffalo buffalo buffalo buffalo buffalo buffalo buffalo."

The machine was generally smart enough not to do much with this obvious tease, "much" being defined as "trying to make some sort of sense of." On the other hand, why does my 7-word sentence produce just 5 repetitions (unchanged) in Norwegian and Italian and 4 in Danish and Swedish? (Also Czech and Slovak.)

German: first word untranslated, then 6x "Büffel." Similarly in Greek & Amharic, as far as I can tell. Corsican: 6x "Buffalo" followed by "pecore."

Bulgarian has a possessive thing going on, I think: Buffalo биволско биволско биволско биволско биволско бивол.

Russian breaks through to 8 words: Буйвол буйвола буйвола буйвола буйвола буйвола буйвола буйвола. Not so Belarusian: Бафала буйвала буйвала буйвала буйвала буйвала буйвала.

Macedonian: Бафало бивол Бафало бивол Бафало бивол Бафало.

Serbian does the 4x-no-translation thing like Czech. Croatian won't be confused with Serbian, though: Buffalo bivol buffalo bivol buffalo buffalo bizona. Dramatic! In nearby Slovenia, a slow start: Buffalo buffalo buffalo bivol bivoli bivol bivoli.

OK–enough fun. Clearly these translation algorithms, none of which can be faulted for not making sense of the sentence, have some wildly varying approaches to repetition.
Richard said,

April 19, 2017 @ 12:48 am

surely the input should have been 8 "buffalo"s?
Gwen Katz said,

April 19, 2017 @ 2:55 am

Self-driving vehicles at the current stage of development would, if adopted universally, undoubtedly result in a drastic decrease of collisions and casualties — but mostly because the roads are far more unsafe than we like to admit. But the ways in which self-driving cars will fail, though less frequently than human failure, will almost certainly include extremely non-human failure modes that will greatly unsettle us. That, I think, will be the barrier self-driving cars will have to clear, if you'll pardon the unfortunate nature of my metaphor.

This is so obviously irrational, and yet it's my instinctive reaction. There are millions of car accidents every year, yet if self-driving cars' record is "perfect, except one single incident where it plowed straight into a semi," that seems like an unacceptable risk and I want to keep self-driving cars off the roads–and allow millions of car accidents to continue happening–to prevent the possibility of that one isolated incident happening again.

Brains are weird, man.
Sean M said,

April 19, 2017 @ 6:30 am

jick: No, it is a "ceteris paribus" argument: "If their work in a field I can give a professional opinion on about looks solid on the surface but fails when I start looking for failure modes, then their work in a field that I can't judge professionally is probably about as flawed." Ceteris paribus is a very useful heuristic, although like any heuristic it does not guarantee success.
Jonathan Smith said,

April 19, 2017 @ 11:35 am

Nguyen et. al.'s (2015) interesting paper showed that "it is easy to produce images that are completely unrecognizable to humans, but that state-of-the-art DNNs believe to be recognizable objects with 99.99% confidence (e.g. labeling with certainty that white noise static is a lion)." Many humorous and alarming examples. So what Keith M Ellis said. Not getting in the self-driving car either…
Roger Lustig said,

April 19, 2017 @ 12:53 pm

@Richard: Wiki-canonically, yes. But, as https://en.wikipedia.org/wiki/Buffalo_buffalo_Buffalo_buffalo_buffalo_buffalo_Buffalo_buffalo#Usage notes, any number of repetitions will do, and can be understood even without reference to the town supposedly originally called Beau Fleuve.
Jonathan Ginzburg said,

April 19, 2017 @ 5:45 pm

I was looking for the Hebrew translation of the word « spoke » (in the bicycle related sense).
To ensure GT gave me the right sense I tried first English to French, with English input « bicycle spoke ». Somewhat to my surprise the proposed translation was « Bicyclette a parlé ». I then tried « wheel spoke » which yielded « La roue a parlé. ». It was only when I tried « wheel spokes » that « rayons de roue » emerged. So far so fairly mundane (though I was surprised that « bicycle spoke » was biassed to be translated as a sentence—-the first two results in a regular google search for « the spoke broke » relate to bicycles.). But then I tried « wheel spokes » into Hebrew with resulting output:
« these בח ח » (neither Hebrew letter sequence is a word). OTOH « wheel spoke » yields
« גלגל דיבר. » which is the `sensible’ sentence corresponding to the French. However, « spokes » yields simply « ח ». (the eighth hebrew letter which means nothing…).
Kaleberg said,

April 19, 2017 @ 10:53 pm

Congratulations on making Google Translate fun again. For a long time dumping in text found on the web produced the most amazing, and often entertaining, garbage. Then, GT seemed to either get better or lose its sense of humor. Thanks to repeated sequences we can all laugh again.

Deep learning has a similar problem with vision. Apparently introducing noise into the an image can produce an image almost identical to the human eye, but interpreted completely differently by a machine. Interesting, there is a whole literature on the patterns of noise, mechanisms for making deep learning more resistant this kind of perturbation and, of course, how to get around such defensive mechanisms.

https://arxiv.org/pdf/1610.08401.pdf
Andrew (not the same one) said,

April 20, 2017 @ 7:12 am

'Buffalo buffalo buffalo….' in Latin: gignat gignat gignat gignat gignat gignat gignat bubalus.
DaveL said,

April 20, 2017 @ 6:07 pm

Has anyone asked Google to play the "game" from Philip K. Dick's "Galactic Pot-healer," where people competed to find the most interesting/amusing English->Japanese->English phrases? It sounds like LSTMs networks could compete there.
ajay said,

April 21, 2017 @ 4:50 am

But the ways in which self-driving cars will fail, though less frequently than human failure, will almost certainly include extremely non-human failure modes that will greatly unsettle us

I am writing this just down the road from the place where, the other day, a human-driven car was driven at speed into a crowd of pedestrians because, as far as I can tell, the driver became convinced that an invisible being who lives in the sky had ordered him to do so. I find that extremely human failure mode fairly unsettling as well.
ajay said,

April 21, 2017 @ 4:57 am

English input « bicycle spoke ». Somewhat to my surprise the proposed translation was « Bicyclette a parlé ».

Such a phenomenon, continued the policeman, can certainly be viewed as compelling if not incontrovertible evidence for the theory of atomic intermixture which I have just expounded. A talking bicycle is by percentage of its atomic origin no less than fifty five percent man, and concomitantly and concurrently the man to whom the bicycle belongs must be a minimum of fifty five percent bicycle and so will no doubt have lost the power of speech himself and be reduced to communicating by ringing his bell.
AntC said,

April 21, 2017 @ 5:37 am

@ajay ;-)

You left out the bit about the crumbs of soda bread on the mat where the bicycle rests at night. More evidence bicycles have mouths.
Tim said,

April 21, 2017 @ 12:52 pm

Just want to share a little Google Translate poetry resulting from drumming my fingers on the keyboard while set to Thai:

There are six sparks in the sky, each with six spheres. The sphere of the sphere is the sphere of the sphere.
Keith M Ellis said,

April 21, 2017 @ 3:13 pm

I find that extremely human failure mode fairly unsettling as well."

Yes, very. But my sense is that the expectations about people and about technology are quite different. We sort of expect people to fail catastrophically in frightening ways. We try to reduce it, but we accept that it will always happen. With technology, however, there's a sort of expectation that it can be made 100% reliable. So there's (almost) "nothing" we can do about a severely mentally ill homicidal driver, but the first time such an accident happens with a self-driving car, you can be sure that the reaction will be … extreme.

Meanwhile, there would be a 99% or more reduction in casualties caused by vehicles wandering across the dividing line into oncoming traffic (because presently human drivers cause such collisions because they're not paying attention, they're looking at their phones, or they're falling asleep).
Keith M Ellis said,

April 21, 2017 @ 3:15 pm

Oh, to be clear: I'm not so much making a polemical argument defending self-driving cars so much as I'm somewhat fascinated by this discrepancy.
Yuval said,

April 22, 2017 @ 9:03 am

It did a pretty decent job of translating my own description of itself from Hebrew.
Rodger C said,

April 22, 2017 @ 10:50 am

There are six sparks in the sky, each with six spheres. The sphere of the sphere is the sphere of the sphere.

I'd have believed you if you told me this was one of those passages from the Nag Hammadi codices that are very incompetently rendered into Coptic from Greek.

[(myl) To preserve this gem for posterity (i.e. past the next GT update), the specific keyboard banging sequence was

่ดฟหกวาวฟไำาพ่้ฟเวฟ่ากวหด่วสฟหกดา่ไ้วำพี้วหกอืวฟหก่้ดสำไอฟ่าส้่หำพะืฟทหอั้ฟส่าาิำืไพ

and here is a screenshot of the result:

]
James Wimberley said,

April 24, 2017 @ 5:19 am

You wake up in a locked cellar with a guy who spends his time typing random sequences of Thai characters into his laptop. What us your appropriate output in response to this input? Probably some combination of "Yes, you are absolutely right, couldn't have put it better myself" and "There, there, time for a nice rest".
BZ said,

April 24, 2017 @ 8:23 am

Re: self-driving cars: One of the supposed advantages is that they can be packed together at high speeds because they don't suffer from human response times. Then, a single glitch of a single car causes a 500 car pileup with mass casualties. Sure it will happen much more rarely than human-involved crashes, but how much more rarely does it have to be before the death rate is comparable?
Adrian said,

April 25, 2017 @ 1:37 pm

One of my Hungarian friends wrote "Sztem." in a FB post, and having forgotten that it's short for "Szerintem" (="IMO") I tried Google Translate, which came up with "I will destroy." And the non-extent verb "sztem" has several other meanings, apparently, depending on the modifier: https://dadge.wordpress.com/2017/04/25/maggle/

My wife suggests that "shtem"="destroy" is some translation contamination from e.g. Polish, but what about the rest?

RSS feed for comments on this post

Electric sheep

38 Comments

Keith said,

leoboiko said,

AntC said,

dfan said,

KevinM said,

Laura Morland said,

MattF said,

Jerry Friedman said,

Joe said,

Gwen Katz said,

KeithB said,

Ethan said,

Jarek Weckwerth said,

Chris C. said,

jick said,

Keith M Ellis said,

Robert Ayers said,

Roger Lustig said,

Richard said,

Gwen Katz said,

Sean M said,

Jonathan Smith said,

Roger Lustig said,

Jonathan Ginzburg said,

Kaleberg said,

Andrew (not the same one) said,

DaveL said,

ajay said,

ajay said,

AntC said,

Tim said,

Keith M Ellis said,

Keith M Ellis said,

Yuval said,

Rodger C said,

James Wimberley said,

BZ said,

Adrian said,

Follow us on Twitter

Archives [+/–]

Blogroll [+/–]

Meta