DeepL Translator

« previous post | next post »

I have often sung the praises of Google Translate (see "Selected readings" below for a few sample posts), but now I've learned about an online translator that, for many languages, may be even better.  Since we've been discussing phenomenal developments in AI quite a bit lately (see also under "Selected readings" below), now seems as good a time as any to introduce DeepL to the collective Language Log readership.

In truth, we've barely mentioned DeepL before (see comments here, here, here, and here), so I really didn't notice it until this past week when my students and auditors from East Asia told me about it.  Seeing what DeepL could do, I was simply overwhelmed.  Let me explain how that happened.

Most of the participants in my Middle Vernacular Sinitic (MVS) seminar (all attendees are from China, Japan, and Korea), said that they've been using it regularly for years.  They also mentioned that they use OCR apps on their phones.  The scanned texts they use can then be fed into various applications for translation.  Many of them also use Grammarly to improve the quality of their writing.  Lately I myself have noticed that when I write papers, essays, and letters in word processing programs (e.g., Microsoft Word), the processor gives me mostly good suggestions for getting rid of superfluous, redundant, awkward suggestions.

Specifically, what impressed me so much about DeepL in this instance is that we were faced with a Dutch translation of a rare, medieval Chinese text with a lot of esoteric vocabulary.  The Dutch translator had done a commendable job of getting from the difficult Chinese to Dutch, but then we had to use OCR on his limited circulation Dutch publication to produce a document to feed into DeepL.  When I read the resulting English translation, I was amazed at how faithfully the English conveyed the sense and the feeling of the extremely recondite medieval Chinese text.  Of course, the English wasn't  perfect, but it made a tremendous contribution toward getting a handle on what was happening in the medieval Chinese text that had seldom been read by anyone (it was lost for more than a thousand years) and had never been translated into any other language beside Dutch.

By way of introduction

DeepL Translator

DeepL Translator is a neural machine translation service launched in August 2017 and owned by Cologne based DeepL SE. The translating system was first developed within Linguee and launched as entity DeepL. It initially offered translations between seven European languages and was gradually expanded to support 31 languages.

Its algorithm uses convolutional neural networks and an English pivot. It offers a paid subscription for additional features and access to its translation application programming interface.

(source)

Try it out.

Header.

Grammarly

Grammarly is an American cloud-based typing assistant. It reviews spelling, grammar, punctuation, clarity, engagement, and delivery mistakes in English texts, detects plagiarism, and suggests replacements for the identified errors It also allows users to customize their style, tone, and context-specific language.

Grammarly was launched in 2009 by Alex Shevchenko, Max Lytvyn, and Dmytro Lider. It's available as a standalone application for use with desktop programs, a browser extension optimized for Google Docs, and a smartphone keyboard.

(source)

From Silas S. Brown:

It's extremely difficult to write a proper introduction
to DeepL because so many of their techniques are secret.  They've
spoken of using a different kind of neural network, but that clearly
isn't enough to reproduce their work, so we can't really analyse it
from a technical standpoint.  All we can do is test it.

I happen to be in the habit of trying to speak with every Chinese
student I meet, and when I get a conversation I offer to exchange
contacts on WeChat.  That means I have quite a few Chinese students
on my WeChat, which makes my Moments (朋友圈, WeChat's timeline) a
constant source of up-to-the-minute "hip" Chinese language that
can be difficult to translate with traditional tools.
So here's a random sample of phrases from this morning:

week 5 之前花先开了,希望week 5结束它还没谢

My translation: The flowers opened before the start of Week 5; I
hope they won't have wilted by the end of Week 5.

Relevant Wenlin entry: 谢[謝] ¹xiè {E} v.  ①thank ②wither (of
flowers/leaves/etc.) | Huā xiè le.  花谢了。 The flowers have withered.

This is going to be particularly awkward for automatic translators,
since 谢 means "thank" probably 99%+ of the time and we're in the
exception.  A well-trained Yarowsky algorithm might just be able to
guess that it means wither if the whole sentence is about flowers.

Google Translate: The flowers bloomed before week 5, I hope the end
of week 5 has not yet thanked.
(Fell into the trap as I expected.)

DeepL: The flowers are blooming before week 5 and I hope it's not
over by the end of week 5.
(Better: it didn't fall into the trap of using "thank".  But it
didn't quite get "wither" or "wilt" either.)

WeChat's own translator: The flowers bloomed before Week 5.  I hope
it's still there by the end of the week 5.

🇭🇰香港大学🇭🇰 环山而建的学校:助力学生每天都有健康的竞走生活

My translation: The University of Hong Kong---an educational
institution built on a hill: keeps students fit by having them
climb it every day.  (I'm paraphrasing here, because 竞走 is
in the ABC "heel-and-toe walking race" and I think it's being
used metaphorically.)

Google Translate: 🇭🇰Hong Kong University🇭🇰 A school built around
mountains: Helping students have a healthy life of race walking
every day.
("Hong Kong University" is the literal translation of 香港大学 but I
used "The University of Hong Kong" because I happened to know that's
the English name they use.  山 can be mountain or hill: I went for
hill, but the automatic translators went for mountain, and in this
case Google Translate made it plural.  The Chinese of course does
not indicate whether it's singular or plural, so the translator has
to fill this in with their own educated guess.  Without looking it
up, I'm guessing HKU is built around only one hill.  If I were doing
this professionally with plenty of time, I would fact-check that.)

DeepL: 🇭🇰 Hong Kong University 🇭🇰 A school built around a mountain:
helping students have a healthy race every day
(DeepL agrees with me about it being a singular elevation, but in
this case made a worse choice than Google on "healthy race".)

WeChat Translate: 🇭🇰 The University of Hong Kong 🇭🇰 Schools Built
Around Mountains: Helping Students Have a Healthy Walking Life Every
Day
(I'm not sure why we went into Title Case here)

带上你们的故事/ item来玩 虽然没有酒 但我有🌹

My translation: Bring your stories and items and join the fun.
There's no wine but I have roses.

Google: Bring your stories/items to play
 Although there is no wine, but I have 🌹

DeepL: Bring your stories / item to play
No wine but I have 🌹
(In this case Google picked up on the fact that Chinese sentences
borrowing one-off English words sometimes use them in singular form
whereas they would be plural in English, but DeepL just copied it.)

Wechat Translate: Bring your story / item to play
I don't have wine, but I do. 🌹

到今天很不容易

My translation: It hasn't been easy to get as far as today.

Google: It's not easy until today

DeepL: It's not easy to get to where you are today.
But it offered alternatives:
It has not been easy to get to this point,
It has not been easy to get to where we are today.
(I clicked on this latter alternative.  I don't know if
my click counted as a "vote" so that DeepL will be more
likely to choose it by default next time.)

WeChat Translate: It hasn't been easy until today.

555开心又感动的快乐周末(破大防版

My translation: Oh, oh, oh, such a happy and touching cheerful
weekend.  Picked out some pictures to tantalise you all.

(The Mandarin reading of 5, wǔ, is being used phonetically.  At the
end of the message, this student is playing on the neologism 破防
meaning to break down the psychological barriers that restrain one's
emotions, in other words to be provocative or tantalising.  版 means
edition, which I've translated as "picked out" in this context.
I wouldn't expect an automatic translator to know she was talking
about pictures next to the message, but any human translator would
have that information available, and should know that a more literal
translation wouldn't make as much sense in English.)

Google Translate: 555 Happy and Touching Happy Weekend (Broken
Version)
(None of the automatic translators picked up on 555 being used
phonetically.  Translating two different Chinese words to "happy"
is not what a human translator would do.  Completely missed the
neologism at the end.)

DeepL: 555 Happy and Touching Weekend (Breaking Big Defense Edition)
(Slightly better handling of the neologism, and it knew not to use
the word "happy" twice even if it had to just delete the second one)

WeChat Translate: 555 Happy and Moving Happy Weekend (Breaking Big
Defense Edition)

是谁家的猫猫这么好看

My translation: Whose cat is it that's so pretty?
(I phrased it like this because the form 是谁家 at the beginning of
the sentence indicates that the emphasis is on the word "whose".
Translating as just "whose cat is so pretty" would lose this.)

Google: Whose cat is so beautiful

DeepL: Who's cat is so pretty?
(At least it added the missing question mark, but "Who's" is short
for "who is" which is not correct English grammar in this context)

WeChat Translate: Whose cat looks so good?

是见一次笑一次的程度

My translation: It's good enough to make you laugh every time you
see it.

Google: It's the level of seeing and laughing once

DeepL: It is the level of laughing once you see it
OR: It's a laugh every time you see it
(I clicked on the second one; that might make it more likely
to be the default next time)

WeChat Translate: Is to see the extent of a smile once

Overall I'm glad DeepL is on the scene but it's not a silver bullet.
Perhaps you could post the above (I suppose the source sentences are
in this case small and generic enough to qualify as "fair use"
without my having to go asking all the students for permission) and
let readers see what they think.  But this little experiment says
nothing about the quality of DeepL (or other machine translation)
when applied to more traditional academic texts rather than what
students are saying on their social media.  At least it's something
to start the conversation I suppose.

——

From Silas S. Brown's home page: 

Welcome. ​I am a partially-sighted computer scientist at Cambridge University in the UK. ​I come from rural West Dorset on the South-West peninsula of England.

From Ning Wu:

I have used Deepl for about 4 months, but I use it very occasionally. When I came to UPenn last September, I found both Deepl translator and Grammarly are widely used in the world and among the Chinese students here. So far I've never used Grammarly. Like most translation systems, DeepL translates texts by using AI networks. I guess these networks are trained on millions of translated texts. In the Deepl interface, if you enter the paragraphs in the left box in the language you want to translate, and select the target language type in the right box, they will be translated and displayed in the right box after a while. 
 
Besides, I also have used OCR with my iPhone since 2020.  On iPhone there is an app called "note", which not only can be used to take notes when you need, but also can scan texts and pictures. I do not use WeChat to scan texts, but more often use "note" on iPhone.  Actually I use OCR very occasionally, too.

From Neil Kubler:

I now use DeepL Translate exclusively; it really is better than Google Translate.

I learned of it from German colleagues last year.

From Hiroshi Kumamoto:

I know (since around 2020) that many people (scholars and students) here in Japan use it as a superior substitute for Google translation. They say that it produces more natural, less funny translations.

I personally have never used it. This is essentially a tool to translate something you wrote in your original language (say Japanese) to another (like English). I don't go through that process. I just write in English.

Here is the free version:

And the pro version (with fixed fees) with more advanced features.

For the time being, I (VHM) will probably continue to use GT instead of DeepL, simply because it is good enough for my present purposes and because used to DeepL would require some time.  However, having seen what it can accomplish in the hands of those who already know how to use DeepL, I recommend it highly for demanding tasks and fine tuning.

 

Selected readings

[Thanks to Qianheng Jiang]



7 Comments

  1. Christian Horn said,

    February 16, 2023 @ 4:51 pm

    I interact much with Japanese, English and German – and use deepl a lot these days. Another fascinating functionality I discovered these days: google meet has added subtitle generation/transcription in Japanese now, while not perfect it's quite impressive already.

    For all of these translation technologies, I hope we eventually come up with OpenSource tech with capabilities at least at the same level as deepl/google-translate.

  2. Josh R. said,

    February 16, 2023 @ 7:31 pm

    I first came across DeepL when I was hired to do cross-checking of a translation (Japanese to English). I checked a particular piece of technical writing, finding it clear and idiomatic, and thus needing no revision. But someone had forgotten to remove the "Translated by DeepL" notice that DeepL automatically attaches to long stretches of text.

    As a professional freelance translator, my reaction was, "Oh…f**k."

    DeepL is not perfect, and it works best on legal and technical text, as that seems to be the main source of its knowledge, but it is light years ahead of Google Translate for any Japanese text of significant length.

  3. Chester Draws said,

    February 16, 2023 @ 10:05 pm

    I have used DeepL for Russian to English and Polish to English in recent years. I is amazing to see how close it gets. It is notably better than any competitor.

    It works very well with military terminology, a notable weakness with most (polk being "shelf" is a common error).

    The biggest issue, unsurprisingly, is idiom.

  4. Chas Belov said,

    February 18, 2023 @ 3:04 am

    Mastodon, or at least the instance my account inhabits, uses DeepL for translation provided the toot has been correctly marked for language. Alas, as a lover of Cantonese, Cantonese is not one of the languages offered for markup.

    It didn't seem to like the following:



    The first character, meaning "to not have," is translated as "no."

    The second character, meaning "to have," is translated as "yes."

    The third character, being a sentence suffix to elicit sympathy, is translated as "hello."

    Oh, well. So I tried sentences. The results were better, although still off (or at least I think they are; my Hong Kong is very powderful).

    有冇波霸奶茶?
    冇囉!

    Gets translated by DeepL as:

    Do you have Boba milk tea?
    No!

    Where I would translate it as something like:

    Did you have Boba milk tea?
    No, and I'm miserable!

  5. Robert Hutchinson said,

    February 18, 2023 @ 3:36 am

    This is second-hand information, but I have heard from folks fluent in both Japanese and English that, at least when doing J->E translation, DeepL's biggest flaw/danger is that it sometimes likes to confidently translate certain things incorrectly, where Google Translate is more likely to output gibberish (which is of course easier to spot as something not to trust).

  6. Chester Draws said,

    February 18, 2023 @ 10:32 pm

    Robert, that is my experience from Russian and Polish with DeepL.

    I would be nice to be able to tweak it to indicate levels of "certainty" or to indicate that two alternatives are possible.

    For example, when translating military ranks it always gives an answer, even when it should "know" that the word is effectively untranslatable, because the ranks don't line. And it still gives Divizion as "Division" when it should be battalion.

  7. Heddwen said,

    February 22, 2023 @ 7:13 am

    As a translator from Dutch to English, I have been using DeepL for years now and have seen it get progressively better.

    Perhaps because English and Dutch are so similar, it does struggle with idioms, often giving literal translations instead of idiomatic ones. It also (understandably) struggles when human authors mix their metaphors. Dutch people have a strong tendency to mix metaphors, I find, and I think this is the only reason I might still have a job in a few years.

    Almost all translation agencies (LSPs as they are called in the biz, language service providers) now no longer hire freelance translators to translate. Virtually all translation work has become PEMT, post-editing machine translation. The company sends you a machine translation, and you are required to go through and check that it is not all gibberish. Regularly, no changes are required.

RSS feed for comments on this post