Unknown language #10

« previous post | next post »

ProZ.com is a membership-based website that targets freelance translators.  They currently have posted a job for which they are seeking a qualified translator, but are uncertain of what language the source text is in.  On first sight, the sample text (see below) looks vaguely Turkic to me.  The person who posted the job notes:

We are trying to figure out this language. It was thought to be Turkish of which it is not familiar to native Turkish translators. It is thought to possibly be Turkish Tartar, Bulgarian, Georgian, Uzbek.

Here's the sample text:

Ukhant karapet qulkt kirlerek
Iqat ighun chapuq sireleq,
Poghtu Payghytei Pielereq
Azlayn qoghular eliut karapet.

Google Translate detects the sample text as Bulgarian, and even gives this Cyrillic version:

Ухант карапет яулкт кирлерек Ияат игхун чапуя сирелея, Погхту Паъгитей Пиелерея Азлайн яогхълар елиът карапет.

But the best "English" translation GT can provide is this:

Uhant carapet julkt Kirlerek
Iiaat ighun chuya sireleia,
Pogghei Pogghei Peleleria
Azina joghular elite carapet.

Note that the second word and the last word are the same.  The first word of the second line, "iqat", reminds me of Uyghur "ikat", a dyeing technique for textiles.  And there are other words in the sample passage that make me think of textile terminology and Turkic language.  But I'll stop here for now.

Any ideas?

[h.t. Thorin Engeseth]



32 Comments

  1. Kate Gladstone said,

    December 1, 2017 @ 11:20 pm

    Re "Uighur 'ikat', a dyeing technique for textiles" — No, the word 'ikat' is of Indonesian origin, not Uighur: see https://en.m.wikipedia.org/wiki/Ikat and https://www.bing.com/search?q=etymology+of+ikat&form=APIPA1&PC=APPD

  2. Kate Gladstone said,

    December 1, 2017 @ 11:25 pm

    I notice, too, that this "Unknown language #9" challeng3 is the second challenge to be numbered #9 on LamguageLog. Shouldn't this second =9 have been numbered #10?

  3. Kate Gladstone said,

    December 1, 2017 @ 11:32 pm

    I see you've corrected the number; thanks!

  4. Victor Mair said,

    December 1, 2017 @ 11:49 pm

    Thanks for catching me, both on the numbering problem in the title and about the origin of the word "ikat". I actually did a search to determine the previous "Unknown language" post, and the last one that turned up was #8. Anyway, I'm glad that's fixed now.

    As for "ikat", I should have known better, because I have studied this very problem in some depth several years ago, but it slipped my mind this evening when I was writing the post. The problem is that, although I've been to Xinjiang dozens of times and visited workshops where ikat is produced, and was quite aware that the Uyghurs referred to it as "atlas", seeing the "iqat" in this Turkic-looking text caused some wires to get crossed in my brain.

    Here's the Wikipedia article on "ikat" discussing the Uyghur word:

    =====

    Uyghurs call it atlas (in IPA [ɛtlɛs]) and use it only for woman's clothing. The historical record indicates that there were 27 types of atlas during Qing occupation. Now there are only four types of Uyghur atlas remaining: Qara-atlas (Darayi, black ikat used for older women's clothing), Khoja'e-atlas (yellow, blue, purple ikat used for married women), Qizil-atlas (red ikat used for girls) and Yarkant-atlas (Khan-atlas). Yarkant-atlas has more diverse styles; during Yarkant Khanate (16th century), there ten different styles of Yarkant-atlas.

    =====

  5. Kate Gladstone said,

    December 1, 2017 @ 11:56 pm

    I believe the language to be either a Georgian or Armenian. When I use Google or Bing to search the text'# individual" words, such as "sireleq" fir instance, many of them come up ONLY in tweets written by Armenians or Georgians or in other Roman-alphabet renditions of Armenian: or Georgian for instance, "Karapet" s a masculine personal name in some varieties of Armenian (other varieties render it "Garabed") and there is a "Saint Karapet Church" in Tbilisi, Georgia. Please find someone who knows Armenian and/or Georgian.

  6. Ben said,

    December 2, 2017 @ 1:34 am

    This might be Armeno-Turkish (Ottoman Turkish written in the Armenian script) that has then been transcribed into the Latin alphabet. Karapet is, as noted, a common Armenian name and the -erek, -lar and -leq (ie -lik) are widely used suffixes in Turkish. "ighun" could be "içün" (modern "içini"), meaning "because" or "for".

    One would need to get the mapping from the Armenian to Arabic alphabets used to see if this theory makes sense.

  7. Laura Morland said,

    December 2, 2017 @ 4:08 am

    I see that nobody on ProZ.com has yet volunteered to take on the job (which has "expired"): https://www.proz.com/job/1383293?print=1

    The request does note that it is "poetry." I bring it up simply because no one in the discussion (including the O.P.) has chosen to describe the text in that way — maybe because it's so obvious?

  8. Richard Burnham said,

    December 2, 2017 @ 6:15 pm

    Applying Google Translate with the options Uzbek, Tajik, Kazakh and Azerbaijani provides supposed translations of some of the words. Three of them are consistent but none of them make much sense.
    Uhant carpets are empty
    Iight igun is a sleek coat,
    Poghtu Payghytei Pielereq
    Lightweight eyelashes eliut karapet.

  9. Ian said,

    December 2, 2017 @ 7:02 pm

    The phonotactics are all wrong for Georgian, so I'm fairly certain it's not that. It could be something else Kartvelian like Mingrelian or Svan though.

  10. Victor Mair said,

    December 2, 2017 @ 7:21 pm

    From John Colarusso:

    It is an Oghuzic form of Turkic. (Oghuzic plural -lar, Kipchak plural -la)

    I have passed it on to Uli Schamiloglu for an exact ID.

  11. David Marjanović said,

    December 2, 2017 @ 7:35 pm

    Not Bulgarian; not remotely Slavic at all.

    It would be good to see the original script…

  12. Marcel Erdal said,

    December 3, 2017 @ 1:27 am

    Not remotely Turkic or Semitic. No European language. Probably a joke.

  13. Anonymous Coward said,

    December 3, 2017 @ 4:42 am

    qoγu is "swan" in Old Turkic and Karaim. Good enough for me.

  14. Victor Mair said,

    December 3, 2017 @ 8:13 am

    Karaim is a very interesting language:

    =====

    The Karaim language (Crimean dialect: къарай тили, Trakai dialect: karaj tili, Turkish dialect: karay dili, traditional Hebrew name lashon kedar לשון קדר "language of the nomads")[6] is a Turkic language with Hebrew influences, in a similar manner to Yiddish or Judaeo-Spanish. It is spoken by only a few dozen Crimean Karaites (Qrimqaraylar) in Lithuania, Poland and Crimea and Galicia in Ukraine. The three main dialects are those of Crimea, Trakai-Vilnius and Lutsk-Halych all of which are critically endangered. The Lithuanian dialect of Karaim is spoken mainly in the town of Trakai (also known as Troki) by a small community living there since the 14th century.

    There is a chance the language will survive in Trakai as a result of official support and because of its appeal to tourists coming to the Trakai Island Castle, where Crimean Karaites are presented as the castle's ancient defenders.

    =====
    https://en.wikipedia.org/wiki/Karaim_language

  15. languagehat said,

    December 3, 2017 @ 9:33 am

    Certainly neither Georgian nor Armenian; I would guess it's Turkic, pace Marcel Erdal (and I can't imagine what would lead anyone to say "Not remotely Turkic").

  16. Forrest said,

    December 3, 2017 @ 12:33 pm

    It's a stab in the dark, but I wonder if it might not be some variety of Hemshin (Homshetsi), spoken in northeastern Turkey. Per my understanding, it's an archaic form of Armenian with a good deal of influence from Turkish.

    "Karapet" certainly looks like the Armenian male name, while "qoghular" looks Turkish. "Chapuq" looks like Turkish "çabuk" (quickly). And "sireleq", as noted above, does turn up in some online Armenian postings.

    Bert Vaux, a linguist specializing in Armenian, has written on Hemshin.

  17. Victor Mair said,

    December 3, 2017 @ 6:21 pm

    From John Colarusso:

    If Uli cannot identify it, then there is something odd about it.

    From Uli Schamiloglu:

    Wow, what a tough nut to crack!

    It turns out that so many of these words (even when you search in various kinds of Cyrillic spellings) are unique in Google and point to this post.

    Some thoughts:

    karapet (carpet? inspired guess)

    kirlerek

    Iqat (Ar. ijad 'creation')

    ighun (Tu. ichün 'for [postposition]')

    kirlerek (I've found it used once in Turkish as 'pollution, waste')

    chapuq (Tu. chabuk? 'quick, quickly')

    Poghtu (paxta in some Central Asian languages is 'cotton')

    qoghular (Tu. 'is placed')
    [If you want to be consistent about it, though, then is -gh- like ch or like gh?]

    In Turkish dialects, Azerbaijani the 1pl imperative is in-k/-q (gedäk 'let us go', from get- 'to go, leave') so I could imagine the last word in lines 1-3 as a verb, but the stems don't make sense.

    Let me know if somebody figures it out.

  18. Erika H Gilson said,

    December 3, 2017 @ 6:59 pm

    Where did this appear, and in what context?

  19. Victor Mair said,

    December 3, 2017 @ 7:07 pm

    @Erika H Gilson

    That information is in the o.p. (with link to all that is known about the text sample).

  20. R. Fenwick said,

    December 3, 2017 @ 11:22 pm

    @Ian:

    The phonotactics are all wrong for Georgian, so I'm fairly certain it's not that. It could be something else Kartvelian like Mingrelian or Svan though.

    Nope. The cluster phonotactics aren't Kartvelian at all – there's nothing at all like the famed "harmonic clusters" so typical of all Kartvelian languages, and Svan tends to be even more exuberant with syllable-final clusters than Georgian is (wisgw "apple", for instance).

    I'm with John Colarusso on this one (and against Marcel Erdal). It's almost certainly Turkic, but that covers a lot of ground; the many instances of q and the relatively small but clear amount of vowel harmony make me think of some Caucasian Turkic language like Karachay-Balkar or Kumyk (I doubt it's either Tatar or Noghay), but I'll have to fish through my dictionaries to find out.

  21. Keith said,

    December 4, 2017 @ 3:03 am

    I think that this is more difficult than it needs to be…
    The page at ProZ.com mentions that this is a transcription of a handwritten text. We can't see the original document, so have to accept that it was in Latin script, that it was perfectly legible so the transcription is flawless including that the use of capital letters in this transcription is faithful to the original. The page also mentions "Subject field: Poetry letter"… I haven't decided whether that is a help or a hindrance, so I'll just ignore it, even though "-ek" "-eq" look like rhymes.
    That leads me to a couple of thoughts.
    1. That "karapet" is not the Armenian given name Karapet / Garabed, since it is not written with an initial capital letter.
    2. That "Payghytei Pielereq" is possibly a proper noun, either a person or a place, because of the initial capital letters.
    From there, picking out words here and there:
    kara is Azeri and Turkish for black,
    Google translate tells me that "qulkt" is "cunt" and "kir" is "dirt in Azeri.
    Azlayn looks turkic, and is Azeri for "offline", and there is a training centre in Baku whose name contains this word ("Azlayn" tədris mərkəzi).
    But I don't know much about Turkic languages, certainly not enough to be able to properly identify roots within words, or to identify the functions of words from their endings.

  22. Andy said,

    December 4, 2017 @ 10:48 am

    I'm so far in agreement with those who have suggested some sort of mixture of Armenian and Turkic (or other languages). The word 'sireleq' is legit (Eastern) Armenian, so unless that turns out to be a coincidence, there surely is an Armenian element to the text: սիրել (sirel) is the verb 'to like/love' (in the infinitive); սիրել եք (sirel ekʿ) is the 2pl present perfect; although the periphrastic tenses are properly written as two words, a perusal of the internet makes it quite clear that they're often written as one; and while the consonant is often transliterated by , it seems just as usual to use or (at least informally) even plain -again, plenty of examples on the internet.

    Aside from that, a couple of random thoughts:
    1. The Turkish 'çabuk' (quick) is apparently from Persian. Wiktionary has 'Middle Persian čābuk, Old Persian (čapuka). Compare Old Armenian ճապուկ (čapuk), from Iranian'. If the latter is true, then the 'chapuq' of the text is a naturalized Armenian word and doesn't necessarily indicate Turkic influence. (Not quite sure about the for here.)
    2. I think that 'karapet' designates a commander of some sort; պետ (pet) is 'chief' or 'master', and comes from PIE *pótis; if I understand correctly, the first bit is related to գործել (gorcel), 'to act/work'. I found it here: https://goo.gl/zU2vDD
    I haven't properly checked in this dictionary for the other words or similar things, but it might well be worthwhile (and easier) for someone who is actually proficient in Armenian to do so.
    3. 'Poghtu' is reminiscent of the Armenian name Պողոս (Poghos), which is 'Paul'.
    4. 'Chapuq' is also the agentive form of the Quechua verb chapuy 'to submerge', so this text being in a hitherto lost and unconvincingly related South American language is my fallback position.

  23. Andy said,

    December 4, 2017 @ 11:06 am

    Pleasing that Language Log doesn't seem to like pointy brackets and their contents :( Here are those lines in full:
    Lines 8-9: and while the consonant ք is often transliterated by k', it seems just as usual to use q or (at least informally) even plain k…
    Line 16: (Not quite sure about the q for k here.)

  24. Victor Mair said,

    December 4, 2017 @ 12:52 pm

    From Alexander Vovin:

    Nothing that I can recognize immediately. But if these folks really want help with identification (unless it is hoax, of course), then considerably more information is needed than 4 lines. First, more data. Second more inf re the actual script, where this/these texts were found, etc., etc. There is some rudimentary rhyming, so that should point to Eurasia (since rhyming spread out of China), but anything that could possibly narrow the geographical area will be of great help.

  25. Victor Mair said,

    December 4, 2017 @ 4:56 pm

    From Peter Golden:

    I agree with Marcel – there is nothing Turkic there—unless you make some really wild stretches – nor Semitic. It's not Georgian either, nor does it have the look of any other Kartvelian language. It is a puzzle. I have heard – years ago – some really interesting North Caucasian languages spoken by various groups that fled the Russian conquest and settled in Anatolia in the 19th century. They are often lumped together as Çerkes (Circassian), but quite a few are not. Had it been Circassian, John Colarusso would have spotted it right away and probably other N. Caucasian languages (e.g. Chechen) as well. Is it a prank?

  26. Victor Mair said,

    December 4, 2017 @ 4:58 pm

    From Stefan Georg:

    I had a look at this mystery text, and I think that this is (*may* be) garbled Armenian (with the emphasis on *garbled*) – if it's love poetry, it has at least an Armenian verb form of the verb ‚love' in it (sirelik', it would be tough, a future gerund, or, then sirel ek' ‚you-pl. have loved'), and the word karapet makes sense in Armenian – first, it is a figure from pre-Christian Armenian mythology, but also a standing epithet of St. John the Baptist (lit. the „forerunner", i.e. of Christ). The phonotactics seems to be vaguely compatible with such an interpretation, *but* as such it is *not* Armenian, neither Classical nor modern Western or Eastern. I would assume that somebody with half-baked knowledge of the script tried to transcibe Armenian into Latin letters, but failed spectacularly. This is my best guess, and of course it might be as wrong as all the others – which are, of course, leading nowhere. This is, as Prof. Erdal said, not even remotely Turkic, anything European, and – sure – also nothing (West, East or South) Caucasian. If it should have been originally anything like this, the transcriber must have done a much more thorough job garbling it beyond recognition (in which case it might be anything or nothing, including everything ruled out so far)). But Armenian may be, at last, a trace.
    Maybe it is from the poetry of Sayat-Nova, a very popular (love) poet and musician from the 18th century (his father was named Karapet, but this may not mean anything), but this is neither here nor there – his poetry is also popular in Georgia and Azerbaijan, both in the local languages, and in Armenian – moreover, ethnic Armenians in Georgia often do not know the Armenian script, so this might be a poem/song (vaguely) remembered and then put on paper without any coherent idea of how to do it.

    Be that as it may, if the translation agency insists on not parting with the original document (maybe they „transcribed" it themselves?) and as much context as they may find, this riddle will remain unsolved …

    And, of course, despite all my chatter about this, the likelihood that this is really (an attempt to render something in) Armenian is, of course, only slightly above zero :-)

  27. Victor Mair said,

    December 4, 2017 @ 5:00 pm

    From Peter Golden:

    It's not Qarachay or Qumuq – I don't know how one of the posters came to that conclusion. I worked on Qarachay texts 50 years ago at the Dil ve Tarih-Coğrafya Fakültesi with Saadet Çağatay (who knew every Turkic language)

    qoghular is the only thing that looks vaguely Turkic (Old Turk. qughu "swan".= quw in Qumuq (already found as such in Middle Qıpchaq texts), but Qangqaz in Qarachay.

    Of course, it may be a botched transcription of…whatever. In which case, one can read whatever one wants into it.

  28. Chris C. said,

    December 4, 2017 @ 5:17 pm

    @Andy — It's because you can include HTML tags in your posts. Angle brackets are interpreted as enclosing tags, which browsers simply ignore if invalid. You should be able to make them display using HTML entities &lt; and &gt; <like so>.

    (And I'm hoping this comes out right. I liked the live preview they used to have here.)

  29. Andy said,

    December 6, 2017 @ 2:37 am

    @Chris C.: Cheers, that's most helpful :)

  30. Andy said,

    December 6, 2017 @ 5:08 am

    Has everyone given up on the mystery text? I managed to find 'kirlerek' a couple more times on the internet. It's definitely proper Turkish, some sort of gerund form of the verb 'kirlemek' (to make dirty, pollute); it's used for actions happening at the same time as the main verb (which might be 'sireleq'). I also noticed that 'qoghular' looks quite like the Azeri 'qoxular' (smells, the noun). They never seem to use gh for x though, so it's very uncertain. I'm coming round to the idea that it's either a hoax or otherwise garbled and incomplete.

  31. /df said,

    December 6, 2017 @ 12:02 pm

    Could it be that ProZ (or its customer) has unwittingly come upon some sort of Caucasian 'lorem ipsum' text?

  32. languagehat said,

    December 6, 2017 @ 12:09 pm

    Could it be that ProZ (or its customer) has unwittingly come upon some sort of Caucasian 'lorem ipsum' text?

    Ha! Best suggestion yet.

RSS feed for comments on this post