Erdogan's phone conversations

« previous post | next post »

Recep Tayyip Erdoğan has been the prime minister of Turkey for 11 years. On Monday, someone posted on YouTube what purports to be recordings of a series of phone conversations between Erdoğan and his son, discussing how to hide a billion dollars or so in cash: "Başçalan Erdoğan'ın Yalanlarının ve Yolsuzluklarının Kaydı"= "Recording of Erdogan's lying and corruption". Here's an acted version of an English translation, from "Full transcript of voice recording purportedly of Erdoğan and his son", Today's Zaman 2/26/2014:

Some more coverage — "Turkish Prime Minister Erdogan's phone talks with his son Bilal, about where to hide the money (english translation)", LiveLeak 2/24/2014; Glen Johnson, "Turkish Prime Minister Erdogan denounces 'vile attack' against him", LA Times 2/25/2014; Tim Arango, "Turks Are Glued to a Sensational Drama, This One Political", NYT 2/25/2014; Roy Gutman, "Erdogan recordings appear real, analyst says, as Turkey scandal grows", Miami Herald 2/26/2014; Humeyra Pamuk, "Turkish Prime Minister targeted in second audio tape", Reuters 2/26/2014; "New leaked recording reveals Erdoğan allegedly unhappy about $10 mln bribe", Today's Zaman 2/26/2014; Tim Arango, "Turkish Leader Disowns Trials That Helped Him Tame Military", 2/26/2014.

Since Erdoğan and his spokesmen have insisted that the recordings are faked, there have already been some expert attempts to explore the question. "PM Erdoğan's tapes not doctored, specialists agree", Today's Zaman 2/26/2014, gives an unusually detailed account of one such investigation:

Audio engineer Kıvanç Kitapçı wrote on his self-named WordPress blog that the recordings must be real as per analyses he hash conducted in his acoustics laboratory.

This refers to a post in Turkish, "Tayyip Erdogan – Bilal Erdogan Telefon Gorusmesi Analizi", 2/24/2014, which I have not yet tried to understand. But Zaman's English summary doesn't make much sense to me:

He said there are be two methods of imitating a conversation. One of these methods is know as “copy and paste,” which means that an amalgamation of different words are cropped from previous speeches. For this to work, the pitch levels of the words need to be normalized, their frequencies should be synchronized and modulated, etc.; the final product, however, is patchy.  

Kitapçı argues that generating fake audio recordings through this method is nearly impossible, and that the human ear for the most part can easily discern authenticity, without even needing a professional analyst.  

It's true that it's very difficult to create a fluent phrase by simple time-domain splicing of audio fragments from diverse sources. But there are well-known techniques for making local modifications, e.g. PSOLA and other methods commonly used in concatenative speech synthesis and other applications, which can produce quite natural-sounding results.

The other method is to record the conversation of two different people who are good at imitating voices and manipulating the recording to make it sound authentic. He decided to check if the leaked recordings were a product of this type of fabrication.  

Kitapçı, who calls himself a devoted specialist on speech acoustics and language processing technology, said he arbitrarily selected a sample of 20 easy words that are pronounced clearly in the voice recording. For the next step, he said, he spent almost five hours searching YouTube to find Erdoğan saying exactly the same words in other speeches that he has delivered before, to compare and contrast them and see whether they differ in what is called the fundamental frequency level, the raw sound that is created as air vibrates the vocal cords. Even a professional voice imitation needs to be done at the harmonic phase by modulating the sound after it passes the vocal cords. But the fundamental frequency is like a fingerprint, inimitable. So, using the Praat linguistic software, he made an F0 Contour measurement of the words he chose.  

With a five percent margin of error, the result was that the voice in the recording in fact does belong to the prime minister.

As written, this is nonsense — fundamental frequency contours are not "like a fingerprint". An individual's F0 contours in different recordings of the same word or phrase in different contexts will generally be very different. And in particular, there will be large overall F0 differences between a public-speaking voice and a private telephone-conversation voice.

Nor is there any reason to think that a particular F0 contour, much less a characteristic pattern of F0 contours, is "inimitable", either by human mimicry or by speech modification techniques.

The Zaman story continues:

Kitapçı was not alone. Reuters correspondent Ece Toksabay reported on Twitter that the owners of two İstanbul recording studios, Babajim Records and STD, separately found out through spectrogram analysis that the recordings had not been doctored.  

This is more like a normal journalistic report, i.e. vague enough that it's impossible to tell what it means. "Spectrogram analysis" could mean anything or nothing.

Zaman continues:

Elsewhere, Tacidar Seyhan, a former Republican People's Party (CHP) deputy and specialist on information technology, said the conversations don't include any digitally derived sounds. “This is [their] original speech,” he said. He also criticized Erdoğan for using the term “dubbing” because that word refers to the addition of voice to an image. If he speaks of editing, it doesn't represent a denial of the conversation, Seyhan noted. He said merging different pieces of Erdoğan's previous speeches is definitely not present in the leaked tapes. “I meticulously analyzed them, listened to them over and over again,” he said.

Listening to the recordings, looking at the audio waveform and spectrograms and pitch tracks, I likewise don't see any evidence that the recordings are fake. But here as elsewhere, absence of evidence is not evidence of absence. And the sound quality it not very good, which would make it easier to hide any signs of fakery.

On the other hand:

In the meantime, Science, Technology and Industry Minister Fikri Işık fired five officials said to be in charge of cryptographic phones in the Scientific and Technological Research Council of Turkey's (TÜBİTAK) Research Center for Advanced Technologies on Informatics and Information Security (BİLGEM). Işık told reporters on Wednesday in İstanbul that the officials were sacked after Erdoğan's complaint that even his cryptographic phone was tapped, which was perceived as a confirmation that the controversial conversation actually took place and that the recordings were authentic. Işık said an administrative and a technical inspection has begun to look into it.

This strikes me as more persuasive evidence that the basic recordings are genuine. And the general response seems to be based on the high prior plausibility of billions of dollars/euros/liras being hidden in the Erdogan family's houses.

Apparently modern politicians have not yet been persuaded of the alleged value of bitcoins and similar vehicles for untraceable private financial transactions.

 

 

 

 

 



10 Comments

  1. Victor Mair said,

    February 27, 2014 @ 4:18 pm

    I asked several Turkish colleagues and friends whether they thought the recordings were genuine. Here are some of the replies I received:

    1.
    I can't get over this – the funny thing is the PM or his son did not deny that it is their voices? A montage like that – is it possible? Probably but then, the snippets they have put together are not so innocent either??

    2.
    Sounds like it!

    More TK

  2. Jerry Friedman said,

    February 27, 2014 @ 7:12 pm

    MYL: Out of curiosity, could you recognize some fakes from "listening to the recordings, looking at the audio waveform and spectrograms and pitch tracks", or by any other means, even if people who knew the speakers couldn't tell? And could you ever be so sure a recording was genuine that you could testify to it?

    [(myl) It would be easy to detect time-domain splices that result in certain kinds of waveform, F0, or spectral discontinuities. And depending on the nature of the recording and its digital history, it might be easy to detect certain kinds of analysis/resynthesis processing.

    The evidence of such interventions might be inaudible or at least not salient to the ear, but obvious to close digital scrutiny or some sort of statistical signal analysis.

    However, I'm skeptical that it would ever be possible to state that no "audio photoshopping" could have been done to a given recording — only that techniques X, Y, or Z were not applied to passage Q. So if the claim to be evaluated is that the fakery was done by such-and-such a person, who knows how to use a DAW but doesn't have mad signal-processing skillz, I might feel comfortable asserting (say) that the continuous phrases without internal silences must have been original. But if the source of the possible fakery is unknown and perhaps very sophisticated, then I think it gets to be pretty hard to be sure about what did or didn't happen.]

  3. Michael P said,

    February 28, 2014 @ 7:44 am

    Bitcoin would probably be hard to use for hiding money on this scale. Thanks to the global transaction ledger, Bitcoin trades are more traceable than cash (but usually less attributable); it is essentially a barter network that runs on individually smallish transactions, and it is not very liquid; and many of the exchanges that improve liquidity were (and probably still are) vulnerable to third-party attacks on transactions. A corrupt official trying to hide hundreds of millions of dollars/euros/etc is probably better served by mattresses full of cash.

  4. tuncay said,

    February 28, 2014 @ 8:14 am

    An audio forensics lab (who did the original analysis for Miami Herald) has also released a preliminary report claiming that the recordings seem genuine.

    I agree with myl's former comment that it is practically impossible claiming with 100% accuracy that the recordings are real. However, the PM has very easy tools at his disposal if indeed the conversations were not real. All of the conversations are timestamped in the leaked video, and there are cues to where various people are at those points in time in each dialogue. If he were to show a counterexample to any of the claims in the dialogues (which he claims are doctored), then the case would be closed.

    But of course, he knows and I know and you know that they are as real as the hazy sky in Istanbul as I type these words..

  5. J. W. Brewer said,

    February 28, 2014 @ 12:48 pm

    Re the source of potential fakery being unknown, from what little I know about Turkish politics it seems simultaneously to be the case that: a) the allegations may well be true and are at least plausible (although of course false allegations are often made against exactly the sort of people who might well have done the thing alleged but happen in the particular instance not to have – that's behind the whole "round up the usual suspects" notion) and perhaps the failure to specifically rebut them in ways that ought to be available to the accused if the audio was faked is itself a highly significant fact; but b) Erdogan has significant, well-connected, and well-funded political enemies, including quite a substantial percentage of the old political establishment which used to run things and major factions of the military/security/intelligence apparatus (who of course have long-standing connections with non-Turkish intelligence services who might be unhappy, for all I know justifiably so, with Erdogan). So this seems like a situation in which it would be prudent to assume that those with an incentive to fake would have access to however high-end/sophisticated/expensive a mode of fakery as might be available.

    To try a U.S. analogy, assume an alternative universe in which the kookiest/fringiest "birther" opponents of the current President were in league with the leadership of the CIA/NSA/etc. or a significant faction thereof. Consider how impressively authentic-seeming the newly-"discovered" evidence purporting to establish the President's actual birth in Kenya would be likely to be under those circumstances. Although of course the same sort of shadowy conspirators could also just tap the President's supposedly ultra-top-secret phone and selectively leak embarrassing but authentic audio of things he thought he'd said in confidence . . .

  6. Victor Mair said,

    March 1, 2014 @ 12:27 am

    From a friend in Turkey:

    Yes, most probable… There has been severe discussions about
    the authenticity of the tappings, but the things that Erdogan had done
    after 17th December, the day Reza Zerrap, a 29 year-old Iranian and
    35 other construction barons had been arrested to wash Iranian
    gold transaction through Halkbank (People's Bank), and on the
    25th of December, another corruption arrest wave was stopped by
    Erdogan in which allegedly his son, Bilal, was one of them, indicate
    the telephone converstaions are real… In fact, Erdogan accept the
    earlier tappings in which he was giving orders to the press men to
    ban some news in Haberturk Television.

    Actually, Erdogan is not denying the content of the mobile telephone
    tappings which were done by encrypted devices. From the Government
    Scientific Office which wrote the algorhtims of encryptions,
    five experts were expelled because they had done
    the tappings, after Erdogan had accepted that he was tapped even in his
    encrypted devices…

    The funny thing is that, there will be a local election on the 30th of March,
    and the votes would not been affected by the corruption events…

    Erdogan will won the elections…

  7. Peter Taylor said,

    March 1, 2014 @ 4:28 am

    ENF analysis might serve to authenticate the recordings if there's a suitable database of mains hum available, but even if it does exist I would expect it to be under Erdogan's control.

  8. Michelle K. Gross said,

    March 4, 2014 @ 12:24 am

    voice synthesis using mimic techniques?
    http://www.isca-speech.org/archive/interspeech_2010/i10_2154.html
    Evaluation of Speaker Mimic Technology for Personalizing SGD Voices (September 26-30. 2010)

    http://www.modeltalker.com/comparison.html
    http://www.modeltalker.com/mtvoices/katesynth.wav was recorded and hand-corrected at the Speech Research Laboratory,

  9. Ellie said,

    March 5, 2014 @ 2:42 pm

    Michelle, the voice on the audio is not nearly half as mechanic as the one in your example

  10. saki said,

    April 21, 2014 @ 6:03 am

    It is 1million% real, if you know his "character" and his akp party's dna. I still do not understand people who have doubt on this. The funny and tragic thing is that he is not denying the content but he is furious and is complaining about illegal recordings!!!! After all these scandals, he still plans to run for presidency, without any shame!!! He will go down, he must go down otherwise darkness will be all over Turkey.

RSS feed for comments on this post