Recep Tayyip Erdoğan has been the prime minister of Turkey for 11 years. On Monday, someone posted on YouTube what purports to be recordings of a series of phone conversations between Erdoğan and his son, discussing how to hide a billion dollars or so in cash: "Başçalan Erdoğan'ın Yalanlarının ve Yolsuzluklarının Kaydı"= "Recording of Erdogan's lying and corruption". Here's an acted version of an English translation, from "Full transcript of voice recording purportedly of Erdoğan and his son", Today's Zaman 2/26/2014:
Some more coverage — "Turkish Prime Minister Erdogan's phone talks with his son Bilal, about where to hide the money (english translation)", LiveLeak 2/24/2014; Glen Johnson, "Turkish Prime Minister Erdogan denounces 'vile attack' against him", LA Times 2/25/2014; Tim Arango, "Turks Are Glued to a Sensational Drama, This One Political", NYT 2/25/2014; Roy Gutman, "Erdogan recordings appear real, analyst says, as Turkey scandal grows", Miami Herald 2/26/2014; Humeyra Pamuk, "Turkish Prime Minister targeted in second audio tape", Reuters 2/26/2014; "New leaked recording reveals Erdoğan allegedly unhappy about $10 mln bribe", Today's Zaman 2/26/2014; Tim Arango, "Turkish Leader Disowns Trials That Helped Him Tame Military", 2/26/2014.
Since Erdoğan and his spokesmen have insisted that the recordings are faked, there have already been some expert attempts to explore the question. "PM Erdoğan's tapes not doctored, specialists agree", Today's Zaman 2/26/2014, gives an unusually detailed account of one such investigation:
Audio engineer Kıvanç Kitapçı wrote on his self-named WordPress blog that the recordings must be real as per analyses he hash conducted in his acoustics laboratory.
This refers to a post in Turkish, "Tayyip Erdogan – Bilal Erdogan Telefon Gorusmesi Analizi", 2/24/2014, which I have not yet tried to understand. But Zaman's English summary doesn't make much sense to me:
He said there are be two methods of imitating a conversation. One of these methods is know as “copy and paste,” which means that an amalgamation of different words are cropped from previous speeches. For this to work, the pitch levels of the words need to be normalized, their frequencies should be synchronized and modulated, etc.; the final product, however, is patchy.
Kitapçı argues that generating fake audio recordings through this method is nearly impossible, and that the human ear for the most part can easily discern authenticity, without even needing a professional analyst.
It's true that it's very difficult to create a fluent phrase by simple time-domain splicing of audio fragments from diverse sources. But there are well-known techniques for making local modifications, e.g. PSOLA and other methods commonly used in concatenative speech synthesis and other applications, which can produce quite natural-sounding results.
The other method is to record the conversation of two different people who are good at imitating voices and manipulating the recording to make it sound authentic. He decided to check if the leaked recordings were a product of this type of fabrication.
Kitapçı, who calls himself a devoted specialist on speech acoustics and language processing technology, said he arbitrarily selected a sample of 20 easy words that are pronounced clearly in the voice recording. For the next step, he said, he spent almost five hours searching YouTube to find Erdoğan saying exactly the same words in other speeches that he has delivered before, to compare and contrast them and see whether they differ in what is called the fundamental frequency level, the raw sound that is created as air vibrates the vocal cords. Even a professional voice imitation needs to be done at the harmonic phase by modulating the sound after it passes the vocal cords. But the fundamental frequency is like a fingerprint, inimitable. So, using the Praat linguistic software, he made an F0 Contour measurement of the words he chose.
With a five percent margin of error, the result was that the voice in the recording in fact does belong to the prime minister.
As written, this is nonsense — fundamental frequency contours are not "like a fingerprint". An individual's F0 contours in different recordings of the same word or phrase in different contexts will generally be very different. And in particular, there will be large overall F0 differences between a public-speaking voice and a private telephone-conversation voice.
Nor is there any reason to think that a particular F0 contour, much less a characteristic pattern of F0 contours, is "inimitable", either by human mimicry or by speech modification techniques.
The Zaman story continues:
Kitapçı was not alone. Reuters correspondent Ece Toksabay reported on Twitter that the owners of two İstanbul recording studios, Babajim Records and STD, separately found out through spectrogram analysis that the recordings had not been doctored.
This is more like a normal journalistic report, i.e. vague enough that it's impossible to tell what it means. "Spectrogram analysis" could mean anything or nothing.
Elsewhere, Tacidar Seyhan, a former Republican People's Party (CHP) deputy and specialist on information technology, said the conversations don't include any digitally derived sounds. “This is [their] original speech,” he said. He also criticized Erdoğan for using the term “dubbing” because that word refers to the addition of voice to an image. If he speaks of editing, it doesn't represent a denial of the conversation, Seyhan noted. He said merging different pieces of Erdoğan's previous speeches is definitely not present in the leaked tapes. “I meticulously analyzed them, listened to them over and over again,” he said.
Listening to the recordings, looking at the audio waveform and spectrograms and pitch tracks, I likewise don't see any evidence that the recordings are fake. But here as elsewhere, absence of evidence is not evidence of absence. And the sound quality it not very good, which would make it easier to hide any signs of fakery.
On the other hand:
In the meantime, Science, Technology and Industry Minister Fikri Işık fired five officials said to be in charge of cryptographic phones in the Scientific and Technological Research Council of Turkey's (TÜBİTAK) Research Center for Advanced Technologies on Informatics and Information Security (BİLGEM). Işık told reporters on Wednesday in İstanbul that the officials were sacked after Erdoğan's complaint that even his cryptographic phone was tapped, which was perceived as a confirmation that the controversial conversation actually took place and that the recordings were authentic. Işık said an administrative and a technical inspection has begun to look into it.
This strikes me as more persuasive evidence that the basic recordings are genuine. And the general response seems to be based on the high prior plausibility of billions of dollars/euros/liras being hidden in the Erdogan family's houses.
Apparently modern politicians have not yet been persuaded of the alleged value of bitcoins and similar vehicles for untraceable private financial transactions.