The 17th annual Blizzard Challenge

« previous post | next post »

In today's email, an announcement for the 17th annual Blizzard Challenge:

We are delighted to call for participation in the Blizzard Challenge 2021. This is an open evaluation of corpus-based speech synthesis systems using common datasets and a large listening test.

This year, the challenge will provide a European Spanish speech dataset from one native speaker. The dataset was offered by iFLYTEK Co. Ltd. and is now available for downloading after registration and completing the license.
The two tasks involve building voices from this data to synthesise texts containing only Spanish words and to synthesise Spanish texts containing a small number of English words in each sentence.
Please read the full announcement and the rules at:

Please register by following the instructions on the web page, then wait for your registration to be accepted before completing the data license.

Important: please send all communications about Blizzard to the official address and not to our personal addresses.

Please feel free to distribute this announcement to other relevant mailing lists.

Zhenhua Ling & Simon King

steering committee: Alan Black, Keiichi Tokuda, Simon King

Why "Blizzard"?

Because about 20 years ago, back in the technological Neolithic era, Alan Black and others created the "CMU_ARCTIC speech synthesis databases", a set of single-speaker audio datasets for use in creating speech synthesis systems. As they wrote:

Since it is very important to us that use of the Arctic corpus be unrestricted, we needed to start from a source of written material that does not impose any copyright restrictions incompatible with our aims. Although there are legally defined “fair use” rights in the US that specifically allow for the extraction of short quotes from a larger body of work, such rulings principally consider the needs of scholarship and of review (for which specific attribution is apparent). Our use does not exactly fit this category, causing us to be cautious. We don't want there to be any residual questions about the availability of this release. […]

[Therefore] we chose to use out-of-copyright books from the Gutenberg Project. With most of these texts being at least 70 years old, we face the issue of language drift. The English language has changed considerably over the past centuries and we did not want to infuse in our prompt set archaic English sentences. Thus we have hand selected a set of short stories whose style is recognizably modern, if not completely contemporary. Partly for consistency and partly from personal preference, we selected stories largely from the early 20th century author Jack London. Many of these stories – famously “To Build a Fire” – depict the difficult living conditions of the Yukon. Other selected books also describe the far Canadian north, hence our moniker Arctic.

And so in 2005, when they decided to create a speech-synthesis challenge (where the task was to create a synthesis system from an appropriately selected and annotated set of read sentences), they decided to call it the "Blizzard Challenge."

Some past LLOG Blizzard-related posts:

"A synthetic singing president?", 8/13/2010
"Blizzard Challenge 2012", 5/16/2012
"2013 Blizzard Challenge", 5/20/2013
"The 2016 Blizzard Challenge", 5/20/2016
"Blizzard Challenge: Listeners wanted", 6/5/2017
"Blizzard Challenge: Appeal for volunteer judges", 7/7/2019



  1. Gregory Kusnick said,

    March 29, 2021 @ 2:01 pm

    I gather that "synthesise texts" in this context means "synthesise speech from text".

  2. Rick Rubenstein said,

    March 29, 2021 @ 4:00 pm

    See, I thought it was going to be a corpus of World of Warcraft in-game chat.

    [(myl) Good idea though.]

  3. stephen said,

    March 31, 2021 @ 9:12 pm

    "A synthetic singing president?", 8/13/2010

  4. stephen said,

    March 31, 2021 @ 9:20 pm

    Oops, I hit the return button too soon. I checked out the 2010 article.
    "…The Beatles Complete on Ukulele project, and introduced me to its creator, David Barratt…"

    Somebody wanted to have Obama contribute to a Beatles collection, by sampling Obama's public speeches. I went to the website and they do have Obama with "Let it Be." I'm sure he found out at some point. When did he find out, and what was his response? Doesn't he sing well in real life?

RSS feed for comments on this post