A diarization corpus from Amazon
About a month ago, Zaid Ahmed and others in Amazon's speech research group released DiPCo ("Dinner Party Corpus"), "a new data set that will help speech scientists address the difficult problem of separating speech signals in reverberant rooms with multiple speakers".
The past decade has seen striking progress in Human Language Technology, brought about by new methods, more training data, and (especially) cheaper/faster computers. But this rapid progress highlights the fact that "All problems are not solved", as I wrote last year — and in particular, the central problem of "diarization", or determining who spoken when, has turned out to be a surprisingly difficult one. And diarization is not just hard for conversations at dinner parties.
Read the rest of this entry »

