What's hot at ICASSP
« previous post | next post »
This week I'm at IEEE ICASSP 2017 in New Orleans — that's the "Institute of Electrical and Electronics Engineers International Conference on Acoustics, Speech and Signal Processing". pronounced /aɪ 'trɪ.pl i 'aɪ.kæsp/. I've had joint papers at all the ICASSP conferences since 2010, though I'm not sure that I've attended all of them.
This year the conference distributed its proceedings on a nifty little guitar-shaped USB key, which I promptly copied to my laptop for easier access. I seem to have deleted my local copies of most of the previous proceedings, but ICASSP 2014 escaped the reaper, so I decided to while away the time during one of the many parallel sessions here by running all the .pdfs (1703 in 2014, 1316 this year) through pdftotext, removing the REFERENCE sections, tokenizing the result, removing (some of the) unwordlike strings, and creating overall lexical histograms for comparison. The result is about 5 million words for 2014 and about 3.9 million words this year.
And to compare the lists, I used the usual "weighted log-odds-ratio, informative Dirichlet prior" method, as described for example in "The most Trumpish (and Bushish) words", 9/5/2015.
The first thing to say about this process is that it's become miraculously easy. On my 3-year-old laptop, the whole thing took about 15 seconds of computer time, responding to about six lines of code, which I was actually able to type at the command line with only one or two typographical errors. So it didn't take my mind off the lecture for very long.
The second thing to say is that there are a few artefacts in the results. For example, six of the ten most 2014-ish words were IEEE, Speech, Processing, Signal, Conference, International — because the paper template for the 2014 conference put this header on the first page of every paper
but the 2017 template had no such header.
Turning our attention to the other end of the list, the 20 most 2017-ish words are (where "#17" means "word count in the 2017 papers", "perM17" means "frequency per million in the 2017 papers, "#14" means "word count in the 2014 papers", "perM14" means "frequency per million words in the 2014 papers", and "LogOdds" means "the weighted log of the odds ratio"):
WORD #17 perM17 #14 perM14 LogOdds __________________________________________________ lstm 1405 (359) 84 (17) 22.169 cnn 1386 (355) 132 (26) 21.087 convolutional 1067 (273) 167 (33) 17.142 graph 3297 (843) 1988 (394) 15.706 layer 3349 (857) 2236 (443) 14.069 rnn 1104 (282) 392 (78) 13.385 learning 4258 (1089) 3215 (637) 13.306 layers 1719 (440) 896 (178) 13.007 dataset 2641 (676) 1714 (340) 12.936 deep 1762 (451) 947 (188) 12.826 ctc 418 (107) 7 (1) 12.730 blstm 531 (136) 71 (14) 12.443 neural 2228 (570) 1581 (313) 10.572 methods 5308 (1358) 4842 (960) 10.068 cnns 356 (91) 54 (11) 9.961 student 410 (105) 90 (18) 9.800 emotion 849 (217) 415 (82) 9.623 network 4749 (1215) 4437 (879) 8.916 fmri 350 (90) 82 (16) 8.882 recurrent 600 (153) 264 (52) 8.719
As you can see, most of these are associated with "deep learning" "neural net" algorithms: LSTM is "Long Short-Term Memory"; CNN is "Convolutional Neural Network" (not "Cable News Network"); RNN is "Recursive Neural Network"; BLSTM is "bidirectional LSTM; CTC is "Connectionist Temporal Classification"; etc.
So now you know what's hot at ICASSP this year.
Yuval said,
March 9, 2017 @ 5:55 pm
The fact that neural methods brought "graph" so up high is impressive.
[(myl) There's a special session on "Graph Topology Inference", which actually seems to be one of the topics that isn't really about pseudo-neural algorithms.]
(Also: fMRI! Srsly?)
[(myl) fMRI produces signals in need of processing, right? The abstract for one of the relevant papers:
Functional magnetic resonance imaging (fMRI) has provided a window into the brain with wide adoption in research and even clinical settings. Data-driven methods such as those based on latent variable models and matrix/tensor factorizations are being increasingly used for fMRI data analysis. There is increasing availability of large-scale multi-subject repositories involving 1,000+ individuals. Studies with large numbers of data sets promise effective comparisons across different conditions, groups, and time points, further increasing the utility of fMRI in human brain research. In this context, there is a pressing need for innovative ideas to develop flexible analysis methods that can scale to handle large-volume fMRI data, process the data in a distributed and policy-compliant manner, and capture diverse global and local patterns leveraging the big pool of fMRI data. This paper is a survey of some of the recent research in this direction.
]
Ben Zimmer said,
March 10, 2017 @ 9:56 am
Mark is too modest to mention that he is receiving the IEEE James L. Flanagan Speech and Audio Processing Award while he's at ICASSP. From the IEEE site:
Congratulations on the richly deserved honor!