« previous post | next post »

Today's xkcd illustrates why topic modeling can be tricky, for people as well as for machines:

The mouseover title: "As the 'exotic animals in homemade aprons hosting baking shows' YouTube craze reached its peak in March 2020, Andrew Cuomo announced he was replacing the Statue of Liberty with a bronze pangolin in a chef's hat."



The strip is about trends in Google searches rather than in document content, but the point is similar: it's one thing to detect a new cluster of words and phrases, and something else to assign an interpretation.

In some cases, the discovery is just a new instance of a familiar type. And here, of course, the familiar type is epidemic or pandemic — but there are a few socio-cultural steps from that to sewing machines, webcams, and flour.

Permalink