Kinds of science

« previous post | next post »

Today's xkcd — "The Three Kinds of Scientific Research":

Mouseover title: "The secret fourth kind is 'we applied a standard theory to their map of every tree and got some suspicious results.'"

A not-so-secret fifth (or zeroth?) kind, which even has an three-letter initialism: exploratory data analysis (EDA).

The original idea of EDA, as Wikipedia tells us, was "analyzing data sets to summarize their main characteristics, often using statistical graphics and other data visualization methods". John Tukey, who invented the term and promoted the idea, was especially concerned with finding and summarizing patterns in data, while avoiding the potentially misleading consequences of fitting "standard" models that assume normal distributions, ignore outliers, multimodality, uncontrolled co-variates, and so on. But another way of framing the goals of EDA focuses less on summarizing data and more on identifying hypotheses to pursue further.

These issues are especially important in speech and language research, where most distributions are not at all "normal", where outliers and uncontrolled covariates are ubiquitous, and there's almost never enough data. Modeling of "large numbers of rare events" is a common aspect of the problem in dealing with text, and multimodality and non-normality are to be expected in acoustic analysis.

These things matter for hypothesis-testing (and for classification and prediction efforts), but responsible researchers spend a lot of their time deciding what hypotheses to test, based on EDA as well as on subjective insight and common sense.



2 Comments

  1. KeithB said,

    August 28, 2024 @ 8:38 am

    I can't tell, is Munroe insulting biology with a variation of the "stamp-collecting" jibe?

  2. Mark Liberman said,

    August 28, 2024 @ 10:59 am

    @KeithB: "is Munroe insulting biology with a variation of the "stamp-collecting" jibe?"

    I guess, though to be fair, there are insults latent in the other "kinds of science" as well…

    And for varied analogical resonances with the "map of every tree" idea, see "Trees spring eternal" (11/23/2003), and "-ome is where the art is", 10/27/2004.

RSS feed for comments on this post