Cluttered writing: adjectives and adverbs in academia Scientific writing is about communicating ideas. Clutter doesn’t help—texts should be as simple as possible. Today, simplicity is more important than ever. Scientist are overwhelmed with new information. The overall growth rate for scientific publication over the last few decades has been at least 4.7% per year, which means doubling publication volume every 15 years. How do we keep up with the literature? We can use computers to extract meaning from texts (Hopkins and King 2007). Better yet, I propose here, we should be writing research in machine readable format, say, using Extensible Markup Language (XML). I think, it is the only way for scientists to cope with the volume of research in the future. But the first step is to start writing as simply as possible to minimize the volume and maximize the meaning. Readability of scientific writing matters not only for scientists. Readable scientific writing could reach wider audience and have a bigger impact outside of academia. So how do we produce readable and clean scientific writing? One of the good elements of style is to avoid adverbs and adjectives. Adjectives and adverbs sprinkle paper with unnecessary clutter. This clutter does not convey information but distracts and has no point especially in academic writing, say, as opposed to literary prose or poetry. William Zinnser, one of the writing experts, advises Why measuring readability by counting adjectives and adverbs? There are many readability measures, for instance: Gunning Fog Index, Automated Readability Index, Coleman–Liau Index, Flesch–Kincaid Reading Ease, Flesch–Kincaid Grade Level, SMOG Index, FORCAST Readability Formula. They are based on counts of words, difficult words (many syllables), and sentences. And the calculated measure is usually a grade level required to understand the text. I did not use these measures for two reasons. First, to calculate these indices I would need full texts of published research, and it appears that as of 2012 I cannot bulk download enough full texts to have a representative sample of a discipline. Second, counting syllables is not a trivial task, and it appears that there are many ways to do it, and the software is not very mature. At the same time, adjectives and adverbs counts are a relatively useful measure. They can be calculated using mature Natural Language Toolkit module for Python. NLTK is a module for Python programming language that can be used for analysis of human language, for instance, to calculate proportion of adjectives and adverbs in text. Both NLTK and Python are free and run on Linux, Mac, and Windows. They can be downloaded from python.org. You will also find extensive documentation and tutorials at the above addresses. NLTK comes with a number of dictionaries that can be used to identify parts of speech, say adjectives and adverbs. I use data from JSTOR Data For Research. The sample is about 1,000 articles randomly selected from all articles published in each of seven academic fields between 2000 and 2010. I made the following selection from JSTOR: Content type: Journal (to analyze research, not the other option: Pamphlets) 2. Page count: (5–100) (to avoid short letters, notes, and overly long essays; fewer than five pages may not offer enough to evaluate text, and longer than 100 may have a totally different style than the typical one for a given field) 3. Article type: Research article (other types such as book reviews may contain lengthy quotes, etc) 4. Language: English 5. Year of Publication: (2000–2010) (only recent research; did not select 2011, 2012, since for some fields JSTOR does not offer most recent publications—the number of available articles in most recent years dramatically drops, based on a JSTOR graph available at the selection). I identify parts of speech using Penn Tree Bank in Python NLTK module. I calculate the proportion of adjectives and adverbs for each academic discipline, and divide it by the smallest proportion, so that results show proportion increase over the discipline with the smallest proportion of the adjective–adverb clutter. Figure 1 shows that natural science uses the fewest adjectives and adverbs, while social science uses the most (about 15% more than natural science). Proportion of adjectives and adverbs in published research by academic discipline group relative to the field with the smallest proportion. 95% confidence intervals shown. Is there a reason that a social scientist cannot write as clearly as a natural scientist? Again, adjectives and adverbs are often meaningless and sometimes misleading. And there is a software to check for the proportions of parts of speech: Python’s NLTK. Following Mark Twain, the scientists should kill much of the adjectives and adverbs to make the academic prose readable and spare us from the unnecessary increase in the volume of research output.