Modals, idiolects, garden-path sentences, and English translations of a ninth-century Chinese poem

Here I present a digest of four scientific linguistics papers from the latter part of the month of January, 2024 to show that our field is very much alive in diverse subfields at the beginning of the new year.

"The Semantics, Sociolinguistics, and Origins of Double Modals in American English: New Insights from Social Media." Morin, Cameron et al. PLOS ONE 19, no. 1 (January 24, 2024): e0295799.

Abstract: In this paper, we analyze double modal use in American English based on a multi-billion-word corpus of geolocated posts from the social media platform Twitter. We identify and map 76 distinct double modals totaling 5,349 examples, many more types and tokens of double modals than have ever been observed. These descriptive results show that double modal structure and use in American English is far more complex than has generally been assumed. We then consider the relevance of these results to three current theoretical debates. First, we demonstrate that although there are various semantic tendencies in the types of modals that most often combine, there are no absolute constraints on double modal formation in American English. Most surprisingly, our results suggest that double modals are used productively across the US. Second, we argue that there is considerable dialect variation in double modal use in the southern US, with double modals generally being most strongly associated with African American Language, especially in the Deep South. This result challenges previous sociolinguistic research, which has often highlighted double modal use in White Southern English, especially in Appalachia. Third, we consider how these results can help us better understand the origins of double modals in America English: although it has generally been assumed that double modals were introduced by Scots-Irish settlers, we believe our results are more consistent with the hypothesis that double modals are an innovation of African American Language.

"Is the Individual Idiolect Substantially a Genetic Inheritance?" Murphy, Terence Patrick. Preprint. Research Square, January 22, 2024. 


Although stylometric studies tends to situate itself within the field of forensic analysis, most stylometricians appear averse to considering genetic explanations for their findings. Instead, they try to work with a range of what they construe as environmental factors in attempting to understand the clustering of individual authorial idiolects. However, researchers in behavioral genetics have demonstrated that the traits for cognitive abilities, including language ability, are among the most heritable. In this paper, I set out the major postulate and eight corollaries for the genetic hypothesis and the major postulate and five corollaries for the environmental hypothesis for explaining the clustering of individual idiolects in dendrogram analysis, using stylo in R. Using a corpus of Anglo-American modernist poetry, I then demonstrate that the individual idiolects of each of the Sitwell siblings—Edith, Osbert and Sacheverell—cluster together. In this way, I aim to help researchers decide which of the two hypotheses is the most likely explanation for the attested idiolectal similarities among the members of a number of important British and French literary families.

"Causality and Signalling of Garden-Path Sentences." Wang, Daphne et al. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 382, no. 2268 (January 29, 2024): 20230013. 


Sheaves are mathematical objects that describe the globally compatible data associated with open sets of a topological space. Original examples of sheaves were continuous functions; later they also became powerful tools in algebraic geometry, as well as logic and set theory. More recently, sheaves have been applied to the theory of contextuality in quantum mechanics. Whenever the local data are not necessarily compatible, sheaves are replaced by the simpler setting of presheaves. In previous work, we used presheaves to model lexically ambiguous phrases in natural language and identified the order of their disambiguation. In the work presented here, we model syntactic ambiguities and study a phenomenon in human parsing called garden-pathing. It has been shown that the information-theoretic quantity known as ‘surprisal’ correlates with human reading times in natural language but fails to do so in garden-path sentences. We compute the degree of signalling in our presheaves using probabilities from the large language model BERT and evaluate predictions on two psycholinguistic datasets. Our degree of signalling outperforms surprisal in two ways: (i) it distinguishes between hard and easy garden-path sentences (with p-value < 10 [superscript -5]), whereas existing work could not, (ii) its garden-path effect is larger in one of the datasets (32 ms versus 8.75 ms per word), leading to better prediction accuracies.

This article is part of the theme issue ‘Quantum contextuality, causality and freedom of choice’.

"Text Complexity and Translation Styles from the Perspective of Individuation: A Case Study of the English Translations of Pipa Xing." Yu, Yingchen et al. Humanities and Social Sciences Communications 11, no. 1 (January 23, 2024): 1-17.


This study aims to investigate the translation styles from the perspective of text complexity and to elucidate underlying factors contributing to the formation of stylistic differences based on individuation. Using SysFan and SPSS22.0 as tools, a combined quantitative and qualitative approach is employed to comparatively analyze the lexical density and grammatical intricacy across the renowned ancient Chinese poem Pipa Xing 琵琶行* and its nine English translations. The findings reveal that while the translations exhibit comparable levels of lexical density, disparities in complexity primarily manifest in terms of grammatical intricacy, reflecting distinct text features: spoken, written, and mixed spoken and written, as well as varying degrees of hierarchical and narrative features. The variations in translation styles are intricately linked to the individuation process undergone by translators. The translator’s individuation process is modeled to show how a translator mobilizes the meaning resources in the repertoire, which is constrained by the allocation of the cultural reservoir, to re-instantiate the source text in the translated text, while constructing affiliation with the target-reader community. Different translators’ allocated repertoires ultimately shape their conscious or unconscious choices in terms of lexicogrammar, thereby generating translated texts characterized by diverse styles.

*Brief introduction to the poem and its title:

"Pipa xing" (Chinese: 琵琶行), variously translated as "Song of the Pipa" or "Ballad of the Lute", is a Tang dynasty poem composed in 816 by the Chinese poet Bai Juyi, one of the greatest poets in Chinese history. The poem contains a description of a pipa performance during a chance encounter with a performer near the Yangtze River.


Of the four subfields represented in this digest, I know the last best.  The authors' analysis is based on translations from publications dating to the following years:  1884, 1915, 1919, 1929, 1960, 1984 (2), 1988, 1994, 2019.  Unfortunately, they were unable to include one from 2020 that is probably the best and most interesting:  Michael Fuller, "Ballad of the Pipa", in his An Introduction to Chinese Poetry (Leiden:  Brill, 2020), pp. 283-289.  Aside from his explanatory notes, from our perspective as linguists, Fuller's version is valuable for providing literal translations for each of the morphosyllables in addition to literary translations of the lines plus Middle Sinitic reconstructions of the rhymes.


