The gap between spoken and written Sinitic is enormous. In my estimation, it is greater than for any other language I know. The following are some notes by Ľuboš Gajdoš about why this is so.
"The Discrepancy Between Spoken and Written Chinese — Methodological Notes on Linguistics", Comenius University in Bratislava, Department of East Asian Studies
The issue of choosing language data on which synchronous linguistic research is being done appears in many ways not only to be relevant to the goal of the research, but also to the validity of the research results. The problem which particularly concerns us here is the discrepancy between speech on the one hand and written language on the other. In this context, we have often encountered in the past a situation where the result of the research conducted on a variety of the Chinese language has been generalized to the entire synchronous state of the language, i.e. to all other varieties of the language, while ignoring the mentioned discrepancy between the spoken and written forms. The discrepancy between the spoken and written forms is likely to be present in any natural language with a written tradition, but the degree of difference between languages is uneven: e.g. compared to the Slovak language, it may be stated that the situation in Chinese is in this respect extraordinary. Nevertheless, it is surprising that the quantitative (qualitative) research on discrepancies between different varieties of the language has not yet aroused the attention of Chinese linguistics to such an extent as would have been adequate for the unique situation of this natural language.
Read the rest of this entry »