Mark Metcalf writes:
Since Language Log addresses lots of interesting language-related issues, I was wondering if you'd ever encountered a problem with Chinese PDFs being incorrectly displayed on an iPad. I searched the LL website and didn't find it previously addressed. I also unsuccessfully searched the Web for solutions.
Here's the issue: Last week I downloaded several articles from CNKI and they all display correctly on my Windows machine. However, when I transferred them to an iPad the Chinese text was garbled. Since I haven't had iPad problems with Chinese PDFs from other sources, one thought is is that CNKI uses a modified PDF file format that can't be properly handled by the iPad OS.
Has anyone previously addressed this problem? If so, could you point me to a solution? If not, would you be interested in addressing this on 'Language Log'? Below I've attached before/after versions of the displays.
I asked several colleagues and students whom I've often observed reading Chinese PDFs on their iPads what their experience with CNKI has been. Here are a few of the replies that I received.
Adam D. Smith:
Both iPad and CNKI are imperfect (CNKI is annoying in so many other ways), and yes I think I have noticed this with certain files previously (most CNKI files work pretty well on both machines, though). For some reason the iPad app your correspondent is using doesn’t correctly recognize the encoding of the Chinese text. One fix for this might be to use an alternative PDF reader on iPad. Your correspondent might be using the usual Adobe reader. There are quite a few others. I suspect that switching from one to another might be a workaround. But annoying all the same.
I have indeed encountered this — it seems to be specific to PDFs from CNKI, and often seems to be limited to page numbers and/or publication date and issue data, rather than article text. I'm not sure what the cause is, but the problem appears to be fixed, at least temporarily, by opening the PDFs in Adobe Acrobat (or maybe Acrobat Reader) rather than in the default PDF apps on the Mac. Saving the PDF there and then reopening should fix the problem — but it's obnoxiously inconsistent.
I never loaded a Chinese PDF file into my iPad, but tried today after reading your email. This is what I found:
When I tried to open a Chinese PDF file in iBooks after syncing it through iTunes, many (but not all) characters went wrong, just as your friend described. My guess is that iBooks does a poor job in treating some of the punctuation marks in the PDF file, thus causing problem in decoding the entire paragraphs.
However, the Chinese PDF displayed very well in the Adobe Reader app, which I downloaded for free from Apple's AppStore. There are several ways to get PDF files into the iPad (through email attachment, Dropbox, Adobe Cloud service, etc.) Apparently, the Adobe Reader app comes with all Chinese fonts and does a good job in decoding punctuation marks.
I haven't encountered this particular problem (though I don't use a tablet myself), and no one has brought it to my attention before. I'm not sure that the phenomenon is the same, but I have run into trouble with CNKI PDFs rendering correctly in Preview, the standard Mac PDF viewer. That program is probably similar to or the same as what the iPad uses, that may have the same cause as what your colleague is encountering. On a Mac computer, though, I don't have any trouble at all reading the PDFs using other PDF viewers. I don't know if the iPad permits other PDF reading software, but if it does, that's the first thing I would try.
In the bad old days, electronically stored and transmitted Chinese texts were often badly mangled (luànmǎ 亂碼 ["garbled", lit., "chaotic code"]), but after Unicode became well-nigh universal, such problems have radically diminished. As late as two years ago, however, we would still encounter issues like the following:
To show you what happened to Mark Metcalf, here are "before" (Windows) and "after" (iPad) screen shots of the opening page of the same article. The second version looks like it is a Chinese text, and it is composed of Chinese characters, but it is complete and utter gobbledygook. (As usual, click to embiggen.)