Archive for Language on the internets

Text Message Language Is Everywhere

Those who hate text message abbreviations will be dismayed to learn of how far they have spread. Here is the sign at the gas station on the Gitksan reservation in Hazelton, British Columbia.
The gas station on the reservation in Hazelton, BC.

Comments (33)

Dear [Epithet] spamference organizer [Name]

The most unsuccessful piece of pseudo-personal spam I received this week must surely be the falsely flattering invitation that began as follows:

Dear Professor [Name][Name1],

We would like to invite you as Invited Speaker on the area of Social Sciences, Law, Finances and Humanities in the Conferences

Vouliagmeni Beach, Athens, Greece, December 29-31, 2010

Organized by the European Society for Environmental Research and Sustainable Development / EUROPMENT, www.europment.org in collaboration with the WSEAS...

Dear Professor [Name][Name1]? Come on, spamsters! Can't you even do a standard mail merge? Isn't that the core of your goddamn lousy trade?

Would it be OK with you if I gave an invited talk entitled "[Title][Subtitle]"?

Read the rest of this entry »

Comments (16)

Facebook Absolutely Must Die

The official name of Facebook in China, as it appears on the Chinese version of its Website, is simply "Facebook."  It is unofficially, but commonly, referred to as Liǎnshū 臉書 (lit., "face book").

Lately, however, Fēisǐbùkě 非死不可 has become a popular way of transcribing the name "Facebook."

Read the rest of this entry »

Comments (15)

وزارة-الأتصالات.مصر leads the non-Latin charge

The first Internet domain names using non-Latin characters are being rolled out, a plan put into motion after approval from the Internet Corporation for Assigned Names and Numbers (ICANN). Arabic-speaking nations are the first to reap the orthographic benefits, with new country codes available for Egypt (مصر), Saudi Arabia (السعودية), and the United Arab Emirates (امارات). The Egyptian Ministry of Communications and Information Technology, previously online at <http://www.mcit.gov.eg/>, is blazing the trail with its new URL:

<وزارة-الأتصالات.مصر>

Not everything is fully worked out with the new system, though. Browsers that aren't caught up to speed on the non-Latin domain names will see the addresses rendered as Latinized gobbledygook. The Egyptian Communication Ministry's Arabic-script URL, for instance, currently resolves to <http://xn—-rmckbbajlc6dj7bxne2c.xn--wgbh1c/>. That's not very communicative.

[Update: See the very helpful comments below for an explanation of the Latinized encoding.]

Comments (20)

Translate at your own risk

Last month I posted a link to a Schott's Vocab Q&A with Claude Hagège on endangered languages. Some commenters immediately picked up on one of Hagège's statements about translation:

However, there exists an important activity which clearly shows that even though the ways languages grasp the world may vary widely from one language to another, they all build, in fact, the same contents, and equivalent conceptions of the world. This activity is translation. Any text in any language can be translated into a text in another language. These two texts express the same meaning. We can therefore conclude that despite the differences between the ways languages grasp the world, all languages are easily convertible into one another, because humans interpret the world along the same, or comparable, semantic lines.

Barbara Partee contributed this comment:

Emmon Bach has put it nicely: The best argument in favor of the universality of natural language expressive power is the possibility of translation. The best argument against universality is the impossibility of translation (i.e. that we often can't really translate exactly). [link added–EB]

Translation ain't easy, even for skilled humans — and (especially) for machines. Google Translate appears to be among the better tools out there, but as the comments section of what (I believe) was Language Log's first reference to Google's translation tool shows, you can have quite a bit of fun breaking it. Moreover, breaking it is easy and can happen completely inadvertently, a lesson that (from what I hear, anyway) is quite often learned too late by desperate students trying to take shortcuts while doing their homeworks for beginning language classes.

Read the rest of this entry »

Comments (32)

Spamalot

In my recent go rogue posting, I reported a comment on an earlier posting from Daniel Gustav Anderson on go rogue as a sexual euphemism, saying that at first I suspected the comment of being spam, but decided it was legit. Then Jake Townhead commented on my posting, questioning my use of the word spam and suggesting that Anderson's comment was merely "bespoke mischief". So now some words on spam.

Read the rest of this entry »

Comments (22)

Google Scholar: another metadata muddle?

Following on the critiques of the faulty metadata in Google Books that I offered here and in the Chronicle of Higher Education, Peter Jacso of the University of Hawaii writes in the Library Journal that Google Scholar is laced with millions of metadata errors of its own. These include wildly inflated publication and citation counts (which Jacso compares to Bernie Madoff's profit reports), numerous missing author names, and phantom authors assigned by the parser that Google elected to use to extract metadata, rather than using the metadata offered them by scholarly publishers and indexing/abstracting services:

In its stupor, the parser fancies as author names (parts of) section titles, article titles, journal names, company names, and addresses, such as Methods (42,700 records), Evaluation (43,900), Population (23,300), Contents (25,200), Technique(s) (30,000), Results (17,900), Background (10,500), or—in a whopping number of records— Limited (234,000) and Ltd (452,000). 

What makes this a serious problem is that many people regard the Google Scholar metadata as a reliable index of scholarly influence and reputation, particularly now that there are tools like the Google Scholar Citation Count gadget by Jan Feyereisl and the Publish or Perish software produced by Tarma Software, both of which take Google Scholar's metadata at face value. True, the data provided by traditional abstracting and indexing services are far from perfect, but their errors are dwarfed by those of Google Scholar, Jacso says.

Of course you could argue that Google's responsibilities with Google Scholar aren't quite analogous to those with Google Book, where the settlement has to pass federal scrutiny and where Google has obligations to the research libraries that provided the scans. Still, you have to feel sorry for any academic whose tenure or promotion case rests in part on the accuracy of one of Google's algorithms.

Comments (9)

The colleagues down the hall

This is a long-overdue follow-up to my post (from April 26), announcing the availability of the film The Linguists on Babelgum.com. A couple things that I failed to point out in that post: first, the version of the film on Babelgum is the DVD version, not the slightly shorter cut that has aired on PBS; second, there are several additional clips that you can watch separately on Babelgum that are on the DVD. Search for "the linguists" on Babelgum and you'll find links to the trailer, the film, and the additional clips. These are all available for some unspecified limited period, so watch 'em now if you're interested.

What I'm really following up on here, though, is this comment by Jesse Tseng.

I was struck by this sentence [in the film, spoken by David Harrison–eb]:

I don't see how you can justify devoting your research career to the syntax of French (a language with millions of speakers), when the skills that you possess could help document a language that is going to go extinct within your lifetime.

Hmm. The fieldwork skills I possess would make me go extinct long before any tribal language I helped to document. And good luck doing any syntax at all with your 15 sentences of Kallawaya…

Seriously, I was disappointed to hear this gratuitous swipe at the colleagues down the hall. I would like to believe that we are all engaged in a common endeavor, with the same justifications. And when linguistics departments get cut, all the sub-fields of linguistics go down together. Or are they hoping that the money then gets funneled into Anthropology?

Read the rest of this entry »

Comments (29)

New pronoun issues on Facebook

In the beginning, Facebook used singular they to refer to members who hadn't specified their gender. Then, people complained, and Facebook listened. (Feel free to google {facebook pronoun} for a taste of the back-and-forth.)

Read the rest of this entry »

Comments (37)

Facebook phases out singular "they"

As Eric Bakovic described here last year, Facebook uses they as a singular pronoun when the gender of the user is not known, leading to news feed items like: "Pat Jones added Prince to their favorite music." That's never been the most elegant use of singular they, since readers of these items tend to know the gender of Pat Jones, even if s/he hasn't told Facebook about it. Even more awkwardly, Facebook also uses themself when a reflexive pronoun is needed, as in: "Pat Jones has tagged themself in a photo." Well, now after some cross-linguistic difficulties, Facebook is trying to stamp out singular they by being more demanding about gender specification.

Read the rest of this entry »

Comments (27)