Archive for Language on the internets

Google Scholar: another metadata muddle?

Following on the critiques of the faulty metadata in Google Books that I offered here and in the Chronicle of Higher Education, Peter Jacso of the University of Hawaii writes in the Library Journal that Google Scholar is laced with millions of metadata errors of its own. These include wildly inflated publication and citation counts (which Jacso compares to Bernie Madoff's profit reports), numerous missing author names, and phantom authors assigned by the parser that Google elected to use to extract metadata, rather than using the metadata offered them by scholarly publishers and indexing/abstracting services:

In its stupor, the parser fancies as author names (parts of) section titles, article titles, journal names, company names, and addresses, such as Methods (42,700 records), Evaluation (43,900), Population (23,300), Contents (25,200), Technique(s) (30,000), Results (17,900), Background (10,500), or—in a whopping number of records— Limited (234,000) and Ltd (452,000). 

What makes this a serious problem is that many people regard the Google Scholar metadata as a reliable index of scholarly influence and reputation, particularly now that there are tools like the Google Scholar Citation Count gadget by Jan Feyereisl and the Publish or Perish software produced by Tarma Software, both of which take Google Scholar's metadata at face value. True, the data provided by traditional abstracting and indexing services are far from perfect, but their errors are dwarfed by those of Google Scholar, Jacso says.

Of course you could argue that Google's responsibilities with Google Scholar aren't quite analogous to those with Google Book, where the settlement has to pass federal scrutiny and where Google has obligations to the research libraries that provided the scans. Still, you have to feel sorry for any academic whose tenure or promotion case rests in part on the accuracy of one of Google's algorithms.

Comments (9)

The colleagues down the hall

This is a long-overdue follow-up to my post (from April 26), announcing the availability of the film The Linguists on Babelgum.com. A couple things that I failed to point out in that post: first, the version of the film on Babelgum is the DVD version, not the slightly shorter cut that has aired on PBS; second, there are several additional clips that you can watch separately on Babelgum that are on the DVD. Search for "the linguists" on Babelgum and you'll find links to the trailer, the film, and the additional clips. These are all available for some unspecified limited period, so watch 'em now if you're interested.

What I'm really following up on here, though, is this comment by Jesse Tseng.

I was struck by this sentence [in the film, spoken by David Harrison–eb]:

I don't see how you can justify devoting your research career to the syntax of French (a language with millions of speakers), when the skills that you possess could help document a language that is going to go extinct within your lifetime.

Hmm. The fieldwork skills I possess would make me go extinct long before any tribal language I helped to document. And good luck doing any syntax at all with your 15 sentences of Kallawaya…

Seriously, I was disappointed to hear this gratuitous swipe at the colleagues down the hall. I would like to believe that we are all engaged in a common endeavor, with the same justifications. And when linguistics departments get cut, all the sub-fields of linguistics go down together. Or are they hoping that the money then gets funneled into Anthropology?

Read the rest of this entry »

Comments (29)

New pronoun issues on Facebook

In the beginning, Facebook used singular they to refer to members who hadn't specified their gender. Then, people complained, and Facebook listened. (Feel free to google {facebook pronoun} for a taste of the back-and-forth.)

Read the rest of this entry »

Comments (37)

Facebook phases out singular "they"

As Eric Bakovic described here last year, Facebook uses they as a singular pronoun when the gender of the user is not known, leading to news feed items like: "Pat Jones added Prince to their favorite music." That's never been the most elegant use of singular they, since readers of these items tend to know the gender of Pat Jones, even if s/he hasn't told Facebook about it. Even more awkwardly, Facebook also uses themself when a reflexive pronoun is needed, as in: "Pat Jones has tagged themself in a photo." Well, now after some cross-linguistic difficulties, Facebook is trying to stamp out singular they by being more demanding about gender specification.

Read the rest of this entry »

Comments (27)