Donkeys in Cyberspace!

« previous post | next post »

Almost a year ago, I posted here (well, at the old LL site) about a new peer-reviewed, open access journal affiliated with the Linguistic Society of America. The journal is called Semantics and Pragmatics (S&P), and I'm coeditor, together  with Kai von Fintel. The big news today is that we have published our first article, and it's a doozy – Donkey anaphora is in-scope binding, by Chris Barker and Chung-chieh Shan. To give Language Log readers a picture of some of what interests formal semanticists I'll fill you in with a little background on the paper – abstruse stuff, but it has applications. Then (and I hope you'll excuse the awkwardness of me slapping my own back, but who else is gonna do it?) I'll give you an update on how S&P is doing.

The first S&P article is not for the faint-hearted: it uses techniques from logic and computer science to attack one of the oldest problems in semantics. The problem involves the interpretation of pronouns that act like they're connected to an antecedent, and yet aren't in the right place to be connected: these pronouns are what we call "Donkey Anaphora", as made famous by examples in philosopher Geach's 1962 book Reference and Generality:

1) Every farmer who owns a donkey beats it

The problem is to derive the meaning of sentences like (1) from basic principles. The problem can be seen if you play the Philosophy 101 game of translating sentences to ordinary First Order Predicate Logic, since the most natural translation doesn't work. (For those with a basic knowledge of logic, the translation would be something like: for every x [[farmer x & there exists y [donkey y & x owns y]] → [x beats y]], but this is no good because the last y is unbound.) This has led scholars to propose various more or less unnatural translations: the trick, of course, is to solve the problem in a natural and general way that respects all the things that semanticists like to respect.

And what do semanticists like myself most especially like to respect? Our sacred cow, the thing we like to respect the most, is Compositionality, a principle usually, but controversially, attributed to Gottlob Frege. Compositionality says that the meaning of a sentence is computable from the meaning of its parts and the way they are put together: the principle effectively says you can plug meanings together like lego bricks. And that's a good thing. Suppose you were building a machine to understand natural language, something many people are in fact trying to do. If you knew what bricks corresponded to each English word, all you'd have to do would be program the machine to put them together. But the devil is in the details. It turns out to be really tricky to give meanings to all the parts of (1) in a way that allows them to be put together like lego bricks.

By the way, Frege himself saw the problem that such sentences posed for compositionality, offering in his famous 1892 article On Sense and Reference the following variant (but in German, natürlich):

2) If a number is less than 1 and greater than 0, its square is less than 1 and greater than 0

The problem here is to interpret the "its", a pronoun that is apparently bound by "a number", even though "a number" is in the antecedent clause of the conditional, and "its" is in the consequent clause. This is not possible in First Order Predicate Logic. And Frege, not entirely coincidentally, is the guy who invented First Order Predicate Logic, so you can imagine why he might have been concerned.

Returning from the historical excursus, I can now say what Barker and Shan's article does: it provides a general method of deriving the meanings of donkey sentences, and a host of other problematic examples, using a classical logic (i.e. a direct descendant of Frege's logic, though more expressive) in a way that respects compositionality. 

And now that I've said what Barker and Shan's article does, I can say what the journal S&P is supposed to do. It's supposed to publish the highest quality work in semantics and pragmatics, like the Barker-Shan article, edited and typeset to at least the level of quality found in any commercial journal in the field, but available (i) more widely than commercial journals, (ii) more rapidly than commercial journals, (iii), more cheaply than commercial journals. We aim to electronically publish our articles as soon as we've got them accepted, copy-edited, and properly typeset, and they're available to everyone who might be interested, for free.

In our first few months, we've had 13 submissions, as well as a number of other inquiries. All submissions have been reviewed by two or three members of our editorial board, comprising 150 professionals in the field, primarily tenure-line faculty. So far, we've published one paper, with another accepted subject to revision, and we've rejected another 10 (though many of these had high potential, and we're hoping the authors will resubmit). Our average time from submission to decision is well under 2 months. As any linguist will tell you, this is much, much quicker than is normal in our field. But we're only just starting. So the question is whether we can maintain this rapid turnaround in the face of what we hope will be an avalanche of new submissions, as people get used to the idea of our field having an electronic journal.

For more on the Journal, see the latest post on the blog Semantics Etc., by my S&P co-editor. Kai is especially excited by some of our wonderful geeky add ons. In his words: "enjoy the extensive hyper-features: active links from the text to the bibliography, clickable DOIs in the bibliography for most of the cited literature, links from example numbers in the text to the examples, etc." To see what he means, try clicking on the pdf of the article – good to look at, and packed with crunchy pdf goodness. Though it's a lot to shoot for, we both hope that today's publication marks the beginning of the future in our little field. As Kai says: "a big day for semantics, for open access, open science publishing in our field."

 



13 Comments

  1. Peter said,

    June 10, 2008 @ 8:41 am

    Please excuse a question from an ignorant computer scientist, but my first attempt at a logical representation of statement (1) would be:

    for every x [ farmer x & there exists y
    [ [donkey y & x owns y] → [x beats y] ] ]

    In other words, ensure the final "y" is within the scope of the quantifier "there exists y". Why is this not adequate as a representation?

  2. David Beaver said,

    June 10, 2008 @ 8:57 am

    > Please excuse a question from an ignorant computer scientist, but my first
    > attempt at a logical representation of statement (1) would be:
    >
    > for every x [ farmer x & there exists y
    > [ [donkey y & x owns y] → [x beats y] ] ]

    Peter, the problem with the representation you suggest can be seen if you consider a model where every farmer owns lots of donkeys but beats none of them, and where there is at least one object which is not a donkey. Your representation comes out True in this model (because there is a y which makes the antecedent of the implication False, and hence the implication True). But of course the original sentence is False in such a situation.
    Cheers, David
    ps. After seeing Jorge's comment below, I looked back at Peter's representation and saw that it does indeed have a more obvious flaw than I'd described, i.e. it says everything is a farmer.

  3. David said,

    June 10, 2008 @ 9:04 am

    What Peter said. My mother-in-law once described me as a person who, if you ask him what time it is, will tell you how to build a watch. David Beaver's post seems to be from is the sort of person who, if you ask him what time it is, would explain that quantum mechanics proves that you cannot know both where the cesium atom is and how fast it's vibrating.

  4. Jorge said,

    June 10, 2008 @ 9:21 am

    Peter: that says that everyone is a farmer. That part is fixable, but it also says that for everyone there is a donkey such that if they own it then they beat it. But that's true as long as for everyone there some donkey they don't own, whether or not they beat the ones they do own.

  5. Peter said,

    June 10, 2008 @ 11:06 am

    Ok, thanks, David. The old false-antecedent-making-the-implication-true assumption that distinguishes logicians from ordinary people!

  6. Robert said,

    June 10, 2008 @ 11:26 am

    The donkey sentence is logically equivalent to 'If a donkey is owned by a farmer, that donkey is beaten by that farmer', i.e

    For All x ( x∈ Donkey => ((Owner(x) ∈Farmer) => (Owner(x) Beats x)))

    which has everything properly bound. Owner(x) maps the universe to elements of the power set of the set of all legal persons, quite often the null set, and x Beats y is a definable predicate. Farmer is a subset of Homo Sapiens, while Donkey is the set of all members of that species.

    English has various ways of rearranging sentences for information packaging purposes, which the transformation from the formulation to the original could fall under. At least, that's the way I'd initially attempt to to resolve the donkey problem, but my degree was in maths, not linguistics.

  7. N. N. said,

    June 10, 2008 @ 2:01 pm

    And what do semanticists like myself most especially like to respect? Our sacred cow, the thing we like to respect the most, is Compositionality, a principle usually, but controversially, attributed to Gottlob Frege. Compositionality says that the meaning of a sentence is computable from the meaning of its parts and the way they are put together: the principle effectively says you can plug meanings together like lego bricks.

    Are you familiar with the work of Cora Diamond and James Conant on this question, and if so, what's your opinion of it.

  8. David Beaver said,

    June 10, 2008 @ 5:15 pm

    N. N.: no, I'm not familiar with the work of Cora Diamond and James Conant, which, so far as I can gather, fits somewhere into the interpretation-of-Wittgenstein industry. But I'd be happy to be enlightened.

  9. N. N. said,

    June 10, 2008 @ 5:56 pm

    David,

    Frege's context principle (the view that a word has meaning only in the context of a proposition) follows from his doctrine of primacy of judgment (the view that the basic unit of meaning is the significant sentence). Frege claims in the Foundations of Arithmetic that the meaning of a word is conferred on it by the sense of the sentence in which it occurs. Thus, the strong version of compositionality that you respect gets the cart before the horse (according to Frege).

    Diamond is interested in the consequences that Frege's position has for nonsense. Is it possible for a sentence to be nonsense because of the incompatible meanings of its parts? Her answer is 'No.' Ian Proops sums up her objection as follows: "Diamond suggests that in rejecting intrinsically illegitimate symbols Wittgenstein – and in her view also Frege – implicitly reason as follows: 'If nonsense is explained as the result of an illegitimate combination of meaningful elements, then we shall be committed to an incoherent view of a nonsense sentence as one that is nonsense because of what it says'." If the strong view of compositionality were correct, then it would be possible to construct nonsensical combinations of meaningful signs (what Conant calls "substantial" nonsense). This, Diamond holds, is absurd. Therefore, the strong version of compositionality must be false.

    I keep speaking of the "strong" version of compositionality because Diamond and Conant don't want to deny any version of compositionality (though I must confess that there weaker versions are mysterious to me). Here's what Conant has to say about the interplay of compositionality and contextuality:

    Frege does, of course, speak of a thought's having "parts" out of which it is "built up" (see, e.g., Frege, 1979), and of how we can "distinguish parts in the thought corresponding to parts of a sentence, so that the structure of the sentence can serve as a picture of the structure of the thought" (Frege, 1984, p. 390). But Frege immediately follows this latter remark with the observation: "To be sure, we really talk figuratively when we transfer the relation of whole and part to thoughts; yet the analogy is so ready to hand and so generally appropriate that we are hardly bothered by the hitches that occur from time to time" (Frege, 1984, p. 390). What kinds of hitches? Hitches, for example, of the sort that Kerry fails to notice when he imagines that he can get hold of a concept merely by employing an expression that elsewhere, in its usual employment, is able to symbolize a concept. Frege thus worries that the all but unavoidable (and in itself potentially innocent) locution of a thought's having "parts" or "components" will mislead one into attributing a false independence to the parts of a thought—so that we can imagine that the parts could retain their identity apart from their participation in a whole of the appropriate structure: "But the words 'made up of,' 'consist of,' 'component,' 'part,' may lead to our looking at it the wrong way. If we choose to speak of parts in this connection, all the same these parts are not mutually independent in the way that we are elsewhere used to find when we have parts of a whole" (Frege, 1984, p. 386). Frege's context principle—and the correlative doctrine of the primacy of judgment (which refuses to allow that the parts of the whole are "mutually independent in the way that we are elsewhere used to find when we have parts of a whole")—in thus insisting upon the unity of a thought or a proposition, in no way denies the compositionality of either thought or language. It insists only upon the mutual interdependence of compositionality and contextuality (Diego Marconi [unpublished] nicely summarizes the position in the slogan "Understanding without contextuality is blind; understanding without compositionality is empty.") Frege’s view of natural language—upon which the Tractatus builds its "understanding of the logic of language"—affirms both (1) that it is in virtue of their contributions to the senses of the whole that we identify the logical "parts" of propositions, and (2) that it is in virtue of an identification of each "part" as that which occurs in other propositional wholes that we segment the whole into its constituent parts (see note 37). [p. "The Method of the Tractatus," p. 432, n. 34]

    Diamond's article dealing with all of this is titled "What Nonsense Might Be." (Sorry for this rambling response, but I'm pressed for time).

  10. Steve Harris said,

    June 10, 2008 @ 7:17 pm

    The Donkey sentence

    (1) Every farmer who owns a donkey beats it.

    is easily rendered in first-order logic:

    (1')
    for all x, { Farmer(x) ==> [ for all y, ( (Donkey(y) & Own(x,y)) ==> Beat(x,y) ) ] }

    or, more simply,

    (1")
    for all x, for all y, [ (Farmer (x) & Donkey(y) & Own(x,y)) ==> Beat(x,y) ]

    The hard part is figuring out the substrate by which (1) gets turned into that proposition. Of course, I should read the paper, to see. Eh, I will; but not before penning this missive.

    The first difficulty is seeing how to interpret "a donkey". We start out with an obvious universal quantifier, "Every farmer"; where does it come about that "a donkey" is also a universal quantifier? But this is really a quite standard feature of English, using "a" to mean "any", hence, "every". Example:

    (2) If I pick up a book, that's because I want to read it.

    That means, quite plainly,

    (2')
    for every x, [ (Book(x) & PickUp(I,x)) ==> Want-to-Read(I,x) ]

    But, of course, not every instance of "a" means "any":

    (3) I met a man yesterday.

    That means, just as plainly,

    (3')
    there exists x [ Man(x) & Yesterday-Meet(I,x) ]

    So the actual difficulty comes in describing when "a" means "any" and when it means "this specific one".

    Perhaps it is the occurence, in (1), of "a" within an introductory "Every" phrase that gives the universal (as opposed to existential) nature to "a". Does this apply to (2) as well? If we write (2) with a genereralized third-person instead of first-person subject, I think it becomes clearer:

    (4) If a person picks up a book, that's because he wants to read it.

    means

    (4')
    for all x, for all y, [ (Book(y) & PickUp(x,y)) ==> Want-to-Read(x,y) ]

    Here it's the "If" that is signalling both instances of "a" to mean "every".

    I'm going to leave the issue of when "a" means "any/every" for now and get on to the more difficult aspect: the "it" of (1).

    The truly annoying part of this is that there is *also* a way to intepret (1) using an existential quantifier. This gets into Russell's explication of the word "the" to indicate a unique individual. In Principia Mathematica, Russell and Whitehead introduced a single symbol–apostrophe (')–to indicate this usage. It works like this, if I recall correctly:

    If we're going to refer to "the x", we need a defining property for x. Let's say we'll define x by the propositional function P, i.e., we're going to be referring to "the x which satisfies P(x)". We'll sympolize this with

    'xP(x) (this is not a proposition, so I'm not giving it a number)

    Then if we want to say,

    (5) The x which satisfies P(x), also satisfies Q(x).

    we can write that symbollicaly as

    (5')
    ('xP(x))Q(x)

    (It is the abutting of Q(x) right up against the grouping ('xP(x)) which important here, as meaning that the Q(x) is taking part in the "the" meaning.)

    And what (5') actually means, in terms of the usual logical symbols, is this enormous thing:

    (5")
    {there exists x [ P(x) & for all y (P(y) ==> y = x) ] } &
    for all x [ P(x) ==> Q(x) ]

    It's important to note the impliled parenthication in (5'): The x inside the Q is just as bound as the one inside the P. The apostophe symbol is a rather complex one; but it's extremely common in our everyday speech to use this kind of complex quantization.

    Thus, ' contains both existential and universal quantification. To say (5)–to make use of the definite article "the"–is to make both an existential and a universal claim: There is something which satisfies P; it is unique; and, what is more, anything satisfying P also satisfies Q.

    We can interpret (1) in this Russell/Whitehead manner:

    (1"')
    for all x { Farmer(x) ==> ['y(Donkey(y) & Own(x,y))] Beat(x,y) }

    Note that (1"') is *not* the same as (1"); it has a different meaning, as (1"') makes a statement only about farmers who own a unique donkey, while (1") makes a statement about farmers who own any number of donkeys (and says that all the donkeys owned by a given farmer are beaten by that farmer). But I believe (1) is ambiguous between these two meanings; it's not clear whether (1) includes within its comprehension farmers who own multiple donkeys. (1"') is, I maintain, an acceptable interpretation of (1), and it is likely closer to the intent of the speaker than is (1") (i.e., the likely picture the speaker has in mind is of a farmer with a single donkey, beating it, not of a farmer beating multiple donkeys).

    Let us stipulate, then, that (1"') is a good rendering of (1). How are we to have come to this? After all, (1"') is explicitly about uniquely defined objects, and (1) says "a donkey", not "the donkey". What triggers the usage of ' for the interpretation of (1) is the "it". Once a donkey has been introduced into (1), usage of "it" (with "a donkey" being the clear referent) means that the speaker is now concentrating upon what has become a uniquely defined object–no longer, really, "a donkey", it is, instead, *the* donkey referred to in the first part of the sentence.

    We do this all the time in conversation: We introduce an element with "a" and then procede to talk about "the" element, assuming that it is now uniquely identified:

    (6) Joe gave Jane a flower, and she sniffed it.

    (6')
    {there exist x [Flower(x) & Gave(Joe,Jane,x)]} Sniff(Jane,x)

    However, the first conjunct in (6') is contained in the second; so we can render (6) more simply as

    (6")
    {'x[Flower(x) & Gave(Joe,Jane,x)]}Sniff(Jane,x)

    though that is more naturally the rendering of

    (6"')
    Jane sniffed the flower that Joe gave her.

    ———————————————————————–

    Frege's sentence:

    (7) If a number is between 1 and 0, then its square is between 1 and 0.

    I can't see this as anything other than a simple universal quantifier (i.e., not using the Russell/Whitehead '):

    (7')
    for all x (0 < x 0 < Square(x) Q(x) ]

  11. Steve Harris said,

    June 10, 2008 @ 7:26 pm

    Hmm, I got cut off. Apparently, I'm too long-winded for these comments.

    First, a correction on (6'):

    (6')
    {there exist x [Flower(x) & Gave(Joe,Jane,x)]} Sniff(Jane,x) }

    Now what I said about (6") should make sense.

    Starting over with Frege:

    (7) If a number is between 1 and 0, then its square is between 1 and 0.

    I can't see this as anything other than a simple universal quantifier (i.e., not using the Russell/Whitehead '):

    (7')
    for all x (0 < x 0 < Square(x) Q(x) ]

  12. Steve Harris said,

    June 10, 2008 @ 7:30 pm

    Trying again. (Apparently, something I'm doing is making unfortunate contact with the text coding, so I'll change my notation.)

    Frege's sentence:

    (7) If a number is between 1 and 0, then its square is between 1 and 0.

    I can't see this as anything other than a simple universal quantifier (i.e., not using the Russell/Whitehead '):

    (7')
    for all x ( Between(0,1,x) ==> Between(0,1,Sq(x)) )

    The hard part is that the overall structure of (7) is that of an if-then; but (7') has a universal quantifier as the outside logical operator, not an implication. The signal for the placing a universal quantifier outside of everything is the occurence of "a" in the if-phrase: As in (2), that means we need to universally quantify. Then (again as in (2)), any occurence of "it" in the then-phrase is a reference to that same variable we're quantifying over from the if-phrase.

    Thus, I suggest this basic pattern: Let p and q be sentential propositional functions (i.e., that can occur in natural-language sentences, not just in logical propositions); p and q can both take a noun phrase as arguments. Let 0np stand for any noun phrase that is incomplete unless an article is placed before it. Let P, NP, and Q be the corresponding logical propositional functions, i.e., P(x) means "x satisfies p" NP(x) means "x is a 0np", and Q(x) means "x satisfies q". Then the sentence

    (8) If p(a 0np), then q(it).

    is rendered

    (8')
    for all x [ (NP(x) & P(x)) ==> Q(x) ]

    (And that's the end.)

  13. Chung-chieh Shan said,

    June 26, 2008 @ 6:37 pm

    A quick comment: Much of our paper is about the perennial problem of how to systematically render (1) as (1')/(1"), (8) as (8'), and so on.

RSS feed for comments on this post