Nurbling

« previous post | next post »

From today's SMBC:

A little internet magic:

And hey presto:

Except if we compare the actual 2012 SOTU to the nurbled version, we find that a substantial amount of over-nurbling has occurred — e.g. in the first quoted example

nurble nurble nurble nurble TESTAMENT nurble nurble COURAGE, SELFLESSNESS, nurble TEAMWORK nurble nurble's nurble nurble.

which is a nurbled version of

These achievements are a testament to the courage, selflessness, and teamwork of America's Armed Forces.

But according to the proposed rule "Change everything but the nouns to 'nurble'", we should have gotten something like

nurble ACHIEVEMENTS nurble nurble TESTAMENT nurble nurble COURAGE, SELFLESSNESS, nurble TEAMWORK nurble AMERICA 'nurble nurble FORCES

Apparently Zach's reader interpreted "noun" as "singular common noun" – and decided to leave 's in as a un un-nurbled token as well. So, here's a little home project for those LLOG readers who have electrical power and internet service this post-Frankenstorm morning: Write a proper nurbler in NLTK (or some other framework of your choice).

You could try some variations on the theme, like replacing just the nouns, rather than everything except the nouns.

Update — some autonurblers from the comments include the original program by Jeff Lee, which uses context-independent table lookup for POS assignment, and John Radke's NLTK version with L3viathan's variation on the theme, which use NLTK's POS tagger:

# nurble.py by http://twitter.com/johnradke
import nltk
ok = ['NN', 'NNP', 'NNS']
punc = [',', '.', '!', '?']
def nurbword(taggedWord):
if taggedWord[1] in ok + punc: return taggedWord[0].upper()
return 'nurble'
def untok(words):
return "".join(words[0:1] + [w if w in punc else " " + w for w in words[1:]])
#############################
f = open('sotu.txt')
sotu = f.read()
f.close()
sentences = nltk.tokenize.sent_tokenize(sotu)
taggedSentences = [nltk.pos_tag(nltk.word_tokenize(s)) for s in sentences]
nurbled = open('nurbled.txt', 'w')
nurbled.write(" ".join(untok([nurbword(w) for w in s]) for s in taggedSentences))

And Adam B's perl one-liner, as modified by Tom, which uses the tagger in the Lingua module:

perl -MLingua::EN::Tagger -ple'$_=Lingua::EN::Tagger->new->get_readable($_);s#/N\w+##g;s#/P\w+##g;s#[A-Z]\S*/\w+#Nurble#g;s#\S+/\w+#nurble#g'

And a version by Rob Speer that uses FreeLing for the POS tagging, and relies on ConceptNet to emphasize the most "political" words. Rob's version has its own cute web app which automatically seeds itself with randomly-selection (U.S. Presidential) Inaugural Addresses…



30 Comments

  1. Zach said,

    October 30, 2012 @ 9:11 am

    Yes, there are a few people working on advanced nurblers. Hopefully there'll be something soon to nurble this pressing need.

    <3
    Zach

  2. L3viathan said,

    October 30, 2012 @ 9:18 am

    Shouldn't it be rather easy if you POS-tagged the text with e.g. http://nlp.stanford.edu/software/tagger.shtml ?

  3. johnradke said,

    October 30, 2012 @ 9:19 am

    I whipped up an NLTK-based version here:

    http://pastebin.com/eN2vw58d

    with the following result on President Obama's last SOTU:

    http://pastebin.com/cyNa5uuZ

  4. Wes said,

    October 30, 2012 @ 9:24 am

    What is the essence of nurble? Why do we find nurble to be such a profound statement on our meaningless existence?

  5. Adam B said,

    October 30, 2012 @ 9:25 am

    pipe the text to
    perl -MLingua::EN::Tagger -nE'$_=Lingua::EN::Tagger->new()->get_readable($_);s#/N\w+##g;s#\S+/\w+#nurble#g;say'

  6. johnradke said,

    October 30, 2012 @ 9:26 am

    Couple bugs I'm aware of with my version: 1) it strips all paragraph formatting in the process of parsing and re-forming the sentences, and 2) it does not capitalize "nurble" where appropriate.

    Anybody's free to do whatever they want with the code.

  7. GAC said,

    October 30, 2012 @ 9:28 am

    Leaving in 's could be an unintended result of the programming — apparently 're is also left alone later in the text.

  8. Adam B said,

    October 30, 2012 @ 9:29 am

    Er,
    perl -MLingua::EN::Tagger -ple'$_=Lingua::EN::Tagger->new->get_readable($_);s#/N\w+##g;s#\S+/\w+#nurble#g'

    is tidier. The clever stuff is all in Lingua::EN::Tagger

  9. L3viathan said,

    October 30, 2012 @ 9:36 am

    @johnradke

    I took the liberty of adding ".upper()" to make it look more like in the image.

    http://pastebin.com/skNcPwYL

  10. Rod Johnson said,

    October 30, 2012 @ 9:37 am

    Mark, I'm surprised to see that "to the" was impervious to your nurbling.

    [(myl) Scribal error. Fixed now.]

  11. Toma said,

    October 30, 2012 @ 1:19 pm

    I wonder if using "smurf" would be better than using "nurble"? Well, now that I think about it, "smurf" works better for replacing adjectives: "Take this smurfing election and smurfing shove it" for example.

  12. sam said,

    October 30, 2012 @ 1:27 pm

    RE: nurbling w/NLTK: AFAIK, all the NLTK tokenizers separate "'s" automatically. It wouldn't be too difficult to fix, but it'd be irritating.

    [(myl) But it would be easy to nurble-proof each separated 's, if we decided on that principle.]

  13. Sili said,

    October 30, 2012 @ 2:31 pm

    Better and more readily parseable: Smurf this smurfing smurf and smurfing smurf it.

  14. Mr Fnortner said,

    October 30, 2012 @ 2:36 pm

    The satisfaction of saying nurble notwithstanding, we already have blah, blah, blah, and yada, yada, yada that are easy to process and aren't distracting.

  15. Bril said,

    October 30, 2012 @ 3:22 pm

    There's still room for refinement of the effect. One might replace all words by nurble of something similar. Just like Herbert Fritsch does in this Youtube fragment:

    http://www.youtube.com/watch?v=rI3R2606tvM&feature=related

  16. benjamin said,

    October 30, 2012 @ 3:42 pm

    a somewhat related marklar has been made in a marklar marklar that features a marklar marklar who speak a marklar that is identical to marklar except for the fact that every marklar is replaced with the marklar "marklar".

    http://www.youtube.com/watch?v=Bzq4YDt7V-o

  17. Jeff Lee said,

    October 30, 2012 @ 3:54 pm

    'Apparently Zach's reader interpreted "noun" as "singular common noun" – and decided to leave 's in as a un un-nurbled token as well.'

    Actually, neither assertion is correct. The part of speech database I used has very few plural nouns in it (only about two hundred — out of 294,950 entries), and the letter 's', for some reason, is present in the database as a noun in its own right. If you can read the PHP code, you'll notice I did no processing of special cases. I "decided" nothing, merely spit out a capitalized version of the word if it was tagged as a noun in the database, or "nurble" if it was not; it was, as is clearly expressed in the comments, an extremely quick and dirty job.

    I've since put up an expanded version which does attempt to figure out if unknown words are pluralized forms of words that do exist in the database, but it's still far from perfect (though it never crossed my mind that perfection would be demanded of a task of such trivial import, suggested and undertaken for mere amusement value).

  18. Rubrick said,

    October 30, 2012 @ 4:52 pm

    I think "nurble" should be extended to mean not just the act of converting text via such a script, but producing speech that consists mostly of buzzwords and connective tissue in the first place. As in, "Romney nurbled on for another 3 minutes without making any effort to address the question."

  19. Steve said,

    October 30, 2012 @ 4:58 pm

    Technical issues aside, the joke is undercut by the fact that it is not true that you can replace everything in a political statement but nouns with "nurble" and still have a broadly understandable comment. i think the joke would be funnier if it at least "sort of" worked.

    My suggestion: tweak the rule so that you just replace all verbs with "nurble". This creates a similarly absurdist effect while also producing statements one can actually follow. If this produces too few nurbles, you could also nurblize nominalizations, particles, and other, similar, words which are not technically verbs but that more or less directly reference or contemplate an action being performed. Nurble could also be modified to make them match tense and etc. a rule along these lines would probably be too complex to lend itself to a simple algorithm but I think it makes for more amusing nurblizations.

    I support Obama because he is an insightful man who understands the difficulties are country faces and he'll do what he needs to do to make our country great.

    Becomes

    I nurble Obama because he nurble an in-nurbleful man who nurbles the nurbles our country nurbles and he'll nurble what he nurbles to nurble to nurble our country great.

    While

    Romney is the top choice for any true American who understands that the entitlement-culture is turning us into whiney and effete shadows of what we should be.

    Becomes

    Romney nurble the top nurble for any true American who nurbles that the en-nurblement-culture is nurbling us into nurbly and effete nurbles of what we should nurble.

  20. Nathan said,

    October 30, 2012 @ 5:11 pm

    @Steve: It's funny, but your POS tagging is buggy.

  21. Steve said,

    October 30, 2012 @ 5:29 pm

    Agreed. I cheated outrageously with "in-nurbleful" and "en-nurblement culture." And probably made other "errors" as well. But, to rationalize my errors away, my variant on the nurble rule is deliberately a flexible one that gives the nurbler a fair degree of creative license. My goal is to produce sentences that are chock-full of absurd nurble goodness, yet strangely comprehensible. I like "en-nurblement culture" because it is both sillier and easier to follow than, say, "nurblement culture".

    And, to rationalize further, my buggily-executed variant on the nurble rule is analogous to the execution of the "rules" governing real languages, which are often not adhered to rigorously.

  22. rpsms said,

    October 31, 2012 @ 4:09 pm

    Interestingly, my less-than-kindergarten-level grasp of Spanish gives rise to the same experience when viewing Spanish-language advertising.

    I am sure there is already a post for this…

  23. Rob Speer said,

    October 31, 2012 @ 9:48 pm

    Here's an automatic nurbler using FreeLing for the POS tagging. It also uses ConceptNet to emphasize the most "political" words.

    http://nurble.luminoso.com/

  24. Vasha said,

    November 1, 2012 @ 3:02 am

    Thanks for that, Rob. It seems that inaugural addresses have not increased in "politcalness" since the beginning — the twelve that I sampled, beginning with James Madison, all rated between 67% and 76% on ConceptNet's Politic-o-meter; you might expect the art of political speechwriting to converge on a maximal density of buzzwords over time, but apparently not. For comparison I checked a couple other genres of speeches: a sermon rated 58%, Randy Pausch's Last Lecture 43%.

    [(myl) There's a bit of circularity here, I think, since "politicalness" here was defined by the word usage in the set of inaugural addresses. Presumably if the training set were political speeches from the 1780s, then the measure would tend to decline over time when tested on speeches from other eras; and if the training set were speeches from the 2000s, then the measure would tend to increase.]

  25. Tom said,

    November 1, 2012 @ 7:05 am

    Minor modification to Adam B's Perl code to preserve punctuation, rather than nurblifying commas, colons, etc, and to capitalise where appropriate:

    perl -MLingua::EN::Tagger -ple'$_=Lingua::EN::Tagger->new->get_readable($_);s#/N\w+##g;s#/P\w+##g;s#[A-Z]\S*/\w+#Nurble#g;s#\S+/\w+#nurble#g'

  26. Zeigfreid said,

    November 3, 2012 @ 5:27 pm

    Hi

    I thought I should just say, I took one of the scripts on this page that uses nltk, made it slightly more Unix like (infile and outfile are optionally supplied but otherwise it uses stdin and stdout). Then I wrote a quick script with just the following:

    xclip -o | nurble | xclip

    This reads from my clipboard, nurbles, and then writes to my clipboard. I've bound this to alt-ctrl-shift+n so that I can nurble my clipboard directly. That way, when I disagree with something, I can nurble it in-place without having to visit some website or paste it into a temp file.

    I do agree that the nurbling doesn't preserve meaning as well as it could, but I think that Steve's suggestion is going too far. Part of the joke is that "nurble nurble nurble" sounds like "blah blah blah". That way a political speech becomes "nurble nurble nurble AMERICANS nurble nurble HEALTHCARE" or some such.

    PS

    nurble ISSUES nurble, nurble JOKE nurble nurble nurble nurble FACT nurble nurble nurble nurble nurble nurble nurble nurble nurble EVERYTHING nurble nurble nurble STATEMENT nurble nurble nurble nurble nurble nurble nurble nurble nurble nurble nurble COMMENT.

    [(myl) Nice! I'm tempted to start ennurbling annoying comments, along the lines of Teresa Hielsen Hayden's use of disemvoweling.]

  27. Adam B said,

    November 4, 2012 @ 2:21 pm

    I've put a reverse-nurbler (it nurbles only the nouns and verbs) here – it uses a stemmer to (poorly) inflect the nurbles for comic effect.

    Sample output:
    These nurblements nurble a nurblent to the nurble, nurbleness, and nurble of Nurble's Nurbled Nurbles.

  28. Adam B said,

    November 4, 2012 @ 2:54 pm

    (It does classic ennurbling as well now)

    [(myl) Lovely. On reflection, Reverse Nurbling is a better anti-troll treatment.]

  29. Adam B said,

    November 4, 2012 @ 3:15 pm

    I thought the thing with disemvowelment was that the troll's comment was still decipherable to people who wanted to make the effort?

  30. Sili said,

    November 4, 2012 @ 5:02 pm

    I have to admit that Parts Of Speech still makes me stop when abbreviated.

RSS feed for comments on this post