Admin update

« previous post | next post »

I've recently done a bit of site-administration hacking — more details are below, but the bottom line is that URLs like http://languagelog.org now do the right thing. Despite this progress, a few problems remain. There's one problem in particular where suggestions from readers with expertise in website administration would be appreciated.

On April 6, 2008, the old Language Log server failed, and had to be replaced by a new machine, in a new location with a new IP address. I restored the old archives in comparable places on the new machine, and arranged for the old domain itre.cis.upenn.edu to be resolved to the new IP address, so that old links like http://itre.cis.upenn.edu/~myl/languagelog/archives/003572.html continue to work. Rather than try to get the old 2003-era Movable Type installation working on the new machine, I decided to switch to WordPress 2.5, which I set up in /var/www/nll (that's "nll" for "new language log"), so the the basic URL for the new site became http://languagelog.ldc.upenn.edu/nll.

Yesterday, I succeeded in persuading the domain registration company to direct languagelog.com, languagelog.net, and languagelog.org to the IP address of the new site. (Actually, I think it's the company that bought the company that bought the company that I registered those domains with back in 2003, which is probably why it took me a while to make the right connections, but anyhow, it's done.)

After a few flourishes of RedirectMatch in the appropriate apache configuration files, nice simple URLs like http://languagelog.org will now take you to the LL home page, and individual posts can be accessed via URLs like http://languagelog.org/?p=172.

One remaining problem is the old RSS feed at URL http://itre.cis.upenn.edu/~myl/languagelog/index.rdf. I understand that some people — maybe quite a few people — still have their RSS readers linked there. So I'd like to reconnect that feed to the new one at http://languagelog.ldc.upenn.edu/nll/?feed=rss2. I tried adding

RedirectMatch ^/~myl/languagelog/index.rdf$ /nll/?feed=rss2

in the appropriate place in /etc/apache2/sites-available/default, but that doesn't seem to work.

There's probably a way to get WordPress to put the right xml in the right place, but a quick search of the docs didn't turn it up. I could hack up a program to get the xml from the new URL and put it in the old file, but that seems inelegant at best.

So if you know how to fix this, please let me know.



22 Comments

  1. Greg said,

    May 21, 2008 @ 9:17 am

    This might help – http://radio.userland.com/userGuide/reference/howToRedirectRss

  2. Jon said,

    May 21, 2008 @ 9:59 am

    I'm sure you're already aware of this, and I assume it's a problem with the noninstallation of movable type. But I can't get the search function to work on the old site.

  3. Peter Hollo said,

    May 21, 2008 @ 10:20 am

    I think the problem is that in a redirect or redirectmatch, the URL you're redirecting to can't be relative – so you need to put the entire URL of your current feed. The line would be:

    RedirectMatch ^/~myl/languagelog/index.rdf$ http://languagelog.ldc.upenn.edu/nll/?feed=rss2

    Hopefully… :)

  4. Tldz said,

    May 21, 2008 @ 10:35 am

    You want to make sure that your Redirect happens on the right virtual host. Either that means that you create a <VirtulaHost> section and put the Redirect there, or you need to make sure that an unwrapped Redirect will be matched against the old hostname. (Note that http pays a lot of attention to hostnames, not just to ip addresses.)

  5. Erik Hetzner said,

    May 21, 2008 @ 10:57 am

    redirects require an absolute url:

    Redirectmatch ^/~myl/languagelog/index.rdf$ http://languagelog.ldc.upenn.edu/nll/?feed=rss2

    You don't really need a regex match, though:

    Redirect permanent /~myl/languagelog/index.rdf http://languagelog.ldc.upenn.edu/nll/?feed=rss2

    should do the trick.

    The permanent means return a 301 redirect which lets well behaved clients know that in the future they should check the new url.

  6. Ruud said,

    May 21, 2008 @ 11:26 am

    Would it not be easier to put a notice in the old RSS feed that people should update to the new feed?

  7. Rick S said,

    May 21, 2008 @ 11:45 am

    I don't really know anything about it, but I notice your Redirectmatch line lacks a question mark before "feed=rss2"; could that be the problem?

  8. Rick S said,

    May 21, 2008 @ 11:57 am

    By the way, I notice that the URL http://languagelog.org/nll/?feed=rss2 returns only 15 entries, rather than the 151 of http://languagelog.ldc.upenn.edu/nll/?feed=rss2.

  9. Mark Liberman said,

    May 21, 2008 @ 12:08 pm

    @Rick S: Sorry, that was a typo in the post — the line in the configuration file had the question mark in the appropriate place.

    Someone emailed me to suggest that the second argument for RedirectMatch needs to be a full URL starting with "http://". I don't think that's true, but trying it didn't help.

  10. Mark Liberman said,

    May 21, 2008 @ 12:20 pm

    @greg: I tried the newLocation thing recommended at that radioUserland site, but it's not clear to me that it works. When I subscribe via Google Reader, I don't see the new stuff; when I open the URL in firefox, I get a syntax error because the directive is not recognized.

  11. John P said,

    May 21, 2008 @ 12:23 pm

    The "Urban Giraffe" is hosting a wordpress plugin to do RSS redirects:

    http://urbangiraffe.com/plugins/redirection/

    I found this by googling "how to redirect RSS feed".

  12. John Laviolette said,

    May 21, 2008 @ 12:28 pm

    The URL:

    http://itre.cis.upenn.edu/~myl/languagelog/index.rdf

    and the URL:

    http://languagelog.ldc.upenn.edu/nll/?feed=rss2

    are different at the beginning, too. Unless itre.cis.upenn.edu and languagelog.ldc.upenn.edu point to the same place, I don't see how changing just ~myl/languagelog/index.rdf to /nll/?feed=rss2 is going to work.

    Also, I believe the period and the question mark have to be escaped, because they are pattern-match characters. See: http://www.apacheref.com/ref/mod_alias/RedirectMatch.html

    Maybe this will work?

    Redirectmatch ^/~myl/languagelog/index\.rdf$ http://languagelog.ldc.upenn.edu/nll/?feed=rss2

  13. Mark Liberman said,

    May 21, 2008 @ 1:15 pm

    @John P: the redirection plugin doesn't work for this task. I installed it, and it accepts the proposed redirection without complaint, but the URL still goes to the same old file.

    @John Lavioletter: itre.cis.upenn.edu and languagelog.ldc.upenn.edu now connect to the same IP address, so any http requests to the former will end up being handled by the latter.

  14. John Laviolette said,

    May 21, 2008 @ 2:06 pm

    It may be the same IP address, but is it the same location? In other words, the same directory or folder on the server? /var/www/ contains two folders, nll/ and myl/ ?

    Erik Hetzner's second example should be correct. I don't see why it wouldn't work, unless it has something to do with propagation. I'm not familiar with the way WordPress works, but does it create file that ?feed=rss2 references, or is that always generated dynamically? If there's a static file, you can try making ~myl/langaugelog/index.rdf a symbolic link to that file.

  15. Steven said,

    May 21, 2008 @ 2:58 pm

    One possibility is that the server may be using a separate redirect rule to direct requests to /~myl/ to another folder. If that's the case, then you may have to change your rewrite rule to make sure that it either accounts for that rule or takes precedence over it.

  16. Greg said,

    May 21, 2008 @ 5:20 pm

    @Mark

    Open a new text file, and enter these 4 lines:

    http://languagelog.ldc.upenn.edu/nll/?feed=rss2

    Save the file as "index.rdf" and place it in your old server in the http://itre.cis.upenn.edu/~myl/languagelog/ directory to replace the old index.rdf (you might want to rename the original so you don't lose it). I believe that that is all there is to it.

  17. Greg said,

    May 21, 2008 @ 5:22 pm

    and of course the code got lost because of the brackets:

    http://itre.cis.upenn.edu/~myl/languagelog/index.rdf

    just drop the space after the <'s

  18. Greg said,

    May 21, 2008 @ 5:23 pm

    [?xml version="1.0"?]
    [redirect]
    [newLocation]http://itre.cis.upenn.edu/~myl/languagelog/index.rdf[/newLocation]
    [/redirect]

    maybe this will work. replace the [ and ] with , respectively of course

    [myl: I tried this (see comments above). When I access that URL with firefox, it complains about an invalid xml directive rather than showing me the feed and offering to subscribe to it.]

  19. Mark Mills said,

    May 21, 2008 @ 6:14 pm

    The URL you redirect TO has to be a complete URL with the exact site you want any and all of the other matches to go to. (From the Apache documentation: "The old URL-path is a case-sensitive (%-decoded) path beginning with a slash. A relative path is not allowed. The new URL should be an absolute URL beginning with a scheme and hostname.")

    You seem to have multiple vhosts configured to support the old and new sites so you may need to place this in more than one place. At the very least it needs to be within the VirtualHost segment for the old site's files OR in the Directory segment that hosts "~myl" or languagelog. You may wish to liberally apply it to just about everyone of them since it seems pretty harmless.

    Also, you might as well be a bit more liberal about the path you accept, just in case:

    RedirectMatch permanent .*/languagelog/index.rdf$ http://languagelog.ldc.upenn.edu/nll/?feed=rss2

  20. Garrett Wollman said,

    May 21, 2008 @ 10:53 pm

    I almost always get frustrated and end up doing this stuff using mod_rewrite instead. That also has the advantage that it works in an understandable way when it's not in the main config file (which is pretty important to me in my day job, where we don't let all 800 users fiddle with our httpd.conf files). That would look something like this:

    —-snip—-
    RewriteEngine on
    RewriteBase /~myl/languagelog/
    RewriteRule index.rdf http://languagelog.ldc.upenn.edu/nll/?feed=rss2 [R=permanent]
    —-snip—-

    (That would go in a .htaccess file, and of course the site configuration has to allow this sort of stuff in .htaccess files. (We usually use "AllowOverride All".)

  21. Charles Belov said,

    May 22, 2008 @ 3:14 am

    That would be my take. You need to put the redirect statement in one of the following files:

    /www/conf/httpd.conf
    /www/htdocs/.htaccess

    assuming /www/htdocs is an alias of the path to your website

  22. Darren Gilroy said,

    May 22, 2008 @ 11:12 pm

    Often the tilde is caught and handled by the "UserDir" module (google mod_userdir) before RedirectMatch gets its chance. Internally apache organizes all of the modules in a chain — the details are a bit fuzzy for me.

    In any case, try turning off UserDir/mod_userdir in your apache config file by prefixing relevant lines with a #. I am grasping at straws here (too?), but hope this helps.

RSS feed for comments on this post