PHP

Changed blog software from Serendipity (s9y) to WordPress

This blog has been neglected far to long / often, partly because things are always so busy and partly the company and my personal twitter accounts seem to have taken over. However there’s another reason, that I think I’d outgrown the Serendipity (s9y) software due to having more exposure to WordPress these days. I also noticed recently that there’s a WordPress app for the iPhone which was the final thing that convinced my to switch software as I’ve always got my iPhone with me and am therefore far more likely to bash out a draft on there before post-editing at the PC (in fact that’s what I did for this post!).

So having decided to switch I set about investigating the ‘how’ to do that and came across two useful resources that highlighted how it might be done:
Technosailor.com has the actual download for the WordPress import script for Serendipity and e-mats.org showed a little bit more of the detail of what to do with it.

The importer worked well in the main, first importing Categories, Users and then posts. However it failed to put the posts in any categories (other than the WordPress default of Uncategorised) and didn’t bring any of the post tags across. So those elements ended up being a manual cut-‘n’-paste job, which took about half an hour or so. Another gotcha with the import is that if you’ve deleted posts / categories / authors etc. from your Serendipity blog, there will be gaps in the database Id numbers, the importer just inserts the next counted value rather than the Id of the post, category or author – this obviously has a major impact on the search engine profile of your site and the user experience resulting in the dreaded 404 errors if a user clicks through from a search engine. For posts, I manually sorted the Id numbers and for the rest I used the .htaccess method discussed next.

Then I moved on to the RewriteRules in the apache .htaccess file as discussed in the e-mats.org post (see above), however the RewriteRules they suggest are somewhat over simplistic and at least one of them simply didn’t work. So I hand crafted a few of my own that whilst being more complete are not a total resolution, they will however cover most scenarios from a standard Serendipity set up to a standard WordPress set up:

RewriteEngine On
RewriteBase /
RewriteRule ^archives/([0-9]+)-.*.html$ /index.php?p=$1 [L,R=301]
RewriteRule ^archives/([0-9]{4,4})/([0-9]{2,2}).*.html$ /index.php?m=$1$2 [L,R=301]
RewriteRule ^archive$ / [L,R=301]
RewriteRule ^feeds/index.rss2$ /index.php?feed=rss2 [L,R=301]
RewriteRule ^feeds/index.rss1$ /index.php?feed=rss [L,R=301]
RewriteRule ^feeds/index.rss$ /index.php?feed=rss [L,R=301]
RewriteRule ^feeds/index.atom$ /index.php?feed=atom [L,R=301]
RewriteRule ^plugin/tag/(.*)$ /index.php?tag=$1 [L,R=301]

Which covers Archives, RSS feeds and article Tags, however there’s still Categories and Authors to deal with, which if you remember had database Id mapping to be done. So for Categories you probably need to be adding 2 to the Id that would have been found in the Serendipity set up due to the standard WordPress ‘Uncategorized’ and ‘Blogroll’ entries:

RewriteRule ^categories/(1)-.*$ /index.php?cat=3 [L,R=301]

Simply repeat that line in your .htaccess file replacing the Id in the brackets (from Serendipity) and after the cat= with the appropriate Id from WordPress.

And similarly for Authors:

RewriteRule ^authors/(1)-.*$ /index.php?author=2 [L,R=301]

where again the bracketed number is the Serendipity Id and the WordPress Id is added after the equals sign.

All in all, it’s a little more work however it’s a much more complete mapping and should catch the long tail entries and keep a lot more of your SEO benefits from your old blog software. Talking of SEO benefits, it’s perhaps worth pointing out what the elements in the square brackets at the end of each rule means. The ‘L’ tells apache it’s the last rule to apply now we’ve found a match. Most importantly for SEO the ‘R=301’ tells apache to issue a 301 RedirectPermanent header, which is the way to tell the search engines that an item has permanently moved, over time Google et al will pick up on the fact that things have moved around and update their cached records for your site and update their results – win 🙂

If there’s anything you can add or have gained from this feel free to post a comment for others to benefit from – thanks.