Migrating dasBlog to WordPress

Over the last couple of years, I run my blog using the dasBlog engine. As I started hosting the blog in 2004 on my own server, I choose dasBlog as it did not need any database on the backend, saved everything in XML and did a great job on the full text search over the XML content. Beside that, a blog engine running on ASP.NET seemed the right choice being familiar with the technology. Eventually, I did several fixes and hacks on my installation over the last few years. Unfortunately, there was no new release since March 2009. As I like playing with alternative technologies from time to time and WordPress comes with a rich set of features I miss at dasBlog, I decided to migrate to WordPress. In this article I will describe the steps moving forward to WordPress hosted on a Windows Server 2008.

Overview

Moving forward to the new platform includes several steps. First of all the server has to be prepared to host the new platform. After the new blog engine is set up, the content needs to be migrated. Finally, the old engine needs to be shut down and the server needs to be set up to forward requests to the old engine to the new one.

Installing WordPress

Installing WordPress should be relatively easy as it is available through the Microsoft Web Platform Installer 2.0. However, you might encounter issues during the process on machines running IIS 7 as the required Windows Update KB980363 causes the installation process to hang. The update process only hangs when started from within the Web Platform Installer, so pick it from the Microsoft Download Page and install the hotfix beforehand. Before installing WordPress you need to install PHP on the server. In addition to the instructions how to configure PHP on IIS 7, Ruslan Yakushev provides a very good tutorial how to set up FastCGI on Windows Server 2008.

Migrating from dasBlog to WordPress

Originally, I planned to use BlogML to migrate the content from dasBlog to Worpress. Instead I found dasBLogML which is a simple GUI wrapper around the original BLogML. First you download the content of the old blog to your local machine.

dasBlogML

To import the BlogML data, you might want to follow Edgardo Vega’s article. In order to avoid potential problems during the import, also have a look at Daniel Kirstenpfad’s tip about replacing all   occurrences in the XML file. Using the BLogML Importer plug-in you can finally import the previously exported XML file.

Import BlogML

Redirecting dasBlog

In the final step I had to redirect the requests from the old blog to the new one. There are several issues to think about: First of all, all binaries are still referred from the old blog. Consequently it is not possible to just shut it down. Furthermore, there are many entries that are linked from several places all over the web.

My solution is to create a IIS module using managed code, and the ASP.NET server extensibility APIs. First of all I had a look at the schemes of the permalinks or URIs I have chosen for the old blog

http://www.blog.old/yyyy/mm/dd/articletitle.aspx

and the new one

http://www.blog.new/yyyy/mm/dd/article-title/

Consequently the HTTP module has to perform several steps: Replace the domains, remove the technology specific information in form of the .aspx file extension (technology specific information isn’t good practice anyway based on Tim Berner-Lee’s article about cool URIs) and finally add some hyphens. While the later is an somehow impossible task, there is an easy workaround. The scheme for permalinks I have chosen in WordPress will list all articles on a given day if you omit the article title in the URI. Consequently, the requested URI will be rewritten by the module to

http://www.blog.new/yyyy/mm/dd/

and sent back in the response with HTTP status code 301 (moved permanently) base on RFC 2616:

“The requested resource has been assigned a new permanent URI and any future references to this resource SHOULD use one of the returned URIs. Clients with link editing capabilities ought to automatically re-link references to the Request-URI to one or more of the new references returned by the server, where possible. This response is cacheable unless indicated otherwise.

The new permanent URI SHOULD be given by the Location field in the response. Unless the request method was HEAD, the entity of the response SHOULD contain a short hypertext note with a hyperlink to the new URI(s).“

Additional URIs that need to be processed are in the form of

http://www.blog.old/CategoryView,category,categoryname.aspx

Also this one is relatively easy as WordPress expects the category in form of

http://www.blog.new/category/categoryname/

Finally, the selection from the calendar in dasBlog looks like

http://www.blog.old/default,date,yyyy-mm-dd.aspx

and needs to be transformed into

HttpApplication application = (HttpApplication)sender;
HttpContext context = application.Context;

context.Response.StatusCode = 301;
context.Response.RedirectLocation = GetRedirectLocation(context.Request);
context.Response.Cache.SetCacheability(HttpCacheability.Public);

if (!context.Request.Equals("HEAD"))
{
    ...
}

To create the redirect locations I use a set of Regex objects that cover the most important URI types.

Regex singleUriPattern
  = new Regex("http://" + OLD_DOMAIN
  + "/[0-9]{4}/[0-9]{2}/[0-9]{2}/([\w-_\%+]+)*.aspx");
Regex categoryUriPattern
  = new Regex("http://" + OLD_DOMAIN
  + "/CategoryView,category,([\w-_\%+]+)*.aspx");
Regex dateUriPattern
  = new Regex("http://" + OLD_DOMAIN
  + "/default,date,[0-9]{4}-[0-9]{2}-[0-9]{2}.aspx");

Now everything beside the content can be deleted from the old dasBlog installation. In order to avoid any requests not covered by the previously deployed module, the custom error page for status code 404 is set to the corresponding URI on the news blog.

After deploying the module (into the bin folder of the dasBlog installation) it needs to be added to the web.config. Therefore you just have to add it to the httpModules section.

<httpModules>
  <add name="UriRedirector" type="RedirectModule" />
</httpModules>

Edit Custom Error Page Dialog

If the application pool is running in Classic mode, the custom error pages do not cover any ASP.NET content. Therefore add the customError section into to web.config file. Now all requests that do not request any content from the old blog or which a are not redirected by your module are covered by the new WordPress blog.

<customErrors mode="On">
     <error statusCode="404" redirect="http://www.blog.new/404/" />
</customErrors>

Conclusion

Now the content from the old dasBlog instance are displayed on the new WordPress blog, the most important links to your old dasBlog pages are covered by the URI redirection to the new blog and all the rest is caught by the WordPress blog as well. You might want to extend the redirect module with further regular expressions (e.g. to cover CommentView.aspx or other dasBlog pages).

Leave a Reply