So You Want To Move Your Comments From Haloscan To Blogger… 23

Warning to regular readers of this blog: SEVERE Nerd Alert.

A lot of folks I know who started their blogs out on Blogger have used HaloScan for commenting since before Blogger implemented comments. Since HaloScan is shutting down in the next few days, you’d think you might want to move all your old comments to Blogger.

Good luck.

There’s really no practical reason why someone at Blogger can’t write some sort of comments parser to handle the XML files that HaloScan spits out, but so far, they haven’t. If you want to get it done right now, the only way I found to make it work is a ridiculously cumbersome process.

Basically, that process is to import everything into a WordPress blog where it can all be properly combined, then re-export it, run it through python script, and upload it back into Blogger.

I’ve decided to write up the entire procedure I went through both as an exercise in writing documentation and in order to help anyone else who’s crazy enough to want to try this. If you think you have the patience for this (or would just like to see exactly how insane I am), hit the “read the rest” link that follows.

I will warn you that there’s a pretty decent degree of difficulty on this: There is at least some command line usage involved. There is setting up of a local host on your computer (albeit a dead-easy to use one). There is a LOT of trial and error in this process, and you have to be comfortable with recognizing when things just didn’t work and you need to start over, or at least take several steps back.

This is also very time consuming. The main reason I had time to futz with all this is that I am currently unemployed.

Also, be sure to read this all the way through these instructions and check out the known issues before you get started. There might be a dealbreaker in there, and I’d really hate for anyway to get halfway through this ridiculous process and realize that they wasted half a day for something they can’t use.

I’d like to say up front that this would absolutely not be possible without the work of Justin Watt (you’ll see why in steps 4/6), and if this works, you should totally donate to his beer fund.

If you can accept all those caveats, here are the instructions I’ve put together based on how I got this to (finally) work. I’ve tried to make it as clear as possible, but some of this stuff gets pretty complicated.

.

Step 1: Install XAMPP on your computer.

XAMPP is a free local server with PHP and MySQL tools built right into it, and which works on Windows, OS X, and Linux (and Solaris if you REALLY want to get out there).

You can also run it as a webserver, but for the purposes of this set of instructions, I’m actually keeping it off-line so that the blog I’m doing this on for a friend of mine remains unpublished (since he only allows selected readers on Blogger).

.

Step 2: Install WordPress on your XAMPP Local Host.

Great set of instructions here for Windows XP. The main difference for setting it up for OS X is actually in the installation of XAMPP, which the XAMPP website covers pretty simply. Note that when you’re in Applications > XAMPP folder, you’ll see a shortcut to the “htdocs” folder that you’ll want to dump all the WordPress stuff into.

One thing I did notice when I did a Get Info on it is that the “htdocs” folder is marked read-only for “everyone”, and you’ll want to make sure it’s marked read/write so that your XAMPP server can access it.* On the Mac, hit command-I to Get Info on the folder, then at the bottom of the window that opens up you’ll see dropdown menus that will allow you to change the permissions easily.

*- Again, I’m assuming you will NOT be putting the WordPress workaround on the Web, because there are huge security issues with marking a file as read/write for everyone on a live server, and I would STRONGLY recommend against doing this if you’re working with a live server.

.

Step 3: Import your Blogger blog into the WordPress install on your Local Host.

On the sidebar of WordPress’s admin page, there’s a Tools > Import feature, and one of the types of blogs you can choose is Blogger. You’ll have to sign in with your Google Account to authorize the import, but once you’ve done that, the rest of the process is automated.

I encountered two minor issues with the importer. The first was that there were about 15 or so posts that didn’t come over, which had to be manually re-added. Out of 2300, I was pretty much okay with that, but going through and figuring out which posts were missed was kind of a pain in the ass.

The second issue was that for some reason the WordPress tool to import from Blogger pulled an extra “>” in at the beginning of every. single. post. from a BlogSpot blog. It’s a little annoying, but it’s also kind of good as an indicator of what posts have been imported and/or reimported.

I will note – about a year ago I imported the blog you’re reading now to WordPress from a Blogger blog about I’d been publishing via FTP for years, and didn’t have the “>” issue. Don’t know if it’s a new bug in the import tool or if it’s something to do with BlogSpot, but the issue was there.

.

Step 4: Make sure all your Blogger posts have the Post Number somewhere in them.

The easiest way to do that is to go into your Blogger template add the following bit of code in right after the <$BlogItemBody$> string:

<font color=”[your blog’s background color]”>postID=<$BlogItemNumber$></font>

Making it the same color as your background will make it visible to the script that needs to pull the post ID number, but invisible to anyone actually looking at your site (unless they happen to highlight it). I tried doing this as an anchor but the script wasn’t able to pull it, it’s got to be right in the actual post.

If you don’t mind the postID for every post being visible to your readers while you perform all this nonsense, you can just put in postID=<$BlogItemNumber$> .

.

Step 5: Install and run the WP-Get-Blogger-Post-IDs script.

This page has two totally invaluable PHP scripts for this process written by Justin Watt: “wp-get-blogger-post-IDs” and “import-haloscan.” Run step 2 in that page’s instructions to download, install, and run the “wp-get-blogger-post-IDs” script. This will pull in all your post IDs that you set up a minute ago so that the comments can be matched to the appropriate post.

Here’s the bad news if you have a protected blog: You WILL have to make your Blogger blog temporarily available to anyone if it’s not already through the Settings > Permissions tab on Blogger.

You don’t have to allow any search engine indexing or anything, it just takes the “this blog is open to invited readers only” wall down temporarily, so unless someone knows your URL and specifically goes to look at it in the short bit where the wall is down, there’s nothing to be concerned about.

The good news is that Blogger automatically preserves your readers list so that the second you’re done getting all the info you need, you can immediately turn that protection back on.

NOTE: I did notice when running the PHP scripts on the XAMPP local server that that they can be a little slow, so your wall of protection may need to be down for up to an hour or more, depending on how many posts you have.

.

Step 6: Put your HaloScan Export files in the “htdocs/wordpress” file and number them sequentially.

If you have more than one HaloScan export file, go ahead and number them sequentially so they can be imported as “export1.xml”, “export2.xml”. Then place those files in your main htdocs/wordpress file.

I will note, when I imported 1400 comments to over 2 exports to this blog via WordPress, it did it just fine, but it choked on trying to import all 8,000+ comments at once on the blog I’m working with on this giant mess.

For my friend’s blog, I wound up just importing each export file one at a time, throwing all the others in a folder I marked “exports” so I knew where they were, but the script would ignore them. You can keep them numbered sequentially so you can keep track of which ones you’ve already imported, just only have one at a time in the main “wordpress” folder.

.

Step 7: Run import-haloscan.php.

Remember this page where you got the “wp-get-blogger-post-IDs” script? Well, the second script on that page, import-haloscan.php, is the second piece of this, located in that page’s Step 3. Follow that page’s instructions on how to download, install, and run that script.

The “import-haloscan” script takes all the post IDs you brought in and matches them up with the comments in your HaloScan Export file(s).

.

Step 8: Check and make sure your comments imported correctly.

Make sure the number of comments you imported for each post matches up. You may have a very few missing comments – When I did it for this blog, I lost 7 out of around 1400 comments, and frankly, I’d rather have 99.5% than none.

However, if you’re missing a ton of comments, then you might want to try deleting all the comments (which can be done in bulk from the “comments” tab on the sidebar) and reimporting each export file one at a time.

.

Step 9: Export from WordPress.

Now that you’ve gotten all your posts and comments linked up and in one place, it’s time to start getting them back over to Blogger.

Go to Tools > Export, and hit the “Download Export File” button. All your posts and their comments will export as a big XML file to your default download directory.

.

Step 10: Download the Google Blog Converters App Engine.

The Data Liberation Front has put together a series of Python scripts that will translate the XML WordPress puts out into something that Blogger can understand. You can download a big old folder of scripts from their Google Code page.

Note that you do need to have a recent version of Python installed for it to work, but most recent OS’s come with a version that will work pre-installed. If you don’t have Python installed, here’s a link to the Python site which will give you more info on how to make that happen.

.

Step 11: Fire up your command line.

On OS X, Terminal works fantastically for this because you can just drag and drop the files you need.

Once Terminal is up and running, drag the “wordpress2blogger.sh” script from the “bin” file in the big downloaded file o’scripts into the terminal window. You’ll see a plus sign to let you know that the script is able to be added, and then the script’s name will just show up in the window.

Then, drag in the XML document that exported from WordPress into the terminal window using the same procedure. Once both are added, hit enter. The script will think for a minute, then spit out an enormous amount of text into the terminal window.

.

Step 12: Create the document to upload to Blogger.

Edited to add 02.18.10: Excellent tip from Kevin in the comments that will allow you to skip part of this step:

When executing the command line version, you can automatically capture the terminal output instead of letting it scroll by and then re-selecting/editing. On any *nix system like OSX or linux you just redirect the output into a file with “>”.

sh wordpress2blogger.sh > mynewfile.txt

Back to our regular programming….

In Terminal, go to Shell > Export Text As. This will export everything in the Terminal window as a .txt file. However, you’ll need to go in and do a couple things before it’s ready for upload. If you’re using another command line interface, you can also just do a select all on all text and paste it into a blank document.

Open this .txt document in your favorite text Editor – I prefer TextWrangler because it’s got an option to soft-wrap text so you don’t have to scroll sideways for miles.

At the top of the document, select everything before the “<?xml version=’1.0’…” and delete it, since that’s just stuff that was only relevant to the terminal.

Go down to the very bottom of the document, and make sure you delete the “[your username]’s-Computer:~ [your username]$”. This is also only something that is useful to the Terminal.

Once you have deleted both of these items, do a Save As… and make sure to save it as a .xml file, and use a name that will allow you to distinguish it as the file that needs to be uploaded to Blogger, like “WordPress Export For Upload To Blogger.xml”.

.

Step 13: Upload the file to Blogger…in a test blog.

I would strongly, strongly recommend setting up a test blog before you re-upload everything to your main blog, since in order to do so, you’re basically going to have to nuke your main blog.

I set up a test blog on BlogSpot that I restricted so that only my friend and I could see it, then uploaded the XML file generated in Step 12. This allowed me to check that all the posts and comments had made it over – Which was good because the first time I tried it, I realized I’d screwed something up and managed to only import comments prior to 2004, and had to go back several steps.

If your upload succeeds and everything looks good in your test blog…

.

Step 14: Backup, then nuke the content on your main Blogger Blog.

Again, I cannot emphasize enough: Backup, backup, backup. Things go sideways. You want a backup of everything. To backup your Blogger Blog, go to Settings > Basic and at the top there’s a link to Export Blog. Click that, and then click the big old button that says “Download Blog.”

Make sure you note where that file is and possibly rename it something like “Backed up main blog” so you can find it if things go wrong.

Once you have that file completely downloaded, you will need to delete all your existing posts so that you don’t wind up with either a) duplicates or b) posts with comments which won’t import because they were marked as duplicates.

To do this go to Posting > Edit Posts. Click on Select All, and you’ll be told that you’ve selected all the visible posts on the screen, and asked if you’d like to select all [however many] posts you have. Click to select all [however many] of your posts, then scroll down to the bottom of the screen and click “Delete Selected.”

Your template will be unaffected, this will just get rid of all your content (Don’t panic, we’re bringing it back in with…)

.

Step 15: Upload the file to your main Blogger blog.

If you got it working for your test blog, this should work for your main blog. You may have to remove some residual HaloScan commenting code (and add some Blogger code back in) from your template to get all the comments to show up properly, but you should be good to go, except for the Known Issues listed below.

.

KNOWN ISSUES

1. This only works with comments that have actually been exported by HaloScan – Once you’ve upgraded to Echo, it spits out a totally different type of XML file that cannot be read by the “import-haloscan” script and unfortunately I’m not enough of a code monkey yet to remedy this myself.

2. The Python parser to go from the exported WordPress to your Blogger re-upload seems to only parse the GMT dates/times that WP spits out, not the actual times stuff was posted (the WP-generated export file contains both pieces of data). Depending on where you live, you can wind up with all your posts up to 12 hours off. For me and my friend, this wasn’t a big issue, but for some people whose blogs are more timestamp-sensitive, this may be a dealbreaker.

3. If you have a blog with a restricted readership, be sure to note that in Step 5 you will need to make it temporarily available to everyone.

4. Two minor issues with WordPress’s Blogger Import tool (failing to import a very few old posts; randomly adding a “>” to every single imported post from BlogSpot) are detailed in Step 3.

—–

Phew.

I am absolutely open to suggestions of how I could have done this more easily, but I did quite a bit of digging around and couldn’t even find instructions for a process this ridiculous and cumbersome, let alone anything simpler.

Hope this helps a few people out, or at least inspires the folks at Blogger to finally put together a HaloScan comment importer. Because this method is completely insane.

23 thoughts on “So You Want To Move Your Comments From Haloscan To Blogger…

  1. Reply Kelly Feb 12,2010 6:55 am

    I’m pretty nerdy, and I typically like these kind of tech projects, but in this case, I’m definitely okay with just losing my comments from the blog. Thanks for writing this up though!

  2. Reply Ellen Feb 12,2010 9:29 am

    I don’t blame you. Like I said, this process is absurdly complicated. I’ll post an update if Blogger does manage to come up with an easier importer, though.

  3. Reply Laz Feb 12,2010 11:07 am

    Ho. Lee. Shit.

    And this, friends, is why I write about basketball for a living. The box-and-one defense is about as complicated as I get.

  4. Reply KM Feb 13,2010 4:23 am

    I don’t think there has ever been anything interesting on my blog that is worth saving. But thanks for working through this!

  5. Reply Kevin Feb 18,2010 7:13 pm

    Great tutorial (!!!), I just did the whole thing and it worked like a charm. Two things to note:

    1. The wordpress2blogger conversion script you mention has a web version, it may have limitations for some users:

    http://wordpress2blogger.appspot.com/

    2. When executing the command line version, you can automatically capture the terminal output instead of letting it scroll by and then re-selecting/editing. On any *nix system like OSX or linux you just redirect the output into a file with “>”.

    sh wordpress2blogger.sh > mynewfile.txt

    Also, the extra “>” before the title and body of each post can be easily removed with a search&replace in any text editor. You can do this to any of the XML files at any stage in this process.

    Many thanks again, you save my life. There has to be a better way….we just need a haloscan2blogger perl script (maybe I’ll play with it) and we’re all set.

  6. Reply Ellen Feb 18,2010 7:21 pm

    Hey Kevin, I’m so glad it was clear enough that it worked for someone else!

    I didn’t recommend the appspot version of wordpress2blogger because it can only output a 1 MB file, and most people who have been using Blogger enough to have had Haloscan comments they want to save will exceed that limit.

    I’m adding your CLI tip to the post, that’s really helpful, and I’m still getting the full hang of the CLI.

    The only issue with doing a search and replace on the >’s is that doing an automated replace might also catch them in places where they’re intended to be instead of just places where they were inserted. And doing it manually would take forever on a 2300+ post blog, which is what I was working with.

  7. Reply Kevin Feb 18,2010 7:28 pm

    One final caveat. You may see the following error at the last Blogger import:

    Sorry, the import failed due to a server error. The error code is bX-qm5h6h

    This is solved by making sure you have turned on Comments! Be sure to change your comment status to “Show” before importing.

    Thanks again.

  8. Reply Justin Watt Feb 22,2010 10:43 pm

    Wait! You did a Blogger to WordPress round trip? That is some SERIOUS passion for Blogger. 99% of folks I know were happy to migrate off of Blogger for the features the WordPress offered. I thought you were doing the same…

  9. Reply Ellen Feb 22,2010 10:48 pm

    I personally just moved from Blogger to WordPress (and am very glad I did, I personally like the WP platform much more), but my friend whose blog I did this for…fears change. 🙂

    I also needed a challenge – I’d been unemployed for some time when I offered to do this project for him. And sure enough, it was a real fun mountain to summit. It just seems totally absurd to me that there isn’t a single tool in existence to do this.

    And again, my thanks to you, Justin, without you this would absolutely NOT have been possible.

  10. Reply Permafrost Feb 23,2010 7:57 am

    Hi! Thank you for this tutorial! Everything has been running smoothly, but I’m stuck at step 4, about the item number. I can’t find the $BlogItemBody$ script in my template. I’m using a “new” template, as opposed to a “classic” one. Can the item number tag be shown by using a widget (or Blogger’s “gadget”)?

  11. Reply Ellen Feb 23,2010 8:33 am

    That’s an excellent question – The fact that you’re using a “new” template is definitely the issue, as it gets rid of all the $Blogger$ sorts of code.

    The one thing I can think of off the top of my head that might work is making sure you copy your current HTML, temporarily switch to a “classic” template, then re-paste your old HTML back after the whole process is complete.

    It’d probably look oogly as hell for a while, but without doing a ton of research (which unfortunately I don’t have time for today as I’m working), I don’t know that there’s any Blogger gadget to allow you to inject “classic” template code into a “new” template.

    If you do hear of a way to do it, please let me know and I’ll make sure to update the post accordingly. Thanks!

  12. Reply Permafrost Feb 23,2010 10:12 am

    I had thought of trying that, so that’s what I’ll do. Thank you!

  13. Reply eddy Feb 24,2010 7:53 pm

    very nice discription

  14. Reply Erlend Mar 4,2010 5:28 am

    I had a quite different approach:

    First I exported the Haloscan comments to an XML file.

    Then I wrote a Python script to parse the comments file (using Python’s standard XML interface). I then combined this script with the Python interface to the Blogger API to upload comments to the correct posts.

    The only drawback was that dates and user information were lost – I cannot post comments as somebody else, nor post comments that seem to come from the past.

    But it was easy to put the user names and the original date into the comment text itself, which seemed a reasonable compromise.

    The upload got stuck in the middle a couple of times and had to be manually restarted, but I managed to work around that so that I did not get duplicate comments.

  15. Reply Ellen Mar 4,2010 7:14 am

    Erlend – if you have a link to your script, please feel free to post it. Unfortunately, I don’t have any python scripting experience yet, so I was not able to try something like what you managed to do.

  16. Reply Erlend Mar 5,2010 8:34 am

    The script can be downloaded from here. As I say in the comments, it will have to be modified to suit your particular setup.

  17. Reply Nicole Mar 13,2010 2:23 pm

    Hi there,

    I was wondering if you might be able to tell me how to use the WordPress2blogger script (or the included .at file)in windows vista. As far as I know, I’ve done everything correctly up till this point but I’m not exactly sure what to put into the command line (it’s not drag and drop like Terminal).The readme file on the Google code page is totally unhelpful for people who don’t already know the basic procedures. Any help would be so appreciated!
    Thanks,
    Nicole

  18. Reply Nicole Mar 13,2010 2:23 pm

    Sorry… .at was meant to be .bat

  19. Reply Ellen Mar 13,2010 2:41 pm

    It’s been quite a while since I’ve used the command line in Windows and my Parallels install is hopelessly borked for the moment to try testing it, but I *think* the way to do it is to just open up the command line and type in “run C:[path to file]wordpress2blogger.bat C:[path to exported xml file]”.

    The .sh files are for *nix based stuff, so they won’t really work on Vista without installing something like cygwin, which is just going to make your life MUCH more complicated.

    Let me know if that works/helps, or if someone with more recent Windows experience than I’ve got wants to chime in with a more correct set of instructions, do feel free.

  20. Reply Nicole Mar 13,2010 10:45 pm

    Thanks so much for the reply. It turned out I was ale to use the online version of wordpress2blogger. My blog is larger than the limit but…I don’t know, it seemed to work. Anyway, I may give the command line version a try just to test it out. I’ll comment if I’m successful! Thanks so much for this wonderful tutorial–it saved 8 years of my blog comments (and also made me consider a move to wordpress for good…).

  21. Reply Ellen Mar 14,2010 8:41 am

    Oh good good. Personally, I prefer WordPress quite a bit. The setup sometimes requires some help from your webhost but otherwise I’ve found it to be a much better and more flexible platform than Blogger.

  22. Pingback: deonandia » Rectum? Damn Near Keeled Him!

  23. Pingback: Special Super-Secret Bonus Link! ← Designated Nerd

Leave a Reply

  

  

  

This site uses Akismet to reduce spam. Learn how your comment data is processed.