Find the right balance among using existing automation tools, putting your programming skills to work and time management

A clowder of cats learning together

If you’ve got a blog with hundreds, thousands of entries, most probably you may want to try a static site generator (SSG) that would fit your needs. This way you can focus more of your attention on the content creation aspect of blogging.

That’s a conclusion I came to recently which brought me to migrate my blog from WordPress to Grav.

Well, it is fair to say that Grav is not exactly an SSG but a flat-file CMS enabling you to store all of your content in the form of files and folders as opposed to a database. The important thing though is the switch to Markdown, a portable format that’s easy to work with rather than creating the HTML content with a WYSIWYG editor.

It is worth mentioning that my web site has been running on WordPress for a few years now using multiple different plugins over the time, and as a result the content ended up containing unwanted shortcodes that needed some cleanup.

Throw in the fact that I am still not too clear about building pages with Gutenberg blocks for the time being, so it was high time for me to just give a try to a flat-file CMS like Grav.

Some First Steps

One thing to be aware of is that a WordPress to Grav migration is not a few minutes task; it boils down to finding the right balance among using existing automation tools, putting your programming skills to work, and time management.

In a word, take it easy and be patient on the roads.

On the one hand I didn’t want to reinvent the wheel to end up writing a new software package for this sole purpose — I did some research and found the right automation tools that worked for me. On the other, I was mentally prepared to do some manual fixes, and if necessary write my own custom bash scripts to fix minor things.

At an early stage of the process I found myself thinking in terms of time management and decided that I’d focus on fixing the Markdown files first; later I would update the uploaded images accordingly.

One of the first lessons learned is the wp-content/uploads folder can be just copied into the root directory of a fresh Grav install so that the uploaded WordPress images can be referenced as follows.

![Figure 1](https://programarivm.com/wp-content/uploads/2016/06/figure-01.png)
##### **Figure 1**. Hello there, how are things going?

Get a Backup of Your WordPress Posts

The next thing to do is to get a backup copy of the WordPress posts to be imported into Grav, which can be easily done on the Export panel as shown in the image below.

Figure 1

Figure 1. Export the WordPress content you need to WXR format

Convert WordPress To Markdown

Once my WordPress export XML file was successfully downloaded I proceeded with its conversion to Markdown format. For this I relied on wordpress-export-to-markdown, a handy script that helps you convert WordPress content into Markdown as per the documentation.

$ npx wordpress-export-to-markdown
npx: installed 135 in 7.547s

Starting wizard...
? Path to WordPress export file? (export.xml) (node:8573) ExperimentalWarning: The fs.promises API is experimental
? Path to WordPress export file? export.xml
? Path to output folder? output
? Create year folders? No
? Create month folders? No
? Create a folder for each post? Yes
? Prefix post folders/files with date? Yes
? Save images attached to posts? No
? Save images scraped from post body content? No

Parsing...
306 posts found.

Saving posts...

Finally, with all posts saved OK, I had to copy the output folder into my Grav 01.blog.

cp -r output /home/standard/projects/programarivm/user/pages/01.blog

Bash Scripting to the Rescue!

But it’s not all peaches and cream; there’s still more work to be done because Will Boyd’s WordPress export to Markdown script is not completely automatic and you have to adapt to the situation.

Have a look at the 01.blog folder and pay close attention to how it’s been built.

$ ls -la user/pages/01.blog/
total 4516
drwxrwxr-x 2 standard standard 86016 Dec  7 15:26 .
drwxrwxr-x 3 standard standard  4096 Dec  7 15:26 ..
-rw-rw-r-- 1 standard standard  3756 Dec  7 15:26 2011-12-28-mi-primera-web-en-html5un-poco-de-historia.md
-rw-rw-r-- 1 standard standard  5244 Dec  7 15:26 2011-12-31-mi-primera-web-en-html5-iielementos-estructurale.md
-rw-rw-r-- 1 standard standard  3609 Dec  7 15:26 2012-01-02-concurrencia-de-procesos-con-php-el-problema-de-los-dados-implementado-con-tuberias-pipes.md
...

It’s all Markdown files, not a great fit for Grav if you want your blog set up as per the Blog Site Skeleton provided that Grav’s folder structure should consist of a 01.blog with subfolders named with valid slugs.

The bash script below makes the previous structure Grav-friendly.

#!/bin/bash
for f in *.md
do
  slug="${f%.*}"
  mkdir $slug
  mv $f "$slug/item.md"
done

If you run the script and ls the 01.blog folder again, now you’ll find as many subfolders as there are posts on the blog, each of which containing one item.md.

$ ls -la user/pages/01.blog/
total 1336
drwxrwxr-x 307 standard standard 98304 Dec  9 12:35 .
drwxrwxr-x   3 standard standard  4096 Dec  9 12:34 ..
drwxrwxr-x   2 standard standard  4096 Dec  9 12:35 2011-12-28-mi-primera-web-en-html5un-poco-de-historia
drwxrwxr-x   2 standard standard  4096 Dec  9 12:35 2011-12-31-mi-primera-web-en-html5-iielementos-estructurale
drwxrwxr-x   2 standard standard  4096 Dec  9 12:35 2012-01-02-concurrencia-de-procesos-con-php-el-problema-de-los-dados-implementado-con-tuberias-pipes
...

The wordpress-export-to-markdown tool automatically creates for you a frontmatter block with two entries in each of the posts: title and date.

---
title: "From PHP to Python Through Simple CLI Examples"
date: "2018-01-10"
---

However, it may be necessary to add a slug to every item.md too should you want to customize the slug.

---
title: "From PHP to Python Through Simple CLI Examples"
date: "2018-01-10"
slug: from-php-to-python-through-simple-cli-examples
---

So if the date needs to be removed from the default slugs as in the example above, you might find the following bash script helpful.

#!/bin/bash
for f in $(find 01.blog -name 'item.md')
do
  slug="${f:19:-8}"
  sed -i "4s/^---/slug: $slug/g" $f
done

for f in $(find 01.blog -name 'item.md')
do
  sed -i '5s/^/---\n/' $f
done

It definitely seems like shell scripts are your best friend when it comes to updating a bunch of Markdown files at once, being to some extent a powerful equivalent to SQL UPDATE statements when it comes to working with database-driven websites.

Markdown is a well-known portable language that advocates for content creation over syntax, hence it really doesn’t make sense to write the exact same thing over and over again across all documents. In that sense, when linking to your own site it’s probably better to write something like this:

 [Lessons Learned From a WordPress to Grav Migration](/lessons-learned-from-a-wordpress-to-grav-migration)

Rather than using the absolute URL counterpart:

 [Lessons Learned From a WordPress to Grav Migration](https://programarivm.com/lessons-learned-from-a-wordpress-to-grav-migration)

The reason being: the former is more readable, less wordy, or more content-centered, if you like. So once again, a bash script relying on regular expressions comes to the rescue, on this occasion to replace absolute URLs with relative ones in all documents.

 #!/bin/bash
 for f in $(find 01.blog -name 'item.md')
 do
   sed -i 's|\[\(.*\)\](https://programarivm.com/\(.*\))|\[\1\](/\2)|g' $f
 done

Conclusion

It took me a few days to migrate my WordPress site into Grav. At the end of the day it was all about finding the right balance between time and effort.

WordPress export to Markdown was really helpful to quickly convert my WordPress posts into Markdown format but I had to do some manual fixes too. I refreshed my bash scripting skills in order to update all .md files in a time efficient way, reviewed how the sed command works, and looked at Bash’s regular expressions.

Finally, it is a good thing to make your Markdown documents look awesome and well-suited to automation by following a consistent convention.

You may also be interested in…