Photo by Andrew Neel on Unsplash
Lessons Learned From a WordPress to Grav Migration
The important thing was the switch to Markdown
If you've got a blog with hundreds, thousands of articles, probably you may want to try a static site generator (SSG) to focus more of your attention on the content creation aspect of blogging. That's a conclusion I came to recently which brought me to migrate my blog from WordPress to Grav.
Well, it is fair to say that Grav is not exactly an SSG but a flat-file CMS enabling you to store all of your content in the form of files and folders as opposed to a database. The important thing though is the switch to Markdown, a portable format that's easy to work with rather than creating the HTML content with a WYSIWYG editor.
It is worth mentioning that my web site has been running on WordPress for a few years now using multiple different plugins over the time, and as a result the content ended up containing unwanted shortcodes that needed some cleanup.
Throw in the fact that I am still not too clear about building pages with Gutenberg blocks for the time being, so it was high time for me to just give a try to a flat-file CMS like Grav.
Some First Steps
One thing to be aware of is that a WordPress to Grav migration is not a few minutes task; it boils down to finding the right balance among using existing automation tools, putting your programming skills to work, and time management.
In a word, take it easy and be patient on the roads.
On the one hand I didn't want to reinvent the wheel to end up writing a new software package for this sole purpose — I did some research and found the right automation tools that worked for me. On the other, I was mentally prepared to do some manual fixes, and if necessary write my own custom bash scripts to fix minor things.
At an early stage of the process I found myself thinking in terms of time management and decided that I'd focus on fixing the Markdown files first; later I would update the uploaded images accordingly.
One of the first lessons learned is the wp-content/uploads
folder can be just copied into the root directory of a fresh Grav install so that the uploaded WordPress images can be referenced as follows.

##### **Figure 1**. Hello there, how are things going?
Get a Backup of Your WordPress Posts
The next thing to do is to get a backup copy of the WordPress posts to be imported into Grav, which can be easily done on the Export panel as shown in the image below.
Figure 1. Export the WordPress content you need to WXR format.
Convert WordPress To Markdown
Once my WordPress export XML file was successfully downloaded I proceeded with its conversion to Markdown format. For this I relied on wordpress-export-to-markdown
, a handy script that helps you convert WordPress content into Markdown as per the documentation.
$ npx wordpress-export-to-markdown
npx: installed 135 in 7.547s
Starting wizard...
? Path to WordPress export file? (export.xml) (node:8573) ExperimentalWarning: The fs.promises API is experimental
? Path to WordPress export file? export.xml
? Path to output folder? output
? Create year folders? No
? Create month folders? No
? Create a folder for each post? Yes
? Prefix post folders/files with date? Yes
? Save images attached to posts? No
? Save images scraped from post body content? No
Parsing...
306 posts found.
Saving posts...
Finally, with all posts saved OK, I had to copy the output
folder into my Grav 01.blog
.
cp -r output /home/standard/projects/programarivm/user/pages/01.blog
Bash Scripting to the Rescue!
But it's not all peaches and cream; there's still more work to be done because Will Boyd's WordPress export to Markdown script is not completely automatic and you have to adapt to the situation.
Have a look at the 01.blog
folder and pay close attention to how it's been built.
$ ls -la user/pages/01.blog/
total 4516
drwxrwxr-x 2 standard standard 86016 Dec 7 15:26 .
drwxrwxr-x 3 standard standard 4096 Dec 7 15:26 ..
-rw-rw-r-- 1 standard standard 3756 Dec 7 15:26 2011-12-28-mi-primera-web-en-html5un-poco-de-historia.md
-rw-rw-r-- 1 standard standard 5244 Dec 7 15:26 2011-12-31-mi-primera-web-en-html5-iielementos-estructurale.md
-rw-rw-r-- 1 standard standard 3609 Dec 7 15:26 2012-01-02-concurrencia-de-procesos-con-php-el-problema-de-los-dados-implementado-con-tuberias-pipes.md
...
It's all Markdown files, not a great fit for Grav if you want your blog set up as per the Blog Site Skeleton provided that Grav's folder structure should consist of a 01.blog
with subfolders named with valid slugs.
The bash script below makes the previous structure Grav-friendly.
#!/bin/bash
for f in *.md
do
slug="${f%.*}"
mkdir $slug
mv $f "$slug/item.md"
done
If you run the script and ls
the 01.blog
folder again, now you'll find as many subfolders as there are posts on the blog, each of which containing one item.md
.
$ ls -la user/pages/01.blog/
total 1336
drwxrwxr-x 307 standard standard 98304 Dec 9 12:35 .
drwxrwxr-x 3 standard standard 4096 Dec 9 12:34 ..
drwxrwxr-x 2 standard standard 4096 Dec 9 12:35 2011-12-28-mi-primera-web-en-html5un-poco-de-historia
drwxrwxr-x 2 standard standard 4096 Dec 9 12:35 2011-12-31-mi-primera-web-en-html5-iielementos-estructurale
drwxrwxr-x 2 standard standard 4096 Dec 9 12:35 2012-01-02-concurrencia-de-procesos-con-php-el-problema-de-los-dados-implementado-con-tuberias-pipes
...
The wordpress-export-to-markdown
tool automatically creates for you a frontmatter block with two entries in each of the posts: title
and date
.
---
title: "From PHP to Python Through Simple CLI Examples"
date: "2018-01-10"
---
However, it may be necessary to add a slug
to every item.md
too should you want to customize the slug.
---
title: "From PHP to Python Through Simple CLI Examples"
date: "2018-01-10"
slug: from-php-to-python-through-simple-cli-examples
---
So if the date needs to be removed from the default slugs as in the example above, you might find the following bash script helpful.
#!/bin/bash
for f in $(find 01.blog -name 'item.md')
do
slug="${f:19:-8}"
sed -i "4s/^---/slug: $slug/g" $f
done
for f in $(find 01.blog -name 'item.md')
do
sed -i '5s/^/---\n/' $f
done
Shell scripts are your best friend when it comes to updating a bunch of Markdown files at once, being to some extent a powerful equivalent to SQL UPDATE
statements when working with database-driven websites.
Markdown is a well-known portable language that advocates for content creation over syntax, hence it really doesn't make sense to write the exact same thing over and over again across all documents. In that sense, when linking to your own site it's probably better to write something like this:
[Lessons Learned From a WordPress to Grav Migration](/lessons-learned-from-a-wordpress-to-grav-migration)
Rather than using the absolute URL counterpart:
[Lessons Learned From a WordPress to Grav Migration](https://programarivm.com/lessons-learned-from-a-wordpress-to-grav-migration)
The reason being: the former is more readable, less wordy, or more content-centered, if you like. So once again, a bash script relying on regular expressions comes to the rescue, on this occasion to replace absolute URLs with relative ones in all documents.
#!/bin/bash
for f in $(find 01.blog -name 'item.md')
do
sed -i 's|\[\(.*\)\](https://programarivm.com/\(.*\))|\[\1\](/\2)|g' $f
done
Conclusion
It took me a few days to migrate my WordPress site into Grav. At the end of the day it was all about finding the right balance between time and effort.
WordPress export to Markdown was really helpful to quickly convert my WordPress posts into Markdown format but I had to do some manual fixes too. I refreshed my bash scripting skills in order to update all .md
files in a time efficient way, reviewed how the sed
command works, and looked at Bash's regular expressions.
Remember, it is a good thing to make your Markdown documents look awesome and well-suited to automation by following a consistent convention.
Did you find this article valuable?
Support Jordi Bassaganas by becoming a sponsor. Any amount is appreciated!