Ramblings of General Geekery

Multi-core PieCrust 2

PieCrust news – and this blog – have been pretty quiet for the past couple months, and that’s because I’ve been busy working on PieCrust 2 performance.

"pasticcetti con crema e amarene" - mini-pies with custard and sour cherries

TL;DR: PieCrust 2 now runs in multiple cores, which speeds up the baking process quite a bit. Update your repositories, or grab the latest version from Pypi!

More details after the break.

Remember the performance graph from the PieCrust 2 announcement post? Back then, PieCrust 2 was taking around 6.5 seconds to bake my blog. That time went up a bit in the following months as I was fixing some edge case situations and bugs… but now with the latest optimizations, bake time for my blog is down to 4.5 seconds on the same computer (a late 2008 Macbook Pro).

Now for some prettier graphs, and to get an idea of where PieCrust stands compared to other static web generators, I wrote a simple benchmark website generator. And yes, before you mention it, it does indeed generate websites for generators to generate.

These are the results for a few well known engines: Middleman, Octopress (which is based on Jekyll), and Hugo. The first graph generates a sample website of 100, 500, and 1000 posts on a dual core CPU (my old Macbook Pro).

Performance on dual-core CPUs

The second graph does the same on a quad core CPU (my desktop PC).

Performance on quad-core CPUs

Benchmark methodology

Here’s some background about what’s going on in those graphs.

The script is basically built upon Steve “spf13” Francia’s benchmark generator, which he uses to demonstrate how crazy fast Hugo is (see the “How fast is Hugo?” video on the Hugo documentation). The code is available in PieCrust’s repo.

It’s really just creating a certain number of blog posts by generating paragraphs made up of random letters and dates. It assigns one similarly random tag to each blog post, with only 20 tags in total for each benchmark website. So each engine has to generate the blog posts (100, 500, and 1000), the index page, and the tags pages.

  • For Middleman, I installed the middleman-blog gem and removed the archive (calendar) page. I also removed all the images, Javascripts, and stylesheets to get a barebones website.
  • For Octopress I did the same, although you can’t really remove the stylesheets apparently so I just made an empty one. So in theory we can probably remove a couple hundred milliseconds from the timings, to account for running Sass on an empty file.
  • For PieCrust I just made a barebones website.
  • For Hugo I used a stripped down version of the Hyde theme.

The first batch of tests were run on OSX, and timed using the time UNIX command.

The second batch of tests were run on Windows, and timed using PowerShell’smeasure command.

Caveats

Obviously those benchmark websites are not very representative of a real website. A real website will have more varied formatting and content in its blog posts. It will have more templating work to be done, like blog archives, a gallery, or a portfolio section – much more free-form stuff. And a real website will have a few seconds of overhead to compile, process, minify, and/or compress asset files like Javascripts and stylesheets and images.

Also, the bulk of the time is usually spent templating and formatting text, so each engine is pretty much a slave to the performance of their respective templaters and formatters… not all Markdown libraries are created equal in various programming languages.

As a result, you can drastically change the generation times by using a different templater or formatter, for those engines that let you pick. For instance, in PieCrust, using Mustache is much faster than Jinja (although a lot more limited, of course).

Stil, those tests are a good indication of how those engines scale.

Conclusions

As you can see, Hugo is insanely crazy fast – probably because it’s written in Go, a compiled, statically-typed language with proper multi-threading support… but also probably because Steve is a very skilled programmer :) It can handle several thousands blog posts before passing the 10 seconds mark.

PieCrust passes that bar between 500 and 1000 posts, depending on your hardware. If you have more than 4 cores, it could handle more posts, of course, since it also scales moderately wide, like Hugo, and unlike most other generators.

Overall, I’m happy with the results so far. I’m sure there are still a few optimizations to be done here and there – and I’ll gladly take some suggestions or contributions! – but, well, I’ve got other projects I’ve abandoned for too long :)