Ramblings of General Geekery

New features in PieCrust 2

Now that you know about PieCrust 2 and you’ve upgraded your website, it’s time to look at the really new features. Today we’ll talk about the 2 ones that I think are most important: the new content model, and the new pagination model.

And yet another apple pie

(this post is going to be a bit long so here’s something to keep you hungry)

Sources, routes, and taxonomies

In PieCrust 1, like in most other static website generators, the way content was defined was quite rigid: you could have pages, and you could have blog posts. PieCrust did a few extra things, like letting you have multiple blogs, each with its collection of posts, but that was it.

The only way to have a specific set of pages, different from other pages, was to use page metadata and filtering (say, filter pages where type is recipe to get all recipes), but that didn’t translate to a good file-system organization, required remembering to tag things correctly, and required to use the inverse filter to get the other pages. You also couldn’t have a different URL format for all recipe pages, as compared to normal pages.

Enter PieCrust 2, where all the content is, under the hood, defined with sources, routes, and taxonomies. Those are generated for you to something equivalent to PieCrust 1 content if you don’t define them, but you can override that for totally custom content.

Sources

Sources are where pages come from. Two source types you already know (if you used PieCrust 1 before) are:

  • the “simple” page source, where pages are found recursively under a given directory, and their relative path translates to their relative URL.
  • one of the “blog” page sources, where pages are found in a closely structured directory, and both the date and “slug” of the post are defined by the filename. So in the case of, say, the flat post source, all posts are files named YYYY-MM-DD_foo-bar.md directly under the posts/ directory.

Because a site can have as many sources as you want, it already means you could create a “recipes” source, and put all the recipes in a different directory than the other pages, so that’s already nice.

But in the future there will also be more advanced sources. See, the “simple” page source just gives one piece of information about a page: its relative URL. But the blog post sources give more information, like the date of the article (“simple” pages need to specify it as part of their config header).

You could therefore imagine, say, a page source where each folder applies a tag to pages inside it. So if you created a page like recipes/pies/fruits/apple-pie.md, it would automatically have tags pies and fruits applied to it, as if you wrote tags: [pies, fruits] in its configuration header. Another useful source would be one that applies a hierarchical order to its pages, based on a filename prefix – this would be well suited to things like documentations.

Routes

Now that PieCrust knows where to find your content, routes define how it’s exposed – or parsed, if you were to run PieCrust as a lightweight CMS, or when running chef serve.

A route defines the shape of the URL of a page. If you’ve used PieCrust 1, you can think of it as a generalization of the post_url/tag_url/category_url settings.

At the moment, it can only use the same information as the one provided by the source (e.g. the year, month, day, and slug of a post for a blog post source), but in the future you’ll get to use all the other page metadata too (so that you can generate URLs that include categories or tags if you want).

Taxonomies

Another generalization from PieCrust 1 are the taxonomies. Before, only categories and tags would have automatically generated listing pages. Now you can have whatever you want. You just need to specify if a taxonomy can have several terms applied to a page (like tags) or not, the name of the term listing page (like _tag.html and _category.html), and a few other optional things.

Putting it all together

Let’s say we want to have a section in our website where visitors can browse our favorites recipes. We want to put all recipe pages in a recipes/ directory (next to pages/ and posts/), be able to tag them by ingredients, with listings of recipes by ingredient being created automatically, and be able to tweak the URLs for all of this.

We’ll specify appropriate sources, routes, and taxonomies in the site configuration. Let’s start with just getting the recipes going:

site:
    sources:
        recipes: {}
    routes:
        - url: /recipe/%path%
          source: recipes

This will make PieCrust look for pages in the recipes/ directory, using the default page source (since we didn’t specify anything), i.e. the same as the one used for pages/. URLs that look like /recipe/foo/bar will match our new route, and a file named foo/bar.md will be loaded from the recipes/ directory in that case.

Now’s the time to add the “ingredients” taxonomy. This gets more complicated because we have several things to specify:

site:
    sources:
        recipes: {}
    taxonomies:
        ingredients:
            multiple: true
    routes:
        - url: /recipe/%path%
          source: recipes
        - url: /recipes/with/%ingredients%
          source: recipes
          taxonomy: ingredients

This does:

  • Add a new taxonomy named ingredients. It’s a multiple taxonomy, meaning that pages can have more than one ingredient assigned to them (this tells PieCrust it potentially has to generate listing pages for combinations of ingredients).
  • Add a new route for listing recipes by a given ingredient. Here, the %ingredients% token, along with the taxonomy: ingredients setting, let PieCrust know how to properly find and match content for this route.
  • When a listing page needs to be generated, PieCrust will look for a recipes/_ingredients.md page, passing whatever value was matched by %ingredients% to an ingredients template variable. This is analogous to how tag and category listing pages work in PieCrust 1.

Some other interesting facts:

  • You can list all recipes by starting with {% for recipe in recipes %}.... Sources have a page iterator exposed by default to a template variable of the same name as themselves.
  • You can create a new recipe page easily with chef prepare recipes foo-bar.

There are many advanced settings to change the behaviour of PieCrust, but they’re outside the scope of this already quite long blog post.

Pagination

Another big change in PieCrust 2 is how pagination is handled. In PieCrust 1, you could only paginate blog posts, but in PieCrust 2 you can paginate any list of items – pages or otherwise.

The new paginate filter lets you do that, by returning a Paginator instance, exactly like the existing pagination object. But where the pagination object returns something that lists the posts in the default blog source, the paginate filter will return something that lists whatever it was passed.

This is especially useful for galleries, as shown with my Meeting Notes doodles. This was done more or less like so:

{% set thumbs = assets|paginate(9) %}

{% for thumb in thumbs.items %}
<img src="{{thumb}}" alt="Note" />
{% endfor %}

[Older entries]({{ thumbs.prev_page }})
[Newer entries]({{ thumbs.next_page }})

Of course in reality there’s more fluff (CSS classes, etc.) and tests around using prev_page and next_page, but you get the idea. Just like in PieCrust 1, assets returns the list of page assets URLs (in this case, a whole bunch of pictures), and the paginate filter makes sure only 9 of them will be shown on a given page. It also tells PieCrust to generate sub-pages.

Obviously, you can’t use 2 pagination sources on the same page – PieCrust wouldn’t know how to generate sub-pages that go in 2 different directions, so you’ll get an error if you try that.

Call for feedback

Please get in touch with me, or post comments here, if you have some constructive feedback about this new content model. PieCrust 2 is still in alpha, so there’s time to change the design without messing up every other PieCrust user.


Upgrading to PieCrust 2

The recently announced PieCrust 2 is all fine and dandy if you were to create a new website – the command line interface and user experience are essentially the same out of the box – but you will find that it can’t handle an existing PieCrust 1 website. This is because a few things have changed… luckily, the chef import command and this blog post will get you going in no time!

Installing PieCrust 2

First, get PieCrust 2 on your system:

  • Install Python 3.
  • Run pip install piecrust in a console.

If you want to install the bleeding edge (read “unstable”) version directly from BitBucket or GitHub, instead of the latest posted release from the Python package manager, you can do one of:

  • pip install hg+https://bitbucket.org/ludovicchabant/piecrust2#egg=PieCrust
  • pip install git+https://github.com/ludovicchabant/PieCrust2.git#egg=PieCrust

There are many other options if you want to install PieCrust for advanced scenarios. See the pip documentation.

Check that everything’s OK by running chef --version. At the time of writing, you should get something like 2.0.0-alpha2.

Note: If you still get a 1.x version, you probably have the PieCrust 1 directory showing up first in your PATH environment variable, so go change that (or delete PieCrust 1 altogether!).

Upgrading a PieCrust 1 website

The chef import command was previously used for importing content from other CMSes like WordPress. Now it can also import content from a PieCrust 1 website, and even upgrade it in place. Get in your website and try:

chef import piecrust1 --upgrade

You’ll notice that a lot of things got moved. The previous layout for a website looked like this:

+ root
|- _cache/
|- _content/
|     |- config.yml
|     |- pages/
|     |- posts/
|     |- templates/
|- css/
|- images/
|- lots/
|- of/
|- other/
|- crap/
|- whatever
|- .gitstuff

Now it looks like this:

+ root
|- _cache/
|- assets/
|     |- css/
|     |- images/
|     |- other-crap/
|- pages/
|- posts/
|- templates/
|- config.yml
|- whatever
|- .gitstuff

Basically, the _content directory is gone… everything got moved up one level, while all the asset files (CSS, images, fonts, etc.) got moved into an asset folder1.

The benefits of this change are:

  • It looks better! There’s no more mix of different “things” at the root level (magic folders, assets, source-control files, miscellaneous things…). Instead, “things” are arranged in different folders that are almost self-explanatory (mostly because you can choose them!).
  • All the assets that should be processed and copied as part of the bake are in assets. Anything that’s only for development purposes (source control files, miscellaneous stuff) can be in the root directory, or in a different folder than assets, and they won’t be picked up by the bake. This prevents a lot of “oh shit” moments where you forgot to add something to the baker/skip_patterns config and a whole bunch of files are baked but you didn’t mean to.
  • It’s going to be easier to manage interoperability with other tools such as Grunt, by having such external tools operate on other top-level directories.

A few other things also changed, mainly because of the move from Twig to Jinja as the default templating engine. Although they’re very similar, they do have some differences in terms of built-in functions and filters.

  • An example is that Twig’s slice filter maps to an array notation in Jinja (so {{items.slice(3, 6)}} becomes {{items[3:6]}}), and Jinja’s slice does something completely different (although very useful)!
  • Another example is date formatting, which is different between PHP and Python.

The importer will try to fix those things automatically for you (or at least warn you about it and provide guidance), but it’s probably going to miss a few ones since I only know about those I ran into while upgrading my own websites. Please report any such problems, thanks.

Unsupported features

PieCrust 2 is not quite feature complete compared to PieCrust 1 – I can’t reasonably wait until 100% of the feature set is implemented before getting it out there for feedback.

Here are things I know are missing:

  • Running as a CMS: there’s not much code needed for that, but there’s no WSGI application class yet.
  • A plugin API: not much code needed yet either, but yeah, you can’t at the moment drop anything in the plugins folder, it won’t get loaded.
  • Slugification of taxonomies: tags and categories containing non-ASCII characters will keep them for now. PieCrust 1 had options for transliterating them into their non-accentuated/ASCII counter-parts.
  • Support for RSS/Atom feed scaffolding (chef prepare feed).
  • Mustache as an alternative template engine.

There are also probably things I don’t know are missing, so make sure you ping me if something you care about is not on this list.

In the next blog post, we’ll finally take a look at the new features in PieCrust 2, including the completely new underlying system for specifying pages, taxonomies, and URLs.


  1. Don’t worry, it’s configurable. You can put your asset files in a
    different folder, or even in multiple folders. ↩︎


Announcing PieCrust 2

I’ve been busy on it for longer than I expected – neglecting the freshly announced Wikked along with several pull requests on PieCrust – but I believe it’s at last ready for a public alpha release: PieCrust 2 is here!

073/365 - Pi Day Pies, 2012

WARNING: before you go clone the new repository, be aware that, at the time of writing this, it has been tested on a glorious total of 2 machines (both my own), and 2 websites (both my own as well). So don’t use it in production, but please do give it a try and post bug reports, thanks!

This post is a short overview of the reason for going a full major version number up, and of the new things you can expect to find. There will be other posts in the following days about breaking changes and upgrade paths, and a more in-depth look at the new features.

Bye bye, PHP

To say that this is a major rewrite of PieCrust would be an understatement: I moved the project over to Python, which means it’s a 100% rewrite. This may upset some users who only know PHP (or at least don’t know and/or like Python)… but this is for the best, I assure you.

First, one of the design principles of PieCrust was always to look language-agnostic. Unlike many other static website generators out there, there’s no “leak” between the underlying implementation of PieCrust and the user experience, i.e. you’re not exposed to PHP-isms at any time while using it1. This makes it easy to change the platform on which it runs without you being affected much.

Second, the reason I picked PHP for the first implementation of PieCrust, more than 3 and a half years ago, is that it felt to me it was the lowest barrier of entry for potential users. Other static website generators embrace their hacker roots, but I wanted something simple enough that any WordPress user would be able to pick it up and try it. Nowadays, people are a lot more used to installing various things to tinker with – Git, Node, Ruby, whatever. The barrier of entry doesn’t seem to be so much at the platform level.

Those two reasons meant I could look at other platforms and figure out which one has what it takes for what I have in mind for the future of PieCrust.

Packaging and distribution

One thing that quickly became annoying in PieCrust 1 was package management (both PieCrust itself and its dependencies) and distribution (how people get PieCrust on their machines). Composer has been an incredible improvement over the venerable PEAR, but both are still a few extra steps away and more complicated than they should be.

Comparatively, gem, npm, and pip are a lot simpler and, better yet, come by default with Ruby, Node, and Python 3 respectively. Getting rid of my custom installer was an appealing thought. The PieCrust 2 install instructions would basically amount to:

  1. Install Python 3
  2. Run pip install piecrust

That’s much better, especially when you think that the upgrade path and hosting are all taken care of for me.

Performance

But it was performance that was the major reason I switched development platforms.

The problem was not that PHP itself was not fast enough – it’s actually doing OK in the overall category of interpreted languages. The problem is that, because it’s got so much usage as a web programming language, it’s lacking a lot of features as a scripting language. One of those features is an API for multi-threading2.

To be honest, I should have thought about it back when I started PieCrust, that I would eventually need parallel processing… but it’s never to late to change direction, which is what I’m doing with PieCrust 2.

To give you an idea of how much this impacts performance, here’s a little graph. It shows the time it takes to bake my blog (the one you’re reading now!) using Octopress (the most popular static website generator around), PieCrust 1, and the newly written PieCrust 2. Obviously, shorter is better.

Octopress takes around 21 seconds3, PieCrust 1 takes around 11 seconds4, and PieCrust 2 takes around 6.5 seconds! And that’s even before I’ve made any optimization pass specific to this new codebase!5.

So yes, there is quite a substantial gain after the move to Python and parallel baking already. And that’s even before I can get into other improvements… for example, chef serve will be able to start a background thread to monitor changes to static assets on the file-system instead of checking for them when HTTP requests come in. This should make the preview server much snappier when you’re refreshing a page that has several images or CSS sheets.

Next post we’ll look at how you can upgrade your existing PieCrust 1 website to version 2, since I took the opportunity of a major version bump to clean up a few things I didn’t like anymore.


  1. Except for date formats. Sadly, date formats are very much tied to the
    underlying framework – unless you implement your own wrapper syntax – and this
    is one annoying breaking change when upgrading to 2.0. I’m open to ideas to fix
    that of course! ↩︎

  2. There are a couple extensions available to fill the gap, but they’re just
    terrible. ↩︎

  3. And that’s only for the posts on this blog, along with the tag pages, with
    simplified markup… my crude interop script strips out code highlighting blocks
    and other expressions (e.g. {{foo}} expressions). I’m
    expecting the real thing would take an additional second or two. ↩︎

  4. See how little it matters whether PHP sucks more or less than Ruby? Design
    and implementation are a lot more important for the big gains. ↩︎

  5. Most optimizations from PieCrust 1 were ported over to the Python codebase
    already. ↩︎


DRM-free backup on Comixology

Me, a few months ago after the “scandal” of Comixology removing the ability to buy comics directly from inside their iOS app:

I would hope ComiXology manages to revert the change, but frankly I’d rather put my hopes in more DRM-free comics available directly from the creators and publishers instead.

Well my hopes have been answered in a way: Comixology announced last week that you would be able to download DRM-free versions of your Comixology books for publishers who are OK with that:

The first wave of participating publishers making their books available as DRM-free backups include Image Comics, Dynamite Entertainment, Zenescope Entertainment, MonkeyBrain Comics, Thrillbent, and Top Shelf Productions. In addition, creators and publishers that are self-publishing through comiXology Submit are now able to choose to make their books available with a DRM-free backup.

No surprises here about the publishers who are indeed “OK with that”, since they’re the ones who were already offering DRM-free comics on their own website… but those are excellent news. I can’t stress enough how huge this is.

I’m not sure whose idea it was – whether publishers like Image pressured Comixology to do this, or whether Comixology came to this logical conclusion on their own – but I’m very happy either way. As I said before, I had completely stopped buying Image comics from Comixology, preferring instead their own DRM-free website… but that website was slow as hell and barely usable. Ideally I’d rather give 100% of my money to Image, instead of – probably – 70% through Comixology, but the usability is night and day between the two, and uploading independently acquired files to an iPad is still a huge pain in the ass1.

“For those out there who have not joined the comic reading community because of DRM – you have no excuse now,” said co-founder and Director of ComiXology Submit John D. Roberts

Indeed.

The only problem I’ve found so far is that those backups are extremely bare: just a ZIP file with the pages as JPEG images. They’re the “retina” hi-res versions, so that’s good, but the archive is missing any kind of metadata. The only way to know what it is, short of having a human open it and read the cover, is to parse the file name.


  1. Something I’m hopping will be greatly improved in iOS8. ↩︎