The Stochastic Game

Ramblings of General Geekery

Data First: The File Server (part 2)

In part 1, we had a look at how to buy and setup a file server. Now you may be asking yourself a few questions about how to actually use that thing. The first question we’ll answer is “what should I put on there?”.

Network

Q: What should I put on my file server?

The answer is two-fold: anything you want to access from more than one device, and anything you can’t easily backup automatically otherwise.

In my case, it means everything:

  • I surely do want to access my music and movies from a variety of devices. I even access them from work.
  • I’m accessing my pictures only from my laptop, but my wife also wants to access them from her laptop (both laptops have Lightroom installed), so pictures go on the server. It also means they will be backed up automatically – if they were on one of the laptops, it would be difficult to do that since the machine would most likely be asleep when the backup job kicks in, and in this age of SSD-only laptops you’d run out of space pretty quickly anyway.
  • My code repositories are on the file server too. I check-out the code locally and commit/push changes back on the file server.
  • Documents, porn, whatever, it’s on there.

Of course there are some caveats. Things may be too slow for you. For instance, if I work in Lightroom, I’ll turn off Wi-Fi and plug the laptop to my home Gigabit network. And even then, it will be noticeably slower than if the pictures were stored locally (but it’s not too bad as far as I’m concerned, since raw picture editing is still the performance bottleneck on my machine). If you’re doing stuff like video editing, that’s not even an option.

When a particular piece of data can’t be efficiently accessed remotely, you can use your file server as the backup device – data would be stored locally on one machine, and backed up automatically to the file server. That’s fine, as long as the backup process is, again, automatic. This generally means the source machine is a desktop computer, so that it’s available during the night, when most backup jobs execute.

I would advise against storing data anywhere else than the file server or an automatically backed-up desktop machine (or otherwise always-on storage unit). Choosing where to put a given piece of data is always a balancing act between where it makes sense, where it’s convenient, and where it’s safe, but remember that, when in doubt, always prefer safety.


Data First: The File Server (part 1)

The central piece to a data-first methodology is, in my opinion, having a file server.

The reason for this is that you’re going to want to access your data anytime, anywhere: streaming your music to your work PC, your movies to your iPad, or accessing your documents from your phone. You need a secure and reliable to way to store and serve that data, and this is best done with a file server.

File Server

(if you’re already about to ask why I don’t just use iCloud or something, you may need to read my introduction post again)

Let’s look at the basics for getting a file server after the break.

The hardware

Requirements

The absolute minimum requirements are a computer that’s connected to your home network, and that is always on. Most desktop computers would fit the job description, unless you’re one of those weird people that actually turn their computers off instead of letting them go to sleep. Ideally, however, that file server should have a few other requirements:

  • Low power: it’s going to be always on, so it might as well consume as little energy as possible, not just for the cute baby polar bears but for your electricity bill too.

  • Quiet: that’s important unless you intend to store it away in your basement or something.

  • Data redundancy: hard drives fail – it’s not a matter of “if” but a matter of “when”. And when it happens, you don’t want to have to recover your data from a backup. It’s tedious at best, and you may lose the last hour of work even if you’re using things like Time Machine on a Mac. Only with “copy-on-write” snapshot systems would you not lose anything, except maybe the file that was being saved when the disk failed. Shadow Copy, a little know feature of Windows available for almost 10 years does exactly that, but nobody really uses it as far as I can tell because Microsoft never did any fancy UIs and marketing like Apple did with Time Machine. And this would mean running Windows on your file server, which would probably defeat the previous 2 bullet points as it would require desktop-grade hardware instead of a smaller dedicated box.

    Anyway, I highly recommend you get a file server that handles some kind of data redundancy. RAID-1 mirroring is the minimum (2 disks), variants of RAID-5 with 4 disks are the most common, and anything above that is a nice bonus. With both RAID-1 and RAID-5 you get one “free” disk failure – i.e. you can have one disk dying on you, and if no other disk dies while you replace it and the system rebuilds itself, then you’re all good. Of course, even if it’s quite rare, it does happen that a second disk dies soon after (especially if all your disks are the same model, bought at the same time), so make sure you always have a backup ready (we’ll talk about that later in this article).

  • Connected with wires: ideally, the file server should be connected via Gigabit Ethernet. You can always hook it up to a Wi-Fi router down the line, but at least you know the bandwidth bottleneck is not at the source. Sure, the latest forms of Wi-Fi can stream HD videos, but how do you think that’s going to scale when you end up having 2 people streaming from 2 different devices while you’re transfering files and there’s a backup job going on? Yeah. If you can, use wires.

Buying a NAS box

Given these requirements, the easiest way is really to go for a dedicated NAS. Unless you want to go through the trouble of refurbishing an old computer of yours, figuring out how to get the RAID controller to work correctly, and finding a place for a box that’s probably 5 times bigger than it needs to be, that is. You could also build a custom PC, but if you’re considering it, then you’re probably able to figure it out on your own (although I may post a guide in a distant future).

A few years ago, ReadyNAS was all the rage for NASes, with Synology not far behind, but since they’ve been bought by Netgear (which was around the time I bought my NV+), ReadyNAS seems to have fallen behind. Now, the top of the line seems to be Qnap, Thecus, and Synology, if I believe those performance benchmarks.

Have a look at those benchmarks and pick the best models that fit your budget, but do keep an eye out for which configuration is used on each test. For instance, RAID-0 with many bays can out-perform everything else on some tasks, even though in practice it’s probably not a configuration you’d use.

Synology are known to have the more user-friendly administration UIs, so you may want to bias your choice accordingly. You may also be distracted for a while by all the stuff these boxes can do on top of simple file sharing, like BitTorrent clients and web servers and photo galleries and what not. Stay focused on their main role: serving files.

The Software

There’s not much to say about the software. If you bought a pre-built NAS as recommended, it will usually come with its own custom Linux-based OS with a fancy administration UI. You will just boot it, set up the shared folders and permissions, and maybe configure some additional sharing services you may need.

If however you built your own NAS, or are using an existing desktop computer, install whatever you know best – MacOS, Windows, or, ideally, a lean Linux distro. Whatever you end up with, make sure your server is easy to remote into. You’re unlikely to hook up a screen and keyboard to it, so you’ll have to use Remote Desktop (on Windows) or SSH (on Mac/Linux) to do any kind of maintenance work. Note that some Linux distros, like the appropriately named FreeNAS, have a web adminstration panel.

Backups

Once you have your file server running, the first thing you need to do is set up some automated backups.

Let me write this again: YOU NEED TO SET UP SOME AUTOMATED BACKUPS.

I’m not kidding. Don’t say “oh I’m bored, I’ll do it next week-end, I just want to watch Netflix already”. You know perfectly well next week-end you’ll be browsing Facebook and cleaning your house because you had been putting that off for 3 weeks already. So do it now.

Again, if you got a pre-built NAS, this is probably as easy as plugging an external drive to the box and going through the administration panel to find the backup settings. Just do an incremental backup of the whole thing every night to that drive. Bonus points if your backup drive is itself a RAID array.

Your NAS may also have some kind of continuous backup system (sometimes called “snapshots”), so you can enable that too.

If you have a custom box, you’re probably smart enough to setup a scheduled robocopy task (on Windows) or a cronjob running rsync (on Linux/Mac) to backup all your data to a secondary drive. If not, look it up online.

What next?

In the next parts, we’ll discuss a couple things, like what should actually go on that new fancy file server of yours.


PieCrust 1.0

PieCrust reached the big milestone of version 1.0 without much fanfare – and this post won’t be any different from the other release announcements. After a few release candidates I figured I would never be quite satisfied, so why not just keep going with the releases and not pay too much attention to the first digit.

Strawberry Rhubarb pie #gluten-free

You’ll see releases 1.1.0 and up coming soon, with the usual bunch of fixes, changes, and new features. The only difference is that the version number will now reflect better what’s going on, since I’ll be loosely following the semantic versioning specification. In a nuthsell, the digit being incremented reflects whether a release is a bug fix, a non-breaking change, or a major and/or breaking change.

The one big new thing that comes with version 1.0 is an installer script, along with a .phar binary, to make it easier for people to use PieCrust if they don’t want or need the source code. Head over to the PieCrust documentation for more information.

For the rest of the changes, keep reading.

Auto-formats

One popular request has always been to make it possible for users to write pages and posts using other extensions than .html – most specifically .md or .markdown. This is now possible with the auto-format feature, which maps extensions for formats. As of 1.0, no auto-format is declared by default, so you have to specify the ones you want in your config.yml:

site:
    auto_formats:
        md: markdown
        markdown: markdown
        textile: textile

The example above maps extensions .md and .markdown to the Markdown format (same as if you specified format: markdown in the page’s config header), and extension .textile to the Textile format.

As of version 1.1, .md and .textile will be defined by default.

Template data changes

Some page template variables have been changed:

  • asset is now assets.
  • link is now siblings, and returns the page’s sibling pages (i.e. in the same folder).
  • There’s a new family variable that returns a recursive version of siblings (i.e. sibling pages and all the children pages in sub-directories).

The old names are still available, but will trigger warnings when you bake.

Feed preparation

The chef prepare command can now create more than pages and posts for you: you can run chef prepare feed and it will create a boilerplate RSS feed page for you.

You can specify --atom to create a boilerplate Atom feed instead.

Plugin update

If your website has some plugins, you can update them easily with the new chef plugins update command. Right now it will just stupidly re-download the plugins from their source, so it may re-install the same version, but that’s enough for now 🙂 It’s especially handy if you have some custom plugin that’s used by several websites.

Sass, Compass and YUICompressor

Speaking of plugins, the previously plugin-implemented Sass, Compass and YUICompressor processors are now part of the core PieCrust code.

They have also been improved in the process. Most importantly, Compass support is a lot better.

Miscellaneous changes

  • The monthly blog archives (blog.months) was incorrectly ordered chronologically, instead of reverse-chronogically. This is now fixed.
  • Anywhere that returns a list of pages or posts should now have consistent behaviour and features, e.g. filtering template functions.
  • You can get access to Twig’s debug functions by setting the twig/debug site configuration variable to true.
  • If you want PieCrust to use the Javascript lessc compiler to process LessCSS stylesheets, set the less/use_lessc site configuration variable to true.
  • Pretty colors for chef commands on Mac/Linux! (this is important)

For the complete list of changes, see the CHANGELOG.


Blog Archives in PieCrust

The question has come up a couple times already via email or Twitter, so here’s a quick recipe to write a nice looking archive page for your PieCrust blog.

There are 2 main types of blog archives: monthly archives and yearly archives. We’ll look at them one at a time after the break.

Yearly archives

This is the simplest one. Because PieCrust exposes your posts sorted by year in the blog.years template variable, you just need to loop on that with a for loop. Each object returned in the loop contains the following attributes:

  • posts: the list of posts for that year.
  • name: the name of the year (although you can also use the object itself, which is what we’ll do).

You end up with something like this:

---
layout: blog
title: Blog Archives
format: none
---
{% for y in blog.years %}
<h2>{{ y }}</h2>
<ul class="archive-list">
    {% for p in y.posts %}
    <li>
        <p>
            <a href="{{ p.url }}">{{ p.title }}</a>
            <time datetime="{{ post.date|atomdate }}">{{ p.timestamp|date('M d') }}</span>
        </p>
    </li>
    {% endfor %}
</ul>
{% endfor %}

Note how we render the date of each post with a custom format using the date filter (here using only the day and month). For more information about the date filter, check out the Twig documentation. Also, to provide a little bit of metadata, we use a time tag along with the atomdate filter, which is a handy shortcut for using the date filter specifically with an XML date format. Of course, you don’t have to keep that same markup – you can reuse the Twig logic but completely change the rest.

Monthly archives

This one is a bit more complicated. Although PieCrust also exposes your posts sorted by month in blog.months, you still need to spot changes in years so you can print a nice title or separator. To do this, we keep a variable curYear up to date with the current post’s year. If the post has a different year than the post before it, we print the new year in an h2 tag.

Each object in the blog.months has the following attributes:

  • posts: the list of posts in that month.
  • timestamp: the timestamp of the month, so you can render it to text with the |date filter.

So to get the year of each month, we use month.timestamp|date("Y"). To print the name of each month we use month.timestamp|date("F").

In the end, it goes something like this:

---
layout: blog
title: Blog Archives
format: none
---
{% set curYear = 0 %}
{% for month in blog.months %}
    {% set tempYear = month.timestamp|date("Y") %}
    {% if tempYear != curYear %}
        {% set curYear = tempYear %}
        <h2>{{curYear}}</h2>
    {% endif %}
    
    <h3>{{ month.timestamp|date("F") }}</h3>
    <ul>
    {% for post in month.posts %}
        <li>
            <a href="{{ post.url }}">{{ post.title }}</a>
            <time datetime="{{ post.date|atomdate }}">{{ post.date }}</time>
        </li>
    {% endfor %}
    </ul>
{% endfor %}

Again, feel free to keep the logic and change the markup to your liking – that’s the whole point of PieCrust!


Data First

I like to think I’m being careful and responsible with my data, especially when I look at what most people do with theirs, so I thought I’d start a new series of posts on the subject.

Hard Disk

“Data-first” is about choosing applications, services and devices based, first and foremost, on the data that you will get out of them, or the data they accept as input. It’s important because, at the end of the day, once you’ve quit your apps and turned off your devices, your data is the only thing that’s left, and the only thing from which you’ll start again tomorrow. It’s also the only thing you’ll have when you decide to switch to different applications, different OSes, or different devices.

Who is this for?

Some companies – Apple, Google, Amazon, or Microsoft – want you to trust them with your data. Trust that they will keep it available to you as the technological landscape around us changes. Trust that they will keep it stored for the next 50 years or so. And that they’ll always be there to unlock the files for you. And that they’ll pass it all on to your kids when you die.

If you can trust at least one of them, “data-first” is probably not for you. Instead you’ll choose the path of least resistance where all you have to do is tap a button on your iPad or Kindle Fire, watch that movie or read that book, and forget about it. Did you just rent or purchase? Do you own or merely lease? Does it have DRM? What about maybe switching to another eco-system in the future? Who cares! I applaud your ability to not worry about such things. Be on your way, you blissfully lucky person, I wish you well.

If you’re like me, however, there’s no way you can think that way. Being French means, at best, having a… let’s say: a “healthy” distrust of governments and corporations. Even if I trusted a company right now (which I don’t), I have no guarantee that the next CEO or board of directors are not going to screw their customers over. And this is important when you want to keep consuming your data for a long time. Am I ever going to stop re-watching “Who Framed Roger Rabbit” or “The Shining”? Or stop re-reading any Alan Moore comic? Probably not. And how long is “Game Of Thrones” going to last? Another 6 years, maybe? Remember how things were 6 years ago? Yeah, that was when people were eagerly waiting for the first iPhone to be released, and Netflix was still about mailing DVDs to people.

So no, I will not trust anybody but myself to manage my data for the next 50 years, let alone the next 10.

What is it for?

Keep in mind that the “data-first” approach has nothing to do with services and applications where you’re not supposed to keep any data. This includes iTunes rentals and subscription based services like Spotify or Netflix. I have absolutely no problem with those, which I use extensively.

What it’s for is any data you’ve chosen to purchase (videos, music, books, whatever), or that you have created or shared (emails, IMs, or other social media bullshit). That’s what we’ll be talking about.

“Data-first” posts will be tagged with the eponymous tag, so keep an eye on it for case-by-case studies.


Tough Time for Honest Comics Readers

JManga, a digital manga service created less than 2 years ago by 39 of the biggest publishers in Japan, is shutting down in a couple months. Most cloud services related fears became a reality when it was clear no refund or backups would be offered. Check out their “Urgent Message” for more details, but believe me when I say it can’t get any worse:

It is not possible to download manga from My Page. All digital manga content will no longer be viewable after May 30th 2013 at 11:59pm (US Pacific Time)

Everybody then wondered what would happen if ComiXology went down. And funnily enough, just the day before, ComiXology had experienced a massive blackout which left people unable to read any issues they didn’t have in their cache.

Rich Jonston from Bleeding Cool concludes:

This is the moment when the real winners are comic stores… and pirates.

As I said before, and as many others said before me: own your data. Cloud services are fine by me as long as there’s a way to easily backup my stuff on my file server, thank you very much.


The Death of Google Reader

After the infamous announcement that Google was shutting down Google Reader, there was a lot of debates around the use of online services, especially free ones, and whether we can trust a company to keep such services up indefinitely.

Of course, nothing can last “indefinitely”, and probably nothing will last until you die. You have to expect that Gmail, Facebook, iTunes, Amazon Kindle and any other service you’re currently using won’t last for more than, say, 20 years (and that’s being generous). You need to plan accordingly.

Marco Arment sums this up on his blog:

Always have one foot out the door. Be ready to go.

This isn’t cynical or pessimistic: it’s realistic, pragmatic, and responsible.

That’s what I’ve always tried to do. I choose programs, services and products that don’t take my data away from me. It makes it easier to switch to something else if I need/want to, and it future-proofs what I spend money on.

But that’s where it gets interesting, because Google Reader was pretty open to begin with. Google may have become this data hoarding and privacy raping monster over the years, but one thing they always had going for them was the Data Liberation initiative. With it, you effectively always had one foot out the door. You could, at any moment, download a list of all your subscriptions in an open, standardized format, along with a collection of all your stars, comments, and shares. You may be disrupted for a while because you would need to adapt to a new feed reader, but you could switch, just like you can switch text editors or operating systems.

What’s wrong with the Google Reader situtation has nothing to do with your data, or with using a free service (although that is an important subject too). What happened is that Google Reader became a lot more than a free online feed reader. It became a single choke point for virtually every feed reader or news aggregator in the world. Google is of course to blame for making it a collateral damage of their social-wannabe delusions, but we are equally to blame for letting all those programs like Flipboard, Pulse or NetNewsWire rely on a single service that was never really intended to be used that way. It’s understandable it ended up this way, because relying on Google Reader meant easier adoption for new users and not having to worry about complex problems like data storage, metadata syncing, and interoperability… but it doesn’t make it the right decision either. We are to blame because we were constantly asking for Google Reader support. It became a feature you couldn’t ship without.

The death of Google reader is not about losing a product we love – it’s about breaking an entire eco-system of products and workflows.

Hopefully, we’ll recover and things will be better, but it does bring up another debate: the one about how we rely so much on other single choke points like Twitter and Facebook. Ideally, everything should be federated, like email, but I tried looking at distributed alternatives, and they’re just not working well enough. If the links you share and photos you post are of any value to you, I’d suggest you start looking at data harvesting solutions. I know I am.


The Problem with iOS

These past few months I’ve seen a fair number of articles about people who switched from iOS to Android. Most of those articles talk about the differences between the 2 operating systems, and how some of those differences proved to be significant enough for whoever was switching: multitasking, notifications, the so-called “open vs. closed”, etc. That’s fine, but these specific bullet point list vs. bullet point list comparisons seem to be missing the higher level view of what’s really going on: iOS is just not working the way it should anymore.

When iOS was released in 2007, it made a lot of things easier to do on a phone, compared to the existing competitors. A lot of tasks just felt “right”, or at least “way better”. But it basically stayed there since then. The only real additions, like the notification center or iCloud, are either incredibly badly designed, or are only usable if you have an almost exclusive Apple-based household (which you shouldn’t). When I look at my iPad now, I feel like I’m looking at a computer that hasn’t been updated in years.

If I’m reading something in Safari, I should be able to quickly send it to Pocket, Buffer, Dropbox, or whatever other service I choose to base my workflow on. But on iOS, the only “sharing” services I have access to are those that Apple thinks I should use. For everything else, I need to copy the URL, switch to another app, and find where to paste it. Why?

If I added a few articles to read on Pocket late yesterday evening, I should have them ready on my phone or tablet the next morning so I can read them during the commute. But on iOS, I have to remember to open the Pocket app so it can sync. Why? (note that other apps like Instapaper try to use the Location Notification API but frankly, by the time I leave the house, and therefore lose any WiFi connectivity, it’s already too late).

If I just bought a bunch of PDFs from DriveThruRPG, e23, Arc Dream or Pelgrane, or some comicbooks from Panel Syndicate or Thrillbent, or whatever else I want from whoever I feel like giving money to, I should be able to just download those files onto my device and open them with any application I like. But on iOS, unless it’s coming from iTunes (directly or via in-app purchases, all of which also means 30% of my money doesn’t even go to the people I want to give my money to), I have to jump through many hoops, going first through Dropbox or FileBrowser and then having to re-transfer every. single. file. to each of their application’s sandbox storage. Why? Oh why?

You may not be surprised to learn that all of those things “just work” on Android. And those are just examples of the frustrations I’ve had with my iPad in the past few weeks. You don’t just go “BOOM!” anymore – quite the contrary: you actually have to work harder to make things happen.

Some people might be tempted to boil this down to a simple list of features, like “iOS needs a background service API, shared storage, and more extensibility points for 3rd party apps”. This sounds to me like going back to 2007 and saying “Windows Mobile needs to make the stylus optional, have a grid of icons on the home page, and remove copy/paste”. Really, it’s not just about a bullet point list of features. It’s about the whole philosophy of the system, and of the company behind it. And although Apple seems to be occasionally bowing down to the pressure of the market, like when it released the 7″ iPad Mini, I don’t expect it to change this radically on the design of iOS.

Would I rather have an iPad Mini instead of my Nexus 7 to bring with me everywhere I go? Of course I would. The iPad Mini is thinner, lighter, and has a better screen. But as someone famous once said, “[design is] not just what it looks like and feels like. Design is how it works”. And for me, iOS is just not working well enough. And it never did, really – it’s just that until the competition caught up with the basics, only a minority of users noticed it.


PieCrust 1.0 RC

The past month has been pretty busy, between my next secret project, my day job, and of course fixing PieCrust bugs. But somehow among this chaos seems to be emerging a release candidate for PieCrust 1.0. And it’s only fitting that I announce this on Pi Day!

P365x52-73: Pi(e)

As always, for a complete list of changes, I’ll redirect you to the changelog. But for the highlights, please read on.

Big thanks go to the few people who contributed patches to the PieCrust code, and to the many who reported bugs and had the patience to help me fix them.

Breaking changes

First, the breaking changes. There are a bit more than I’d like, but most of them should not be a problem to 99% of users:

  • Chef’s command line interface has changed: global options now need to be passed first, before the command name. So for example, if you want debug output when baking, you need to type chef --debug bake.
  • The pagination.posts iterator can’t be modified anymore (i.e. calls to skip or limit or filter will fail). You can use the blog.posts iterator instead to do anything custom.
  • The xmldate Twig filter has been renamed to atomdate.
  • There was a bug with the monthly blog archives (accessed with blog.months), where they would be incorrectly ordered chronologically. They are now ordered reverse-chronologically, like every other list of posts.
  • The baker/trailing_slash is now site/trailing_slash, since PieCrust will also generate links with a trailing slash in the preview server, and not just during the bake, when that setting is enabled. The old setting is still available, though.
  • The asset template variable is renamed assets. The old name is still available.
  • Specifying a link to a multi-tag listing page is now done with the array syntax: {{pctagurl(['tag1', 'tag2'])}}. The previous syntax quickly broke down as soon as somebody decided to have tags with slashes in their name 🙂

All those changes should give you an error message that’s easy to understand, or have backwards compatibility in place with a warning telling you about the change. Look out for those.

Sass, Compass and YUI Compressor

Previously available as plugins, the Sass, Compass and YUI Compressor file processors are now part of the core. There were enough people mentioning those tools, especially Compass, that it made sense to include them by default.

The Sass processor is very similar to the one previously available in the plugin. In the site configuration, you can specify include paths with sass/load_paths, output style with sass/style, or any custom option to pass to the Sass tool with sass/options.

Compass support, however, has changed quite a bit, and should be now a lot better:

  • You enable it by setting compass/use_compass to true. This will prevent the default Sass processor to run on your .scss files.
  • If .sass or .scss files are found in the website, the compass tool will be run at the end of the bake. It will by default use any config.rb found at the root of the site. You can otherwise specify where your Compass config is with compass/config_path, or ask PieCrust to auto-generate it for you with compass/auto_config to true.
  • It may be a good idea to add your config file to the baker/skip_patterns list, so that it’s not copied to the output directory.

To enable the YUI Compressor to run on anything that outputs CSS, specify the path to the .jar file with yui/compressor/jar.

Linking feature now official

For a while, there was a link template variable that let you access other pages in the content tree. It was however never really official since I was still iterating on the design.

It’s now official, and available through the siblings template variable. It will return the pages and directories next to the current page.

To return the whole family tree starting from the current page, you can use family. It’s like a subset of site.pages.

Auto-format extensions

Another popular request is the ability to use different file extensions for pages and posts, like .md for Markdown content or .textile for Textile content.

This is now possible with site/auto_formats. This is a list that maps an extension to a format name:

site:
    auto_formats:
        md: markdown
        mdown: markdown

Here I’m mapping *.md and *.mdown to the Markdown format. Files found with those extensions will be treated as if they were .html files, but will also have their format set to markdown.

Feed preparation

If you write a blog, you most probably want to have an RSS feed. You can have one prepared for you with: chef prepare feed myfeed.xml. It will create a new page that has most of what you want by default. You can then go and tweak it if you want, of course.

Miscellaneous

A few other important changes:

  • All libraries (including Twig, Markdown or Textile) have been upgraded to their latest versions.
  • It is now possible to specify posts_filters on a tag or cateogory page (_tag.html or _category.html).

PieCrust on Heroku

When I first decided to work on PieCrust, I settled with PHP as the language – even though it mostly sucks – in an attempt to make it broadly available. Anybody who runs a blog on WordPress should be able to switch and enjoy the perks of plain text data without needing to install and learn a whole new environment.

That doesn’t mean PieCrust can’t also be used in the nerdiest ways possible. A while ago we looked at how cool it is to update your website with Git or Mercurial, and today we’ll look at how you can host it on Heroku, which incidentally also supports Git-based deployment.

Today's latte, heroku.

If you already know how Heroku works, then the only thing you need is to make your app use the custom PieCrust buildpack. Skip to the end for a few details about it.

For the rest, here’s a detailed guide for setting up your PieCrust blog on Heroku, after the break.

1. Sign up and setup

This is pretty obvious but it’s still a step you’ll have to go through: sign up for a Heroku account and install their tools. Follow the first step to login via the command line, but don’t create any app just now.

2. Create your PieCrust website

For the sake of this tutorial, let’s start with a fresh new site. You will of course be able to use an existing one, the steps would be very similar.

Let’s create one called mypiecrustblog:

> chef init mypiecrustblog
PieCrust website created in: mypiecrustblog/

Run 'chef serve' on this directory to preview it.
Run 'chef bake' on this directory to generate the static files.

Let’s also add a post, just to be fancy:

> chef prepare post hello-heroku
Creating new post: _content/posts/2012-12-03_hello-heroku.html

Last, turn the site into a Git repository, make Git ignore the _cache directory, and commit all your files:

> git init .
Initialized empty Git repository in /your/path/to/mypiecrustblog/.git/
> echo _cache > .gitignore
> git add .
> git commit -a -m "Initial commit."

By the way, you can quickly check what the site looks like locally with chef serve. We should be able to see the exact same thing online in a few minutes when it’s running on Heroku.

3. Create your Heroku app

Now we’ll turn our site into a Heroku app. The only difference with the documentation on the Heroku website for this is that we’ll add an extra command line parameter to tell it that it’s a PieCrust application:

> heroku create mypiecrustblog --buildpack https://github.com/ludovicchabant/heroku-buildpack-piecrust.git
Creating mypiecrustblog... done, stack is cedar
BUILDPACK_URL=https://github.com/ludovicchabant/heroku-buildpack-piecrust.git
http://mypiecrustblog.herokuapp.com/ | git@heroku.com:mypiecrustblog.git
Git remote heroku added

What’s happening here is that, in theory, Heroku doesn’t know about any programming language or development environment – instead, it relies on “buildpacks” to tell it what to do to set up and run each application. It has a bunch of default buildpacks for the most common technologies, but it wouldn’t know what to do with a PieCrust website so we need to provide our own buildpack, with that --buildpack parameter.

If you already created you app previously, you can also make it a PieCrust application by editing your app’s configuration like this:

heroku config:add BUILDPACK_URL=https://github.com/ludovicchabant/heroku-buildpack-piecrust

We can now push our website’s contents to Heroku:

> git push heroku master
Counting objects: 3, done.
Writing objects: 100% (1/1), 185 bytes, done.
Total 1 (delta 0), reused 0 (delta 0)

-----> Heroku receiving push
-----> Fetching custom git buildpack... done
-----> PieCrust app detected
-----> Bundling Apache version 2.2.22
-----> Bundling PHP version 5.3.10
-----> Bundling PieCrust version default
-----> Reading PieCrust Heroku settings
-----> Baking the site
[   171.7 ms] cleaned cache (reason: not valid anymore)
[    46.4 ms] 2012/12/03/hello-heroku
[    21.3 ms] [main page]
[     2.2 ms] css/simple.css
-------------------------
[   247.3 ms] done baking
-----> Discovering process types
       Procfile declares types    -> (none)
       Default types for PieCrust -> web
-----> Compiled slug size: 9.5MB
-----> Launching... done, v7
       http://mypiecrustblog.herokuapp.com deployed to Heroku

To git@heroku.com:mypiecrustblog.git
   1180f39..e70c271  master -> master

At this point, you should be able to browse your website on Heroku (http://mypiecrustblog.herokuapp.com in our case here).

You now just need to keep adding content, and git push to make it available online.

Appendix: The PieCrust buildpack

The PieCrust buildpack we’re using in this tutorial will, by default, bake your website and put all the generated static files in the www folder for the world to enjoy.

If, however, you set the heroku/build_type site configuration setting to dynamic, it will copy the PieCrust binary (a .phar archive) to your app’s folder and create a small bootstrap PHP script that will run PieCrust on each request. This would make deployments very fast, as you won’t have to wait for the website to re-bake, but it’s highly recommended that you use a good cache or reverse proxy for anything else than test websites.

Note that the version of PieCrust that’s used by the buildpack is, by default, the latest one from the development branch (default in Mercurial, master in Git). You can change that with the PIECRUST_VERSION environment variable. For example, to use the stable branch instead, you can do:

> heroku config:add PIECRUST_VERSION=stable

For more information about the buildpack, you can simply go check the source code over on Github.