The Stochastic Game

Ramblings of General Geekery

Actual Play: The Murderer Of Thomas Fell

To celebrate 2015 (happy new year!) I’m starting a new section on this website: actual play write-ups of RPG games. I got back into gaming a few months ago, GM’ing a few Call/Trail of Cthulhu games.

The first one is “The Murderer of Thomas Fell”, a simple, one-shot adventure that acted as an introductory adventure to my new group of players. If you’re the kind of person that reads actual-plays, or if you plan on running that adventure, you can head over here.


Gutentags for Vim

Autotags is my second “official” Vim plugin (after Lawrencium). It confirms a trend of having a terrible name (although this time for different reasons), but I’m open to changing it since it’s still early. And as that terrible name implies, this new plugin is all about automatically managing your tags.

Edit: thanks to Reddit, it was renamed to Gutentags! I edited this post after this point to use the updated name and links.

Classic Airline Baggage Tags

One of the biggest problems you face when using Vim with a large codebase – and one of the reasons most users still go back to an IDE for their day job – is that tags files in Vim suck. No wait, it’s not even that they suck, it’s that there’s nothing out of the box to help you with it. Which, well, sucks.

In case you don’t know, tags files are basically a reverse index of the symbols defined in a given codebase, as generated by an external tool like Ctags. This is what lets you put the cursor on a function call and jump to the definition of that function. It’s basic stuff that “just works” in an IDE1, but in Vim you need to create, update, and otherwise manage that thing yourself. It’s insanely archaic even by Vim’s standards.

But there’s no reason it shouldn’t “just work” in Vim, and that’s why I wrote Gutentags. Head over to the official website to get started in less than a minute.

In case you’re wondering how this plugin is different from the many other similar plugins out there, or from just doing it the retarded way (i.e. run !ctags -R . every now and then), here it is:

  • No dependencies on anything else than Vim and Ctags: no Python, Ruby, or whatever.
  • Cross-platform: should work at least on Mac and Windows at the moment, Linux should be fine too2.
  • Automatically index new projects: when you open a file in a new project, Gutentags will start indexing it right away. You don’t need to manually run it if you don’t want to.
  • Incremental tags generation: when you edit and save a file, Gutentags will properly and automatically update the index, but only for that file. Re-generating the whole index obviously doesn’t scale for large codebases, yet that’s what most tutorials tell you to do! This is madness and it has to stop.
  • Background update: you shouldn’t have to wait while the index is (re)generated (which is what !ctags -R . does! Again, madness).
  • Keep tags files away: don’t like to see lots of tags files polluting your projects everywhere? Tired of adding tags to every .gitignore or .hgignore file ever? Me too. Gutentags lets you keep them in a hidden place of your own choosing.

At the time of writing this post, Gutentags has been tested on a glorious total of 3 machines (all my own with the same Vim configuration), so watch our for bugs, and please report them on Github or BitBucket.

The usual disclaimers are in effect (this is a random piece of code you found on someone’s blog!), but I just want to warn you that since this plugin kicks off background ctags processes, there could be bugs that will generate humongous tags files while saturating your laptop’s CPU and ending up burning your balls and/or snatch. Again, report them via Github/Bitbucket after calling your local emergency dispatch centre.


  1. Except when it doesn’t. See also: Visual Studio’s Intellisense. ↩︎

  2. Yeah, I’m expecting many bug reports. ↩︎


New features in PieCrust 2

Now that you know about PieCrust 2 and you’ve upgraded your website, it’s time to look at the really new features. Today we’ll talk about the 2 ones that I think are most important: the new content model, and the new pagination model.

And yet another apple pie

(this post is going to be a bit long so here’s something to keep you hungry)

Sources, routes, and taxonomies

In PieCrust 1, like in most other static website generators, the way content was defined was quite rigid: you could have pages, and you could have blog posts. PieCrust did a few extra things, like letting you have multiple blogs, each with its collection of posts, but that was it.

The only way to have a specific set of pages, different from other pages, was to use page metadata and filtering (say, filter pages where type is recipe to get all recipes), but that didn’t translate to a good file-system organization, required remembering to tag things correctly, and required to use the inverse filter to get the other pages. You also couldn’t have a different URL format for all recipe pages, as compared to normal pages.

Enter PieCrust 2, where all the content is, under the hood, defined with sources, routes, and taxonomies. Those are generated for you to something equivalent to PieCrust 1 content if you don’t define them, but you can override that for totally custom content.

Sources

Sources are where pages come from. Two source types you already know (if you used PieCrust 1 before) are:

  • the “simple” page source, where pages are found recursively under a given directory, and their relative path translates to their relative URL.
  • one of the “blog” page sources, where pages are found in a closely structured directory, and both the date and “slug” of the post are defined by the filename. So in the case of, say, the flat post source, all posts are files named YYYY-MM-DD_foo-bar.md directly under the posts/ directory.

Because a site can have as many sources as you want, it already means you could create a “recipes” source, and put all the recipes in a different directory than the other pages, so that’s already nice.

But in the future there will also be more advanced sources. See, the “simple” page source just gives one piece of information about a page: its relative URL. But the blog post sources give more information, like the date of the article (“simple” pages need to specify it as part of their config header).

You could therefore imagine, say, a page source where each folder applies a tag to pages inside it. So if you created a page like recipes/pies/fruits/apple-pie.md, it would automatically have tags pies and fruits applied to it, as if you wrote tags: [pies, fruits] in its configuration header. Another useful source would be one that applies a hierarchical order to its pages, based on a filename prefix – this would be well suited to things like documentations.

Routes

Now that PieCrust knows where to find your content, routes define how it’s exposed – or parsed, if you were to run PieCrust as a lightweight CMS, or when running chef serve.

A route defines the shape of the URL of a page. If you’ve used PieCrust 1, you can think of it as a generalization of the post_url/tag_url/category_url settings.

At the moment, it can only use the same information as the one provided by the source (e.g. the year, month, day, and slug of a post for a blog post source), but in the future you’ll get to use all the other page metadata too (so that you can generate URLs that include categories or tags if you want).

Taxonomies

Another generalization from PieCrust 1 are the taxonomies. Before, only categories and tags would have automatically generated listing pages. Now you can have whatever you want. You just need to specify if a taxonomy can have several terms applied to a page (like tags) or not, the name of the term listing page (like _tag.html and _category.html), and a few other optional things.

Putting it all together

Let’s say we want to have a section in our website where visitors can browse our favorites recipes. We want to put all recipe pages in a recipes/ directory (next to pages/ and posts/), be able to tag them by ingredients, with listings of recipes by ingredient being created automatically, and be able to tweak the URLs for all of this.

We’ll specify appropriate sources, routes, and taxonomies in the site configuration. Let’s start with just getting the recipes going:

site:
    sources:
        recipes: {}
    routes:
        - url: /recipe/%path%
          source: recipes

This will make PieCrust look for pages in the recipes/ directory, using the default page source (since we didn’t specify anything), i.e. the same as the one used for pages/. URLs that look like /recipe/foo/bar will match our new route, and a file named foo/bar.md will be loaded from the recipes/ directory in that case.

Now’s the time to add the “ingredients” taxonomy. This gets more complicated because we have several things to specify:

site:
    sources:
        recipes: {}
    taxonomies:
        ingredients:
            multiple: true
    routes:
        - url: /recipe/%path%
          source: recipes
        - url: /recipes/with/%ingredients%
          source: recipes
          taxonomy: ingredients

This does:

  • Add a new taxonomy named ingredients. It’s a multiple taxonomy, meaning that pages can have more than one ingredient assigned to them (this tells PieCrust it potentially has to generate listing pages for combinations of ingredients).
  • Add a new route for listing recipes by a given ingredient. Here, the %ingredients% token, along with the taxonomy: ingredients setting, let PieCrust know how to properly find and match content for this route.
  • When a listing page needs to be generated, PieCrust will look for a recipes/_ingredients.md page, passing whatever value was matched by %ingredients% to an ingredients template variable. This is analogous to how tag and category listing pages work in PieCrust 1.

Some other interesting facts:

  • You can list all recipes by starting with {% for recipe in recipes %}.... Sources have a page iterator exposed by default to a template variable of the same name as themselves.
  • You can create a new recipe page easily with chef prepare recipes foo-bar.

There are many advanced settings to change the behaviour of PieCrust, but they’re outside the scope of this already quite long blog post.

Pagination

Another big change in PieCrust 2 is how pagination is handled. In PieCrust 1, you could only paginate blog posts, but in PieCrust 2 you can paginate any list of items – pages or otherwise.

The new paginate filter lets you do that, by returning a Paginator instance, exactly like the existing pagination object. But where the pagination object returns something that lists the posts in the default blog source, the paginate filter will return something that lists whatever it was passed.

This is especially useful for galleries, as shown with my Meeting Notes doodles. This was done more or less like so:

{% set thumbs = assets|paginate(9) %}

{% for thumb in thumbs.items %}
<img src="{{thumb}}" alt="Note" />
{% endfor %}

[Older entries]({{ thumbs.prev_page }})
[Newer entries]({{ thumbs.next_page }})

Of course in reality there’s more fluff (CSS classes, etc.) and tests around using prev_page and next_page, but you get the idea. Just like in PieCrust 1, assets returns the list of page assets URLs (in this case, a whole bunch of pictures), and the paginate filter makes sure only 9 of them will be shown on a given page. It also tells PieCrust to generate sub-pages.

Obviously, you can’t use 2 pagination sources on the same page – PieCrust wouldn’t know how to generate sub-pages that go in 2 different directions, so you’ll get an error if you try that.

Call for feedback

Please get in touch with me, or post comments here, if you have some constructive feedback about this new content model. PieCrust 2 is still in alpha, so there’s time to change the design without messing up every other PieCrust user.


Upgrading to PieCrust 2

The recently announced PieCrust 2 is all fine and dandy if you were to create a new website – the command line interface and user experience are essentially the same out of the box – but you will find that it can’t handle an existing PieCrust 1 website. This is because a few things have changed… luckily, the chef import command and this blog post will get you going in no time!

Installing PieCrust 2

First, get PieCrust 2 on your system:

  • Install Python 3.
  • Run pip install piecrust in a console.

If you want to install the bleeding edge (read “unstable”) version directly from BitBucket or GitHub, instead of the latest posted release from the Python package manager, you can do one of:

  • pip install hg+https://bitbucket.org/ludovicchabant/piecrust2#egg=PieCrust
  • pip install git+https://github.com/ludovicchabant/PieCrust2.git#egg=PieCrust

There are many other options if you want to install PieCrust for advanced scenarios. See the pip documentation.

Check that everything’s OK by running chef --version. At the time of writing, you should get something like 2.0.0-alpha2.

Note: If you still get a 1.x version, you probably have the PieCrust 1 directory showing up first in your PATH environment variable, so go change that (or delete PieCrust 1 altogether!).

Upgrading a PieCrust 1 website

The chef import command was previously used for importing content from other CMSes like WordPress. Now it can also import content from a PieCrust 1 website, and even upgrade it in place. Get in your website and try:

chef import piecrust1 --upgrade

You’ll notice that a lot of things got moved. The previous layout for a website looked like this:

+ root
|- _cache/
|- _content/
|     |- config.yml
|     |- pages/
|     |- posts/
|     |- templates/
|- css/
|- images/
|- lots/
|- of/
|- other/
|- crap/
|- whatever
|- .gitstuff

Now it looks like this:

+ root
|- _cache/
|- assets/
|     |- css/
|     |- images/
|     |- other-crap/
|- pages/
|- posts/
|- templates/
|- config.yml
|- whatever
|- .gitstuff

Basically, the _content directory is gone… everything got moved up one level, while all the asset files (CSS, images, fonts, etc.) got moved into an asset folder1.

The benefits of this change are:

  • It looks better! There’s no more mix of different “things” at the root level (magic folders, assets, source-control files, miscellaneous things…). Instead, “things” are arranged in different folders that are almost self-explanatory (mostly because you can choose them!).
  • All the assets that should be processed and copied as part of the bake are in assets. Anything that’s only for development purposes (source control files, miscellaneous stuff) can be in the root directory, or in a different folder than assets, and they won’t be picked up by the bake. This prevents a lot of “oh shit” moments where you forgot to add something to the baker/skip_patterns config and a whole bunch of files are baked but you didn’t mean to.
  • It’s going to be easier to manage interoperability with other tools such as Grunt, by having such external tools operate on other top-level directories.

A few other things also changed, mainly because of the move from Twig to Jinja as the default templating engine. Although they’re very similar, they do have some differences in terms of built-in functions and filters.

  • An example is that Twig’s slice filter maps to an array notation in Jinja (so {{items.slice(3, 6)}} becomes {{items[3:6]}}), and Jinja’s slice does something completely different (although very useful)!
  • Another example is date formatting, which is different between PHP and Python.

The importer will try to fix those things automatically for you (or at least warn you about it and provide guidance), but it’s probably going to miss a few ones since I only know about those I ran into while upgrading my own websites. Please report any such problems, thanks.

Unsupported features

PieCrust 2 is not quite feature complete compared to PieCrust 1 – I can’t reasonably wait until 100% of the feature set is implemented before getting it out there for feedback.

Here are things I know are missing:

  • Running as a CMS: there’s not much code needed for that, but there’s no WSGI application class yet.
  • A plugin API: not much code needed yet either, but yeah, you can’t at the moment drop anything in the plugins folder, it won’t get loaded.
  • Slugification of taxonomies: tags and categories containing non-ASCII characters will keep them for now. PieCrust 1 had options for transliterating them into their non-accentuated/ASCII counter-parts.
  • Support for RSS/Atom feed scaffolding (chef prepare feed).
  • Mustache as an alternative template engine.

There are also probably things I don’t know are missing, so make sure you ping me if something you care about is not on this list.

In the next blog post, we’ll finally take a look at the new features in PieCrust 2, including the completely new underlying system for specifying pages, taxonomies, and URLs.


  1. Don’t worry, it’s configurable. You can put your asset files in a
    different folder, or even in multiple folders. ↩︎


Announcing PieCrust 2

I’ve been busy on it for longer than I expected – neglecting the freshly announced Wikked along with several pull requests on PieCrust – but I believe it’s at last ready for a public alpha release: PieCrust 2 is here!

073/365 - Pi Day Pies, 2012

WARNING: before you go clone the new repository, be aware that, at the time of writing this, it has been tested on a glorious total of 2 machines (both my own), and 2 websites (both my own as well). So don’t use it in production, but please do give it a try and post bug reports, thanks!

This post is a short overview of the reason for going a full major version number up, and of the new things you can expect to find. There will be other posts in the following days about breaking changes and upgrade paths, and a more in-depth look at the new features.

Bye bye, PHP

To say that this is a major rewrite of PieCrust would be an understatement: I moved the project over to Python, which means it’s a 100% rewrite. This may upset some users who only know PHP (or at least don’t know and/or like Python)… but this is for the best, I assure you.

First, one of the design principles of PieCrust was always to look language-agnostic. Unlike many other static website generators out there, there’s no “leak” between the underlying implementation of PieCrust and the user experience, i.e. you’re not exposed to PHP-isms at any time while using it1. This makes it easy to change the platform on which it runs without you being affected much.

Second, the reason I picked PHP for the first implementation of PieCrust, more than 3 and a half years ago, is that it felt to me it was the lowest barrier of entry for potential users. Other static website generators embrace their hacker roots, but I wanted something simple enough that any WordPress user would be able to pick it up and try it. Nowadays, people are a lot more used to installing various things to tinker with – Git, Node, Ruby, whatever. The barrier of entry doesn’t seem to be so much at the platform level.

Those two reasons meant I could look at other platforms and figure out which one has what it takes for what I have in mind for the future of PieCrust.

Packaging and distribution

One thing that quickly became annoying in PieCrust 1 was package management (both PieCrust itself and its dependencies) and distribution (how people get PieCrust on their machines). Composer has been an incredible improvement over the venerable PEAR, but both are still a few extra steps away and more complicated than they should be.

Comparatively, gem, npm, and pip are a lot simpler and, better yet, come by default with Ruby, Node, and Python 3 respectively. Getting rid of my custom installer was an appealing thought. The PieCrust 2 install instructions would basically amount to:

  1. Install Python 3
  2. Run pip install piecrust

That’s much better, especially when you think that the upgrade path and hosting are all taken care of for me.

Performance

But it was performance that was the major reason I switched development platforms.

The problem was not that PHP itself was not fast enough – it’s actually doing OK in the overall category of interpreted languages. The problem is that, because it’s got so much usage as a web programming language, it’s lacking a lot of features as a scripting language. One of those features is an API for multi-threading2.

To be honest, I should have thought about it back when I started PieCrust, that I would eventually need parallel processing… but it’s never to late to change direction, which is what I’m doing with PieCrust 2.

To give you an idea of how much this impacts performance, here’s a little graph. It shows the time it takes to bake my blog (the one you’re reading now!) using Octopress (the most popular static website generator around), PieCrust 1, and the newly written PieCrust 2. Obviously, shorter is better.

Octopress takes around 21 seconds3, PieCrust 1 takes around 11 seconds4, and PieCrust 2 takes around 6.5 seconds! And that’s even before I’ve made any optimization pass specific to this new codebase!5.

So yes, there is quite a substantial gain after the move to Python and parallel baking already. And that’s even before I can get into other improvements… for example, chef serve will be able to start a background thread to monitor changes to static assets on the file-system instead of checking for them when HTTP requests come in. This should make the preview server much snappier when you’re refreshing a page that has several images or CSS sheets.

Next post we’ll look at how you can upgrade your existing PieCrust 1 website to version 2, since I took the opportunity of a major version bump to clean up a few things I didn’t like anymore.


  1. Except for date formats. Sadly, date formats are very much tied to the
    underlying framework – unless you implement your own wrapper syntax – and this
    is one annoying breaking change when upgrading to 2.0. I’m open to ideas to fix
    that of course! ↩︎

  2. There are a couple extensions available to fill the gap, but they’re just
    terrible. ↩︎

  3. And that’s only for the posts on this blog, along with the tag pages, with
    simplified markup… my crude interop script strips out code highlighting blocks
    and other expressions (e.g. {{foo}} expressions). I’m
    expecting the real thing would take an additional second or two. ↩︎

  4. See how little it matters whether PHP sucks more or less than Ruby? Design
    and implementation are a lot more important for the big gains. ↩︎

  5. Most optimizations from PieCrust 1 were ported over to the Python codebase
    already. ↩︎


DRM-free backup on Comixology

Me, a few months ago after the “scandal” of Comixology removing the ability to buy comics directly from inside their iOS app:

I would hope ComiXology manages to revert the change, but frankly I’d rather put my hopes in more DRM-free comics available directly from the creators and publishers instead.

Well my hopes have been answered in a way: Comixology announced last week that you would be able to download DRM-free versions of your Comixology books for publishers who are OK with that:

The first wave of participating publishers making their books available as DRM-free backups include Image Comics, Dynamite Entertainment, Zenescope Entertainment, MonkeyBrain Comics, Thrillbent, and Top Shelf Productions. In addition, creators and publishers that are self-publishing through comiXology Submit are now able to choose to make their books available with a DRM-free backup.

No surprises here about the publishers who are indeed “OK with that”, since they’re the ones who were already offering DRM-free comics on their own website… but those are excellent news. I can’t stress enough how huge this is.

I’m not sure whose idea it was – whether publishers like Image pressured Comixology to do this, or whether Comixology came to this logical conclusion on their own – but I’m very happy either way. As I said before, I had completely stopped buying Image comics from Comixology, preferring instead their own DRM-free website… but that website was slow as hell and barely usable. Ideally I’d rather give 100% of my money to Image, instead of – probably – 70% through Comixology, but the usability is night and day between the two, and uploading independently acquired files to an iPad is still a huge pain in the ass1.

“For those out there who have not joined the comic reading community because of DRM – you have no excuse now,” said co-founder and Director of ComiXology Submit John D. Roberts

Indeed.

The only problem I’ve found so far is that those backups are extremely bare: just a ZIP file with the pages as JPEG images. They’re the “retina” hi-res versions, so that’s good, but the archive is missing any kind of metadata. The only way to know what it is, short of having a human open it and read the cover, is to parse the file name.


  1. Something I’m hopping will be greatly improved in iOS8. ↩︎


ComiXology Scandal

Unless you’ve been living under a rock, you can’t have missed the news that ComiXology released a new version of their mobile app that drastically changes how comics are purchased. It was reported on technology, gadget, Applerelated, and of course comicbook-related websites. It was even discussed heavily on RPG forums.

A summary of the situation is that:

  • The iPad/iPhone app doesn’t have in-app purchases anymore – you’re forced to buy directly from the web by switching to Safari.
  • The Android app still has in-app purchases, but as I understand it they don’t go through Google Play anymore and, instead, directly hit ComiXology’s servers.

Of course, the internet being what it is, a lot of people are pissed off and are voicing their rage on social networks. I’m not happy with the change either but I’m going to try and articulate my more moderate opinion in a few points here.

It’s probably too soon

The comics industry was in very, very bad shape until recently. Digital comics revived a moribund market in a completely unprecedented way, largely thanks to ComiXology on the iPad. Digital comics let new readers discover series at their own pace without having to enter an intimidating comicbook shop and browsing through stacks of TPBs to find the first story arc. When everything is just a tap away, especially single $2 issues instead of $15 collected volumes, it’s much easier to try things and, eventually, start following one of them. Impulse buying was a big part of ComiXology’s success and the market’s recovery.

But I’m not sure the market has recovered enough at this point. Adding several extra steps between the reader and a purchase may discourage a big percentage of users who are still casual readers and not “fans” yet, and effectively stop to the inertia accumulated over the past couple years.

Profit trumps user experience

It is clear now that this change was made to align with Amazon’s strategy after they were acquired. Amazon is a company that has always walked an extremely fine line of near-zero margins in almost all aspects of their business.

So it’s not surprising that they’re first doing to ComiXology what they did with the Kindle app: avoid the 30% tax that Apple and Google have on their in-app purchasing systems. And it makes sense to do so when you already have your own micro-transaction infrastructure in place, which is the case with Amazon.

The problem is that Apple completely forbids developers from using their own system… and in this case, Amazon chooses their margin over their users’ experience.

That’s extremely disappointing but, again, not surprising coming from Amazon.

Glossing over details

Another disappointing aspect of this whole affair was how unclear the announcement was. This is the email I received:

Dear Comics Enthusiast,

We have introduced a new comiXology iPhone and iPad Comics app, and we are retiring the old one. All your purchased books will be readable in the new app once you’ve downloaded it and taken the following steps:

  • In the original Comics app, log into your comiXology account.
  • Sync your in-app purchases to your comiXology account by tapping the Restore button on the Purchases tab.
  • Download the new comiXology app. This will be your new home for downloading and reading comics.
  • Start shopping on comixology.com. New purchases will appear in the “In Cloud” tab in our new app.

Read this a couple times and tell me if you would have understood what it was all about. It says there’s a new app, but it never says why. Why are they switching to a new app instead of just updating the same one? And where does it say you won’t be able to purchase directly from the app anymore?

This is unacceptably bad communication.

It’s unclear where the money goes

And what happens with that 30% that ComiXology is going to save on each transaction? It’s totally unclear whether this will be redistributed in any way to the creators and publishers.

It would have been extremely easy for ComiXology to mention that more money will go to creators in order to get all fans behind the change. Instead, we’re left to assume this all goes into Jeff Bezos’ pockets… probably because that’s exactly what will happen.

Not really a change for me

That said, since the beginning of the platform, I’ve been buying comics directly on comixology.com in the hope that this meant more money for the creators… and if not, at least I was giving more money to a small but growing company that was making the industry better. So the new iPad app is effectively not changing anything as far as I’m concerned… except that now I’m not sure where this extra money goes anymore.

Pricing and delayed releases

Some people have mentioned that moving to a true web store will let ComiXology and publishers set a finer pricing scale, i.e. comics sold at, say, $1.50 (Apple enforces price points of $0.99, $1.99, $2.99, and so on). This may prove beneficial, but given that it’s Amazon we’re talking about, it may prove to be another opportunity to put pressure on publishers’ margins.

It will also remove the occasional hassle of issues being delayed, or even blocked, by Apple’s crazy stupid approval process because – shocking! – some of them contain adult material. But then again, it was easy enough to switch to the web store for only those rare issues.

ComiXology is becoming obsolete anyway

Another reason I’m less annoyed by this change is that ComiXology was having a decreasing presence in my reading habits anyway. Image Comics has been offering DRM free comics for a while now, so I effectively stopped buying anything from Image in ComiXology. Most Marvel titles I don’t really need to own so I’m reading them through Marvel Unlimited. This leaves DC/Vertigo titles and indie comics, and those are increasingly purchaseable directly from the author…

Conclusion

So all in all, I don’t care that much about the change from a personal user experience point of view, but it does make me worried about the future of the industry. It also doesn’t shine a good light on Amazon – although I guess that’s the least of their worries.

I would hope ComiXology manages to revert the change, but frankly I’d rather put my hopes in more DRM-free comics available directly from the creators and publishers instead.


Meeting Notes

These past couple years my free time has been consumed by work on PieCrust, Wikked, and, oh, yeah, having 2 kids and 2 cats (what I was thinking, I don’t know). As a result, I haven’t been playing music or drawing much, which I miss a lot.

So I started doing it at work. Well, not playing music, because a drumset in the middle of the open-space would probably be frowned upon, but drawing and doodling.

The result is a whole bunch of post-it notes with some pretty decent art, which I’ve collected over on a “Meeting Notes” page. Check it out!


Wikked Performance

Since I announced Wikked here, I’ve been mostly working on fixing bugs, editing the documentation1, and evaluating its performance – which is what we’ll look at here today.

The big question I wanted to answer was how far you can go with just the default configuration, which is based on SQLite and requires no setup from the user. The reason for this was twofold:

  • I needed to write some advice in the documentation about when you should start looking into more sophisticated setups.
  • I plan to setup a public test wiki where people can try Wikked directly, and I needed to know if it would go down after I post the link on Reddit or HackerNews.

Initial assessment

The first thing I did was to figure out the current status of the code. For this, I took the first stress-test service I could find (which was Load Impact), and got my own private wiki tested.

  • This private wiki runs on the same server as this blog, which is a fairly under-powered server since almost all of my public websites are just static files, thanks to PieCrust: it’s a Linode VPS with only 512Mb of RAM.
  • The test requests a dozen different pages from the website, continually for around 10 seconds, with only a fraction of a second between each request. It increases the number of “users” running that test over time.

Here are some of the results:

As you can see, as the number of concurrent users increases, loading a page stays on average under a second, at 800ms. Then, around 20 concurrent users, things break down horribly and it can take between 3 and 10 seconds to load a page.

For a website running with SQLite on a server so small that Linode doesn’t even offer it anymore2, and designed mainly for private use, I think it’s pretty good. I mean, I initially didn’t plan for Wikked to run for groups larger than 10 or 15 people, let alone 20 people at the same time!

Still, I can obviously do better.

Request profiling

Werkzeug supports easy profiling of requests, so I added an option for that and looked at the output in QCacheGrind3. As I thought, pretty much all the time is spent running the SQL query to get the cached page, so there’s little opportunity to optimize the overall application’s Python code.

In Wikked, SQL queries are done through SQLAlchemy. This is because even though those queries are simple enough that even I could write them by hand, there are subtle differences in SQL dialects depending on the database implementation, especially when it comes to schema creation. I figured I would bypass the ORM layer if I need to in the future.

SQLAlchemy can be forced to log all SQL queries it generates, and that highlighted many simple problems. I won’t go into details but it boiled down to:

  • A couple of unnecessary extra queries, which came from my object model lazily loading stuff from the database when it didn’t need to.
  • Loading more columns than needed for the most common use-case of reading a page. Some of them would generate JOIN statements, too.

I also realized I was doing my main query against an un-indexed column, so I changed the schema accordingly… derp duh derp (I’m a n00b at this stuff).

Funkload

Now I was ready to run some more stress tests and see if those optimizations made a difference. But although Load Impact is a very cool service, it’s also a commercial service and I was running out of free tests. I didn’t want to spend money on this, since this is all just hobby stuff, so I looked for an alternative I could setup myself.

I found a pretty neat library called FunkLoad, which does functional and load testing. Perfect!

I started 4 Amazon EC2 instances, wrote an equivalent test script, and ran the test. To make it work, I had to install FunkLoad from source (as opposed to from pip), and troubleshoot some problems, but it worked OK in the end.

Without my optimizations, I got slightly better average page loads than before – probably coming from the fact that both my EC2 instances and my Linode server were on the west coast, whereas Load Impact was running from the east coast.

With the optimizations, however, it looked a lot better:

As you can see, Wikked on my small server can now serve 40 concurrent users without breaking a sweat: 300ms on average, and always less than 1s. And it could probably handle up to 50 or 60 concurrent users if you extrapolate the data a bit.

Moar hardware!

Next, I figured I would try to see if it made any difference to run the same setup (Wikked on SQLite) on a beefier server. I launched an EC2 instance that’s way better than my Linode VPS, with 3Gb of RAM and 2 vCPUs.

Well: yes, it does make a difference. This bigger server can serve 80 concurrent users while staying under the 1 second mark most of the time. Yay!

Conclusion

Those numbers may not seem like much but this is as good a time as any to remind you that:

  • I’m sticking to sub-1s times as the limit, because I like fast websites. But I could easily move the limit up to 1.5 seconds and still be within a generally acceptable range (e.g. from my home laptop, Wikipedia serves its pages in around 1.3 seconds).
  • This is about testing the most simple Wikked setup, based on SQLite, because that means the easiest install experience ever compared to other wikis that need a proper SQL server. And SQLite is notoriously limited in terms of concurrent access.
  • Serving even just 40 concurrent users is actually quite high. If you consider, say, 10 minutes per visit on average, that’s around 240 visitors per hour, or 1920 visitors per day if they’re all going to be mostly coming from the same time zone. That’s more than 50.000 visitors a month4.

Still, this is my first real web application, so there’s probably even more room for improvement. I’m always open to suggestions and constructive criticism, so check-out the code and see if you can spot anything stupid!

In the meantime, I’ve got some documentation to update, and a public test wiki to setup!


  1. It’s still missing a custom theme and a fancy logo, by the way. That will be coming as soon as I have any actual idea of what to do there! ↩︎

  2. That’s a referral link, by the way. ↩︎

  3. It’s not a typo. QCacheGrind is a Qt version of KCacheGrind, so that you don’t need to install KDE libraries, and it looks slightly less terrible. ↩︎

  4. The real issue is however how your site will behave if all of a sudden a lot of those visitors arrive at the same time. This is probably not uncommon if you have the kind of wiki where there can be announcements posted to a mailing list or a Facebook group, which can in turn get a lot of members to click the same link. ↩︎


Announcing Wikked

There hasn’t been any updates on this blog for a few months, and there was a good reason for that: I was working on someting new.

The problem is that I was trying to get this new project to a “good enough” state to launch publicly… but somehow I ended up in a seemingly infinite loop of improvements, refactorings, and bug fixing.

Eventually I snapped out of it: fuck it, let’s launch it as is, and see if anybody cares enough to complain that it’s not good enough. I wrote some basic documentation, fought with setuptools for packaging, and uploaded it to the Python package server.

Wikked

So lo and behold, here is Wikked, a wiki engine entirely managed with text files sitting in a revision control system.

I think it’s pretty cool, so come read more about it after the break!

Quickstart

You’re too lazy to follow the link to the documentation? Here’s your quick start:

pip install wikked
wk init mywiki
cd mywiki
wk runserver

Text files again?

Yes, this is “Part 2” of my personal crusade to both learn about web technologies and have all my data in text files inside Mercurial or Git. I find it so much easier to manage and backup than some piece of data trapped in an SQL database or something.

It’s obviously not a magic bullet – for one, it doesn’t scale well – but for personal websites I find that it’s perfect.

What’s next

The plan for Wikked is to stabilize it, of course: fix any bugs, make it easier to deploy, make it more configurable. I’m also expecting having to add proper support for Git, as right now only Mercurial is fully supported to store page revisions.

Then, it needs a demo website. There’s one already, actually, but I need to make it a bit more solid, like a cron job that resets it to its original state every night.

Last, I want to get some proper feedback about the Wiki Syntax. It was mostly thrown together as I found I needed something for my own wiki, but I’m still not 100% happy about it.

Fly away, monkeys!

That’s it for now. Be sure to send me some feedback, and to report bugs. Especially the part about reporting bugs, because this thing has never seen any other computer than my laptop and my VPS, so it’s pretty much the mother of all “works on my machine”.

Enjoy! 🙂