Wikked Performance

posted on 13 April 2014 00:00 in Wikked

Since I announced Wikked here, I’ve been mostly working on fixing bugs, editing the documentation¹, and evaluating its performance – which is what we’ll look at here today.

The big question I wanted to answer was how far you can go with just the default configuration, which is based on SQLite and requires no setup from the user. The reason for this was twofold:

I needed to write some advice in the documentation about when you should start looking into more sophisticated setups.
I plan to setup a public test wiki where people can try Wikked directly, and I needed to know if it would go down after I post the link on Reddit or HackerNews.

Table of Contents

Initial assessment

The first thing I did was to figure out the current status of the code. For this, I took the first stress-test service I could find (which was Load Impact), and got my own private wiki tested.

This private wiki runs on the same server as this blog, which is a fairly under-powered server since almost all of my public websites are just static files, thanks to PieCrust: it’s a Linode VPS with only 512Mb of RAM.
The test requests a dozen different pages from the website, continually for around 10 seconds, with only a fraction of a second between each request. It increases the number of “users” running that test over time.

Here are some of the results:

As you can see, as the number of concurrent users increases, loading a page stays on average under a second, at 800ms. Then, around 20 concurrent users, things break down horribly and it can take between 3 and 10 seconds to load a page.

For a website running with SQLite on a server so small that Linode doesn’t even offer it anymore², and designed mainly for private use, I think it’s pretty good. I mean, I initially didn’t plan for Wikked to run for groups larger than 10 or 15 people, let alone 20 people at the same time!

Still, I can obviously do better.

Request profiling

Werkzeug supports easy profiling of requests, so I added an option for that and looked at the output in QCacheGrind³. As I thought, pretty much all the time is spent running the SQL query to get the cached page, so there’s little opportunity to optimize the overall application’s Python code.

In Wikked, SQL queries are done through SQLAlchemy. This is because even though those queries are simple enough that even I could write them by hand, there are subtle differences in SQL dialects depending on the database implementation, especially when it comes to schema creation. I figured I would bypass the ORM layer if I need to in the future.

SQLAlchemy can be forced to log all SQL queries it generates, and that highlighted many simple problems. I won’t go into details but it boiled down to:

A couple of unnecessary extra queries, which came from my object model lazily loading stuff from the database when it didn’t need to.
Loading more columns than needed for the most common use-case of reading a page. Some of them would generate JOIN statements, too.

I also realized I was doing my main query against an un-indexed column, so I changed the schema accordingly… derp duh derp (I’m a n00b at this stuff).

Funkload

Now I was ready to run some more stress tests and see if those optimizations made a difference. But although Load Impact is a very cool service, it’s also a commercial service and I was running out of free tests. I didn’t want to spend money on this, since this is all just hobby stuff, so I looked for an alternative I could setup myself.

I found a pretty neat library called FunkLoad, which does functional and load testing. Perfect!

I started 4 Amazon EC2 instances, wrote an equivalent test script, and ran the test. To make it work, I had to install FunkLoad from source (as opposed to from pip), and troubleshoot some problems, but it worked OK in the end.

Without my optimizations, I got slightly better average page loads than before – probably coming from the fact that both my EC2 instances and my Linode server were on the west coast, whereas Load Impact was running from the east coast.

With the optimizations, however, it looked a lot better:

As you can see, Wikked on my small server can now serve 40 concurrent users without breaking a sweat: 300ms on average, and always less than 1s. And it could probably handle up to 50 or 60 concurrent users if you extrapolate the data a bit.

Moar hardware!

Next, I figured I would try to see if it made any difference to run the same setup (Wikked on SQLite) on a beefier server. I launched an EC2 instance that’s way better than my Linode VPS, with 3Gb of RAM and 2 vCPUs.

Well: yes, it does make a difference. This bigger server can serve 80 concurrent users while staying under the 1 second mark most of the time. Yay!

Conclusion

Those numbers may not seem like much but this is as good a time as any to remind you that:

I’m sticking to sub-1s times as the limit, because I like fast websites. But I could easily move the limit up to 1.5 seconds and still be within a generally acceptable range (e.g. from my home laptop, Wikipedia serves its pages in around 1.3 seconds).
This is about testing the most simple Wikked setup, based on SQLite, because that means the easiest install experience ever compared to other wikis that need a proper SQL server. And SQLite is notoriously limited in terms of concurrent access.
Serving even just 40 concurrent users is actually quite high. If you consider, say, 10 minutes per visit on average, that’s around 240 visitors per hour, or 1920 visitors per day if they’re all going to be mostly coming from the same time zone. That’s more than 50.000 visitors a month⁴.

Still, this is my first real web application, so there’s probably even more room for improvement. I’m always open to suggestions and constructive criticism, so check-out the code and see if you can spot anything stupid!

In the meantime, I’ve got some documentation to update, and a public test wiki to setup!

It’s still missing a custom theme and a fancy logo, by the way. That will be coming as soon as I have any actual idea of what to do there! ↩︎
That’s a referral link, by the way. ↩︎
It’s not a typo. QCacheGrind is a Qt version of KCacheGrind, so that you don’t need to install KDE libraries, and it looks slightly less terrible. ↩︎
The real issue is however how your site will behave if all of a sudden a lot of those visitors arrive at the same time. This is probably not uncommon if you have the kind of wiki where there can be announcements posted to a mailing list or a Facebook group, which can in turn get a lot of members to click the same link. ↩︎