-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
I've been working on Angerwhale some more, and now it's about 17 times faster than it was before!
I achieved this speed increase by changing the way caching works. Before, I cached pages based on the md5sums of all the visible articles on the page (along with a list of all tags and categories, since those show up on the sidebar of every page). This was pretty slow because every article and comment had to be enumerated, sorted by date (to determine which were recent enough to show up on the page), and then finally md5'd. This worked pretty well, because a lot of the time was spent HTML-izing the data (stash) in the TT template. However, determining which cache key to use still took a significant amount of time.
Now, I'm doing things differently. In the auto method, before most of the application executes, I determine the "revision number" of the blog. The revision number is increased (not necessarily by one) every time the blog is modified -- an article is tagged, a category is added, a new post appears, etc. If I have a cached version of the requested URL for the current revision, then I immediately give that to the user, along with any saved headers. This happens in about 20 miliseconds on my machine. No request logic is executed; if there's a cached copy, it's echoed to the user and the request ends inside auto.
If I don't have a cached copy, I save the revision number and URL in the stash (as the "cache key"), run the request as normal, and then in the end action, I save the body and headers to the cache.
That's pretty much all there is to it. I'm also saving the last-modified time and an e-tag for every entry so that browsers and RSS readers can issue conditional GET requests. This saves bandwidth in addition to CPU time. If the RSS reader already has the latest feed, I return a "304 Not Modified" response and don't even have to serve the cached feed to the RSS reader. (This works with browsers, too. Try firing up Angerwhale and clicking reload constantly. On every request after the first, your browser will get "304 Not Modified" and load the document from cache instead of the web server. Fast!)
(There is one disadvantage -- I have to maintain a cached copy of every page for every logged in user because the user's name shows up in the sidebar when he's logged in. I plan to fix this by using javascript to render the username on the client side. If you don't have javascript, you'll just have to remember your name.)
In case you're wondering, here are the ab results for the
main page.
Before:
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 835 844 6.7 847 855
Waiting: 832 841 7.0 844 853
Total: 835 844 6.7 847 855
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 45 48 3.0 48 56
Waiting: 42 45 3.4 46 54
Total: 45 48 3.0 48 56
Anyway, if you're using Angerwhale, I hope you enjoy the speed increase. Let me know if you notice any issues!
-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iQCVAwUBRb3Jd9AZeFPdJeQvAQLu7AQAlByIFzhoU+0ILxwuTCzUdZ40FC1CjKj3 eqLP+4GpWBWh9q7nWn92ihpFW1/t+iZGaK/NPCJzlKUgcV2I3BFcIx0HDNfixOAb nd36AKKdVWGHgG+cX+oS3dzby4SZEhBid3XflxilKsRvkJDFCowGbon7PS7dC4hg wn8AXpLSWrQ= =rg++ -----END PGP SIGNATURE-----