pystatsd & 5,000 Lists!

We’re working hard to improve Open Library’s general stability and performance, after a few harrowing weeks moving our hardware infrastructure around. We’re beginning to measure more stuff across the site, from general activity levels (about 40,000 catalog edits every month!) to quite specific actions (like, seeing that every second, 1-3 people open up our BookReader).

We’ve begun using a super awesome, real-time stats processing package called pystatsd, a Python implementation of Etsy’s statsd server. My favourite bit is a program that sits on top of that called graphite which takes all the stats we collect with pystatsd and renders them as graphs in a browser. Suddenly, we can see the system in a new and useful way.

We’re also looking hard at improving our memcached configuration, recently introducing another 4 memcached machines into our pool. Now that we can measure memcached hits and misses using pystatsd and graphite, we’ll be able to tell when our caching stuff is actually improving. Yay!

Memcached hits & misses

Another tweak you might find interesting… it used to be that lists would only show up on the main Lists page if they contained at least 3 seeds. The other day, Raj and I upped that to at least 5 seeds, and that immediately produced a selection of arguably more interesting lists, most of which settle around a subject area. Here’s a small selection:

Have you made a great list, or found someone else’s? Let us know in the comments!