Monthly Archives: April 2010

Open Library Ore

Ben Gimpert - click to visit Ben's websiteBen Gimpert is a friend of the Open Library. He and I got together over lunch a few months ago to talk about big data, statistical natural language processing, and extracting meaning from Open Library programmatically. His efforts are beginning to bear some really interesting fruit, and while we work out how we might be able to present it online, we thought you might be interested to hear what he’s been up to…

Continue reading

Thumbnail View in BookReader!

We’re pleased to introduce a new thumbnail view for the Internet Archive BookReader. The thumbnail view gives you a quick visual impression of a book by seeing thumbnails of many pages at once. It’s a great way to quickly scan through a book.

Here’s how it looks for a book about the painter Goya:

The thumbnail view also makes it easy to pick out particular pages of interest, for example if you were trying to find the Burrowing Owl in Bird life in an Arctic Spring. Hint: here’s what he looks like:

You might also try looking at Old English colour prints or some of the other books about color prints.

This feature was submitted by Stephanie Collett of the California Digital Library via our BookReader GitHub account. It’s great to have this feature come in from the open source community building around the BookReader!

We're Hiring!

The Open Library team is seeking an experienced Python developer to join our small, experienced team. Born in 2007, Open Library is a large, wiki-editable library catalog and all our data and software is open. We want to enhance the way data moves in and out of Open Library by building features that make it simple for people to contribute records to the library as well as extracting them. We want to connect our records to as many online resources as possible, to be the locus for information about books online.

You will be responsible for core application development (running a system called Infogami) as well as development of new website features. You will review and enhance the Open Library’s current API offering, as well as looking out on to the broader web to find and develop useful API integrations back into Open Library. Learn more at the Open Library system at

Must haves:

  • Software engineering experience, 3-5 years
  • Mad Python skillz
  • Applied use of PostgreSQL, Ubuntu/Linux, JavaScript/AJAX
  • Demonstrable working code online
  • Experience with triplestore database architecture; RDF/XML formats
  • Experience with open-source development projects and practice
  • Ability to work under your own supervision towards a shared outcome
  • Excellent communication skills, both written and verbal


  • Wikipedia hacks
  • Experience using GitHub or similar
  • Demonstrable, creative API integration projects, preferably with mashes from more than one system
  • A presence in the Python community
  • An interest in excellent user interface design
  • Experience working with SOLR/Lucene
  • Experience with data processing (we have millions of records)!
  • Experience working in teams dispersed around the world
  • Interest in data visualisation
  • Located in, or prepared to relocate to San Francisco

We’re working towards big goals at Open Library. The online presence of books is a very interesting space at the moment, ripe for an innovative outlook and wide integration with all sorts of other systems. If you enjoy breaking new ground, iterative development and huge datasets, please let us know!

How To Apply
Please send your resume and cover letter to with the subject line “Open Library Engineer”. We thank all applicants for their interest, but advise that only those selected for an interview will be contacted. No phone calls please.

About the Internet Archive
The Internet Archive is a non-profit digital library committed to preserving the world’s digital cultural artifacts. Used by over 6 million people, this resource is becoming part of how the Internet works. Our job is to put the best humanity has to offer within reach of students, educators and the general public. Find out more about our organization and web archive at

The Internet Archive is an equal opportunity employer.