Easy permanent links to book page images

We just launched a new image permalinks feature for downloading and linking to page images of books hosted on the Internet Archive. Using a page image permalink makes it easier to references the contents of a book hosted on the Archive without having to know the details of how or where the book is stored. Since a book’s data could be moved around within the multiple petabytes of data in the Archive at any time the permalinks provide a consistent and stable way to access the page images.

Here are a few quick examples. For each of these URLs you would add http://www.archive.org/download/{item identifier} to the beginning (hover over an image to see its full URL).

Referencing the cover image for a book at thumbnail size:
/page/cover_thumb.jpg

Continue reading

Improved Set Up for Developers

Over the last few months, a handful of the developers at the Internet Archive have begun working more closely with Open Library code, where previously, the project was more isolated and had really only been worked on by the core team of, well, two: Anand & Edward. Apart from more fun collaborating with colleagues across the Archive, this increased exposure of the Open Library code base has been profoundly useful for the project. Apart from the very useful fresh perspectives and questions, it’s also led to an improved toolset for getting a developer’s instance of Open Library up and running on your local machine – so important when you’re trying to find your way around a new system.

The cherry on top is an install script for Linux, written by Raj Kumar, on top of the awesome work done by Michael Ang (Mang) to prepare for our recent Lending launch. The updated docs are here:

http://openlibrary.org/dev/docs/setup

This is a bit of a milestone for us – making the codebase more accessible and easier to work with is something we’ve wanted for ages, so it’s nice to see it well on its way.

Open Library Ore: A MySQL data dump is available

A while back, Ben Gimpert - click to visit Ben's websiteBen Gimpert wrote a guest post for us called Open Library Ore, explaining how he had begun to hack on the massive full text corpus on the Internet Archive, practising various Natural Language Processing techniques to begin to teach machines to glean topics of books by sheer letter crunching. Turns out the elements in the ore are beginning to emerge, particularly in the form of a dataset available for download under Attribution-Noncommercial-Share Alike 3.0 CC license… Please, if you know SQL, why not download the dataset and see what you can find out? We’d love to hear any discoveries you make, perhaps in the comments?

Continue reading

Today's Downtime (is over)

We’re going to bring OpenLibrary.org offline at around 3PM Pacific Time to upgrade our database server. If all goes according to plan, this upgrade should take approximately three hours. We’ll be replacing our old spinning platter disks with modern solid state drives as part of a broader effort to improve the overall performance and reliability of the Open Library system.

We’ll post updates if they come to hand.

6PM UPDATE – Looks like it’s going to take a wee bit longer than expected to get everything copied over. And by “wee bit,” we mean another hour or so from now. If the database decides it wants to move slower, we’ll update again. Thanks so much for your patience!

7.45PM UPDATE – And, we’re back!