A Celebration of Banned Books

It’s Banned Books Week this week, and we’re celebrating!

As Joan E. Bertin suggests over on Huffington Post, “for a country that venerates its First Amendment guarantee of freedom of speech, the United States tries to ban books with alarming frequency.” Sadly, it isn’t only the USA that has banned books in the past, as this list of banned books on Wikipedia shows. As Cara noted over on the shiny new Internet Archive news blog, we have scans of quite a few previously banned books for you to gaze upon, including but not limited to The Wonderful Wizard of Oz:

Continue reading

Search performance!

We’re doing some work on improving our search engine at the moment. As we release the new code, search performance may be intermittent. Apologies for the interruption, but, search will be much faster when everything settles down. We’ll drop a note in here when it’s back online.

Update, 6PM PST: We didn’t quite get as much done today as we’d planned, but search should be stable. More tomorrow!

Update, 9AM PST, 8/28: Holy search, Batman!! Before… searching on Open Library was a slog. But now! It’s a breeze! Our search guy, Paul, has been tightening knobs and flipping switches (aka making good use of SOLR stored fields), and our chief data munger, Edward, helped push out the new code this morning. Just see how fast our 24,781 bacon records show up! Then, there’s the “collection” of digitized books about cheese… Please let us know if you come across anything untoward.

Snowflakes

I just stumbled on a beautiful, recently-scanned book about snowflakes, published in 1863. Apart from its gorgeous illustrations, the author’s opinions about snowflakes are also fascinating.

By the way, the other day we added a little link on any Internet Archive pages that are echoed in Open Library that sends you straight to our open, editable record for that item, so, if you’re surfing around the Archive’s Texts collection and you find a book we have a record for, you can just jump across and – if you’re so inclined – help to flesh out the information we have about it.

Continue reading

API with RDF/XML output available

It is now possible to access Open Library book metadata in an  RDF/XML format. The access is through the RESTful API. For an example, view:

http://openlibrary.org/b/OL6807502M.rdf

The returned RDF/XML relies heavily on Dublin Core metadata terms, and uses some elements from bibliontology and the registered RDA schemas. Although soundly based on RDF, the output can be used like any XML and presents (most of) the Open Library metadata in the easily understood Dublin Core terms.

It has been suggested that this format include links to cover images, where available. It is also on our list to add tables of contents to the output. Other suggestions are very welcome — add them here, or send them to the ol-tech discussion list.

We’d love to hear about the uses you make of this API, and anything we can do to help you get more out of the Open Library.

ISBN publisher codes

There can be more than way to say the same thing, for example gramophone record, phonograph record and vinyl records. When libraries write catalog records they pick one of these terms and sticks to it, they use what is known as a ‘controlled vocabulary‘. This makes it easier to browse library catalogs.

Traditionally it has been thought that patrons want to browse by author and subject headings, so these fields have been controlled. The data in these fields can be used in other ways, Ross Singer has been experimenting with geographic subject headings.

Publisher is an uncontrolled field. Penguin and Penguin Books are the same publisher, but their name has been entered in catalog records differently, making it difficult to browse by publisher.

A workaround is to use the ISBN field in the catalog record. Almost every book published since 1970 has an ISBN. English-language books start with a 0 or 1, followed by a variable-length publisher code, item number and finally a checksum digit.

For example: 0-14-043531-X
0 = English language
14 = Publisher code
043531 = Item number
X = checksum

We are able to build a list of ISBN publisher codes by picking the most popular publisher name, as it appears in library records, for each code. Using ISBN we can start the process of making publisher a controlled field.

The results: