Category Archives: Uncategorized

Snowflakes

I just stumbled on a beautiful, recently-scanned book about snowflakes, published in 1863. Apart from its gorgeous illustrations, the author’s opinions about snowflakes are also fascinating.

By the way, the other day we added a little link on any Internet Archive pages that are echoed in Open Library that sends you straight to our open, editable record for that item, so, if you’re surfing around the Archive’s Texts collection and you find a book we have a record for, you can just jump across and – if you’re so inclined – help to flesh out the information we have about it.

Continue reading

ISBN publisher codes

There can be more than way to say the same thing, for example gramophone record, phonograph record and vinyl records. When libraries write catalog records they pick one of these terms and sticks to it, they use what is known as a ‘controlled vocabulary‘. This makes it easier to browse library catalogs.

Traditionally it has been thought that patrons want to browse by author and subject headings, so these fields have been controlled. The data in these fields can be used in other ways, Ross Singer has been experimenting with geographic subject headings.

Publisher is an uncontrolled field. Penguin and Penguin Books are the same publisher, but their name has been entered in catalog records differently, making it difficult to browse by publisher.

A workaround is to use the ISBN field in the catalog record. Almost every book published since 1970 has an ISBN. English-language books start with a 0 or 1, followed by a variable-length publisher code, item number and finally a checksum digit.

For example: 0-14-043531-X
0 = English language
14 = Publisher code
043531 = Item number
X = checksum

We are able to build a list of ISBN publisher codes by picking the most popular publisher name, as it appears in library records, for each code. Using ISBN we can start the process of making publisher a controlled field.

The results:

Small pieces, loosely joined

We’re very excited about a little integration that the smartypants team over at Flickr have done with Open Library. If you happen to use Flickr and just happen to photograph the covers or insides of books you read, there’s an easy way to connect them to the catalog records we have on Open Library. You just have to add a thingy called a “machine tag,” the same way you would add any other tag, making use of a special format for the tag.

For example, here is a photo of a book that Heather read, posted on Flickr:

books i've read

This book also has a record on Open Library, at this URL:

http://openlibrary.org/b/OL23086206M/Tethered

Now, to link them together, all we need to do is add a specific machine tag to the Flickr photo that references the Open Library ID, like this:

openlibrary:id=OL23086206M

And hey presto, now you see a link to Open Library under Additional Information on Heather’s photo page:

Machine tags!

You might find yourself asking, why is this good? Well, it’s good because it creates another channel for content to come into Open Library. We’ve been thinking about how much the rest of the web knows about all the books in our catalog, and we’ve begun the process of actively seeking out this content, and piling it onto our catalog records. So, each photograph or cover that we now have access to via Flickr is like another node in the network that surrounds our book records. Rather than treat these records as isolated, we want to connect them to as many things as we can find, which in turn, will begin to make Open Library richer with more points of entry than a search on Open Library itself.

There’s a curious example of this already, from STML on Flickr. He added the same machine tag to several photographs he’d taken of his copy of the “Progressive Atlas” book:

Progressive Atlas

Nice to be able to see inside the book too!

We were also thrilled to discover this morning that one of the largest independent publishers in the USA, W.W. Norton, has added these Open Library machine tags to some 100 or so of their beautiful covers, archived on Flickr! Awesome!

There are a few fiddly bits emerging as people try this out. Like, when you do a search for a book, Open Library displays all the editions it knows about in the search results, and you might even see two records that have the same publisher name and publish date… That sometimes makes it a bit tricky to work out which particular Open Library record to link to, but our advice at this stage is just to pick one and run with it. (Later, we’d like to provide the option to merge two records into one, but we’re not there yet.)

Another interesting question being asked internally here, and also on our Flickr group is “what conventions should we be using for machine tags?” Our attitude here is that this integration is only very new, so it’s not the time to be impressing standards or conventions. We’d much rather just step back and see what people come up with on their own. There was a funny example yesterday, where dumbledad asked whether it was OK to tag this “action shot” of himself reading a book sitting outside with the openlibrary:id= machine tag. The response is, yes! Go for it! Or, create your own machine tag that seems to work for you, perhaps openlibrary:actionshot= or openlibrary:inside=, and we’ll just see what happens.

The next step, of course, is to have all these lovely bits and pieces show up on Open Library itself. Stay tuned!

(Disclaimer – I thought it might be important to say that I used to work at Flickr, but I had absolutely no say in the development of this new feature. There are also several other services that are “connectible” using this method. You might like to read the Flickr Code blog for more details on that.)

New Bits!

A few hours ago we released a couple of new bits and pieces we thought it was worth mentioning.

First, we’ve re-arranged the way search results display so our search facets are more obvious, there’s a new cover view, and the pagination is tidier.

You’ve always been able to see facets on the search page, but we were trying to find a way to make them more exploratory and interactive – hopefully, this redesign is a start. So, you can click on a facet to narrow your search, then another, and another. It starts to get interesting when you remove previously selected facets from the search, and begin to move sideways through the catalogue. (The team has wasted some hours playing with this!)

As I was bouncing around, I found a few gems, including 6 digitized books about the Masai, written between 1857 and 1905, including the fascinating Vocabulary of the Enguduk Iloigob and Through Masai land: a journey of exploration among the snowclad volcanic mountains and strange tribes of eastern equatorial Africa.

There’s also Cookery recipes by St. Mary’s Guild, Mill Valley, California – just around the corner from us here in San Francisco – published in 1902 and available to read online. Pickles, Marmalades, Jellies, Preserves is “swooning in sweetness” on Page 71, and the scan is full of hand-written notes, as any good cookbook should be!

And, as NASA celebrates the 40th Anniversary of the Apollo mission, here’s a bit of Mars-related science fiction to whet your appetite. If you like space stuff, you’ll love the collection of fantastic 16mm videos shot on board Apollo, hosted over at nasaimages.org, another project of the Internet Archive.

The other cool thing that we released is integration with the new, improved book reader available on archive.org. Improvements include a one-page view, access to the full resolution of the original scan (in that one page view), and the ability to link into a specific page in a scanned book, just by grabbing the URL in the navigation bar whenever you’re looking at a certain page, like I did above to link to Page 71 of the cookery book. (The URL updates on the fly as you turn the pages – super cool!) There’s more information over at the Open Content Alliance blog.

We’d love to hear what you think of the new search results page, so please leave us a comment!