Joyeux anniversaire, Monsieur Verne!

February 8th, 2010 — 10:35pm

Happy Birthday, Jules Verne!

“The expander of horizons,” is what a noted critic called Jules Verne. He was the prophet, the foreseer and foreteller of our great mechanical age.”1

Jules VerneJules Verne, one of the “fathers of science fiction,” was born today, back in 1828. He wrote several hundred tales about travels to exotic locales in incredible machines. Science fiction’s other father is Englishman H. G. Wells, born about 40 years later. Apparently, Verne is also one of the most translated author in history, second only to Agatha Christie in his global reach2.

The opening paragraphs of one of his most famous stories, 20,000 Leagues Under the Seas, will give you a hint:

The year 1866 was signalized by a remarkable incident, a mysterious and inexplicable phenomenon, which doubtless no one has yet forgotten. Not to mention rumors which agitated the maritime population, and excited the public mind, even in the interior of continents, seafaring men were particularly excited. Merchants, commons sailors, captains of vessels, skippers, both of Europe and America, naval officers of all countries, and the governments of several states on the two continents, were deeply interested in the matter.

For some time past, vessels had been met by “an enormous thing,” a long object, spindle-shaped, occasionally phosphorescent, and infinitely larger and more rapid and its movements than a whale. Read on?

After poking around the library a little, I found the 1900 Hetzel edition of Voyage au Centre de la Terre full of gorgeous illustrations. I also uncovered a few scanned volumes of the complete Works of Jules Verne, in particular Volume 1, Volume 2 and Volume 4. (There may be more available online… In the 1911 edition, I see that 15 volumes were published originally, in 600 numbered copies.)

Such wonderful stories!

Comment » | Data, Uncategorized

Lepidoptera

January 26th, 2010 — 8:55pm

European butterflies and moths European Butterflies and Moths

Gawd, this redesign is getting so exciting we can hardly bear it! Metamorphosis, eat your heart out! We’re very close to calling a lot of the new pages finished, although we expect to continue improving things after we do our soft launch.

I found these beautiful plates in a book called European Butterflies and Moths. There’s a huge collection of gorgeous old things in the Smithsonian Libraries collection on archive.org.

This post was somewhat inspired by a juicy thread over on the NGC4LIB mailing list. What is a butterfly?

Comment » | Uncategorized

Happy Australia Day!

January 26th, 2010 — 6:19pm

Please excuse my national pride leaking out on to the Open Library blog, but, there are some wonderful classic Australian books in the catalog that I’d like to share with you.

While The Billy Boils by Henry Lawson

Woman and her Possibilities, a lecture delivered in 1913 in my home town of Adelaide, South Australia, by W. Ramsay Smith.

That’s enough about Australia, lovely as it is. We also wanted to let you know that a friend of the Internet Archive, David Rumsey, has been busy digitizing his wonderful collection of maps and geographical books, for example, The California Water Atlas, which is full to the brim with gorgeous illustrations, timely given all the rain we’ve had in San Francisco this month.

Comment » | Uncategorized

Sneak Peek

January 13th, 2010 — 9:15pm

As I mentioned a few weeks ago, we’ve been working hard on reconstructing Open Library, and it’s getting to that exciting stage when the redesign is starting to feel alive, and full of real data.

One thing we’re producing is a new page about a certain subject that shows a list of all the Works about that subject, authors that write about it, publishers that publish books in that area, and a graph that shows the publishing history of that subject - all generated from bibliographic data and our shiny new search!

Here’s my favourite so far: fondue.

This is getting exciting!

It probably goes without saying, but there was a bump in fondue books around 1970. I’ll sprinkle a few more sneak peeks of the new design from now until we do our soft launch. Coming soon!

Update: I couldn’t resist sharing a few more…

internet

marilyn

And this final comparison is really interesting. For all the order of library cataloging, it’s funny how much difference your native tongue can make. I love how the birth of the Industrial Age is represented in literature this way…

terms

1 comment » | Uncategorized

Season’s Greetings

December 24th, 2009 — 4:29am

If you’re looking for something to read during the holidays, do consider fossicking about the treasures we have in Open Library.

I’ve had a quick look and found some lovely old things…

Megilat hanukah by A. Hayman et al., and abridged edition of the Maimonides Mishneh Torah.

A Christmas Carol by Charles Dickens, a ghost story about Christmas, featuring the curmudgeonly Ebenezer Scrooge.

Chanukah Sketch by Ruth E. Levi, a short play about Father Time and his inquisitive visitor.

And a slightly random find: the tale of that portrait that we all like to think is Shakespeare, but we’re really not certain, The story of the “Grafton” portrait of William Shakespeare “Aetatis suae 24, 1588″ with an account of the sack and destruction of the manor house of Grafton Regis by the parliamentary forces on Christmas Eve, 1643 by Thomas Kay:

Wishing you and yours all the very best over this holiday season!

Comment » | Uncategorized

An update on Open Library

December 4th, 2009 — 1:19am

It’s been some months since we’ve updated you about what the Open Library is up to. Sorry about that. Thought it might be nice to produce a novella/brain dump to let you know where we’re at.

The short answer is: all sorts of things! I’ve been leading the project now for about 6 months, and have finally settled down enough to tell you what we’re up to. We’d love to hear what you think of our ideas perhaps in the comments of this post, or on our general discussion mailing list.

The Open Library project began in February of 2007, and launched in November that year, so it’s approaching 3 years old. During that time, we’ve amassed one of the biggest virtual library catalogs online, at some 23 million edition entries and some 6 million or so author records. We also have a ton of book covers. Our catalog is entirely open and free to use. You can download everything if you wish, or use our API to either link to our records, or to display Open Library data on your website.

When I started, we didn’t have much insight into what was happening on Open Library. We knew people were using it, but didn’t really know how much, or who or what was happening. The majority of edits to the catalog are made by our bots, running updates across the system, and creating new stub records. While this work is essential, I couldn’t see any of the humans using the catalog. So, we started trying to get some insight into the site usage using tools like Alexa amongst others.

Here’s what we’ve uncovered so far, with more to come:

  • We have an average of about 400 concurrent visitors at any time, peaking at up to 900 people
  • We’ve increased daily unique IPs from about 100,000 back in June to somewhere around 250,000 today
  • Our site uptime has steadied to a very healthy 100% on a good week
  • Our bounce rate is high. Too high. It’s a concern for us that people lob into Open Library thanks to our high search engine ranking, but bounce straight out again
  • There are over 3,000 sites that link directly into Open Library. Wonderful! We’re working on understanding what those links are, and from where.
  • Our membership is fairly small, but growing every day. You can edit and use Open Library without creating an account, which probably accounts for the modest membership.

I think the story that numbers like those tell is that we have an excellent foundation for growth. This is precisely what we’re banking on as we announce that we’ll be releasing a redesign of Open Library in the next few months. We’ll stay in touch about the actual dates, and it’s very likely we’ll do a soft release before we make the final transition live. Please, watch the blog for updates on timing.

There are a number of enhancements to Open Library that we’re planning to make in the upcoming redesign (or, “realignment” as Cameron Moll has written about). That’s not to say we won’t be taking the opportunity to update the site’s look and overall usability, but, the core of the release will be about the catalog and how you see it.

Having researched a lot of the historical documentation surrounding the project , I saw tons of ideas that sounded great and which it’s time to create, like the ability to tag records or authors, provide tools to upload small collections from special interest or rural libraries and to push what bibliographic data means on the web. We’re looking forward to beginning to make some of these ideas reality into 2010.

Key Components of the Redesign

Works
Open Library deals with books at the edition level. This makes finding “War and Peace” really tricky, because all we currently display are the hundreds of editions in a big unordered list. Tricky to find what you’re searching for… Luckily, the cataloging standards initiative called Functional Requirements for Bibliographic Records (FRBR), describes a “super-level” of book called “the Work” which describes the abstract idea of a book and not its constituent editions, probably making it easier to get started with research and the like.

We’ve been toiling for the past several months to roll up all our editions into logical Works. This is incredibly tricky for all sorts of reasons and as much as we would like it to be bulletproof perfect on the first go, it’s likely people will see one edition that should be in certain Work, or Work records that are really same book. Providing tools for fixing dupes like that is next on the list. That said, we’ve been testing our brand new Work search lately, and it’s given me (at least) an entirely different and exciting iew on the Open Library. We can suddenly see things like the books in our catalog with the most editions, or all the Works by Mark Twain (instead of a massive list of all the editions he’s supposed to have written) and more. Truly, it’s invigorating after being stuck in the edition “mud” for so long. Not that edition data is bad, of course, just that the aggregate is extremely useful.

Subjects
As a non-librarian, I have been both shocked and awed by the degree of classification that’s possible using library practices. Catalogers have worked hard to put books into very specific descriptive boxes and hierarchies. Being a fan of messy data and classification, I have stumbled upon lots of classifications for books whose “order” seems quite nonsensical.

For example, many of the science fiction books listed on Open Library have several very similar, convoluted subject classifications, separated by all manner of different characters. To the human eye, it seems like duplication of effort. One book might have the following subjects assigned to it:

Science Fiction - General, Fiction / Science Fiction / General, Fiction, Fiction - Science Fiction, Science Fiction

We could just show a list of concepts, like:

Science Fiction, General and Fiction

…instead. and turn each of those terms into links, which take you through to a page that can show all books with the same subjects.

Similarly…

Probability & statistics, Probabilities, Mathematics, Science/Mathematics, Probability & Statistics - General, Mathematics / Statistics

… could be consolidated into Probability, Statistics, Probabilities, Mathematics, Science and that pesky “General” subject. People are good at reading collections of words in a list and understanding the concepts of the list, we think. It’s almost more difficult to parse the variants as you see above, with all their repeats and the use of characters to indicate some sort of hierarchy.

So, we’re going to try that (but not delete the LCSHs, of course).

Links, links, links
The key interface into the current catalog is a search box: essential if you know what you’re looking for, but useless for browsing. We’re going to introduce new navigation elements into the site that will help people dive into the catalog and bounce around. Certainly, we’ll still have search (much improved, upgraded to SOLR 1.4), but, as we think about that high bounce rate, we want to help people hop around the catalog instead of coming and going so quickly. To borrow a phrase from Tom Coates, we are constructing a new view to the catalog to represent is as a web of data instead of discrete records. The more connections we can create between records, the richer that browsing experience can be.

From a linked data perspective, we also want to introduce the ability for people to connect our records with many more systems online. Right now, you can assign up to 6 identifiers with Open Library edition records: ISBN (10 & 13), Library of Congress Control Number (LCCN), Library of Congress Classification System (LC), Internet Archive and OCLC. These IDs are certainly valuable, and in deep circulation in library catalogs around the world, buuuut… there are loads of other bookish sites out there on the web that also have wonderful, rich information about books that we’d like to connect to. Examples include Goodreads, LibraryThing, Zotero amongst others, really any resources that people think are useful! The idea is to stop worrying about a canonical identifier and simply to try accumulate as many identifiers as we can. This idea will take a while to bear fruit, but it works on the premise that we have a new opportunity in cataloging now: to place books in a network instead of on a shelf.

Similarly, we would like to collect links to other sites that are relevant to a certain book or author. Did you know Alain de Botton has a Twitter account? Sites like The Guardian, Flashlight Worthy or the New York Review of Books have incredibly rich information about books and authors that would be wonderful to connect with from the Open Library catalog.

We’re also excited about the role Open Library can play in the new Book Server initiative that was launched by the Internet Archive in October this year:

The BookServer is a growing open architecture for vending and lending digital books over the Internet. Built on open catalog and open book formats, the BookServer model allows a wide network of publishers, booksellers, libraries, and even authors to make their catalogs of books available directly to readers through their laptops, phones, netbooks, or dedicated reading devices.

The basic idea is that publishers can publish a list of any/all epubs in their catalog to be aggregated by other services online. Open Library could be one of those aggregators. We hope to show a real time representation of whatever we can aggregate, so when you look for individual books, you see a live list of where you can get your hands on the document, whether for purchase or download. After all, isn’t the job of a library to get people to books?

Librarianship as the Foundation of Open Library
Open Library’s mission has always been to build a page on the web for every book ever published. We have only been able to start achieving that mission on the shoulders of the work of librarians. While it’s possible (and encouraged) for people to add new records for books we don’t know about yet, the vast majority of our records come directly from library catalogs.

The opportunity we have now is to help interested contributors to enrich these records. Having people who love a particular book, or who have some knowledge in a particular subject area, or who enjoy correcting typos, or who like to make sure all the boxes are filled in, or have a photo of a book they’ve read, or who found a great review of a book on another site can all contribute information to the Open Library. As Tim Spalding, founder of Library Thing, noted in his Social Cataloging talk presented in New Zealand this October, nobody’s quite sure where this “social cataloging” might go, or when it might become useful to librarians in a cataloging sense. What we do know is that there’s a lot of knowledge out there on the web about books, and we want to make a Open Library a place where people can contribute any amount, no matter how small, to make the catalog more useful.

The Open Library is an amazing resource, and now it’s time to take it to the next level. Yeah!

By the way, we’re looking for at least one senior web developer to join the team too, so if you’d like to join a small team doing interesting things with library catalogs, APIs and SOLR on an extensible wiki-editable platform built in Python, and you live close to San Francisco or would move here, please drop a line to info@.

Note: I did a small copy edit Dec 4. Pretty sure I didn’t remove anything of substance.

19 comments » | Uncategorized

Choose Your Own Adventure

November 12th, 2009 — 9:26pm

There is some really lovely stuff happening around the internet about books at the moment. Here’s just one thing I stumbled on that’s absolutely gorgeous.

One Book, Many Readings by Christian Swinehart

It’s a series of visualizations and commentary on what it means to move through a Choose Your Own Adventure book, and what sort of book you end up with if you just construct various scaffolding around places and events instead of providing a directed beginning-to-end narrative.

A graphic from Christian's site about choosing your own adventure books

This paragraph stuck out at me:

Just as looking at a film only in terms of its individual frames would be missing the point, considering the pages of a CYOA book in isolation ignores what makes the structure of these books special. As in all hypertext systems, pages make up the body of the organism, but it is the nervous system of connections between them that allows for emergent properties to develop.

It’s precisely that “nervous system” that we’ve been thinking about behind the scenes at Open Library. Admittedly, the catalog we have right now is pretty dry. Many of our records are all but empty and certainly don’t tell visitors much about the actual books they catalog. We’ve been trying to eke out the landscape of the catalog, because there’s certainly lots of data in it. I mean, in lieu of rich individual records, what can the aggregate tell us? Can we build pages around these aggregate views? What could we show on, say, a page that shows all the books we have about cheese? How would Open Library operate if there was no search box? Would it be navigable?

Many questions, as you can see. I wonder if we can make a Choose Your Own Catalog.

1 comment » | Uncategorized

Just because it’s the weekend

November 8th, 2009 — 11:11pm

You may have to be of a certain age to appreciate this, but… here we go.

1 comment » | Uncategorized

Internet Archive’s BookReader out in the wild

October 28th, 2009 — 10:23pm

Or, not so wild actually, it’s the Library of Congress!

We were thrilled to see our BookReader on the read.gov site today. The Library is using it to showcase of some gorgeous books from their Rare Book Collection, like “A Wonder-Book for Girls & Boys,” “The Baby’s Own Aesop,” and “A Christmas Carol.”

You might also be interested to follow along with a “book in progress” called The Exquisite Corpse Adventure, “an episodic progressive story game” with more than 20 contributors.

There’s information about the BookReader software on the Open Library site if you’re code-y too. We love it when the BookReader gets used!

Comment » | Uncategorized

Scheduled Downtime Complete

October 27th, 2009 — 5:32pm

We’re planning for a scheduled downtime on Wednesday, October 28 for hardware upgrade of Open Library servers. Open Library will be unavailable for 2 hours during 7:00 AM PST - 9:00 AM PST.

We’ll post here when the site’s back online.

Update at 8:36am: And we’re back!

7 comments » | Uncategorized

Back to top