Archive for March, 2010

jQuery Magic Goes Upstream

By lance arthur

When we started to re-design Open Library, we knew that we wanted to make it not only useful, but fun to use as well. Obviously, its usefulness as a resource is dependent on its data, the data integrity, and the participation of interested people like you who want to help to build and maintain an online library for everyone to use now and in the future.

A big part of our goal is, therefore, to make it both easy and (hopefully) fun to edit the records, add new books and authors, and enhance the information with lots of interesting bits and pieces, like web sites that talk about those books and authors you’re interested in, excerpts from the works, and just about every piece of info available that anyone might want to know.

The “fun” part is where I came in. George already had a set of designs to use and a very strong sense of how the library should function. Taking it from the flat plane of jpegs into the interactive form of a web site – and attempting to balance her palette and layout with some special effects made possible through the open source Javascript library, jQuery and its companion, jQuery UI – meant finding some simple solutions to complicated questions.

Here’s a list of the jQuery apps we’ve integrated, a little bit about how we altered them for our site when necessary, and how we’re using them.

Colorbox
Thickbox used to be the be-all, end-all of pop-up lightbox apps on the jQuery platform, but it’s no longer maintained. I’ve found Colorbox to be the most versatile and easy to use alternative. Anywhere you encounter a pop-up (for example, when adding a cover to a book or a photo to an author’s page), we’re using Colorbox. In some cases, we’re combining Colorbox with iFrames and jQuery UI tabs so you can manage multiple covers or photos attached to a single record.

Flot
When you want to talk about a powerful use of jQuery to bring in HTML 5-type elements on all of today’s browser, a good example is Flot. Flot creates charts and graphs on the fly from any data you feed to it, and allows you to manipulate it in a fairly easy and highly scalable manner. Flot must be used in conjunction with excanvas (which is included in the download) so that older browsers like IE7 that don’t or can’t support the new <canvas> tag will still be able to display all the charty magic you’ll be creating with it.

We’re using Flot mainly on the Subject pages to illustrate published dates of related books, providing a quick visual readout of the subject’s history in books, as well as an interactive method of focusing in on particular periods to get a closer look at that history. We also use Flot on our Administrative pages to see how the site is performing and watch the number of edits, active IP addresses, new memberships and other data without resorting to Flash.

Subject page capture

jCarousel
This is a case where we’ve adapted something with a particular goal in mind and twisted it around a bit to serve our purposes. jCarousel is generally used as a way to look at a series of images in a dynamic fashion, so you can see a subset of them by flipping around a carousel housed in a window on screen. We’re using jCarousel on our Subject pages to display book covers. Luckily for us, jCarousel comes complete with Ajax support to pull in images in a dynamic fashion only when called for (because sometimes the list of covers is thousands of titles long, and you may never get to the 234th panel, right?) so we don’t have to pre-populate our carousel with everything at first, but after you call them in, the carousel works as usual. You can flip back and forth between a dozen covers (or cover placeholders) and then click the cover to go to that work’s page.

Perhaps more interesting than that (because in a sense, book covers are just images, right?) we tied Flot and jCarousel together so that the things you do in a Flot chart affect the covers shown in the carousel. When you zoom in on the chart to see only the number of books about flowers published between 1910 and 1970, the carousel updates itself to correspond.

dataTables
As we developed the design, it became quickly apparent that what a lot of people wanted to know about a given book, more than anything else, was “is this available to read online right now?” Luckily, we usually know the answer to that question because The Internet Archive is busily scanning in books for everyone to read – probably we’re scanning more books as you’re reading this – so we have millions of titles online. We also monitor the availability of ebooks by others, so we have a fairly decent knowledge about that. But how do we let you know about that when you’re looking at a work about its related editions?

We decided to poll the database about every edition whenever you look at a work page (with more than one edition) and present that in a big ol’ hairy table, and then allow you to sort the results on that page and see which editions are available as ebooks. dataTables fit the bill perfectly as a way to take what could be an unwieldy gob of info and parse it so that the table’s contents could be sorted and re-sorted based on your whims, and would also paginate the results so you’re not faced with hundreds of rows to scan. Additionally, dataTables provides in-table searches, so you can further filter the list of editions down to just those published in 1972.

And as long as we were polling the database, we’re also showing you if you can buy the book (though at present we’re not grabbing the book’s availability at the various online stores – but we plan to!), borrow the book or download your own copy. Allan Jardine, who developed dataTables, also provided a couple of enhancements just for us because he’s a really nice guy.

Assorted and Sundry
Those are the Big Four plug-ins we’re using, but we’re also taking advantage of jQuery in a number of ways.

  • When you add a new work to Open Library and start typing in an author’s name, a modified version of jQuery Autocomplete performs a look-up of existing authors to see if we already know about the one you’re typing.
  • When you’re filling out a form, jQuery Validation is doing some pre-submission check ups on your inputs and what we require.
  • When you’re typing in a text box, WMD, the Wysiwyg Markdown editor, is providing the toolbar you see at the top and the preview you see at the bottom so you don’t have to figure out if what you’re typing is going to look like what you think it will. Our developer, Anand, has improved the source and we made that available on GitHub.

Perhaps the nicest thing about using jQuery, as illustrated above, is that there’s likely to be a solution to your problem already available, and that the people who made it are amenable to helping you figure out any problems you encounter with it when things start to collide. The library itself is robust, easy to implement and maintained by a very large contingent of very smart people. Luckily for me, they’re always a lot smarter than I am, so they already figured everything out.

My thanks to everyone involved with jQuery and all the plug-ins we’re using – and all the plug-ins we’ll be using as we move forward and continually improve Open Library.

Announcing the Open Library redesign

By George Oates

Announcing the Open Library redesign!
Screenshot on Flickr – CC Attribution

Hooray! And yay! We’re very excited to announce the “soft launch” of our brand new Open Library site! This is version 1 of a reconstructed Open Library, and we’re going to keep it “soft” at a special URL until we’re sure it’s stable enough to make the final transition to openlibrary.org. We’re hoping that will happen soon.

As we mentioned in two previous blog posts [1][2], the main features of the new design are:

1. Works
The previous version of Open Library was only aware of editions of books, or “manifestations” in FRBR-speak. We’re excited to release Works, which helps catch all editions of the same book and collect them all under this one umbrella. Each work also has its own URI too – we’re hoping these propagate.

Note that our representations of Works is imperfect. We’re the first to acknowledge that there are lots of duplicate edition records in Open Library, and these dupes clog up our ability to derive or create works from editions. That means that we might have 25 Jane Eyres for a while, and that the next logical feature to release is a way for people to help merge things.

2. Subject pages
We wanted to find a way to help people browse the catalog rather than having to know what they’re looking for before they start. So, we’ve gone through a process of breaking down and reconstructing the subject headings on our records, giving each heading a URL, and displaying a whole bunch of data about each heading: works about that subject, publishing history, related subjects, authors who write about it, and publishers who publish in that subject area.

3. Revamped search
We’ve rewritten search from scratch and upgraded to SOLR 1.4. Our ranking is very basic for now, so “relevance” doesn’t mean a lot yet. We can’t wait to improve on it, and in the meantime, you can also sort your searches by the number of editions, when things were published, or filter using facets.

4. UI Improvements
The whole site’s had an overhaul in terms of the user interface. All the major operations (editing, searching, adding covers etc) have been redesigned. Even changing the size and position of the Edit button will hopefully make it clearer that these records are open to correction. We’ll be blogging over the coming weeks with specifics about the user interface enhancements.

5. Links, link, links
Another major component of the redesign is to begin the process of connecting our records to other references out there on the interwebs. If you get to an Edit Edition page, you’ll notice that you can add different identifiers from a variety of systems to the Edition record, and even add a new type of identifier to the system. The more IDs we can collect, the more connections there’ll be into and out of Open Library.

Caveats!
The redesign is just out of the oven, so it’s important to be clear that there are still things missing, unclear, coming soon, or potentially even broken:

1. The API

A lot of the revisions we’ve made to the API are undocumented. We’re looking forward to changing that, and will update you as we do. We’d also like to expand the range of ways you can write to Open Library via the API.

2. The Data
Now that we’ve improved on the ways to browse the Open Library catalog, we’ve exposed a lot of the corners and content in there that may never have seen the light of day, or are just plain wrong.

It might be odd to say, but we sympathize with Google’s recent position on metadata quality[3]. Trying to merge records from lots of different catalogs means there will be duplicates, and that any errors in those different catalogs are imported as well. That’s not to say we’re not happy with what we’ve got at this first stage. Edward has done a fantastic job to get this far, and we’re looking forward to continual improvement of the dataset.

The fun thing — the best thing? — about Open Library is that you can correct any errors you come across, and those corrections can be propagated.

3. Under construction
This is a “soft launch,” our very first release at a new take on the Open Library system. There will be things that seem a bit weird, particularly if you’ve used the previous version.

We’re fairly sure that all the major operations work though, so if you find something that’s broken, or would like to suggest an improvement or discuss something, we’re all ears!

So, please go and explore the new Open Library. This is just the beginning!

http://www.openlibrary.org

Enjoy!

Tim Berners-Lee: The year open data went worldwide

By George Oates

Comparing two classification systems

By George Oates

Carina Nebula Details: The Caterpillar
From the Goddard Space Flight Center , CC Attribution 2.0 Generic

As you may have seen in our recent Sneak Peek post, we’ve been working on new ways to allow you to browse subject headings in Open Library. Edward’s just built a new subject search index too, so you’ll be able to do a keyword search for any/all subject headings that mention that word.

Testing things and looking around, I wondered about comparison between two different systems of classification: Library of Congress Subject Headings (or Authorities) and Flickr tags.

I searched for “space flight” subjects on Open Library and found 57 results. Here are the first twenty:

  • Space flight
  • Space flight to the moon
  • Manned space flight
  • Space flight in fiction
  • Space flight to Mars
  • Orbital transfer (Space flight)
  • Space flight to the moon in fiction
  • Psychological aspects of Space flight
  • Space Flight
  • Space flight training
  • Extravehicular activity (Manned space flight)
  • Orbital rendezvous (Space flight)
  • Goddard Space Flight Center
  • Space flight in literature
  • Space flight to Jupiter
  • George C. Marshall Space Flight Center
  • Effect of space flight on
  • Physiological aspects of Space flight
  • Space flight to Venus
  • Manned space flight in fiction

There’s a new page for every subject:

Space Flight subject page preview

And here are 20 things Open Library and Flickr think are related:

Open Library Flickr
  1. Space flight to the moon 255
  2. Juvenile literature 203
  3. Manned space flight 197
  4. Astronautics 178
  5. Physiological effect 144
  6. Exploration 137
  7. Congresses 118
  8. Space vehicles 107
  9. History 90
  10. Space flight to Mars 83
  11. Space shuttles 70
  12. Astronauts 67
  13. Orbital transfer (Space flight) 49
  14. Space medicine 46
  15. Rockets (Aeronautics) 40
  16. Space stations 39
  17. Psychological aspects 38
  18. Space transportation system flights 38
  19. Interplanetary voyages 29
  20. Artificial satellites 28
  21. …and more…
  1. space
  2. nasa
  3. iss
  4. apollo
  5. moon
  6. spaceshuttle
  7. station
  8. rocket
  9. kennedyspacecenter
  10. astronaut
  11. international
  12. ksc
  13. discovery
  14. human
  15. flight
  16. lunar
  17. museum
  18. internationalspacestation
  19. orbit
  20. secondlife

It’s interesting that one system works with plurals, the other singular. Makes sense perhaps because perhaps a library classification is designed to collect things together, whereas tagging could be about describing the thing that’s in front of you. Collections under a certain tag on Flickr are most often emergent, instead of predetermined (although they can be that as well). Curious also the contrast between literary and visual classifications, and the distinct lack of overlap.

The other thing is librarians’ marvellous use of (context), like Orbital transfer (Space flight). It’s incredibly useful when all at once you have to describe or comprehend something in brief. Apart from our need to classify ever more deeply, has the literal size of analog catalogs helped evolve such specificity?

Wonderful stuff.