Lists are coming

This is a guest post from friend-of-Open-Library, Megan Amaral. She’s currently hurtling towards the end of a Master’s Degree in Library and Information Science at San Jose State University and has a self-proclaimed penchant for metadata systems. She’s joined the Open Library team as an intern for a month or two to observe and comment on what we’re up to. Welcome, Megan! She also tweets as @bookfinch.

The upcoming Lists feature will present a really exciting opportunity for Open Library users to create and share their handmade collections of Open Library records.  Lists will be flexible enough to include pretty much everything: works, editions, authors, subjects, other lists, and potentially even specific pages of works held in the Internet Archive’s collection of scanned books.

I’m sure you already know that the Open Library catalog compiles data from different sources for its records, which can then be updated by users.  This results in a truly a dynamic collection of information.  With Lists, there will be another way for users to organize and interact with these records.  To provide an overview, here are some of my thoughts on what Lists are in the context of the Open Library catalog.

  • Lists are basic, unrestricted categorizations of things.  The sky’s the limit.
  • You don’t have to be an expert to create or understand them.
  • They are tools for sharing and re-locating records.
  • They are meaningful.  If someone goes to the trouble to create a list, then there is at least one person that the list is important to.

I’m a personal supporter of user-created lists because they provide all of us with a new way of browsing and locating resources.  As list creators, people can gather records that represent a personal idea in a sharable and retrievable format.  As information seekers, lists allow people to benefit from the opinions of experts, fanatics, and your Aunt Lulu because when anyone can make a list, anyone can become the curator of collections of their choosing.

Once released, the Lists feature will allow people to assign tags to their lists.  The lists that a book (or author) appears in will also be displayed directly on that book’s own page, which means that Open Library records and lists will intersect.  For example, an initial search for Hemingway’s book The Sun Also Rises could take you to someone’s list of “Books by American Authors who Lived in Paris”…or a list of those actual authors…or a hybrid of both.  You are bound to find something new and interesting!

Last week, I had the opportunity to participate in an early brainstorm for the Lists feature.  During the brainstorm, George identified two activities that should definitely be included in the initial release of this feature.  First, people should be notified in some way when the record of an item in their list is edited.  Secondly, lists must be able to be exported.  This second requirement launched a flurry of ideas of what, exactly, should be exported.  The exciting photo below illustrates some of our thoughts.

Lists!]

The thinking is that in addition to the immediate items on the list, perhaps the export should go a little deeper into the catalog records.  For example, if an author were included in a list (let’s call the list “Authors of Books About X-Rays”), would it be useful if the export of that list included the books that the author wrote?  Or if a Work was listed, perhaps the different Editions of that work would be useful in the list export.

The discussion also included some thinking about how the Lists feature will actually fit into the Open Library website.  What will the main Lists page have on it?  Should it display lists that contain the most actively edited records here?  Or the newest lists?  And what will be on the pages of each individual list?  (So far, these pages will include a history of edits made to the list, the Subjects of the items in the list, and an option to view changes made to the records of items in the list.)

I think that Lists will be an engaging feature for sharing and discovery.  I’ll keep you updated about this feature as the Open Library team moves closer to its release!

Reading in the Sun!

Pixel Qi screen vs MacBook in direct sunlight
Photo by Raj Kumar

We’ve just had a visit from Mary Lou and John from Pixel Qi, showing off their amazing new screens that can operate in two modes: with the backlight, like a normal LCD, or with the backlight off, like a highly reflective “epaper” display that uses an incredible 80% less power than a “standard” display.

Mary Lou is one of the founders of One Laptop Per Child (which Pixel Qi collaborates with these days), and the OLPC is one of the biggest distribution channels for Internet Archive books.

It's a Merge-Fest!

Thank you to everyone who’s merged an author or two since we launched the feature on Monday! The response has been excitingly wonderful – there have been about 200 merges run, with a record 31,246 edits for the last 7 days! And, not just by staff!

You can see all the merges as they happen from the new sub-section of Recent Changes:

http://openlibrary.org/recentchanges/merge-authors

I must say I was quite pleased to find Somerset Maugham in need of so much merge love. Check out all his alternate names! It’s so satisfying when you find a juicy one like that.

Onward!

Duplicate Authors? Wave your Magic Wand!

[note: the author merge feature is no longer patron-facing]

In your wanderings around Open Library, you may occasionally have seen two records for a person you know to be a single author, like Brooks, Terry & Terry Brooks.

Look for the Magic Wand around the site to start merging!

Today, we’re releasing a new feature to help you merge those two separate Terry entries into one. This, in turn, will update all the Works listed under each Terry and try to reconcile each Work by each Author to try to reconcile a tighter list of Works for the newly merged Terry. Magic!

Try a search for your favourite author now, browse recent author merges, or read on…

A few things bear explaining:

  • The merge feature works on the idea of a Master author and its Duplicates. As you do the merge, it will be up to you to elect the most suitable Master. We select the author record with the most Works as the default, but you can change that
  • Only people with an Open Library account can merge authors
  • Updating the search engine after a merge takes a little while at the moment, up to about 10 minutes, so you won’t see the list of the new Master’s Works updated immediately. We’re looking to speed this up, but are very happy to release this as a “minimum viable product.” As I mentioned, merging an author with either lots of works, lots of editions, or both, takes a long time to update, so please be patient.
  • Duplicate authors’ names will be saved as an alternate on the Master record. For example, the (new) Master record for H. P. Lovecraft now lists alternates like Howard Philips Lovecraft, H. P Lovecraft, Howard P. Lovecraft and H.P Lovecraft. These alternates are often just subtle differences in spacing or capitalization, and we’re hoping they might prove useful later if we begin to stockpile them now.
  • If you’re in any doubt about whether or not to merge an author, don’t. It’s possible you might come across an odd-looking author name like August (re: H. P. Lovecraft) Derleth or H. P. (introduction by Lin Carter) (with Harry Houdini on Pharoahs) Lovecraft in a search for H. P. Lovecraft… these are trickier, because they’re noting contributors in the author name. Ideally, those contributors would be siphoned out into the contributors field per edition, and not merged into the H. P. Lovecraft Master. That would be a loss of information. So, it’s probably easier to just leave those long, odd “authors” alone for now.

I’ve actually found it really fun to test this new feature. I found a useful directory listing of authors on Yahoo of a ton of authors that I began to merge in Open Library. By referring to an external list like this, I could just move from one to the next, rather than trying to come up with authors to search for.

We’ve also bundled another enhancement into this release: Recent Changes V2: There’s a new little bit of navigation to the recent changes page, so you can see things like all the authors merged on 8/16/2010, or all the bot edits made in June 2010. We’re looking forward to adding other bits and pieces to these new filtered views, for example, all the new ebooks made available on a certain day, or all the new covers uploaded in a certain month. Perhaps these could also have feeds available too, so you could subscribe to a feed of changes to keep your version of the Open Library dataset up-to-date.

As well as Recent Changes V2, we’ve introduced the concept of “save_many” for transactions that contain lots of little updates. This is a performance improvement, and entered as a single line in Recent Changes – look for the little “expand” link to open up the contents of the save_many transaction.

So, why not have a shot at merging two duplicate authors? The best place to start is the Author search page.

Anyhoo, we’re excited to show you the first major feature we’ve rolled out since the launch of the redesign back in May, and we’re excited to see what you make of it. Go forth and merge!