To celebrate the Internet Archive’s 20th anniversary, the Open Library team has added pages for 200,000 new modern works and rolled out a brigade of fixes and features to improve our user experience.
Over the past year, Open Library’s digital librarian Jessamyn West and lead engineer Brenton Cheng have worked tirelessly with the engineering team and volunteer community to outline a roadmap for revitalizing Open Library and address the issues most affecting our users. We’re proud to announce progress on several fronts, including social sharing, improved book lending, a mobile-optimized book reader, full-text search, a new developer tool, and the addition of thousands of new modern works.
- Thanks to the efforts of Giovanni Damiola, full-text search through all books hosted on the Internet Archive is back online and is faster than ever. You can try the new feature, for example, to see over 115,000 places where works reference Benjamin Franklin’s maxim: “Little strokes fell great oaks”.
- Thanks to Richard Caceres, we have a beautiful new Book Reader, which looks great on mobile devices and provides a much clearer and simpler book borrowing experience. Try out the new Book Reader and see for yourself!
- In the processing of adding hundreds of thousands of new modern works to the Open Library catalog, Mek Karpeles released our new openlibrary-client, a command line developer tool for programmatically fetching and creating new works on Open Library.
There are a few small changes in the BookReader that we think you’ll like specifically. EPUB and PDF loans can be initiated from within an existing BookReader loan. What this means for Open Library users is two pretty cool things you’ve long requested:
- Users who start loans from the BookReader can borrow either EPUB or PDF formats, and switch formats during the loan period.
- Users who start loans from the BookReader can return loans early, even EPUBs and PDFs.
We hope these changes will delight our readers, empower our developers, and help our community to make even more quality contributions. The path ahead looks even more promising. With clear direction and exciting redesign concepts in the works, the Open Library team is eager to bring you an Open Library at the cutting edge of the 21st century while giving you access to five centuries’ of texts.
You may have read about our recent downtime. We thought it might be a good opportunity to let you know about some of the other behind the scenes things going on here. We continue to answer email, keep the FAQ updated and improve our metadata. Many of you have written about the quality of some of our EPUBs. As you may know, all of our OCR (optical character recognition) is done automatically without manual corrections and while it’s pretty good, it could be better. Specifically we had a pernicious bug where some books’ formatting led to the first page of chapters not being part of some books’ OCRed EPUB. I personally had this happen to me with a series of books I was reading on Open Library and I know it’s beyond frustrating.
To address this and other scanning quality issues, we’re changing the way EPUBs work. We’ve improved our OCR algorithm and we’re shifting from stored EPUB files to on-the-fly generation. This means that further developments and improvements in our OCR capabilities will be available immediately. This is good news and has the side benefit of radically decreasing our EPUB storage needs. It also means that we have to
- remove all of our old EPUBs (approximately eight million items for EPUBs generated by the Archive)
- put the new on-the-fly EPUB generation in place (now active)
- do some testing to make sure it’s working as expected (in process)
We hope that this addresses some of the EPUB errors people have been finding. Please continue to give us feedback on how this is working for you. Coming soon: improvements to Open Library’s search features!
It makes no odds what it is you carry, so long as you carry the truth along with you. – intro to 1893 edition
There are many good responses to “Why do we still have libraries when everything is online?” My favorite one has to do with the importance of finding people to curate and sort and sift through the enormous bulk of online material to create knowledge and wisdom from what is merely just data. Small projects which do not scale. Henry David Thoreau went to Cape Cod in the mid 1800s and wrote about the experience. His writings on Cape Cod were published in 1865 and reprinted many times after that. The text can be found any number of places, but actually flipping through the books reveals a lot more about the cultural history of this book and the text it contains. Just the covers alone are lovely to look at.
Cover featuring the Eastham Windmill
Cover featuring cranberry motif
Looking through the many copies Open Library has, there’s a lot of marginalia and other interesting things to peek at. One version appears to have been purchased for a dollar while another may have cost upwards of thirty.
The book was frequently given to libraries as a gift. Sometimes by people you may have heard of.
Some of these versions have beautiful and unusual illustrations and some have photographs.
Some have illustrations nearly obliterated by low quality scanning (not ours).
And some have little mysteries. What does “By transfer The White House” mean? What did the War Department think of this book?
All of these are aspects of the book–one work,many editions–that surface through close inspection, with human eyes.
The Concord MA library has scanned, assembled and anotated a set of images of Thoreau’s surveys which is another wonderfully curated set of digitized ephemera that help us understand our world..
A slightly more personal note here… it’s been a little over three years since I started working at Open Library and just this past week we hit a milestone of 25,000 emails sent. That’s slightly lower than the number of emails we get because some are just saying “Thank you!” and some we forward to other departments and yes, a few are spam. But the rest–the tech support, the early book returns, the reference questions, the merge requests–have been answered by me and Michelle and Laurel.
It’s been very gratifying to help keep Open Library’s ebook lending library open and thriving and very interesting to watch the ebook environment changing around us since we first opened in a much more limited fashion in 2005. Here’s to ten more years of free ebook lending and a continually improving ebook reader experience in the next ten years!