On Bookstores, Libraries & Archives in the Digital Age

The following was a guest post by Brewster Kahle on Against The Grain (ATG) – Linking Publishers, Vendors, & Librarians

See the original article here on ATG’s website

By: Brewster Kahle, Founder & Digital Librarian, Internet Archive​​​​​​​

​​​Back in 2006, I was honored to give a keynote at the meeting of the Society of American Archivists, when the president of the Society presented me with a framed blown-up letter “S.”  This was an inside joke about the Internet Archive being named in the singular, Archive, rather than the plural Archives. Of course, he was right, as I should have known all along. The Internet Archive had long since grown out of being an “archive of the Internet”—a singular collection, say of web pages—to being “archives on the Internet,” plural.  My evolving understanding of these different names might help focus a discussion that has become blurry in our digital times: the difference between the roles of publishers, bookstores, libraries, archives, and museums. These organizations and institutions have evolved with different success criteria, not just because of the shifting physical manifestation of knowledge over time, but because of the different roles each group plays in a functioning society. For the moment, let’s take the concepts of Library and Archive.

The traditional definition of a library is that it is made up of published materials, while an archive is made up of unpublished materials. Archives play an important function that must be maintained—we give frightfully little attention to collections of unpublished works in the digital age. Think of all the drafts of books that have disappeared once we started to write with word processors and kept the files on fragile computer floppies and disks. Think of all the videotapes of lectures that are thrown out or were never recorded in the first place. 

Bookstores: The Thrill of the Hunt

Let’s try another approach to understanding distinctions between bookstores, libraries and archives. When I was in my 20’s living in Boston—before Amazon.com and before the World Wide Web (but during the early Internet)—new and used bookstores were everywhere. I thought of them as catering to the specialized interests of their customers: small, selective, and only offering books that might sell and be taken away, with enough profit margin to keep the store in business. I loved them. I especially liked the used bookstore owners—they could peer into my soul (and into my wallet!) to find the right book for me. The most enjoyable aspect of the bookstore was the hunt—I arrived with a tiny sheet of paper in my wallet with a list of the books I wanted, would bring it out and ask the used bookstore owners if I might go home with a bargain. I rarely had the money to buy new books for myself, but I would give new books as gifts. While I knew it was okay to stay for awhile in the bookstore just reading, I always knew the game.

Libraries: Offering Conversations not Answers

The libraries that I used in Boston—MIT LibrariesHarvard Libraries, the Boston Public Library—were very different. I knew of the private Boston Athenæum but I was not a member, so I could not enter. Libraries for me seemed infinite, but still tailored to individual interests. They had what was needed for you to explore and if they did not have it, the reference librarian would proudly proclaim: “We can get it for you!” I loved interlibrary loans—not so much in practice, because it was slow, but because they gave you a glimpse of a network of institutions sharing what they treasured with anyone curious enough to want to know more. It was a dream straight out of Borges’ imagination (if you have not read Borges’ short stories, they are not to be missed, and they are short. I recommend you write them on the little slip of paper you keep in your wallet.) I couldn’t afford to own many of the books I wanted, so it turned off that acquisitive impulse in me. But the libraries allowed me to read anything, old and new. I found I consumed library books very differently. I rarely even brought a book from the shelf to a table; I would stand, browse, read, learn and search in the aisles. Dipping in here and there. The card catalog got me to the right section and from there I learned as I explored. 

Libraries were there to spark my own ideas. The library did not set out to tell a story as a museum would. It was for me to find stories, to create connections, have my own ideas by putting things together. I would come to the library with a question and end up with ideas.  Rarely were these facts or statistics—but rather new points of view. Old books, historical newspapers, even the collection of reference books all illustrated points of view that were important to the times and subject matter. I was able to learn from others who may have been far away or long deceased. Libraries presented me with a conversation, not an answer. Good libraries cause conversations in your head with many writers. These writers, those librarians, challenged me to be different, to be better. 

Staying for hours in a library was not an annoyance for the librarians—it was the point. Yes, you could check books out of the library, and I would, but mostly I did my work in the library—a few pages here, a few pages there—a stack of books in a carrel with index cards tucked into them and with lots of handwritten notes (uh, no laptops yet).

But libraries were still specialized. To learn about draft resisters during the Vietnam War, I needed access to a law library. MIT did not have a law collection and this was before Lexis/Nexis and Westlaw. I needed to get to the volumes of case law of the United States.  Harvard, up the road, had one of the great law libraries, but as an MIT student, I could not get in. My MIT professor lent me his ID that fortunately did not include a photo, so I could sneak in with that. I spent hours in the basement of Harvard’s Law Library reading about the cases of conscientious objectors and others. 

But why was this library of law books not available to everyone? It stung me. It did not seem right. 

A few years later I would apply to library school at Simmons College to figure out how to build a digital library system that would be closer to the carved words over the Boston Public Library’s door in Copley Square:  “Free to All.”  

Archives: A Wonderful Place for Singular Obsessions

When I quizzed the archivist at MIT, she explained what she did and how the MIT Archives worked. I loved the idea, but did not spend any time there—it was not organized for the busy undergraduate. The MIT Library was organized for easy access; the MIT Archives included complete collections of papers, notes, ephemera from others, often professors. It struck me that the archives were collections of collections. Each collection faithfully preserved and annotated.  I think of them as having advertisements on them, beckoning the researcher who wants to dive into the materials in the archive and the mindset of the collector.

So in this formulation, an archive is a collection, archives are collections of collections.  Archivists are presented with collections, usually donations, but sometimes there is some money involved to preserve and catalog another’s life work. Personally, I appreciate almost any evidence of obsession—it can drive toward singular accomplishments. Archives often reveal such singular obsessions. But not all collections are archived, as it is an expensive process.

The cost of archiving collections is changing, especially with digital materials, as is cataloging and searching those collections. But it is still expensive. When the Internet Archive takes on a physical collection, say of records, or old repair manuals, or materials from an art group, we have to weigh the costs and the potential benefits to researchers in the future. 

Archives take the long view. One hundred years from now is not an endpoint, it may be the first time a collection really comes back to light.

Digital Libraries: A Memex Dream, a Global Brain

So when I helped start the Internet Archive, we wanted to build a digital library—a “complete enough” collection, and “organized enough” that everything would be there and findable. A Universal Library. A Library of Alexandria for the digital age. Fulfilling the memex dream of Vanevar Bush (do read “As We May Think“), of Ted Nelson‘s Xanadu, of Tim Berners-Lee‘s World Wide Web, of Danny Hillis‘ Thinking Machine, Raj Reddy’s Universal Access to All Knowledge, and Peter Russell’s Global Brain.

Could we be smarter by having people, the library, networks, and computers all work together?  That is the dream I signed on to.  I dreamed of starting with a collection—an Archive, an Internet Archive. This grew to be  a collection of collections: Archives. Then a critical mass of knowledge complete enough to inform citizens worldwide: a Digital Library. A library accessible by anyone connected to the Internet, “Free to All.”

About the Author: Brewster Kahle, Founder & Digital Librarian, Internet Archive

Brewster Kahle
Brewster Kahle

A passionate advocate for public Internet access and a successful entrepreneur, Brewster Kahle has spent his career intent on a singular focus: providing Universal Access to All Knowledge. He is the founder and Digital Librarian of the Internet Archive, one of the largest digital libraries in the world, which serves more than a million patrons each day. Creator of the Wayback Machine and lending millions of digitized books, the Internet Archive works with more than 800 library and university partners to create a free digital library, accessible to all.

Soon after graduating from the Massachusetts Institute of Technology where he studied artificial intelligence, Kahle helped found the company Thinking Machines, a parallel supercomputer maker. He is an Internet pioneer, creating the Internet’s first publishing system called Wide Area Information Server (WAIS). In 1996, Kahle co-founded Alexa Internet, with technology that helps catalog the Web, selling it to Amazon.com in 1999.  Elected to the Internet Hall of Fame, Kahle is also a Fellow of the American Academy of Arts and Sciences, a member of the National Academy of Engineering, and holds honorary library doctorates from Simmons College and University of Alberta.

Posted in Discussion, Librarianship, Uncategorized | Leave a comment

Amplifying the voices behind books

Exploring how Open Library uses author data to help readers move from imagination to impact

By Nick Norman, Edited by Mek & Drini

Image Source: Pexels / Pixabay from popsugar

According to René Descartes, a creative mathematician, “The reading of all good books is like a conversation with the finest [people] of past centuries.” If that’s true, then who are some of the people you’re talking to?

If you’re not sure how to answer that question, you’ll definitely appreciate the ‘Author Stats’ feature  developed  by Open Library.

A deep dive into author stats

Author stats give readers clear insights about their favorite authors that go much deeper than the front cover: such as birthplace, gender, works by time, ethnicity, and country of citizenship. These bits and pieces of knowledge about authors can empower readers in some dynamic ways. But how exactly?

To answer that question, consider a reader who’s passionate about the topic of cultural diversity. However, after the reader examines their personalized author stats, they realize that their reading history lacks diversity. This doesn’t mean the reader isn’t passionate about cultural diversity; rather, author stats empowers the reader to pinpoint specific stats that can be diversified.

Take a moment … or a day, and think about all the books you’ve read — just in the last year or as far back as you can. What if you could align the pages of each of those books with something meaningful … something that matters? What if each time you cracked open a book, the voices inside could point you to places filled with hope and opportunity?

According to Drini Cami — Open Library’s lead developer behind Author Stats ,  “These stats let readers determine where the voices they read are coming from.” Drini continues saying, “A book can be both like a conversation as well as a journey.” He also says, “Statistics related to the authors might help provide readers with feedback as to where the voices they are listening to are coming from, and hopefully encourage the reading of books from a wider variety of perspectives.” Take a moment to let that sink in.

Data with the power to change

While Open Library’s author stats can show author-related demographics, those same stats can do a lot more than that. Drini Cami went on to say that, “Author stats can help readers intelligently alter their  behavior (if they wish to).” A profound statement that Mark Twain — one of the best writers in American history — might even shout from the rooftop.

Broad, wholesome, charitable views of [people] … cannot be acquired by vegetating in one little corner of the earth all one’s lifetime. — Mark Twain

In the eyes of Drini Cami and Mark Twain, books are like miniature time machines that have the power to launch readers into new spaces while changing their behaviors at the same time. For it is only when a reader steps out of their corner of the earth that they can step forward towards becoming a better person — for the entire world.

Connecting two worlds of data

Open Library has gone far beyond the extra mile to provide data about author demographics that some readers may not realize. It started with Open Library’s commitment to providing its readers with what Drini Cami describes as “clean, organized, structured, queryable data.” Simply put, readers can trust that Open Library’s data can be used to provide its audiences with maximum value. Which begs the question, where is all that ‘value’ coming from?

Drini Cami calls it “linked data”. In not so complex terms, you may think of linked data as being two or more storage sheds packed with data. When these storage sheds are connected, well… that’s when the magic happens. For Open Library, that magic starts at the link between Wikidata and Open Library knowledge bases.

Wikidata, a non-profit community-powered project run by Wikimedia, the same team which brought us Wikipedia, is a “free and open knowledge base that can be read and edited by both humans and machines”. It’s like Wikipedia except for storing bite-sized encyclopedic data and facts instead of articles. If you look closely, you may even find some of Wikidata’s data being leveraged within Wikipedia articles.

Wikidata is where Open Library gets its author demographic data from. This is possible because the entries on Wikidata often include links to source material such as books, authors, learning materials, e-journals, and even to other knowledge bases like Open Library’s. Because of these links, Open Library is able to share its data with Wikidata and often times get back detailed information and structured data in return. Such as author demographics.

Wrangling in the Data

Linking-up services like Wikidata and Open Library doesn’t happen automatically. It requires the hard work of “Metadata Wranglers”. That’s where Charles Horn comes in, the lead Data Engineer at Open Library — without his work, author stats would not be possible.

Charles Horn works closely with Drini Cami and also the team at Wikidata to connect book and author resources on Open Library with the data kept inside Wikidata. By writing clever bots and scripts, Charles and Drini are able to make tens of thousands of connections at scale. To put it simply, as both Open Library and Wikidata grow, their resources and data will become better connected and more accurate. 

Thanks to the help of “Metadata Wranglers”, Open Library users will always have the smartest results — right at their fingertips. 

It’s in a book …

Once Upon a Time, ten-time Grammy Award Winner Chaka Kahn greeted television viewers with her bright voice on the once-popular book reading program, Reading Rainbow. In her words, she sang … “Friends to know, and ways to grow, a Reading Rainbow. I can be anything. Take a look, it’s in a book …”

Thanks to Open Library’s author stats, not only do readers have the power to “take a look” into books, they can see further, and truly change what they see.

Try browsing your author stats and consider following Open Library on twitter.

The “My Reading Stats” option may be found under the “My Books” drop down menu within the main site’s top navigation.

What did you learn about your favorite authors? Please share in the comments below.

Posted in Community, Cultural Resources, Data | Comments closed

Giacomo Cignoni: My Internship at the Internet Archive

This summer, Open Library and the Internet Archive took part in Google Summer of Code (GSoC), a Google initiative to help students gain coding experience by contributing to open source projects. I was lucky enough to mentor Giacomo while he worked on improving our BookReader experience and infrastructure. We have invited Giacomo to write a blog post to share some of the wonderful work he has done and his learnings. It was a pleasure working with you Giacomo, and we all wish you the best of luck with the rest of your studies! – Drini

Hi, I am Giacomo Cignoni, a 2nd year computer science student from Italy. I submitted my 2020 Google Summer of Code (GSoC) project to work with the Internet Archive and I was selected for it. In this blogpost, I want to tell you about my experience and my accomplishments working this summer on BookReader, Internet Archive’s open source book reading web application.

The BookReader features I enjoyed the most working on are page filters (which includes “dark mode”) and the text selection layer for certain public domain books. They were both challenging, but mostly had a great impact on the user experience of Bookreader. The first allows text to be selected and copied directly from the page images (currently in internal testing), and the second permits turning white-background black-text pages into black-background-white-text ones.

Short summary of implemented features:

  • End-to-end testing (search, autoplay, right-to-left books)
  • Generic book from Internet Archive demo
  • Mobile BookReader table of contents
  • Checkbox for filters on book pages (including dark mode)
  • Text selection layer plugin for public domain books
  • Bug fixes for page flipping
  • Using high resolution book images bug fix

First approach to GSoC experience

Once I received the news that I had been selected for GSoC with Internet Archive for my BookReader project, I was really excited, as it was the beginning of a new experience for me. For the same reason, I will not hide that I was a little bit nervous because it was my first internship-like experience. Fortunately, even from the start, my mentor Drini and also Mek were supportive and also ready to offer help. Moreover, the fact that I was already familiar with BookReader was helpful, as I had already used it (and even modified it a little bit) for a personal project.

For most of the month of May, since the 6th, the day of the GSoC selection, I mainly focused on getting to know the other members of the UX team at Internet Archive, whom I would be working with for the rest of the summer, and also define a more precise roadmap of my future work with my mentor, as my proposed project was open to any improvements for BookReader.

End to end testing

The first tasks I worked on, as stated in the project, were about end-to-end testing for BookReader. I learned about the Testcafe tool that was to be used, and my first real task was to remove and explore some old QUnit tests (#308). Then I started to make end-to-end tests for the search feature in BookReader, both for desktop (#314) and mobile (#322). Lastly, I fixed the existent autoplay end-to-end test (#344) that was causing problems and I also had prepared end-to-end tests for right-to-left books (#350), but it wasn’t merged immediately because it needed a feature that I would have implemented later; a system to choose different books from the IA servers to be displayed specifying the book id in the URL.

This work on testing (which lasted until the ~20th of June) was really helpful at the beginning as it allowed me to gain more confidence with the codebase without trying immediately harder tasks and also to gain more confidence with JavaScript ES6. The frequent meetings with my mentor and other members of the team made me really feel part of the workplace.

Working on the source code

The table of contents panel in BookReader mobile

My first experience working on core BookReader source code was during the Internet Archive hackathon on May the 30th when, with the help of my mentor, I created the first draft for the table of content panel for mobile BookReader. I would then resume to work on this feature in July, refining it until it was released (#351). I then worked on a checkbox to apply different filters to the book page images, still on mobile BookReader (#342), which includes a sort of “dark mode”. This feature was probably the one I enjoyed the most working on, as it was challenging but not too difficult, it included some planning and was not purely technical and received great appreciation from users.

Page filters for BookReader mobile let you read in a “dark mode”

Then I worked on the generic demo feature; a particular demo for BookReader which allows you to choose a book  from the Internet Archive servers to be displayed, by simply adding the book id in the URL as a parameter (#356). This allowed the right to left e2e test to be merged and proved to be useful for manually testing the text selection plugin. In this period I also fixed two page flipping issues: one more critical (when flipping pages in quick succession the pages started turning back and forth randomly) (#386), and the other one less urgent, but it was an issue a user specifically pointed out (in an old BookReader demo it was impossible to turn pages at all) (#383). Another issue I solved was BookReader not correctly displaying high resolution images on high resolution displays (#378).

Open source project experience

One aspect I really enjoyed of my GSoC is the all-around experience of working on an open source project. This includes leaving more approachable tasks for the occasional member of the community to take on and helping them out. Also, I found it interesting working with other members of the team aside from my mentor, both for more technical reasons and for help in UI designing and feedback about the user experience: I always liked having more points of view about my work. Moreover, direct user feedback from the users, which showed appreciation for the new implemented features (such as BookReader “dark mode”), was very motivating and pushed me to do better in the following tasks.

Text selection layer

The normally invisible text layer shown red here for debugging

The biggest feature of my GSoC was implementing the ability to select text directly on the page image from BookReader for public domain books, in order to copy and paste it elsewhere (#367). This was made possible because Internet Archive books have information about each word and its placement in the page, which is collected by doing OCR. To implement this feature we decided to use an invisible text layer placed on top of the page image, with words being correctly positioned and scaled. This made it possible to use the browser’s text selection system instead of creating a new one. The text layer on top of the page was implemented using an SVG element, with subelements for each paragraph and word in the page. The use of the SVG instead of normal html text elements made it a lot easier to overcome most of the problems we expected to find regarding the correct placement and scaling of words in the layer.

I started working sporadically on this feature since the start of July and this led to having a workable demo by the first day of August. The rest of the month of August was spent refining this feature to make it production-ready. This included refining word placement in the layer, adding unit tests, adding support for more browsers, refactoring some functions, making the experience more fluid, making the selected text to be accurate for newlines and spaces on copy. The most challenging part was probably to integrate well the text selection actions in the two page view of BookReader, without disrupting the click-to-flip-page and other functionalities related to mouse-click events.

This feature is currently in internal testing, and scheduled for release in the next few weeks.

The text selection experience


Overall, I was extremely satisfied with my GSoC at the Internet Archive. It was a great opportunity to learn new things for me. I got much more fluent in JavaScript and CSS, thanks to both my mentor and using these languages in practice while coding. I learnt a lot about working on an open source project, but a part that I probably found really interesting was attending and participating in the decision making processes, even about projects I was not involved in. It was also interesting for me to apply concepts I had studied on a more theoretical level at university in a real workplace environment.

To sum things up, the ability to work on something I liked that had an impact on users and the ability to learn useful things for my personal development really made this experience worthwhile for me. I would 100% recommend doing a GSoC at the Internet Archive!

Posted in BookReader, Community, Google Summer of Code (GSoC), Open Source | Comments closed

Google Summer of Code 2020: Adoption by Book Lovers

by Tabish Shaikh & Mek

OpenLibrary.org,the world’s best-kept library secret: Let’s make it easier for book lovers to discover and get started with Open Library.

Hi, my name is Tabish Shaikh and this summer I participated in the Google Summer of Code program with Open Library to develop improvements which will help book lovers discover and use OpenLibrary.org.

My Journey into Open Source

When I got to college, I could tell classes would not be enough to help me get the hands on experience I would need to gain confidence in my programming abilities. I heard from friends and professors within my university that open source projects presented a great opportunity to work with established engineers in the field to gain hands-on experience.

In the past, I tried contributing to a few well known open source projects, like Wikipedia. I selected Wikipedia because the community is large, active, and well established, there’s a lot of documentation, and the project is in a programming language I know well.

I quickly became overwhelmed. Wikipedia may be well established, but a project of that size felt difficult to navigate without a mentor to guide me. I was able to successfully set up my environment, but then I had trouble finding an appropriate first issue to work on and hit a dead end as I tried to familiarize myself with the code. I found myself wishing for a chance to work more closely with the community.

One evening in March of 2018, I was searching for a free algorithms book on Google and discovered Open Library. I had trouble finding the exact book I was looking for, but I could tell Open Library was an important library resource for accessing free books online and I noted their dated design as a big opportunity for improvement. So I bookmarked the page in my browser and was surprised to discover a “Help Us” button. I clicked the button and landed on a github issue which mentions their community calls. This gave me confidence there was a community which could help me get started and answer my questions, so I decided to give it a shot.

The community calls gave me a guided path for positively improving the experience of patrons using the service. During the community calls, members present what they’ve completed, what they’re working on, and what they may be stuck with. In reality, this is a way to be seen for your achievements, update others, and receive help. Having this type of structure helped me discover which appropriate opportunities exist, how to approach and plan to solve the problem.

This experience was really special to me because it was the first time I had been part of an international community and all of the members were aligned toward a common goal of universal access to knowledge.

In the first few months of volunteering I redesigned the website footer and made several pull requests. I also noticed Salman was participating in Google Summer of Code (GSoC) in 2018. I applied to work with Open Library for GSoC in 2019 and was disappointed to learn the Internet Archive didn’t have enough slots for Open Library to participate. Fortunately, I worked with Mek, Open Library’s program lead, who recognized my contributions and arranged an “Internet Archive Summer of Code” (IASoC) internship program where we accomplished a major victory of releasing the sponsorship program which empowers the community to make meaningful, diverse books more available to borrow. You can read the blog post here which was picked up by BoingBoing and Gizmodo.

Noticing a Problem

During my years volunteering, we recognized several indicators that Open Library could be better serving its mission by distributing to a larger audience. Open Library, which has millions of free books to borrow, has an international alexa rank of  #11,079, compared to Goodreads which is a top #300 website without having books to borrow. The data also showed many patrons would drop off at the registration page because it didn’t offer immediate field validation and the fields would be cleared upon submit if, e.g. an email was already registered. The book pages, our most frequently viewed pages, were also very slow to load, causing patrons to drop-off. Also the experience of the book pages was confusing because there were separate views for Works and Editions. Because of all these factors, only around 6% of the Internet Archive’s books were checked out, meaning 94% of the catalog remained underutilized.

I applied to GSoC 2020 with a plan, “Adoption By Book Lovers” to resolve some of these key issues, help more people like myself discover and derive value from the Open Library, and hopefully improve their first experience in the process.

Placing our bets

In the service of helping more patrons discover Open Library, increasing our utilization and engagement, and decreasing confusion and bad experiences, we made 5 key bets:

  1. Improving Sign Up
  2. Book Page Redesign
  3. Shareable Profiles & Public Reading Log
  4. Imports & Exports
  5. Twitter Bot

There’s a common saying, “the first impression is the last impression”. This has certainly been true for many patrons attempting to sign up for an Open Library account. The easiest, surest way to help more patrons derive value from the Open Library platform is by Improving Sign Up; reducing the friction and early negative first impressions during account creation.

Open Library’s mission for 2019 was “Reducing bad experiences, confusion, & dead-ends”. By combining our Works and Editions pages into a single more performant Book Page Redesign we believed we’d reduce the confusion of users searching for their favourite books and in turn, also increase distribution. The DoubleClick study by Google shows that 53% of patrons drop off if page load is exceeds 3 seconds and this carries significant SEO penalties. While redesigning our Book Page, a key consideration was page-load performance because we knew this would increase our rank in search engine results and increase retention through the lending and registration funnels.

Finally, by betting on social features, like shareable profiles and public-by-default reading logs, the ability to import books from Goodreads, and a twitter @borrowbot to help patrons discover which books are available to read and borrow on Open Library, we felt confident we could increase the number of patrons that may discover and adopt OpenLibrary.org.

Improving Signup

In 2018, we coincidentally, hit a regression #1431 to our account creation page which presented itself as a server error for patrons trying to register a new account when their username or email was already taken.

Because of this bug, our daily registered users dropped from ~2300 to ~1700 (-500). Through this, we discovered that nearly 1/5 of patrons (i.e. 500 a day) who attempted registration would hit some validation issue when creating their account (e.g. email or username invalid or taken, recaptcha broken). Even after solving the #1431 regression, we hypothesized that many of these 500 patrons were hitting error-cases which refreshed the page and cleared their form inputs, causing patrons to bounce. An easy solution was adding real-time validation to ensure emails, usernames, passwords, and recaptcha are valid before submitting the form.

In order to implement real time validation, we planned Epic #1433 which included two pieces: 

  • #2053 – update backend API endpoints
  • #2055 – Add real-time field validation for email, username, password to show errors before submission.

While we do not have great analytics on how conversion increased, we do know from our support channels that these changes have anecdotally resulted in a significant decrease in support emails around patron signup.

Book Page Redesign

User interviews and surveys have taught us that most patrons who visit Open Library are trying to find a “Book”. Many patrons report that the terms Work and Edition may confuse their experience. This confusion is increased because a user can unpredictably be dropped into either a Work page or an Edition page which have different designs.

Our goal in redesigning the books page was to increase clarity of the experience and improve page loading times. To improve clarity and simplify the experience, we merged the work and edition pages to a single book page where patrons may find all the information about a work and learn about the availability of various editions without having to navigate multiple pages.  

When redesigning the Book Page, we made the following changes:

  • Editions table. We made the editions table front-and-center to enable readers to quickly switch between the different editions. We also feature editions by availability and language, and allow patrons to change how many results are shown at a time. We added a new search box to enable patrons to find relevant editions without reloading. 
  • Navigation tabs. We have bucketed the work’s information into an “Overview” tab and the current Edition’s information in the “This Edition” tab. The tab bar always sticks to the top of the page for easy access to different sections of the page.
  • Expandable descriptions. In previous designs, long text descriptions made it difficult to see all important book information at a glance. There are now “Read more” links to expand and collapse long descriptions.
  • Clearer buttons. All the favorite actions of readers such as borrowing, searching inside, adding books to one’s reading log, and book star ratings have been grouped together and moved right below the book cover. It’s hopefully more clear now that the “Want to Read”
  • Load times. We know page speed is a priority for readers. The new Books Page should be significantly faster (Lazy Loading of Related Works Carousel).

Considerations. We tried to change as little as possible and were careful not to remove existing functionality:

  • URLs: Developers and partners will be happy to hear that /works and /books urls and APIs will continue to work as expected without change. Both the work and edition pages will simply appear to use the same consistent design.
  • Lists: While admittedly slightly less convenient, you can still add Works to Lists by clicking the “Use this Work” checkbox as shown below. By default, Lists will use Editions.

I had always worked in small teams with not a lot of stakeholders and no clash of ideas. The Books Page Redesign was one in which the issue was open for 3 years and it was being stalled due to clash of interests in how we should display our pages. Completing this issue was a major milestone in my GSoC program where I learned to cooperate and compromise on some aspects of our design so that all stakeholders were happy.

The feedback we received from our patrons was that ~65% patrons found the New Books Page a step forward, ~17% did not have any preference and ~22% found the change a step backward. Therefore we think our hypothesis was correct and this feature would improve user experience and reduce user confusion.

Read more about the Book Page Redesign: https://blog.openlibrary.org/2020/07/08/re-thinking-open-librarys-book-pages/

Additional Book Page improvements

After completing the Book Page redesign, we made two major improvements to help our Librarian community and to improve performance and load times: a better book /edit experience and Lazy Loading of expensive book page components (e.g. related works carousels).

Book Page Editor. We redesigned the Books Page Editor to enable our librarians edit book metadata with ease. 

Lazy Loading of Related Carousels. To improve the page loading time we firstly created a list of components and their timings and noticed that the Related Works and Author Works took the most time to load thereby slowing down the page for up to 10%. Therefore our hypothesis was to lazy load related works carousel which would then enable our newly designed books pages to load faster.

The impact of this change was that now pages load up to 10% faster:

Shareable Profiles & Public Reading Logs 

We noticed that very few patrons share their reading logs or even know they can be shared. However, we know patrons on Goodreads share their reading logs frequently. And also, lists on Open Library are shared all the time. Why is this?

In 2017, when Open Library announced the new Reading Log feature, it was set to be private by default. We expected many patrons would change their reading logs to be public, but because it wasn’t public by default and difficult to discover, patrons didn’t know the feature existed and had no reason to make it public.

In the spirit of being an open platform, we wanted patrons to have the opportunity to make their reading logs public to patrons with similar interests. As a result, we decided to make Reading Logs public by default for new accounts created after 2020-05, with the option for any patron to set their reading logs to private. Even after making this change, we noticed patrons trying to share their generic /account/books page, however this page always reflects the content of the currently logged in user.

By always redirecting /account/books to the publicly shareable /people/username url, we are able to move in a direction which enables patrons to freely share their reading logs and paves the way for other features like “following”, which we’re interested in exploring next year.

Enabling these change required:

This change simplified how users share their reading log and profile pages publicly paving a path for more social additions to Open Library.  

Imports & Exports

Goodreads provides a way to download/export a list of books from one’s bookshelves. This feature would allow a user to take an exported dump of their reading log from Goodreads and then add each of these books to their Open Library account.

The Goodreads import feature from https://openlibrary.org/account/import

The export options enables patrons to download a list of Open Library book identifiers from their reading log. 

The download export option from https://openlibrary.org/account/import
A picture of a CSV file crated by the exporter

Twitter Bot

Our objective for this task was how do we reach more patrons/readers and help them discover more books on openlibrary.org? According to the hashtag analytics audit done on tweetbinder.com on hashtags #books #amazon using the free version the analytics show that in a 7 day period the number of original tweets(excluding retweets) was approx. 140 with a number impact of 11M. Therefore this is a great opportunity for making our bookshelves discoverable.

Whenever a user tweets out a book with the amazon link/ an ISBN, the twitter @borrowbot would retweet the book with the link from Open Library if it is available. The book will be tweeted only once.


  • In no small part because of the bets we made, our international Alexa rank improved by 10% from #11,079 to #9,893.
  • Our Book Page load times improved on average by ~10%.
  • 2 out of 3 of our patrons approved of our Book Page redesign, with 11% celebrating it as game changer.
  • More than 5,000 books have already been imported through the Goodreads import tool
  • Support team reports significant decrease in account creation support emails

What I learned

I always looked for ways to improve my work and have always loved constructive feedback from my mentor Mek who helped me learn how to estimate time for tasks, effectively identify stakeholders and include them in the process (reaching consensus on decisions was a lot harder than I anticipated), and how to communicate problems and achievements in a way which everyone may understand. Also, writing takes a long time and it’s easy to want to code until the deadline. As our founder Brewster Kahle says, “work backwards from the blog post”. 

I also had the privilege of applying what I’ve learned to be a mentor for both Sachin Naik (#3627 #3622 ) and Fatima (#3454) within our community and helping them submit some of their first pull requests for Open Library

Posted in Bulk Access, Community, Google Summer of Code (GSoC), Open Source | Comments closed

Open Library for Language Learners

By Guyrandy Jean-Gilles 2020-07-21

A quick browse through the App Store and aspiring language learners will find themselves swimming in useful programs.

But for experienced linguaphiles, the never-ending challenge is finding enough raw content and media to consume in their adopted tongue. Open Library can help.

Earlier this year, Open Library added reading levels to their catalog for more than three thousand books. The ability to search by reading level, combined with filtering by language, provides the savvy patron a convenient way to find, read, and listen to handfuls of elementary books in their desired language.

Getting the most out of Open Library’s BookReader

One of the most valuable settings of Internet Archive’s BookReader for language learners is Read Aloud. I highly recommend using this feature while reading to ensure your pronunciation is perfect. Just about any modern browser supports Read Aloud out of the box.

Tip: If you want the most natural sounding voices, believe it or not, Microsoft Edge is your best choice. If Microsoft Edge isn’t available on your platform, there are likely ways to install more natural voices via plugins or other methods.

Finding Books at your Literacy Level

From the main menu, click on the “Browse” drop-down and select K-12 Student Library.

You’ll next be presented with a Student Library where you may choose books by “reading level” or “grade”. In my experience, selecting by reading level offers more non-English options. Also in my experience, the higher the reading level, the more non-English options are available. Your mileage may vary.

Let’s pick “Grade 12” for now. We should see books tagged at a 12th grade reading level, predominantly in English. Let’s change that by scrolling and adding a filter for only Spanish books on the right sidebar.

Now our results show Spanish editions at a 12th grade reading level! You may notice many of these available books are out-of-copyright translations of the “literary canon.” This is largely what’s available in Open Library’s non-English k-12 catalog at the time of writing this post. Let’s select Hamlet by William Shakespeare.

To select the Spanish edition of Hamlet, scroll to the editions table and type “Spanish” into the edition search bar. Then, click “Read” to open the BookReader.

Because this is an unrestricted book, you may click the Read button to begin reading. If you want to take advantage of the Read Aloud feature, hover over the headphones icon on the right side of the Read button and click Listen.

BookReader should automatically narrate the book in the text’s native language. The passages will be highlighted as they are read aloud. If the voice appears to be incorrect, this may means your browser does not have access to a suitable digital voice to read aloud the book’s language. We’ve found Microsoft Edge and Google Chrome to be reliable options.


Now you have all the tools you need to find and read books in other languages. I cannot stress the Read Aloud feature enough as it allows me to hear new words spoken as I’m introduced to them. No matter where you are in your language learning journey, reading and listening to books in your target language can only accelerate your progress. Let Open Library help you along the way. ¡Disfrutar!

Did you enjoy this article? Please let us know on twitter!

Posted in Uncategorized | Tagged , | Comments closed
  • open library logo
  • follow us on twitter

  • Recent Posts

  • Archives