Early, around 7:30am Pacific, on Tuesday, April 28th, high database load was detected on OpenLibrary.org. Investigation revealed a set of at least 38,703 residential IP addresses performing a coordinated sqlinjection attack on a vulnerable openlibrary.org endpoint, resulting in exfiltration of emails and encrypted passwords of 175,080 legacy accounts, registered before March, 2011. This table has not been used for authentication since 2016, however we advise affected accounts to change their passwords on any relevant platforms.
The attack was identified and mitigated within a four hour window. Impact was limited due to the obscurity of the attack which could only process a single account query per malicious request. This is an old, no-longer-in-use table that was formerly used for Open Library sign-in prior to switching to use Archive.org login credentials in 2016.
Details
Prior to 2016, Open Library maintained its own login system distinct from Archive.org, which used a legacy account database table. In 2016, both for improved security and patron convenience, the Open Library website switched to a unified model where authentication is performed using archive.org credentials and not legacy Open Library credentials. Since this date no new Open Library patron account passwords have been stored within Open Library’s legacy account database.
Today’s incident only affects a subset of legacy accounts whose credentials are no longer in active use. Furthermore, no plaintext passwords were compromised – all passwords in this table were both salted and encrypted.
Remediation & Impact
Upon discovery, the identified exploited path was blocked at the nginx level and a security fix was then patch deployed to our servers. All accounts in the no-longer-in-use legacy `account` table have had their encrypted password fields cleared. We are releasing a tool to check whether your email was affected by the breach.
Check If Your Account is Affected
If your email is on this list, out of an abundance of concern, we recommend changing your password for any service that matches the password used when registering your OpenLibrary account.
Open Library’s Security Policy
The Open Library team routinely monitors security alerts, performs sqlinjection audits, and responds seriously to security reports we receive. We believe strongly in a full transparency policy when incidents occur, both so our patrons have the best information to make decisions, are able to understand our responses, and so our developer community can help report and address issues.
Thank you for your patience and understanding and our sincere apologies for the poor behavior of these malicious actors and the impact this has on our community. As AI tooling makes it easier for malicious actors to attack websites like ours, our team will continue to proactively take steps to put our patrons’ privacy and security first.
Today’s challenge is to find “The Secret of Secrets“, by Dan Brown using Open Library search. It’s not impossible, but it is not easy… And it’s not just because the Я is backwards on the cover.
If you search for “the secret of secrets“, you won’t find the right match on the first two pages of results.
In this example, our current search algorithm is biasing too heavily on returning book results that have lots of editions to vouch for them, as well as other boosting factors (like star ratings) that don’t always produce desired result.
What search algorithm would perform better? And how do know whether one approach is better than the other?
These are key questions Drini Cami — core maintainer of Open Library search — has been investigating this month.
Is Search Improving?
In order to know whether we’re making changes that improve the quality of our search results, we can’t just change the algorithm, type in a search, and see if the result is better in that one case. We need to apply some consistent framework across a collection of challenging queries and measure how the system — on a whole — performed before versus after.
In Open Library’s case, Drini maintains a Search Evaluation Spreadsheet that measures 100 common searches—everything from “Harry Potter” and “Little Prince” to “The Secret Garden” and “Narnia”. These searches come from our server logs (i.e. popular searches from patrons). It also contains challenging cases that we’ve seen underperform in the past.
For each search query we’ve collected, we define what we expect the “correct” search result to be and then check how often the correct result appears in the top 3 search results (across the search algorithms we’re considering).
Multiplicative Instead of Additive
For the technical crowd, this change (PR #12357) adjusts Work Search’s Solr eDisMax tuning to use multiplicative boosting (via boost) instead of additive boosting (via bf) to reduce over-weighting of popularity signals relative to textual relevance:
Replaces bf-based additive boosts with an eDismax boost function expression.
Expands/adjusts qf and phrase-boost parameters (pf, pf2) to change match weighting and proximity scoring.
Overall, the change has improved the relevance of our test results by roughly 10%:
“The secret of secrets” now appears as the 3rd result.
“My Life”, by Bill Clinton went from 101th to 19th
“laws field guide” went from 14th to 1st
Expect these improvements to be live on the main site early next week. Happy reading!
Future Opportunities: Exact Match versus Browser
Since releasing this blog post, we received a great question internally by Sawood Alam, from the Wayback Machine team who asks:
Have you measured how would this change affects the discovery use-case where a patron is not after one specific document in mind, but wants to find out what options are out there (as opposed to the lookup use-case where they already know what they are looking for)?
And this is indeed something the Open Library team has been considering. Sawood is pointing out that there are [at least] two modes of searching:
by Exact match
Browsing
One may browse in a variety of different ways, but for simplicity I’d like to refer to this as “searching by proxy”. That is to say, instead of searching for an exact book by title, a patron may endeavor to discover a suitable book by any number or combination of proximal qualities like author, topic, format, genre. An example is a search for nonfiction books about UFOs that are advertised as textbooks and published before 1950.
For browsing queries of this flavor, it’s difficult to know (in advance) what book(s) should appear in the top-three position in search results as the answer will often be subjective to the searcher.
As a result, an additional approach will need to be instrumented and added to our existing process that (a) accurately identifies when a search term is for a proximal quality rather than an exact (e.g. title) match and, (b) introduces secondary evaluation metric, such as:
Success Rate: How often any result is clicked — perhaps called Query Success Rate (QSR)
Relevance:When a result is clicked… how often is this click for a record in a top 3 position — something like Mean Reciprocal Rank (MRR)
Further research is required to understand how these metrics may be combined in a recipe that results in the best experience for patrons. For instance, [when] is it more important to improve relevance versus the general distribution of clicks? Maybe “better” means increasing the ratio of searches-to-clicks by 20% rather than increasing the number of clicks in the top-3 position by 25% (if search-to-clicks were to drop by 5%).
Suffice to say, as policy changes make it more challenging for some readers to find the exact book(s) they are looking for, it becomes increasingly meaningful to be able to suggest suitable alternatives — and be able to measure how effective we are at making relevant recommendations. Measuring browse cases is something we expect to work towards in the coming months.
Hi! I’m Catherine, a curious and avid librarygoer, currently pursuing a career transition into the Library and Information Sciences field.
I joined the Open Library Librarian’s Team as a Librarian-In-Training during the summer of 2025 to learn about the world of online libraries, contribute to an open source project, and be part of a library community. My initial objective was to learn how to work with book metadata in MARC format but what I ended up learning, and what I’m still learning, has far surpassed my expectations. Before I took on this role, I didn’t know how to code in HTML or Markup (or any code, really). I had never contributed to an open-source project or submitted a GitHub ticket – and now I’ve done all three! I’ve also created a curated collection, tagged books by subject to further categorize them within the Open Library catalog, and learned how to create book carousels using the subject tags to visually display a collection of books on a page.
As I write about and reflect on my Open Library journey, I’m so proud of what I’ve learned and accomplished, and of the wonderful time I’m having engaging with the Open Library community.
Getting Started & Deciding How to Contribute
I discovered Open Library through the Internet Archive while researching volunteer opportunities within libraries. Open Library‘s mission resonated with me, and I felt called to join the community to help make knowledge free and accessible. I submitted my application and not long after, I was invited to the community Slack.
When I joined the Open Library Slack channel as a Librarian-In-Training, I was ready and excited to get started but realized I had no idea where or how to begin. I gained my bearings by reading through the instructional documentation and the Librarians In Training (LIT) Guide, letting the information guide me as I explored and familiarized myself with the website’s layout, behaviours, content, and pages. My interest piqued when I read the How to Create Curated Collections guide and saw how well the Star Wars and Star Trek collections had been curated. Inspired, I decided to contribute a curated collection of my own, though I was unsure what my collection would showcase. It was while going through the list of all the curated collections that I noticed a distinct lack of Canadian literary awards representation. As a Canadian, this was something I needed to remedy! I decided to highlight Canadian content on Open Library by creating a collection for the CBC’s Canada Reads Awards. This is a literary award I am very familiar with, having read many of the championed books, and one that I continuously use to inspire my own reading list.
About the CBC’s Canada Reads Awards
Canada Reads is a radio program broadcast yearly on CBC Radio One in which five celebrity judges champion and debate a book in the hope of it winning and being crowned as Canadian’s must-read book of the year. The radio show takes place over five days in five hour-long segments with one book voted off each day. Canada Reads was launched in 2002 and is still going strong. It also has a French equivalent on Radio Canada called Le combat national des livres.
My familiarity with the books and subject matter helped me understand the scope of the project and visualize the ways in which I wanted to present the collection. My vision for this curated collection was to have displayed, on one page and by each award year, the book winners and contenders along with the people who championed them. It was quite a learning process to set up this collection, and I’m so happy with the way it turned out.
Establishing the Collection
Before I even created a curated collection page for the awards, I compiled a list of all the Canada Reads books in my notes, separated by year from 2002 – 2025, noting the winners and contenders and who championed the books. Thankfully, this was an easy step as all the books are very well documented on the CBC’s website. I brought this information into Open Library by creating personal lists in my Open Library account for each award year and manually searching for and adding the books to their respective lists. With over 100 books to add, this was a tedious, time-consuming, and unsustainable process. I knew there had to be a more efficient way to complete this task, but I didn’t know how. I chipped away at it for a while and learned as I went, also taking the time to update the book metadata to ensure the accuracy of records, editions and general book information. However, I was getting overwhelmed by the process, my questions were adding up, and I couldn’t seem to find the answers I was looking for in the documentation. It was time to turn to the community for help. I was so relieved to discover the weekly Open Library Community Zoom calls on Tuesdays, where I would get to ask my questions to actual human beings – and have conversations!
Open Library Community Calls
Joining that first community call was instrumental in my Librarian-In-Training journey and for the development of my Canada Reads curated collection. I was immediately welcomed, supported, and made to feel like a part of the community. I had the opportunity to present my collection for the first time, share my vision, and explain my pain points. I was met with enthusiasm and incredible insights that eased my overwhelm and helped me move forward. I hadn’t realized how, in vocalizing my project to the community, it would help me feel so connected, supported, and invested in seeing my project through.
Every community call I have attended has provided me with the tidbits of information needed to help me develop the skills required to execute my vision. These community calls have been an invaluable resource and remain a delightful part of my week.
Among the pieces of information I received during these calls was news of the bulk editing tool, which enabled me to automatically add, from a typed list, all the Canada Reads books by year to my personal lists, saving me a lot of time and manual labour. I also learned the intricacies of editing book metadata according to industry standards. And I learned how to set up a curated collection page along with the coding required to create book carousels, which visually enhance the look of my collection pages. Each step has helped me establish my collection and display the information the way I envisioned. I now have a good understanding of the technical knowledge required to set up a collection page, and though there is still plenty to learn, I feel a lot more confident in my technical skills than I did when I started.
Leveling Up the Collection with Subject Tags
Having gained these new skills and technical knowledge, I was encouraged during a community call to take my Canada Reads collection to the next level by adding unique subject tags to each book championed during Canada Reads. This not only categorized the books into a Canada Reads subject page and increased their searchability across the Open Library catalog, it also served as the basis for the coding required to create book carousels.
Subject tags are part of a book’s metadata and are added according to the subjects found in the book as well as any subject the book might be associated with. For Canada Reads, I used three unique subject tags to categorize each championed book into a specific Canada Reads subject, as described in the table below:
Specific subject tag added to all the Canada Reads books according to the year in which they were championed
For each subject that is tagged, Open Library automatically creates a subject page that groups all the tagged books and displays them on the page. The subject page displays a book carousel of all the tagged books. In the section below the carousel, the metadata of those books is displayed by category, showcasing the publishing history, related subjects, places, people, and times. It is a great place to see the book metadata in one place and the variety of topics within a collection. With the right librarian privileges, these subject pages can be edited and spruced up.
I was granted editing permission to maintain the Collection:Canada Reads subject page and decided to set it up in the same way as I had done for my curated collection. I did so for visual consistency as well as ease of editing. By using the same format on both pages, I can copy any updates I make on one page and paste them onto the other. There are technical differences between the Collection:Canada Reads subject page and the Canada Reads Awards curated collection page, but the only visual difference is the logo, which I specifically changed to help me distinguish between the two pages when I’m working on them.
I wanted to include book carousels on my pages because they add a dynamic and interactive element that, to me, feels like a more realistic library experience than scrolling through a wall of text on a webpage. Although the coding for the book carousels confused me at first, I was guided through my confusion during a community call. In these calls, others walked me through the coding process and explained the technical setup of the subject tags. I was able to understand how the unique Collection:Canada Reads subject tags I had been adding would form the basis of the code required to create the book carousels. I used the code below, only changing the ‘collection:‘ and ‘title=’ elements to match the collection I wanted to have displayed on the page.
From August 2025 to November 2025, I was focused on building my Canada Reads collection pages, attending the Community calls on a weekly basis, and learning everything I could to make them come to life. I took on this project knowing it would be an excellent challenge, and one I am continuing to meet with openness and curiosity.
This collection has taken on a life of its own. Although I never expected to share it with anyone, I am tremendously glad I joined that first community call to share it with the Open Library team. Since then, I have established two Canada Reads collection pages, updated the metadata on many books, written this blog post, and even got the opportunity to present my collection at the 2025 Open Library Community Celebration in November 2025!
I couldn’t have predicted what I’ve accomplished with this collection, nor the recognition I’ve received. I’m incredibly proud of the work I’ve done and so, so grateful for the Open Library team’s support, without which I wouldn’t have gotten this far.
Next Steps for the Collection
This collection is a work in progress and as of this blog’s publication date, I have completed the award years from 2016 – 2025. Over the coming months, my goal is to add the remaining books from 2002 – 2015 as well as clean up the metadata for each book. I will also add the 2026 books once Canada Reads has concluded in April 2026.
This is a manual process that can be fairly time consuming as editing book metadata varies greatly depending on the popularity of the book and the number of editions it has. Although I would love for my collection to be complete, I am making sure not to rush the process and to take the necessary time to correctly input the information and create a collection that is accurate – and one that I am proud of.
My hope for this collection is for people to enjoy it and hopefully, be inspired to read some Canadian content.
If you would like to contribute to Open Library as a librarian, you can fill out this form and join the Slack channel and the weekly community call. Other contributors will provide mentorship and help you get started.
Open Library volunteers regularly work behind the scenes to build collections and improve access to books from around the world. One of these is Nazar Kotsur, who has contributed as a volunteer librarian since 2022.
A student pursuing a bachelor’s degree in Japanese language and literature, Nazar first learned about Open Library through a language-learning group that shared a list of online resources. Drawn to its open-source mission, he became involved as a volunteer librarian.
The list began as a personal resource Nazar uses to track books that appear to be missing from the internet—books or specific editions for which he has not been able to locate a PDF or other digital file.
These were “some interesting books I found, I like or I want to read and I was compiling the list because I couldn’t find the specific editions,” Nazar says.
The list contains works spanning topics from the Ukrainian fight for independence to the country’s history and culture, as well as fiction and literature.
Many of these books have not yet been preserved digitally. While some of the books are modern and might be present in a library, others are from the 1920s or 1930s and could be difficult to find even in physical form. A few of the books on the list are public domain works, which have files that Nazar hopes to later add to Ukrainian Wikisource.
The work has become all the more urgent as recently, the biggest Ukrainian online library, Chtyvo, closed. It contained 87,000+ books, some of them not available anywhere else, including many public domain works. Nazar, who is currently collaborating with others to preserve records from that site as well, says this was a big loss for the Ukrainian humanities and for readers.
Nazar is no stranger to open source projects. Today, Nazar is also a part-time Wikisource development consultant for Wikimedia Ukraine.
Wikisource is a project of the Wikimedia Foundation that aims to build a freely accessible online library of source texts, including translations of those works in many languages.
At Wikisource, Nazar organizes proofreading contests and finds new contributors by reaching out to students and professors to help speed up preservation efforts.
“My contributions in Open Library right now mostly consist of fixing the books that were proofread on Wikisource and adding IDs so that the Read button [on Open Library] becomes available,” Nazar says.
This is important because when patrons click the Read button on Open Library, they get redirected to the online book reader with scanned PDF or DJVU files of the edition. This enables them to see the pages from editions as images with the original printed text, color, previous owners’ notes in the margins and often for old books, various defects. This is excellent for preservation, but can make reading harder, especially if for those with less-than-perfect vision or who are reading on a small screen.
“Open Library is a great project that has done a lot to preserve various books in digital form and make them available for reading to people around the world,” Nazar says. “Wikisource is quite similar to Open Library in that its goal is to preserve books and a lot of the files we work on actually come from Open Library, but the way Wikisource goes around this task is different.”
Wikisource editors transcribe the pages of scanned books into digital form, preserving the original structure and formatting. That allows for a better reading experience, in which readers can configure the font and size of the text. The ability to resize text to fit the screen size is more convenient on small screens or when a book has burned-out letters, water damage, or other defects. Most important, these texts can be put into text-to-speech software so that visually impaired people can access them too.
Once a Wikisource ID is added to the edition in Open Library, it will redirect users to the text in Wikisource, where it may be more convenient to read.
Nazar continues to add Wikisource IDs to the books in the lists and many others.
If you would like to contribute to projects like this one, you can indicate your interest in volunteering as a librarian in this form and connect with Nazar in the librarians Slack channel.
Open Library is powered by a global community of volunteers, a small team of staff, and several extraordinary, handpicked volunteer fellows who are picked to work alongside staff to tackle ambitious, high impact projects. This week, we’re featuring the work of Engineering Fellow Ben Deitch, who has made a dramatic impact on the Open Library initiative since 2024.
Ben’s numerous engineering contributions have strengthened Open Library’s experience for hundreds of thousands of patrons. With the mentorship of senior staff engineer Drini Cami, Ben wrote code that enables patrons to:
Find exact book editions from their Reading Logs
Search which books were read within any given year
Discover interesting books based on a sophisticated reddit-style trending algorithm
Mark books as “Want to Read” from author pages
Prior to Ben’s work, reading logs would show works instead of editions. Ben added the ability to view editions on a reading log. This enables users to track the precise editions of books they have read. The log will also show the right cover for each edition.
Ben also implemented a basic “fuzzy search” for the Solr Search engine, making the overall search system much more tolerant of spelling errors and bringing it closer to modern standards for search engines so that patrons don’t hit dead ends.
In another project, Ben coded in a new, image-based preview for user created book lists, which appears on users’ My Books pages. This feature also enables patrons to see the first few books in a list at a glance.
Historically, search results and book pages have featured a “Want to Read” button that patrons can click to keep track of books of interest. Ben extended book results so patrons can also click “Want to Read” from the author’s page.
In the past, when carousels were rendered for the homepage, facets were included – like language – that were accidentally being dropped when additional results were loaded. Ben fixed this issue so books results were relevant even when loading multiple pages.
Ben’s work can be found across the Open Library – from the search page, to the home page, the author’s page, and the my books page. We are grateful to Ben and couldn’t be more proud of his contributions to our Open Library.