ISBN publisher codes

By Edward Betts

There can be more than way to say the same thing, for example gramophone record, phonograph record and vinyl records. When libraries write catalog records they pick one of these terms and sticks to it, they use what is known as a ‘controlled vocabulary‘. This makes it easier to browse library catalogs.

Traditionally it has been thought that patrons want to browse by author and subject headings, so these fields have been controlled. The data in these fields can be used in other ways, Ross Singer has been experimenting with geographic subject headings.

Publisher is an uncontrolled field. Penguin and Penguin Books are the same publisher, but their name has been entered in catalog records differently, making it difficult to browse by publisher.

A workaround is to use the ISBN field in the catalog record. Almost every book published since 1970 has an ISBN. English-language books start with a 0 or 1, followed by a variable-length publisher code, item number and finally a checksum digit.

For example: 0-14-043531-X
0 = English language
14 = Publisher code
043531 = Item number
X = checksum

We are able to build a list of ISBN publisher codes by picking the most popular publisher name, as it appears in library records, for each code. Using ISBN we can start the process of making publisher a controlled field.

The results:

5 Responses to “ISBN publisher codes”

  1. Javantea says:

    I noticed that openlibrary.org has been giving a 503/500 since 1:30 PM PDT today (7/20). Since there isn’t any other contact info, I thought I’d let you know via this blog.

  2. Karen Coyle says:

    I find it interesting to look at the list of publishers in order by frequency. The top ones are:

    0-19 Oxford University Press 251368
    0-16 U.S. G.P.O. 245442
    0-521 Cambridge University Press 175340
    0-415 Routledge 117731
    0-13 Prentice-Hall 116635
    0-471 Wiley 112967
    0-06 Harper & Row 109797
    0-07 McGraw-Hill 98202
    0-312 St. Martin’s Press 91149
    0-02 Macmillan 75461

    And it’s interesting how many of the top ones are university presses.

    Also, it looks like there are either typo’d ISBNs in the data (I’m sure there are some) or some publishers have shared ISBNs. For example, I find MacGibbon & Kee under a couple of different numbers assigned to others, as well as their own. Anyway, fascinating data here.

  3. Wolf D. LANG says:

    You did a fascinating job here.
    Could You also build a publisher list for the language group 3 (german)?
    regards

  4. When I worked in a bookshop, we had a book that listed all the different ISBN publisher codes – came in handy on occasion.

    How are you managing “imprints” of publishers?

  5. Karen Coyle says:

    “Imprint” can mean a variety of things, from a particular publisher’s series, like “Vintage Classics” to once independent publishing houses that have been purchased by a larger publishing company (increasingly common).

    What we have to work with, however, is simply the publisher name that we receive in the metadata, and that is generally the name that appeared on the title page of the book. There is nothing to link that name to an actual corporate entity (either the one owning the imprint at the time, or the one that may own it today). I think that our data is closer to “imprint” than it is to “publisher”, and that bringing the two together will require some external data. The ISBN prefixes may provide some help in that area. For example, code 0-02, which is generally listed as belonging to Macmillan, has appeared in metadata with these listed in the publisher field:

    Macmillian
    Collier Macmillan
    Free Press
    Maxwell Macmillan International
    Maxwell Macmillan Canada
    Collier Books
    Macmillian Reference USA
    Collier-Macmillan

    If these are what you intend my “Imprint” then imprint is generally what we have in the data.