Help:Using Worldcat data

Jump to navigation Jump to search
This page is a help or manual page for the ISFDB database. It describes standards or methods for entering or maintaining data in the ISFDB database, or otherwise working with the database. Other help pages may be found via the category below. To discuss what should go on this page, use the talk page.

If, after exploring the Help system, you still have a question, please visit the Help desk and let us know. We probably know the answer, but we need your help to know what we left out of the help pages.

If you are new to editing the ISFDB, please see Help:Getting Started.

For more on this and other header templates, see Header templates.

WorldCat is a collection of over 400 million searchable bibliographic records maintained by the OCLC (Online Computer Library Center) inter-library cooperative. It covers books, serials, music, films and some other types of holdings. The OCLC Worldcat is the largest Union catalog currently in existence. The LC Online Catalog includes millions of bibliographic records created over many decades and according to a variety of cataloging policies.

  • There are several different interfaces to this data available:
    • Public WorldCat interface is a Web interface available for free since August 2006. Use the {{OCLC}} template when referring to this source in ISFDB note fields and within the ISFDB Wiki.
    • OCLC Fiction Finder provides access to all fiction cataloged by OCLC, including shrunk cover scans when available.
    • WorldCat genres lets users browse WorldCat data by genre, e.g. science fiction or cyberpunk.
    • The xISBN Web service allows developers to submit an ISBN and retrieve a list of related ISBNs along with their basic bibliographic data.
    • OCLC FirstSearch is another Web interface limited to subscribers.
  • When using data from WorldCat to populate ISFDB publication fields, a number of precautions must be observed:
    • Publisher inaccuracies:
      • WorldCat seems to often shorten publisher names, so the reported publisher name may not be the form on the actual book.
      • Publisher imprints are often not noted at all, and publishers are in some cases so abbreviated as to be ambiguous. For example, "London: Lane" could be either "John Lane" or "Allen Lane".
      • Publication locations (listed with publisher name) may be accurate only to the level of the country, if that. For example, any publisher in England and even some headquartered in Scotland may be listed as "London".
      • Therefore, use WorldCat to confirm publisher data from other sources, unless no other source is available.
    • Author inaccuracies and ambiguities:
      • When a book is published under a pseudonym known to the WorldCat system (or perhaps to the entering librarian), the "author" field lists the primary author name known to WorldCat rather than the pseudonym. If the "responsibility" field is present in the record, it contains the name as it appears in the book.
      • Often an illustrator is listed as a co-author. Again you need to check the "responsibility" field where this is usually spelled out.
      • When an illustrator is listed, there is often no way to tell if the credit is for interior art, the cover art or both.
      • Anthologies/collections which have contents level data often use abbreviated author names like "R. Heinlein" or "R. A. Heinlein"; there is no easy way of telling how the authors were actually credited. Punctuation, capitalization and even spelling are also often mangled.
    • Date Issues:
      • Dates are rarely given more precisely than by year -- month and day are not noted.
      • Often multiple dates are listed, with the earlier being a copyright date, or the date of an earlier edition, particularly if the later edition is a facsimile.
    • Publication formats must be determined from size in centimeters (cm). Paperbacks are often noted as such, but this includes both "tp" and "pb" in ISFDB terms, so size must be checked.
      • "18cm" is the size of the standard US/Canadian mass market paperback.
      • "19cm" is probably a small tp/hc.
      • 20+ cm is either a tp or a hc, but distinguishing between tps and hcs can be tricky. Sometimes WorldCat will print "(pbk.)" next to the ISBN, which is self-explanatory, but if they don't, then it's time to check other sources. Price is not always a reliable indicator, but it can help.
      • If the book was published pre-WWII, it's usually either a hardback or, in some cases, a pamphlet since paperbacks didn't take off until WWII -- first in the UK, then in the US. There were some cheap editions in the early part of the 20th century whose binding sometimes approached the current paperback binding, but few have survived and fewer are owned by libraries. If the catalog information is ambiguous, then enter "unknown" in the format field and record the volume size in the Notes field.
    • Anthologies and collections may have contents listed, but care must be taken when using this information to record titles in the ISFDB. Quite often initial articles will be omitted from the title, e.g. "The House on Blackmore Street" might be listed as "House on Blackmore Street". Don't assume the presence or absence of "The", "A" or "An" without using several secondary sources as a backup.
    • Older works, printed in multiple volumes, often give no page counts but only volume counts.
    • Some editions have multiple records in WorldCat and you have to pull them up side by side to see if there is additional information to be derived from the interplay of different fields. Often, but not always, there is one single "best" record for a given edition/printing. When there is not, multiple OCLC record numbers may need to be recorded as sources.
    • Prices are usually reflective of whatever Baker and Taylor charged the last time the book was available to libraries, which may be significantly higher than the original list price of the book. Also, library editions are sometimes priced differently. Unless the book is fairly recent (and is in the current Baker and Taylor catalog) no price is likely to be listed at all.
    • Some authors are much better covered in WorldCat because the cataloging librarian had access to a specialized collection. For example, Andre Norton's books are exceptionally well covered although the data is not easy to find.
    • WorldCat rarely lists printing numbers and its subject headings leave much to be desired, but you can always troll other online catalogs for this data (which opens a whole different can of worms). It does, however, usually indicate a first edition as such.
    • Data listed in brackets such as "[First Ed.]" is by convention inferred rather than stated in the publication. Other data will normally be stated in the publication, but may have been abbreviated.
  • When any data in an ISFDB record is derived from an OCLC/WorldCat record, this should be specifically stated as such in the Note field.
    • When an ISFDB publication record corresponds to one or more OCLC/WorldCat records, enter the OCLC record number(s) in the External IDs field.
    • To refer to an OCLC record in the notes, use the {{OCLC}} template, e.g. "Page count from {{OCLC|12345}}, publisher from {{OCLC|67890}}".
  • When a publication is being verified against a WorldCat record, the WorldCat/OCLC number must be entered in the External IDs field.
  • Sometimes a single record will list multiple ISBNs. This generally means that multiple states of the publication were cataloged together, such as hardcover, trade paperback mass-market paperback, and/or library binding. This appears to be done only when the different states have the same year of publication and page count. It seems particularly likely for a library binding version based on an otherwise identical (except for binding and ISBN) state.
  • Since WorldCat records are derived from reports of individual libraries, if two or more member libraries catalog a publication differently (different forms of the publisher or author name, for example) there may be multiple records for what is actually a single publication. Try to avoid creating multiple ISFDB records in such cases.
  • WorldCat often lists the LCCN, i.e. the Library of Congress Control Number (formerly Catalog Number). This is generally worth recording in the External IDs field of the corresponding ISFDB publication record. If you need to link to a Library of Congress record in notes, use the "LCCN" template, e.g. {{LCCN|61-11702"}} or {{LCCN|2000123456}}. Note that for books published before 2000, the LCCN number is generally shown as a two-digit year, a hyphen, and a serial number of up to 6 digits. For books published in 2000 and later the year is given as 4 digits and the hyphen is often omitted.

How to Tell When a Book Cataloged by WorldCat Never Appeared

WorldCat data is only as good as the data that OCLC receives from individual libraries. If a library enters an announced book into its catalog and the data is transferred to WorldCat, then this record may remain in WorldCat indefinitely even if the book is never published.

There are a couple of ways of telling when an OCLC record is bogus or "vaporware". First, unlike Amazon, libraries rarely enter the page count or the book size until they receive at least one copy, so an OCLC record with an empty page count and no size designation is likely to be vaporware. Second, if OCLC reports that a book supposedly published by a major publisher is available from only 1-10 libraries, there is something wrong with the picture. Finally, if you check OCLC's list of libraries that supposedly owned a copy at some point and all (or almost all) of them are not hyperlinked (i.e. do not report a current copy on file), that's a big red flag as well. WorldCat may list as "libraries" holding a publication one or two purchasing services, such as Baker & Taylor. These should be disregarded, and if no other libraries are listed, that is again a red flag that the book was never actually published.

See also WorldCat and OCLC