Beyond bibliographic records

Our cataloging model revolves around the 'manifestation', the particular edition or version of a work that is to be added to the collection. This is also the unit of bibliographic exchange: we ship around MARC records which have data about 'manifestations'.

These are the 'inputs' into our catalogs and bibliographic systems. There is no necessary reason that they should also be the only outputs, although this is in fact what is usually the case. Searches tend to result in lists of entries for manifestations, each of which displays some subset of the data in the bibliographic record.

Recent catalogs have only changed this model slightly. Faceted browse, for example, typically allows manifestations to be brought together by some 'facet': subject, place of publication, or date, for example. The facets themselves, potentially interesting ways of organizing data for presentation to a reader, don't tend to be used in this way.

What is an example of an alternative? Worldcat Identities provides an example of how a person or organization might be used as an organizing principle for displaying data. Here, we pull data from many records, recombine it, and present it in an integrated way. So an Identities page has, for example, a list of books by a person and about a person; it has alternative forms of the name of that person; it has related persons (or organizations); it has a concept-based tag-cloud representing the publications by and about that person; and so on. Here is the page for Don Knuth (click on it to see the full page):

knuth.png

We have done some work on similar sorts of pages for works. It would be nice to think of this type of organization for places.

In fact, our bibliographic records contain data about lots of entities about which people have an interest, or about which they ask questions. These include works, people, places, subjects, time periods, ... However, our manifestation-record-oriented view means that we do not always exploit these data in ways in which they can be mobilized to answer those questions.

Of course, we do also manage other data as 'records': name and subject authorities for example. But these are not used extensively as structured data, and are often collapsed to strings in the bibliographic records. Other futures for this type of data are interesting to consider, but not here.

Now, I was prompted to write this by an interesting post by John Mark Ockerbloom who talks about 'concept-oriented' catalogs. (Note: by 'concept' John means 'thing' or 'entity' or 'object'. In some ways, 'Concept' is confusing here because it might be thought that he is meaning a 'subject' in library terms, something about which we have a lot of coded data. That said, we don't have agreed words for all that we want to talk about.)

As more and more knowledge resources become available to users, via the expansion of the Internet, the streamlining of interlibrary loan services, and the mass digitization of print library materials, well-defined, well-documented, and well-connected concepts will become increasingly important for readers that want to find what is most useful to them in a sea of information. While we will never have well-defined concepts for everything readers might be interested in, the concepts that have been defined by someone, somewhere, can serve as valuable guideposts for subsequent information seekers, if we're smart about managing and using them. [Understanding concept-oriented catalogs]

John notes as examples Worldcat Identities and FictionFinder (an earlier prototype designed to show how data could be mobilized around works. I always liked the way this allowed you to search for 'settings': for example, you can search for detective novels set in Edinburgh, etc). He also notes the Subject Maps work he is involved in at the University of Pennsylvania.

Libraries have managed bibliographic records, containers of data. The directions above point to a growing interest in seeing how they might more actively manage the data itself, making it work harder to provide information about entities of interest to their readers.

Comments: 1

Dec 09, 2009
Stephen Hearn

There are other possible "organizing principles" in bib record data that could be mined for interesting collective representations besides traditional access point entities like persons, corporate bodies, etc. An aggregation of data from records for resources in a particuar language or published in a particular year or place or by a particular publisher could also help profile those concepts in ways that are hard to discern just by browsing a list of manifestation records.