Working around works

There is a significant - if little read - literature of cataloging theory. A recurrent theme is the balance between gathering like items, and discriminating between them. Managing similarity and difference in this way, and making sensible user interface choices, is not straightforward.

The FRBR model represents a recent approach to a part of this question: how to gather things that are in some way instances of the same intellectual work (a discretionary decision at the edges*), and how to distinguish sensibly between these things (critical editions, for e.g, or translations, etc).

We now have a variety of online systems which present data about 'book-like objects'. They have to make user experience decisions about how best to represent works and their instances.

Interestingly, Goodreads** and LibraryThing seem to default to a work-based view: the entry is at the work level. See for example Dr Zhivago and Dr Zhivago.

Amazon seems to default to a particular 'manifestation' or 'expression' (to use FRBR terms) as in Dr Zhivago. Google Books seems to do something similar - Dr Zhivago. In each case, though, there is an attempt to link to other editions etc from a particular page.

Worldcat.org is more like Amazon and Google. At the moment, it aims to show the most highly held member of a work set in a result, and then link to other editions from that, as in Dr Zhivago.

There are reasons for taking these various approaches and each service make decisions based on what it is trying to do, and the view it takes of its user interests.

Now, although the idea of a work has been a library concept for some time, the 'manifestation' level has tended to dominate practice, and this has carried forward into online catalogues. The catalog card, and the MARC record, are typically at the manifestation level and systems have been very 'record-based'. They have not done a lot to re-combine the manifestation-level data to present other types of user experience than a record-by-record results display. The main exception to this may be an author-based approach. But typically, local systems have not sought to cluster data into works, or into displays about place or subject, the various things about which they have data.

Worldcat Identities is one response to this. It aims to pull data from multiple records to create a page which records what we know about a particular identity, Boris Pasternak, for example. Goodreads or LibraryThing will also give you a Boris Pasternak page based on what they know about him. (The National Library of Australia's People Australia is also interesting in this regard. See for example, Patrick White.)

More recently, OCLC Research has been experimenting to see what data is available for display in a consolidated way at the work level. See a sample set of pages here, and some background detail here [pdf].

* A note on 'discretionary'. We cluster stuff based on aggregate cataloger choices. I like Tim Spalding's characterization of the 'cocktail party test' in a blog entry about works and LibraryThing.
** See the comment by Alex Thurman on Goodreads, suggesting it is more like Amazon.

Comments: 3

Aug 04, 2009
Jonathan Rochkind

Regarding 'discretionary', I think this is exactly right. It's important to note that the 'work set' is a subjective and contextual choice, not some objective piece of data waiting to be discovered. But that doesn't mean it's useless, it's very important because (in Western culture at least?), the concept of 'work' exists, and is of value to users.

The FRBR report says: "The concept of what constitutes a work and where the line of demarcation lies between one work and another may in fact be viewed differently from one culture to another." Quite right, but oh well, it's the world we live in.

So traditional cataloging tries to make these work distinctions (to the extent that they are _implied_ in AACR2 choices like 'uniform title' by setting out precise instructions meant to result in choices that match that cultural determination of 'work'.


LibraryThing tries to do it instead by just relying on members of that culture using their intution, and averaging out everyone's choices and relying on them to reach consensus through discussion.

Neither is more 'correct', and neither is more 'FRBR', just two different approaches to trying to create a collective decision about work sets that is useful to users. Both are discretionary and subjective.

A bit more on this at: http://bibwild.wordpress.com/2009/08/04/maps-territories-and-discretion/

I'd be interested if Worldcat is considering trying to make a 'work' view the default 'landing page' from a search, a bit more like LibraryThing. I suspect this would actually be of more general use than the library legacy practice of always showing individual manifestations as search 'landing pages'.

Lorcan says: "There are reasons for taking these various approaches and each service make decisions based on what it is trying to do, and the view it takes of its user interests." Certainly true as far as it goes -- but I've never seen a written out clear analysis of what the reasons for the traditional library manifestation-centered display are, what they are trying to accomplish, what user interests we believe they are meeting.

I suspect that in fact this choice isn't based on any actual clearly thought attempt to meet certain user interests -- but instead just because we've always done it that way. Because in the card catalog world it was impractical to do otherwise. And in the online world, it takes a bit more work to do otherwise. Not because doing it this way actually is necessarily optimal for meeting identified user interests.

Aug 11, 2009
Alex Thurman

As a Goodreads user (and sporadic volunteer "Goodreads librarian") I wish it defaulted to a work-based view, but it really doesn't. The Dr. Zhivago page Lorcan links to is for a single manifestation, with a link to "other editions". There is a work-level field (first published date) that is frequently blank or incorrect, and some effort to consolidate ratings for different editions into a single rating for the work, but on balance the Goodreads default is closer to the Amazon model than to LibraryThing.

Aug 13, 2009
Stephen Hearn

Traditional catalogs contain multiple representations of bibliographic entities. Work and expression level representations tend to occur in indexes, e.g.,

3 Pasternak, ... Dr. Zhivago.
7 Pasternak, ... Dr. Zhivago. English.

Each of these index lines is a brief bibliographic representation, derived from data in the manifestation level records (indicated by the hit counts). The fact that records tend to be at the manifestation level does not mean that only manifestation-level access is provided in traditional catalogs. Indexes need to be figured in as "landing pages" too.