walker.pngThe papers from the recent NISO OpenURL and Metasearch meeting provide a very useful roundup of the state-of-the-art in the declared subject area. But they also touch on other topics facing libraries as they construct distributed information environments. Some things that lodged with me as I skimmed powerpoint:

  • One stop shop vs an ecosystem of services. The most interesting presentations to me were those about institutional deployment, in particular those at Rochester [ppt by David Lindahl and Jeff Suszczynski] and California State University, San Marcos [ppt by David Walker]. In each case, metasearch is seen as a platform which makes databases available for search to other applications. The aim is to provide service which fits into patterns of user behavior and abstracts away from the boundaries of database providers. The focus was on putting data where it was useful. The focus was not on putting the user in front of a 'one-stop-shop' which is how metasearch often seems to be presented. Both presentations also usefully see metasearch as a part only of a wider system of services which discover, locate, request and deliver resources of interest.
  • Registries. There are presentations of the UK Information Environment Service Registry [ppt by Anne Apps], registry activity within the NSDL project Ockham [ppt by Martin Halbert], and OCLC's registry of OpenURL resolvers [ppt by Phil Norman]. This type of application is becoming more important. Each is exploring how to develop systemwide 'intelligence' about entities required to run distributed applications. As we move further towards distributed services, this type of 'intelligence' becomes more important, and we need good ways of creating, propagating, and sustaining the 'intelligence', or metadata, about these entities. What entities? Well, in the cases above, we are talking about being able to locate and use OpenURL resolvers (a particular type of network service), collections, network services, and agents (typically organisations such as publishers, resource operators, suppliers, etc). This data is needed to populate local knowledge bases, and for other purposes, and will increasingly be available in directory or registry services.
  • Describing collections and services. One of the strands of the metasearch initiative focuses on collection and service description. There has been a significant focus on collection description in recent years, though there are no widescale consistent deployment of services which create, share or use them. Of course, archives have long been used to thinking about description at the collection level, but historically it has not been a part of general library practice. Services in the sense used here focuses on services which make collections available: the network functionality through which it is accessed. Metasearch applications variably use collection and service desciptions and there is a view in some quarters that the burden of creating this data should be shared, in the way that the burden of cataloging is shared through union catalog organizations. The jury is out on this one. Collection level descriptions were mentioned in the recent Open Content Alliance announcement and have been highlighted, to take one example, in the TEL service, a metasearch service across European national library catalogs.
  • Whither metasearch? Despite the level of activity represented here, or maybe because of it, metasearch still seems like an interim approach to me, for business, social and technical reasons. I have discussed this many times in these pages (see links below). Metasearch is a high cost activity; the incentives of data providers, metaseach application providers, and library users may not always be aligned; and it is difficult to build value added services on top of this lowest-common-denominator, federated resource. This last point is especially important, as the demands on our services grow and as we want to manipulate and mine data to create value. These factors point to the potential value of consolidating data in a smaller number of disciplinary or genre verticals, where one can focus on search quality and adding value. I was interested recently to come across a presentation by Mark Krellenstein, CTO of Elsevier, at an NFAIS meeting on metasearch which discusses these issues along similar lines [ppt].
  • Acronymic flyby. In the collection description area, the metasearch group is proposing a collection description schema which is an adaptation of the Dublin Core Collection Description Application Profile, which in turn is an adaptation of the RSLP Collection Description schema. In the search area, the metasearch group is proposing MXG [see ppt by my colleague Ralph Levan], which is an adaptation of SRU/SRW, which is an adaptation of Z39.50. Each of these cases is an example of a sorry profusion of acronyms which helps no-one. Raymond Yee suggests that we need fewer APIs and fewer metadata specs, and he is right.
  • xISBN. It was nice to see that a couple of presentation mentioned their use of xISBN, our web service that accepts an ISBN and returns ISBNs in the same work set. See the Rochester presentation [ppt] mentioned above and Ross Singer's discussion of his work with OpenURLs [ppt].

Taken with the discussion of authentication [ppt by my colleague Mike Teets], many of these discussions clearly evoke an environment of services, whether within a library or increasingly withing a distributed group of libraries, providers and users. Such environments need directories or registries, and good ways of streamlining communication between parts. These presentations give a snapshot of much of the relevant work going on in the library community now. We have much more to do ...


Related columns and posts:

Note: image from David Walker's presentation [ppt].

Another note: see here for a report of another presentation of the CalState, San Marcos work.

Comments: 1

Aug 30, 2006
Multi Search

Some search engines use metadata strategies or in other words--they are metasearch engines. However, many times this provides a slower way to search, as it will have to request multiple engines in order to just show a few results.