Simpler search

One of the observations over time wrt library standards -- I am thinking mainly of protocols -- has been that we have overdesigned, ending up with solutions that are complex enough to cater for every last case, but are not simple enough to be widely adopted. For this reason, they have remained within a library niche, and, in some cases, have not been widely adopted even within that niche.

A version of this discussion emerged recenly in relation to Z39.50 and the related, simpler, SRU/W approaches which reimplement Z39.50 semantics in a web services idiom.

The NISO Metasearch Intiative has been working with the SRU/W developers and has produced a proposal for the NISO Metasearch XML Gateway. As I understand it, the aim here is to further simplify the interaction between a metasearch application and a data provider by removing the need for query grammar interoperability while retaining the ability to return structured data which facilitates manipulation and representation. In this way, the cost of implementation is lowered at the expense of interoperability across multiple data providers: each data provider expects a query in its own query grammar.

There is an Implementors' Guide -- the primary author is my colleague Ralph Levan -- on the initiative's Wiki (although when I just tried it the link to the current version did not work).

Currenlty, the Metasearch XML Gateway defines three levels. Levels 1 and 2 are each a non-conformant subset of the NISO SRW/U standard. Level 3 adds capacities which make it minimally conformat to SRU.

Related entry:

Comments: 6

Aug 25, 2005
Ross

It's great that this has been released. While I realize that this is only a proposal, it at least gives us that are developing potential targets and clients some confidence in moving forward with SRW/U.

I think there are two possible ways to offer metasearch at this point... Distributed indexes (Z39.50, SRW/U) and localized indices. The Google Scholar/Scirus/etc. way of an index of crawled content (and if it was SRU based content, i.e. well structured, that'd sweeten the pot even more) makes more sense as storage grows cheaper and network latency stays stagnant. The advantage to this is that there is always the SRW/U robustness to fall back on, when necessary.

This presents other problems, of course: authorization, spidering schedules, branding, etc... but if Google is allowed, why can't some library consortia?

Aug 25, 2005
Lorcan Dempsey

Yes. See the discussion in Three stages of library search.

Aug 26, 2005
Ben Toth

I know it's a bit of a generalisation, but professionally we've had little incentive to simplify search experience for users and quite a lot of incentive to emphasise the complexity and mystery surrounding search. It concerns me that librarians are not 'getting beyond Boole' because we seem to be locked into a mentality which equates finding with results sets. It's not just the fault of librarians - the industry is locked into a business model - creating and maintaining large sets of metadata - that is increasingly irrelevant to connecting users with the content they need.

There are wonderful opportunities to help users connect with the material they need by simplifying search, hiding complexity, and promoting sharing and pooling of knowledge. But we don't seem to be taking those opportunities, or at least not leading them (Hubmed and Attract are innovative UK services for example, neither of which are led by librarians).

Aug 26, 2005
Ray Denenberg

It's unfortunate that MXG is sometimes represented as "removing the need for query grammar interoperability". That's not what it's about. MXG does take a gentle, phased approach towards interoperability, defining a "level 0" that provides an escape hatch for native queries. Level 0 is merely an intermediate step towards the goal expressed by level 3, to "support a standard query grammar". The grammar is CQL, the Common Query Language (http://www.loc.gov/cql/).

Aug 26, 2005
Yan Han

I agree with your point about simpler search. In general, people are trying to simplify the process by moving/hiding the complex. Like the evolution of programming languages, from assembly lang to C, then to C++ and Java, we see that it is getting easier to programming (and also reducing errors) by using reuseable libraries written by a few people, but used by ten thousands of programmers. I believe in general libraries should make their services (catalogs, search/find, ILL) easy to use.

Sep 15, 2005
Matthias Steffens

Thanks for the interesting post!



As a PHP programmer of a bibliographic web application who has recently struggled to implement SRU/W+CQL in its application I would greatly acknowledge any means that make it easier for developers to support open standards. So a controlled step-wise approach (levels 0-3) towards full CQL support would be appreciated.



After a lot of reading I managed to support SRU/W returning data in MODS XML format. But the really hard bit is the CQL query language (especially since there doesn't seem to exist a parsing library for PHP). For now I only support the most basic CQL queries (i.e. no boolean CQL operators and no masking characters or parentheses) in my application since, unfortunately, I don't have the time to code a full-blown parser myself. That said, I'd really *LOVE* to support standards like CQL. I think that, currently, implementation is just not easy enough to gain a wider adoption (at least with regard to PHP development which makes for quite a bit of the web development, IMHO).



Regards, Matthias