Lorcan Dempsey's weblog On libraries, services and networks.   

Libraries - systems and technologies

Some reading

 •  Categories: Books, movies and reading ... , GLAM , Libraries - systems and technologies , Libraries - organization and services , Metadata , OCLC

Here are links to several unrelated publications .....

Reconfiguring the Library Systems Environment

portal: Libraries and the Academy, Vol. 8, No. 2, April 2008.

http://www.oclc.org/research/publications/archive/2008/dempsey-portal.pdf (.pdf: 195K/18 pp.)

[Lorcan Dempsey: Selected publications [OCLC]]

This is a short piece adapted from an earlier blog entry.

Lavoie, Brian, and Günter Waibel. An Art Resource in New York: The Collective Collection of the NYARC Art Museum Libraries. (.pdf: 136K/18 pp.)

[Books and reports [OCLC - Publications]]

The New York Art Resources Consortium (NYARC) includes the Frick Art Reference Library, the Metropolitan Museum of Art’s Thomas J. Watson Library, and the libraries of the Brooklyn Museum and the Museum of Modern Art. This report describes the results of a study of the aggregate collection of these institutions.

Godby, Carol Jean, Devon Smith, and Eric R. Childress. 2008. "Toward Element-level Interoperability in Bibliographic Metadata." The Code4Lib Journal, 2 (2008-03-24). Available online at: http://journal.code4lib.org/articles/54.

[Publications [OCLC - OCLC Research]]

I mentioned this before, but in a message about another topic.

View commentsView comments (0)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit   Furl  

The Integrated Library System: status and future

 •  Categories: Libraries - systems and technologies

Library Management Systems Study: An Evaluation and horizon scan of the current library management systems and related systems landscape for UK higher education [pdf] is a report commissioned by JISC and SCONUL which has just appeared.

It is quite long, including an environmental scan, interviews with library system vendors, reports of a survey of librarians, and feedback from the reference group assembled for the project. (I was a member of the reference group.)

View commentsView comments (0)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit   Furl  

The network level

 •  Categories: Books, movies and reading ... , Libraries - systems and technologies , Libraries - organization and services , OCLC

Jeremy Frumkin of Oregon State University talks to Merrilee Proffitt about library services and moving to the network level in the second Parcast [file is here]. This is in response to the question "what keeps you awake at night?".

What is a Parcast?

Welcome to the OCLC Programs and Research PARcast page. Here you'll find links to our podcasts—the latest recorded interviews with industry thought leaders and up-and-comers—as well as recorded webinars, or online presentations, from Programs and Research staff. [PARcasts [OCLC - Programs and Research]]

Mark Dimunation of the Library of Congress spoke to Merrilee in the first Parcast. His topic:

Special collections need to keep collecting and building collections of real things, but also need to be smart and be part of the digital conversation. How do libraries create a digital environment where researchers can derive the evidence they need to do their work? [PARcasts [OCLC - Programs and Research]]

View commentsView comments (0)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit   Furl  

The big switch

 •  Categories: General - distributed environments , General - systems and technologies , Libraries - systems and technologies , OCLC

I have just finished Nicholas Carr's The Big Switch. Here is a sample:

The complexity and inefficiency of the client-server model have fed on themselves over the last quarter century. As companies continue to add more applications, they have to expand their data centers, install new machines, reprogram old ones, and hire ever larger numbers of technicians to keep everything running. When you also take into account that businesses have to buy backup equipment in case a server or storage system fails, you realize that, as studies indicate, most of the many trillions of dollars that companies have invested into information technology have gone to waste. [The Big Switch, p. 56]
Most of the software and almost all of the hardware that companies use today are essentially the same as the hardware and software their competitors use. Computers, storage systems, networking gear, and most widely used applications have all become commodities from the standpoint of the businesses that buy them. They don't distinguish one company from the next. The same goes for the employees who staff IT departments. Most perform routine maintenance chores - exactly the same tasks that their counterparts in other companies carry out. The replication of tens of thousands of independent data centers, all using similar hardware, running similar software, and employing similar kinds of workers, has imposed severe penalties on the economy. It has led to the overbuilding of IT assets in almost every sector of industry, dampening the productivity gains that can spring from computer automation. [The Big Switch, p. 57]

Carr makes an analogy with electric power. Many years ago, companies would have had their own power generators. This was very inefficient and we moved to a utility model, where generating capacity was concentrated and delivered to others over the electric grid. He foresees the emergence of a similar model with computing and applications, a movement to a utility model where capacity is delivered as required over the network.

Of course, as he notes, this is already upon us. Think of a couple of prominent examples: Amazon Web Services and the range of Salesforce.com's services.

Amazon provides computation, storage and other services on an on-demand basis. Werner Vogels, Amazon CTO, has an interesting presentation where he talks about Amazon's webscale services and discusses their rationale. The subtitle of the presentation is "compete on ideas, not resources". In terms that echo Carr's, he talks about the 70/30 switch, claiming that 70% of a firm's "time, energy and dollars is spent on undifferentiated heavy lifting" in building out infrastructure, while 30% is spent on "differentiated value creation". Amazon wants to help organizations reverse those numbers, reducing the time spent on undifferentiated, increasingly commodity, infrastructure.

I was looking at My Starbucks Idea the other day and was interested to see that it was powered by force.com. This is a suite of on-demand tools from Salesforce.com which claim to allow you to build enterprise applications without any custom development work. What immediately struck me was the way in which the service was promoted, echoing Carr and Vogels: the strapline is "Finally, focus on innovation, not infrastructure". I liked their line:

Free up the dollars wasted "keeping the lights on"

with a zero-infrastructure model.

The 'big switch' is going to be a major issue for libraries over the next few years. They spend too much time getting their systems to work, and not enough time putting them to work.

Of course, much will depend on what types of services are available to libraries from their providers and it will be as interesting to see how those providers reconfigure their offerings in coming years and what new providers emerge.

Note: I was prompted to note the Big Switch after reading and commenting on Mark Dahl's post here.

Related entries:

View commentsView comments (2)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit   Furl  

ILS

 •  Categories: Libraries - systems and technologies

I did a search on ILS in Google earlier.

The top result was for a website called ILSMart.com where the ILS stands for Inventory Locator Service. It is a supply chain company for the aviation and allied businesses.

Inventory Locator Service is not a bad name for what we call the ILS?

The first library-related entry was the fourth, which was the Wikipedia entry for ILS (Integrated Library System).

View commentsView comments (1)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit   Furl  

Library labs

 •  Categories: Libraries - systems and technologies

The National Library of Australia has set up Library Labs:

The aim of this wiki space, which we are calling Library Labs, is to let our friends and colleagues know what we are doing, to invite comments, questions and feedback and to provide a space for discussion and collaboration.

We are particularly interested in forming a community of Australian business analysts and developers who are working on similar problems and who are interested in interoperable, standards-based solutions that can foster the development of a national information infrastructure. We are also interested in working with colleagues at an international level to provide prototypes and testbeds for new and emerging standards.

[Home - Library Labs - National Library of Australia wiki]

There are links to strategy documents and to prototypes.

Via Warwick Cathro.

Related links:

View commentsView comments (0)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit   Furl  

An effective web presence?

 •  Categories: Libraries - systems and technologies , Libraries - organization and services

The Library at University College Dublin (UCD) invited expressions of interest in a "website and online environment consultancy" a while ago. I thought that the document they prepared as background was nicely done, and that although specific in many ways to UCD it represented a general statement of common issues being addressed by libraries as they grapple with what constitutes an effective web presence.

Reading the document two general questions came through for me (expressed in my words):

  1. How does one create some unity of approach across the resources to be presented, in terms both of management and presentation? Some of this discussion is about whether it makes sense to move away from a web-page list/browse presentation to a more managed environment for particular resources. Examples would be moving all electronic resources into Metalib or using LibGuides for library subject guides. I was struck by this comment about the catalogue:
    We have taken it for granted that the catalogue as a separate search interface can happily retain a different look and feel from the website and other interfaces but many of our users do not differentiate the online library platforms in this way [UCD Library Web Consultancy].
    They point to the use of Collexis by the Technical University of Delft to provide search based access to information about the library and its services, and wonder if this approach is preferable to the more common static web page approach.

  2. The second question is in some ways more interesting, and more difficult. The document notes that the university has several network environments (course management system, University portal), and this raises questions about the best target environment for library attention. In which environment should services be delivered?

There is also some consideration of general Web 2.0 considerations, although I liked that they resisted the temptation to lead with this.

They ask this question:

Is there a case for our online presence in effect becoming "the library" and the physical library actually becoming a sort of support or adjunct? Our thinking to date has not moved beyond viewing the online environment as an important, but subordinate component and our entire structure reflects that perspective. [UCD Library Web Consultancy]

Until recently, library place, expertise, and services were vertically integrated around collections. Space was needed to house collections and allow readers and writers to consult with collections. Expertise organized and interpreted them. Services provided access to them. However this is all reconfigured in a network environment where readers and writers have many resources to call on. In this context, place, expertise, collections and services continue to co-evolve but are not so closely integrated in the same way and need to be managed to different ends. So, for example there may be more emphasis on social learning, on high value face to face interaction, on access to scarce equipment or expertise in the context of place. Or, again, library network services will look more to placing expertise or collections at the point of need in research and learning workflows, and at managing institutional outputs.

Thanks to Ros Pan for the link.

Related entries:

View commentsView comments (1)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit   Furl  

Evergreen and Pines

 •  Categories: Libraries - systems and technologies , Libraries - organization and services

There is a lot of buzz around Evergreen: there is excitement about an open source integrated library system. There is another aspect of it that is interesting, which I haven't seen so much discussion about.

Evergreen was developed to support PINES in Georgia. It was designed to support consortial working. (And yes, I understand that individual institutions are looking at it also.) I think that this is interesting as we will probably see more consortial activity over time as the benefits of shared working and aggregate access become clearer. In this context, for example, it will be interesting to see what the impact on library use the availability of 'one big library on the web' in Georgia will be ....

PINES experienced a whopping 40 percent increase in lending during the past year. A statewide consortium that has grown to include more than 275 public libraries and affiliated service outlets in 137 counties, the Public Information Network for Electronic Services -- PINES, for short -- offers Georgia citizens a shared catalog of more than 9.3 million items, with a single library card that is welcomed in all member libraries. PINES now boasts more than 1.7 million registered cardholders -- just under 19 percent of the state's population but more than 35 percent of the citizens living in a county served by the system. [Use of Georgia's public libraries continues to rise in Internet Age]

As part of this trend to consortial working, I believe that we will see more collaborative sourcing of shared systems. This results in a concentration of technical capacity, which may make those groups more willing to consider open source solutions like Evergreen.

Related entries:


View commentsView comments (0)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit   Furl  

Personal reference collections as digital libraries

 •  Categories: Libraries - systems and technologies , Research, learning and scholarly communication

We will see much more activity connecting user environments and bibliographic resources. I am thinking of citation managers, reading lists, social bookmarking sites (see citulike and unalog) and RSS feeds. Some of these may be specifically supported by the library (e.g. a citation manager service), some may be developed within an academic or scholarly context (e.g. Zotero, citulike, ...), and some may be general network services. People have multiple ways of creating personal and shared collections of data and links.

They are also an example of an increasingly important aspect of our bibliographic apparatus - we have discovery or 'rendezvous' experiences outside the library resource, where it would be good to be able to link back into a library service for fulfillment, or indeed into other services. As we expose more data to search engines, this provides another example. We don't have robust, general ways of doing this across resource types.

In this context I was very interested to read a report from work done at the University of Minnesota on the ability to resolve references in the RefWorks collections of graduate students and others. Here is the abstract:

Introduction. Digital library users collect, enhance and manage their online reference collections to facilitate their research tasks. These personal collections, therefore, are likely to reflect users' interests, and are representative of their profile. Understanding these collections offers great opportunities for developing personalized digital library services, such as reference recommender systems.

Method. We recruited subjects by individual e-mails to the users of RefWorks - a web-based personal reference management tool installed for use at the University of Minnesota. To participate, subjects needed to give their consent and share their references with us. 96 subjects participated, majority (65) of who were graduate students, resulting into 30,336 references. Based on the type of the reference, these were stratified into one of the three valid identifying IDs - DOI, ISBN, or URL. Multiple reference resolvers (CrossRef, WorldCat) were used to enhance the overall resolvability of these collections.

Analysis. Descriptive statistics and simple graphics analysis were used to describe the dataset.

Results. Over 90% of the total references in users' personal collections could possibly have a valid ID (DOI, ISBN, URL), and therefore, are potentially resolvable. However, only about 17% of the references in these collections had a valid ID, and fewer than 11% actually resolved successfully. Using a combination of reference resolvers, the total resolvability of the references in these collections was enhanced from under 11% to over 41%.

Conclusions. Users' personal reference collections have a tremendous potential of building, supporting, and enhancing personalized digital library services, such as reference recommender systems.
[Resolvability of References in Users' Personal Collections]

Related entries:

View commentsView comments (0)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit   Furl  

Library process automation: the ecology of providers

 •  Categories: Featured , General - systems and technologies , Libraries - systems and technologies , Libraries - organization and services , OCLC

Just as I began to see messages about the publication of Marshall Breeding's report on his survey of library perceptions of their system vendor I was reading The new economics of the BI market by Jerry Held on The Database Column blog.

He talks about consolidation within the BI (Business Intelligence) market: "After more than a dozen acquisitions made by Business Objects, Cognos, and Hyperion over the past few years, these BI tools/analytics industry leaders were themselves snapped up in a matter of months by SAP, IBM, and Oracle respectively." And he notes the earlier consolidation of the underlying database industry around Oracle, IBM and Microsoft.

Held argues that consolidation has improved the overall BI marketplace. It delivers - he suggests - economies of scale and economies of innovation (and, although he does not mention it by name, economies of scope). These 'mega-vendors' offer a range of products. For some customers, the ability to concentrate interaction with a single vendor, a single helpdesk, and a single contract, and to benefit from discounts, are important benefits. For vendors, it should be possible to remove redundant costs in administration and distribution. Competition between a small number of dominant players is good for the market.

He suggests, however, that the mega-vendors find it difficult to innovate or meet new needs; they have a very full array of products spread over a large customer base. This means that there will always be investment available to new entrants who innovate around technology or business models to meet evolving needs.

He points to open source and SaaS (software as a service) as two important business model innovations. He also provides some technology innovation examples, emphasizing performance and price improvements.

Does this map onto the process automation providers within the library community? Here are some thoughts, focusing on the US environment. (And, full disclosure, OCLC has some offerings in some of the areas I discuss below.)

There has definitely been consolidation within the classic ILS environment. This is good in principle, as the library market - not very big to begin with - has been overpopulated with vendors trying to provide a full range of products. In practice, of course, much depends on how the remaining vendors work through integration issues. We can see some potential economies of scope (as diversifying library needs can be met from a single source) and scale (as development, support and R&D are consolidated).

However, none of these vendors is very large, they operate in a small community, and they have limited organic growth opportunities in their historic core. They have moved to meet diversifying library needs with additional products. Accordingly, we have seen that process automation for the 'bought/physical collection' (the ILS) has been joined by process automation for the 'licensed collection' (metasearch, resolution,knowledge base,ERM), and the 'digital collection' (repositories). Other products have also appeared to meet more specific needs (self-service, e-reserves, ...). Recently, a new category of discovery system has emerged which pulls together institutional data (from the ILS and from repositories), and several products have appeared. Now, each vendor has a significant development challenge in creating this full array of products, and we have seen some licensing of other components (support for metasearch or knowledge base, for example). Interestingly, we have not seen these companies acquire new entrants who are also developing these newer products (more of these below).

And, although we have seem some libraries acquire pieces from different vendors this is not as widespread as one might expect for some of the reasons suggested above. There are economies in dealing with as few vendors as possible. In addition, the library community has quite a personalized relationship with its ILS vendor community which adds to the incentives to acquire various components from the same vendor.

Marshall suggests that 'dissatisfaction and concern prevail' in this marketplace. I think we can expect further consolidation, as the number of vendors here reduces to two or three, maybe with particular specialties.

What about innovation? There is some concern that there has been little innovation in the classic ILS space, which matches Held's observation. That said, we can point to Ex Libris's collaboration with Herbert Van Der Sompel around the deployment of resolution as a service as a notable instance, or experimentation with ERM. It is not surprising that as new areas have been identified we have seen a range of new entrants, sometimes emerging from within the library or academic community. See for example Serials Solutions, which aims to provide a complete approach to licensed collections. The metasearch and resolution arena has seen several companies emerge, some of whom syndicate services to other players. See for example Muse Global, Openly Informatics (now part of OCLC), WebFeat or TDnet. And more recently, as we have seen attention to better discovery environments, Aquabrowser is being deployed by some libraries.

One area where innovation has been slow is in how the library systems apparatus engages with the tools that people are increasingly using to organize their own information spaces, at the browser level, or in social bookmarking, social networking, and other network-level sites.

Business model innovation? Held mentions Open Source and SaaS (Software as a Service). We have seen two major areas of open source development. The first is in the area of repositories, where we see Fedora, Dspace, and Eprints. The effort involved in deployment here may be high. Each initiative has gone through some organizational development, looking for ways to sustain itself, and the role of grant/foundation money has been important. The second is in the ILS arena, where Koha and Evergreen are receiving a lot of attention. Koha is more widely deployed; there have been some recent high-profile commitments to Evergreen. There are also some other areas where open source solutions are in use: metasearch (e.g. Index Data, LibraryFind), text searching (e.g. Lucene, Index Data), and a recent interest in 'next generation catalog' solutions (e.g. Solr, Vufind). Index Data has been active for a while, with a strong niche presence in Z39.50 applications and text searching and metasearch offerings. One interesting development is the emerging support industry here, where Care Affiliates, Index Data, Equinox and LibLime will offer support and consultancy. It will be interesting to see how this range of activity develops in coming years. In part it will probably depend on the ability of this nascent support industry to meet mainstream library requirements for support and reliability; and in part of course on the ability to continue to develop the software.

And what about SaaS? SaaS tends to be used quite loosely. Think simply of three levels. The first is where individual instances of an application are hosted. This may save the library some costs (hardware, sysadmin) but does not really alter the service model in other ways. A second is a 'multi-tenancy' model where multiple customers may be served from the same instance, but each with their own virtual application, potentially with configuration options. This may deliver savings but there may also be service improvements. Enhancements, fixes, etc, are available to all at the same time. Serials Solutions' services might be an example here. The third level becomes more interesting where shared use of a service generates network effects. Take a hypothetical example: a supplier could more easily develop recommender systems across multiple circulation systems. An actual example appears to be provided by Aquabrowser's announcement of its MyDiscoveries feature which aims to share user contributions to the catalog across customer instances. The SaaS model has been rapidly adopted in wider contexts, and while there has been some library adoption, it is interesting that there is not a high level of discussion of the approach.

Marshall writes:

The year 2007 saw considerable upheaval in the library automation industry. To get some sense of the aftermath of the recent rounds of mergers, acquisitions, product consolidations, and to gauge interest in open source automation systems, I created and executed a survey that aims to measure the prevailing perceptions in libraries. [Perceptions 2007: an International Survey of Library Automation]

What is interesting to me is the extent to which the ecology of library process automation is richer than it was a few years ago. If we think of managing three materials workflows (bought/print, licensed/electronic, digitized/digital), and the progressive movement of libraries into the latter two, then we see that library needs are now potentially met by a wide number of players. The classic ILS vendors remain central players, but they have been joined by others.

The ILS vendors have products in all three areas, and are developing new discovery products. We have seen new entrants in the repository space (including ContentDM, now owned by OCLC) and in the licensed materials space (resolover, knowledgebase, metasearch, ERM) where a variety of products are available from a range of vendors. In this context, the collection of services within the Cambridge Information Group is interesting (Serials Solutions, Refworks, Illumina, Aquabrowser as well as other bibliographic products). And, of course, OCLC provides services also. Open source offerings have emerged to meet needs across the board.

We will definitely see more convergence alongside further new entrants. It will be interesting to see how the Open Source offerings develop, and I think that we will see some game-changing offerings in the SaaS space.

I hope Marshall repeats the survey. It would be interesting to extend its scope - if that can be done without too much loss of focus - to consider more of the wider process automation landscape.

Related entries:

Pointer to MyDiscoveries via Meredith Farkas.

View commentsView comments (0)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit   Furl  

To 'extract, transform and load' or to federate

 •  Categories: General - systems and technologies , Libraries - systems and technologies , Libraries - distributed environments , Search

One of the major questions for library systems is the role of metasearch or federation. I have written about this here (Metasearch: a boundary case) and here (Metasearch, Google and the rest).

The issue is that libraries have to manage a range of database resources whose legacy technical and business boundaries do not very well map user preferences or behaviors. The approach has been to try to move away from presenting a fragmentary straggle of databases to bundling them in various ways in a metasearch application, sometimes in one big search, sometimes in smaller course or subject bundles. The issues here are well-known, not least of which is that libraries typically have limited control over the performance of the target databases.

As an alternative, a few libraries have explored consolidating locally loaded data. This can work very well, as it becomes easier to build additional services over a consolidated resource. However, this is a rather too adventurous undertaking for most libraries. Another approach is for a third party to consolidate, and this is what we have seen with Google Scholar, Scopus, Worldcat, and others.

More recently, recognizing the advantages of local consolidation, we have seen the emergence of a new class of library system which pulls together metadata from locally managed stores (e.g. digital repository, ILS, institutional repository, ...) and offers an integrated search. This may still have to work closely with a metasearch engine to integrate access to external databases. ILS vendors are moving in this direction, and through Worldcat Local, OCLC is also addressing this type of integration.

This is a discussion worth returning to, but that is not my purpose here. Rather I wanted to point to an interesting treatment of similar issues from a different domain. Mike Stonebraker, database guru and writer in the group blog, The Database Column, has a post where he contrasts two models of data integration: ETL (extract, transform and load) and federation. The focus is on enterprise systems. The ETL model will typically involve a centralized data warehouse and "for each operational system, they will employ some sort of ETL process to transform data instances into the global schema and then load them into the centralized warehouse".

'Extract, transform and load' is a good characterization of what is involved in consolidation of library data, whether this is attempted locally or through third parties. One of the interesting questions is the sophistication of the 'transform'. Think of author names, for example, or subjects, or other controlled data, and what would be involved to effectively merge data created within different regimes. What is the impact, for search or for faceted display, of limited or no transformation of these elements?

Here are the headings Stonebraker uses for his discussion.

  • Data element "heat": Hot data favors ETL
  • Indexing: Federation is harder to optimize
  • Resource management: Faster BI query responses for ETL shops
  • Complexity of the schema change: ETL approach performs less joins
  • Contention (concurrency control): Federation contention challenges
  • Timeliness: ETL approaches must deal with out-of-date data issues
  • Mapping: Federations can't handle some transformations

BI is short for 'business intelligence'. 'hot' data is data that is accessed often.

Now, while it is clear that our environment is similar to that discussed here in many ways it would be interesting to do a similar analysis with our domain in mind to see where there are differences. Of course, one issue is that most of the data under discussion here seems to be within institutional control.

Here is his conclusion:

In summary, virtually all enterprises use the ETL approach for data integration. The data federation market is, in contrast, quite small. The place where I see federations as most viable is when there are many, many data sources (e.g., more than 5,000 sources) and BI users utilize only a small number of them at any given time. In this extreme case, the average data element is accessed zero times before it is updated or deleted. In this instance, one is better off leaving the data where it originates. On the other -- more common -- hand, when most data elements get used several times, the ETL approach will continue to be preferred. [To ETL or federate ... that is the question - The Database Column]

Related entries:

View commentsView comments (3)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit   Furl  

Library website analytics

 •  Categories: General - systems and technologies , Learning and research - systems and technologies , Libraries - systems and technologies , Marketing , User experience

Our activities in the network world leave traces. The analysis of these traces is now a major undertaking as organizations mine this data to understand behaviors, to improve their systems, and to refine their offer.

Tony Hirst has a series of posts about 'course analytics':

In contrast to the academic analytics, one of the things I set out to explore was how an off the shelf web stats analytics tool (Google Analytics) could be used to help me learn more about what students were doing with our online course materials, and help me identify what - if anything - a "learning site's" goals could be, and what the site might be optimised for. [OUseful Info: Course Analytics - Prequel]

And further ....

For the moment, what I am interested in is how website analytics can be used applied to online course websites in order to gain a better understanding of online study habits and the bahaviour of students taking an online course. [OUseful Info: Course Analytics, Part 1 - Visitor Behaviour]

He provides some interesting analysis, looking at how students use course materials. He then extends the question to the library website, and based on discussion with his Open University library colleagues he suggests a list of questions that might be tackled with this approach. What sort of search engine searches result in referrals to the library website, for example. How well is actual page popularity mapped by front page navigation options? And so on.

He wonders what success looks like:

How to define library website goals is another interesting exercise... If the site was Amazon, where the aim is to sell goods, a relevant goal page would be a "Thanks for the cash - the goods will be with you in a day or two" page. What is the range of useful, successful transactions on a Library website? [OUseful Info]

He is interested in hearing from libraries who use Google Analytics, or similar off the shelf approaches, and about what they are measuring. If you have some experience, leave him a comment .....

View commentsView comments (1)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit   Furl  

CISTI lab

 •  Categories: Libraries - systems and technologies , User experience

I was in CISTI in Ottawa yesterday and saw an interesting and regrettably brief presentation from Glen Newton about the work of their research group They are doing some nice things to layer useful functionality across large data sets (e.g. clustering, mapping citation patterns, recommendation, ...).

Some examples of their work are visible through CISTI Lab:

CISTI Lab is an experimental site for demonstrating and evaluating prototype software and services developed by CISTI staff and research partners. [NRC-CISTI, Welcome to CISTI Lab]

Projects are linked from a Wiki. They include Ungava:

Ungava - explores ways of navigating full-text search and the visualization of search results for articles in the NRC Research Press collection (demo) and the Colorado State University Libraries catalogue (demo). Includes a drill cloud implementation. [Main Page - CISTI-ICIST LAB WIKI]

The public test collections are smallish. Ungava is built with Lucene and incorporates some work from the Simile project at MIT, including the exhibit and timeline. It implements Coins. It also uses what it calls drill clouds:

Ungava extends tag clouds to make them a useful tool for search refinement. That is, to use a tag cloud to refine an existing query by adding new elements to the query through interactions with the cloud. As this results in a kind of drill-down search behaviour, these new clouds have been named drill clouds. [Drill Clouds - CISTI-ICIST LAB WIKI]

View commentsView comments (0)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit   Furl  

Processes and repositories

 •  Categories: Digital asset management , Libraries - systems and technologies , Libraries - organization and services , Marketing , Metadata , Research, learning and scholarly communication , The cultural and scholarly record , User experience

I find it convenient to think about current library systems activities in terms of support for three materials workflows: bought/print materials, licensed/electronic materials, and digital/digitized materials. This is being pragmatic rather than pure, and is open to challenge on many grounds. I have discussed these at more length here, and suggested some ways in which they are developing. Development is in two directions: each of the areas continues to develop itself, while at the same time there is a growing desire to find better ways of working across them (e.g. at the discovery layer, or in terms of a more unified approach to metadata creation/management).

Now, we have an agreed and well-understood set of processes around the first category. These are encapsulated in the integrated library system, and still quite strongly influence library organization. These include things like selection, acquisition, cataloging, circulation, catalog, and so on.

We have a less well agreed set of processes around the second area, and an emerging apparatus of systems support. This includes resolvers, ERM systems, A to Z lists, metasearch, and so on. A level of agreement is apparent in that substitutable systems are now available to support this activity. However, differences in organizational structure to support the area and low takeup of ERM systems suggest that we are in early days. One place where there is likely to be further evolution relates to the creation, management and sharing of the data used to drive these systems.

And we have a much less well agreed set of processes around the third area. Libraries are exploring repositories for digitized collections, they are creating institutional repositories, and building workflows for content preparation and ingest, metadata creation, and so on. In fact, there is no agreed level of service in this area: you do not naturally expect to find particular services here in the way, for example, that you expect to find a circulation system. Of course, this lack of agreement makes this a potentially expensive area. There is a lot of figuring out what to do, and routine off-the-shelf tools or services may not necessarily exist across the range of what you want to do.

This is an overly complex systems landscape, and it will have to be rationalized in coming years so that libraries can spend more time putting their systems to work in support of their users and less time actually getting their systems to work together at all.

Anyway, this is by way of prelude to an observation about repositories. A couple of repository launches have come over my horizon in recent weeks.

The first is the Digital Conservancy at the University of Minnesota, which I mentioned the other day. This aims to provide services in relation to two classes of material: faculty research outputs and university administrative materials that traditionally would have gone to the University Archives. As I suggest in my post this makes a lot of sense: the repository aims to support the full range of institutionally produced intellectual outputs.

The second was the Open University's Open Research Online, "a repository of our research publications and other research outputs." In this case, the service aims to provide support for all the research outputs of OU academics. So, what you will find are deposited open access materials. However, you will also find citations to books, journal articles, and so on, which are not actually available in the repository: you may be referred to a publisher site. The repository aims to provide a full record to research activity, not only the open access materials.

What we have here, then, are well-worked through services which offer overlapping but different views onto their University's intellectual outputs. This is not a major issue as universities work towards a view of what should be offered and what their constituencies value.

However, in the longer term, lack of agreement about services and supporting processes may be a barrier, on the management side where different systems support is needed, or on the user side where different services from different universities may lead to confusion, reducing the gravitational pull that familiarity supports.

Aside: Of course, in the longer run also, there are interesting questions about the relationship between these institutional services and network level services but that is a discussion for another day.

Related entries:

View commentsView comments (1)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit   Furl  

The network reconfigures the library systems environment

 •  Categories: Featured , Learning and research - distributed environments , Libraries - systems and technologies , Libraries - distributed environments , Libraries - organization and services , User experience

One of the main issues facing libraries as they work to create richer user services is the complexity of their systems environment. Consider these pictures which I have been using in presentations for a while now.

libsystemsenv.png

Reductively, we can think of three classes of systems - (1) the classic ILS focused on 'bought' materials, (2) the emerging systems framework around licensed collections, and (3) potentially several repository systems for 'digital' resources. Of course, there are other pieces but I will focus on these.

In each case what we see is a backend apparatus for managing collections, each with its own workflow, systems and organizational support. And each with its own - different - front-end presentation and discovery mechanisms. What this means is that the front-end presentation mirrors the organizational development over time of the library backend systems, rather than the expectations or behaviors of the users.

You have the catalog here, maybe several options for licensed resources (a-to-z, metasearch, web pages of databases, and so on) over there, and potentially several repository interfaces (local digitized materials, institutional repository) somewhere else.

This is one reason that people have difficulties with the library website. Effectively, it is a layer stretched over a set of systems and services which were not designed as a unit. Indeed, in some cases, they were not originally designed to work on the web at all. So what do we have?

ILS: a management system for inventory control of the 'bought' collection (books, DVDs, etc). The catalog is bolted onto this and gives a view onto this part of the collection. In effect, in virtue of its integration with inventory management, the catalog provides discovery (what is in the collection), location (where those things are) and request (get me those things) in a tightly integrated way. The ILS and catalog may be part of a wider apparatus of provision, and may have mechanisms for interfacing to resource sharing systems of one sort or another. The management side may have interfaces to a variety of other systems for sharing and communicating data: procurement, finance, student records. And there will be a flow of data into the system, from jobbers, as part of a shared cataloging environment, and so on.

Licensed: This has been an area of rapid recent development as the journal literature moved to electronic form. On the backend we now see a variety of approaches, and the frontend can be very confusing with lists of databases and journals presented in various ways, often in uncertain relation to the catalog (where do I look for something?). We are now seeing the emergence here of an agreed set of systems around knowledge-base, ERM, resolution and metasearch, and there is rapidly developing vendor support. This is the range of approaches for which Serials Solutions has proposed the ERAMS name. These systems require the management of new kinds of data, and mechanisms are being put in place, certainly not yet optimal, for the creation, propagation and sharing of this data. With journals data, discovery, location and request are not so tightly coupled as they were with the catalog. Discovery has happened in one set of tools (A&I databases), but then the appropriate title may have to be located in another tool (the catalog for example) and, if not available locally, requested through yet another system. The importance of the resolver, and the enabling OpenURL, has been to tie some of these things together and remove some of the human labor of making connections between these systems. And metasearch has been seen as a way of reducing human labor by providing a unified discovery experience over disparate databases. However, this whole apparatus is still not as as well-seamed as it needs to be, and users and managers still do more work than they should to make it all work.

Repository: Libraries are increasingly managing digital materials locally and supporting repository frameworks for those. This includes digitized special collections, research and learning materials in institutional repositories, web archives, and so on. There are a variety of repository solutions available, some open source. Typically, the contents of the repository backend may be available to repository front-ends on a per-repository basis. Here, discovery (what is there), location (where is it) and request and delivery are typically tightly integrated. Repositories may also have interfaces for harvesting or remote query. On the management side, metadata creation and material preparation may still be labor-intensive.

OK, so here are some general observations about this environment:

  • There is still a major focus - in terms of attention, organizational structures, and resource allocation - on the systems and processes around the ILS and the bought collection. In academic libraries, we will surely see some of this move towards the systems and processes around the licensed collections given the rising relative importance of this part of the collection. The repository strand of activity, associated with emerging digital library activities, may, in some cases, be supported from grant or other special resources. It will need to become more routine.
  • The fragmentation of this systems activity, the multiple vendor sources, the different workflows and data management processes, and the absence of agreed simple links between things mean that the overall cost of management is high.
  • There is also another cost: diminished impact and lost opportunity. The awkward disjointedness described above also means that it is difficult to mobilize the consolidated library resource into other environments, course management or social networking systems for example. It is difficult to flexibly put what is wanted where it is wanted.
  • There has been much discussion of library interoperability, but it has tended to be about how to tie together these individual pieces, or about tying pieces to other environments (how do I get my repository harvested for example). There has been less focus on how you might abstract the full library experience for consumption by other applications - a campus portal for example.
This in turn means several things.
  • We will see more hosted and shared solutions emerge, which offer to reduce local cost of ownership. And, of course, we are seeing vendors consider more integration between products. In particular it is interesting seeing the concentration on support for the licensed e-resources emerge strongly, as well as discussion about integrated discovery environments.
  • Over time, we can expect to see some more reconfiguration in a network environment. Shared cataloging and externalizing the journal literature have been two significant reconfigurations in the past. The pace of current developments suggest that we may be ready for other ways of collaboratively sourcing shared operations. For example, does it make sense for there to be library by library solutions for preservation, social networking, disclosure to search and social networking engines, and so on.

The next picture tries to capture an important direction that has emerged in the last year or so.

environments.png

For many of the reasons identified above, we are seeing a growing interest in separating the discovery and presentation front end from the management backend across this range of systems. Why? Well, because it is becoming clearer as I suggested in my opening that legacy system boundaries do not effectively map user preferences. And because fragmentation adds to effort and accordingly diminishes impact.

What about the discovery side? So, we saw metasearch, a partial response to fragmentation of A&I databases. We are now seeing a new generation of products from the 'ILS vendors' which look at unifying access to the library collection: Encore, Primo, Enterprise Portal Solution. However, discovery has also moved to the network level. So, folks discover resources in Amazon, Google, Google Scholar. And OCLC is working to create discovery experiences which connect local and network through Worldcat Local, Worldcat.org and Open Worldcat.

And on the management side? Here the variety of workflows and systems adds cost, as resources are managed on a per-format basis. We can expect to see simplification and rationalization in coming years as libraries cannot sustain expensive diversity of management systems. The National Library of Australia's discussion of a 'single business' systems environment, or Ex Libris's discussion of Uniform Resource Management are relevant here. It is likely that there will be a growing investment in collaboratively sourced solutions, as libraries seek to share the costs of development and deployment.

As discovery peels off, then the issue of connecting discovery environments back to resources themselves becomes very important. It is interesting to look at Google Scholar in this regard, as different approaches are required for the three categories identified above. It has worked with OCLC and other union catalogs to connect users through to catalogs and the ILS; it has worked with resolver data to connect users through to licensed materials; and it has crawled repositories and links directly to digital content.

Given this great divide, several issues become very important:

  • Routing, resolution and registries become critical, as one wants to enable users to move easily from a variety of discovery environments to resources they are authorized to use. We need a richer apparatus to support this. (I have discussed the role of registries elsewhere.)
  • Libraries have thought about discovery. There is now a switch of emphasis to disclosure: libraries need to think about how their resources are best represented in discovery environments which they don't manage. (I have also discussed disclosure in more detail elsewhere in these pages.)
  • And again, how we present library services for consumption by other environments becomes an issue. For example, we are lacking an ILS Service Layer, an agreed way of presenting the functionality of the ILS so that it can be placed, say, in another discovery environment (shelf status, place a hold, etc).
  • Better discovery puts more pressure on delivery, whether from a local collection, throughout a consortium, or in broader resource sharing or purchase options. Streamlining the logistics of delivery and providing transparency on status at any stage for the user (as they can do with UPS or Amazon) become more important.

radiant2.png

And finally ....

We are used to thinking about better integration of library services. But that is a means, not an end. The end is the enhancement of research, learning and personal development. I discussed above how we want resources to be represented in various discovery environments. Increasingly, we want to represent resources in a variety of other workflows. These might be the personal digital environments that we are creating around RSS aggregators, toolbars and so on. Or the prefabricated institutional environments such as the course management system or the campus portal. Or emerging service composition environments like Facebook or iGoogle. As well as in network level discovery environments like Google or Amazon that are so much a part of people's behaviors.

Libraries need to focus more attention on reconfiguring library services for network environments. This is the main reason for streamlining the backend management systems environment. It does not make sense to spend so much time on non-value creating effort.

Related entries:


View commentsView comments (6)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit   Furl  

The integrated library system

 •  Categories: Libraries - systems and technologies

A couple of notes on the Integrated Library System.

A discussion document on the ILS, prepared at the University of Windsor, has been made available. [Future of the Integrated Library System]

This pulls together some commentary on the current state and preferred directions for the ILS. Thanks to the authors for kind references to some of my own pieces!

JISC and SCONUL wish to commission a project to conduct an evaluation and horizon scan of the current Library Management Systems (LMS) landscape in HE. The main focus is on LMS (including Electronic Resource Management systems) but does not preclude consideration of other, related systems. [ITT: Library Management Systems : JISC]

Depending on who wins the bid, the JISC/SCONUL study offers a real opportunity to advance discussion here. There is enough money available to fund a sustained investigation. The report is scheduled to be complete some time in early 2008.

This space is becoming more interesting as the 'classic' ILS vendors reconfigure their offerings, and as the more general library systems environment continues to evolve.

Related entries:

View commentsView comments (0)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit   Furl  

QOTD: Libraries and learning management systems

 •  Categories: Learning and research - systems and technologies , Libraries - systems and technologies , ebooks and other e-resources

I have just acquired Virtual learning environments: using choosing and developing your VLE by Martin Weller. VLE is a term popular in the UK for Course or Learning Management System. Weller is Professor of Educational Technology at the Open University where he directed their VLE project. The Open University is a major user of the open source Moodle system.

Chapter 6 is about the Managed Learning Environment (MLE), a term used for the ensemble of systems and services which support learning in an institution. Weller identifies portal, library, student record system and content management system as important other components in the MLE. His discussion about libraries and the interaction with the VLE is concentrated in this chapter.

The relationship between VLEs and library systems reflects the changes in practice and internal politics wrought by the advent of e-learning perhaps more than any of the other systems. There is a sense in which the very identity of libraries and their function in the educational process is at stake. Just as e-learning has induced much navel gazing and concern amongst educators regarding their role, and the potential commoditization of education, so it is with librarians. The answer, however, is largely the same - e-learning makes the store of information less significant, but in such an information-rich world it makes the skills of dealing with information more valuable. [p. 67]
He then suggests a continuum of potential for the library, from redundant to central. "At one extreme the need for a library becomes superfluous - at its simplest this might be categorized as 'I've got Google, what do I need a library for?'" [p. 67] In this redundant model, necessary materials are loaded into the VLE, and it points to other resources out on the open web.

In the central approach, the library mediates access to content within the VLE, providing value in selection, purposing to particular tasks, metasearch and so on.

The VLE and library interface then is one fraught not only with problematic technical issues, but also with a political dimension. There have been no shortage of projects examining the interface between the two, indeed there is something of a project overload, without a real consensus reached as to the ideal configuration. The main areas where the two systems interface is with the location of resources, and more specifically the following:
  • locating and importing resources into a VLE;
  • storing data about new types of resources, for example learning objects, within library catalogues;
  • managing rights and clearance for resources;
  • indexing and describing resources. [p. 68]

He mentions that the VLE may be managed within different organizational contexts, including some in which it is located within the library. The relationship between the library and the learning management system has indeed been a topic of much discussion in recent years. And we are seeing a growing discussion about the role of the library in relation to e-science and data curation. As more activities move onto the network, workflow and information management become pervasive issues which prompt interesting questions about how academic support services are best configured.

View commentsView comments (0)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit   Furl  

Moving to a 'single business' systems environment

 •  Categories: General - distributed environments , Libraries - systems and technologies , Libraries - distributed environments , Libraries - organization and services

The National Library of Australia has made an interesting report available, National Library of Australia IT Architecture Project Report, March 2007. [pdf] Here is the declared purpose:

The aim of this report is to define the IT architecture that will be needed to support the management, discovery and delivery of the National Library of Australia's collections over the next three years. The current architecture has allowed the library to develop a significant digital library capability over the last decade. Now the burden of maintaining and supporting existing systems and services is increasingly hindering us from bringing new services online, improving the user experience, exploring new ideas or responding to technological change. In the meantime, enormous changes are occurring in the broader environment.
The report identifies three major responses within the context of a new framework for digital library services (I talk about them in a different order than the one in which they are presented). One, it recommends a move to a service-oriented architecture. The grounds for this are clear, and clearly made in the report. They include the ability to share common services across applications, to be able to respond to change effectively, and to reduce over time the redundancy, cost and complexity of development.

Two, it argues for using open source solutions where they are 'functional and robust'. It notes an amendment to prior policy which favored a buy over a build policy. The Library will now consider open source solutions based on function and cost comparisons. The assessment of cost will not only include consideration of the direct costs of additional development but also the benefits of contributing code to the community and, interestingly, the opportunity costs of using commercial software whose development path is not aligned with library direction and need. The report notes the possibility of collaboratively sourcing some functionality with partners.

And three, the report talks about a 'single business' approach. This was the most interesting aspect of the document to me, because it underscores a major issue for libraries and the systems they deplo