May 07, 2008
•
Categories:
Books, movies and reading ...
, GLAM
, Libraries - systems and technologies
, Libraries - organization and services
, Metadata
, OCLC
Here are links to several unrelated publications .....
Reconfiguring the Library Systems Environment portal: Libraries and the Academy, Vol. 8, No. 2, April 2008. http://www.oclc.org/research/publications/archive/2008/dempsey-portal.pdf (.pdf: 195K/18 pp.) [Lorcan Dempsey: Selected publications [OCLC]]
This is a short piece adapted from an earlier blog entry.
Lavoie, Brian, and Günter Waibel. An Art Resource in New York: The Collective Collection of the NYARC Art Museum Libraries. (.pdf: 136K/18 pp.) [Books and reports [OCLC - Publications]]
The New York Art Resources Consortium (NYARC) includes the Frick Art Reference Library, the Metropolitan Museum of Art’s Thomas J. Watson Library, and the libraries of the Brooklyn Museum and the Museum of Modern Art. This report describes the results of a study of the aggregate collection of these institutions.
Godby, Carol Jean, Devon Smith, and Eric R. Childress. 2008. "Toward Element-level Interoperability in Bibliographic Metadata." The Code4Lib Journal, 2 (2008-03-24). Available online at: http://journal.code4lib.org/articles/54. [Publications [OCLC - OCLC Research]]
I mentioned this before, but in a message about another topic.
April 24, 2008
•
Categories:
General - distributed environments
, Metadata
, Standards
There is an interesting note on the Google Webmaster Central Blog:
When we originally launched Sitemaps, we included support for the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) 2.0 protocol, an interoperability framework based on metadata harvesting. In the meantime, however, we've found that the information we gain from our support of OAI-PMH is disproportional to the amount of resources required to support it. Fewer than 200 sites are using OAI-PMH for Google Sitemaps at the moment.
In order to move forward with even better coverage of your websites, we have decided to support only the standard XML Sitemap format by May 2008. We are in the process of notifying sites using OAI-PMH to alert them of the change. [Official Google Webmaster Central Blog: Retiring support for OAI-PMH in Sitemaps]
Via Paul Walk, who remarks:
There are a few ways of looking at this. Perhaps ‘open access’ repositories are less concerned with Google rankings than the typical website owner. Perhaps the penetration of OAI-PMH in the world is still below any level that Google could find particularly interesting - certainly they never went to great lengths to advertise this support while it lasted. Clearly, Google have come to the end of a ‘trial period’ for their support for this protocol in their main indexing service. [paul walk’s weblog » Blog Archive » Google gives up on supporting OAI-PMH for Sitemaps]
March 31, 2008
•
Categories:
Identity management, IPR and e-commerce
, Metadata
, OCLC
My Programs colleagues have released an interesting review [pdf] of copyright investigation practices across several RLG Partners.
In this project, staff from eight partner institutions participated in copyright investigation interviews between August and September 2007 to share the ways in which their institutions currently obtain copyright permission to provide users with access to high-risk or special collection materials. [Copyright investigation summary report - PDF]
It’s also important to note that staff who participated from almost every institution expressed a sense of “just getting started” or “realigning efforts to be more consistent across campus and across library units.” Almost all of the staff interviewed were in newly created positions; several noted that conducting copyright investigations in a centralized fashion was a new area of focus for their institutions. [Copyright investigation summary report - PDF]
This report is interesting in its own right as a review of practices. It also contributes background information to ongoing work at OCLC exploring a Registry of Copyright Evidence.
Related entry:
March 29, 2008
•
Categories:
Knowledge organization and representation
, Metadata
Stanford researchers collected data from del.icio.us and come to some pretty interesting conclusions about tagging. Of course, they are talking about tagging of web pages where the text of the tagged item is available for indexing.
Social bookmarking is a recent phenomenon which has the potential to give us a great deal of data about pages on the web. One major question is whether that data can be used to augment systems like web search. To answer this question, over the past year we have gathered what we believe to be the largest dataset from a social bookmarking site yet analyzed by academic researchers. Our dataset represents about forty million bookmarks from the social bookmarking site del.icio.us. We contribute a characterization of posts to del.icio.us: how many bookmarks exist (about 115 million), how fast is it growing, and how active are the URLs being posted about (quite active). We also contribute a characterization of tags used by bookmarkers. We found that certain tags tend to gravitate towards certain domains, and vice versa. We also found that tags occur in over 50 percent of the pages that they annotate, and in only 20 percent of cases do they not occur in the page text, backlink page text, or forward link page text of the pages they annotate. We conclude that social bookmarking can provide search data not currently provided by other sources, though it may currently lack the size and distribution of tags necessary to make a significant impact. [Heymann, Paul; Koutrika, Georgia; Garcia-Molina, Hector: Can Social Bookmarking Improve Web Search?]
In general they found that users thought that tags were objective and relevant. They highlight results throughout the paper. I thought the conclusion they drew from this result quite interesting:
Result 11: Domains are often highly correlated with particular tags and vice versa.
Conclusion: It may be more efficient to train librarians to label domains than to ask users to tag pages.
March 23, 2008
•
Categories:
GLAM
, Libraries - organization and services
, Metadata
, Research, learning and scholarly communication
, The cultural and scholarly record
An interesting announcement from CLIR about a $4.27M competitive program to describe hidden collections has just appeared. The existence of such collections must be more fully disclosed if they are to release more of their value in research and learning:
With generous funding from The Andrew W. Mellon Foundation, the Council on Library and Information Resources is creating a national program to identify and catalog hidden special collections and archives. The records and descriptions obtained through this effort will be accessible through the Internet and the Web, enabling the federation of disparate, local cataloging entries with tools to aggregate this information by topic and theme. [Hidden Collections]
This is a preliminary announcement and it will be interesting to see how the thinking behind the program is elaborated as more materials appear. The call for proposals will be in June. In particular, I will be interested to see some of the observations about organization, formats and federation frameworks expanded. See for example the following statements which relate to each of these topics respectively:
The program's strategy for building a distributed organization of cataloging and collection information assumes local autonomy and responsibility but also requires centralized agreements concerning governing principles that will ensure enterprise-wide coherence. [Hidden Collections]
Because tightly defined fields can impede interoperability, recent reports on hidden collections emphasize the need to make the categories and schemes of record creation and descriptions less rigid than those of the past. Cataloging special collections and archival materials has routinely been defined as a local practice. The shift to understanding hidden collections as a national problem entails an acknowledgment that in the 21st century, collaboration, coordination, and coherence of response to users is fundamental and takes precedence over local practice. [Hidden Collections]
The process will involve adopting a technology platform (or platforms) that will allow accurate descriptive information to be entered quickly, efficiently, and cost-effectively. The results of each project will be linked to and interoperable with those of all others to form a federated environment that can be built upon over time. Institutions must acknowledge local ownership of the data generated through the program and agree to its persistence. [Hidden Collections]
March 17, 2008
•
Categories:
General - distributed environments
, Metadata

I recently installed the Operator extension in my browser.
Operator leverages microformats and other semantic data that are already available on many web pages to provide new ways to interact with web services. [Operator :: Firefox Add-ons]
Interesting to see it in action on the JISC National eBooks Observatory page above. It recognizes address and contact data. Clicking on address displays an address, and offers to show it on Google and Yahoo maps.
I could also have used LinkedIn as an example where some structured data is also exposed using microformats. LinkedIn is one of the examples used by Yahoo in its much discussed announcement the other day about its support for various structured formats.
While there has been remarkable progress made toward understanding the semantics of web content, the benefits of a data web have not reached the mainstream consumer. Without a killer semantic web app for consumers, site owners have been reluctant to support standards like RDF, or even microformats. We believe that app can be web search. By supporting semantic web standards, Yahoo! Search and site owners can bring a far richer and more useful search experience to consumers. For example, by marking up its profile pages with microformats, LinkedIn can allow Yahoo! Search and others to understand the semantic content and the relationships of the many components of its site. With a richer understanding of LinkedIn's structured data included in our index, we will be able to present users with more compelling and useful search results for their site. The benefit to LinkedIn is, of course, increased traffic quality and quantity from sites like Yahoo! Search that utilize its structured data. [Yahoo! Search Blog: The Yahoo! Search Open Ecosystem]
The entry goes on to describe the support that Yahoo will be providing for microformats, metadata vocabularies and opensearch.
Google released its Social Graph API a little while ago.
It will be interesting to see what incentives the concentrating power of Google and Yahoo provide for the more widespread diffusion of structure in support of a web of data.
Related entry:
March 13, 2008
•
Categories:
Metadata
, OCLC
We have updated the audience level experimental service pages.
In this initiative we are using the pattern of holdings across different types of libraries (school, research, etc) to give a 'hint' about the level of interest of an item (juvenile, research/specialist, ...). You can read more about how we calculate the levels on the project page: Recognizing that different types of libraries typically serve different populations, OCLC researchers considered whether library types could be related to audience levels. They decided to explore whether the pattern of holdings of materials in WorldCat might be leveraged to provide an audience-level indicator. [Audience Level [OCLC - Projects]]
We have used the audience level on internal projects where materials need to be filtered in a particular way. In Worldcat identities we show an average audience level for an author (Gordon Korman gets 0.17). In this service we roll things up to the work level, and we show a list of manifestations (editions, etc) for each work.
My colleagues constructed an experiment to compare the results we got with this approach with cataloguer assessments. The audience level 'hint' compared reasonably well with the human assignments. A paper on this work will be published in due course.

'Audience level' may not be quite the right name for this. Classics for example will get lower level than you might expect if you just think about 'difficulty'. Thus spake Zarathustra, for example, has a level of 0.38: because of its 'classic' status it is widely available through public libraries.
The experimental service pages have a nice slider feature to show different audience levels in a collection of Zoology books (which were used in the experiment). (And we link all titles through to worldcat.org.)
February 15, 2008
•
Categories:
GLAM
, Metadata
, OCLC
My colleague Günter Waibel writes about a new RLG Programs project looking at the exchange of metadata between museum systems. Go to the entry for more detail.
With the generous support of a $145,000 grant from the Andrew W. Mellon Foundation, RLG Programs will gather a select group of museum partners to accomplish the following: - Creating a low-barrier / no-cost batch export capability for CDWA Lite XML out of the collections management system used by the participating museums (GallerySystems TMS)
- Modeling data exchange processes using the Open Archive Information Protocol for Metadata Harvesting (OAI-PMH) at the participating museums
- Creating an aggregation of museum content within OCLC Research for analysis
- Discussing the evidence about the relative utility of the aggregation with stakeholders from the museum, vendor and aggregator community
[hangingtogether.org]
February 11, 2008
•
Categories:
Metadata
, OCLC
OCLC distributed around 2 million printed catalog cards last year. They are still being used ....
January 30, 2008
•
Categories:
Libraries - distributed environments
, Metadata
, OCLC
I have written before about how registries provide 'intelligence' in the network. Scalable loose coupling between library services will benefit from good ways to discover those services.
The Worldcat Registry includes data for library services (resolver, catalog, virtual reference) which drives Worldcat Local and Worldcat.org. Worldcat.org's 'understanding' of the library network is captured in the Registry.
A while ago the OpenURL Resolver Registry and Gateway were incorporated into the Worldcat Registry. The Registry is openly available and the Gateway is systematically used by several other parties including Zotero:
Zotero 1.0.2 also includes several new site translators, including translators for social media sites Flickr and YouTube. Zotero’s default Open URL resolver has also been changed to the OCLC OpenURL Resolver Gateway, which will allow many Zotero users to automatically find items from their collections in their campus library through the Locate button without editing their preferences. [Zotero: The Next-Generation Research Tool » Blog Archive » Our Most Stylish Release Yet: Zotero 1.0.2]
My colleague Joanna White tells me that use of the gateway is climbing. In November 2007, over 250,.000 requests were processed. There are currently about 1,600 resolvers registered.
Here are some details about use and update of the OpenURL Resolver registry and gateway.
[OCLC OpenURL Resolver Registry [OCLC]]
Related entries:
January 30, 2008
•
Categories:
Knowledge organization and representation
, Learning and research - systems and technologies
, Metadata
Phil Barker looks at FRBR in the context of learning object metadata.
The proposed object model borrows from the scholarly works application profile (SWAP) application model, which in turn is based on the Functional Requirements for Bibliographic Records (FRBR) entity model. The rationale behind this was that, firstly, scholarly works may be considered learning materials in higher education, so any model for learning materials would have to describe scholarly works, secondly, the FRBR model is well-tested and seems generic enough to describe many other types of resource (e.g. musical scores and performances, images, online resources). [Learning Materials Application Profile Domain List]
Via Pete Johnston.
January 18, 2008
•
Categories:
Knowledge organization and representation
, Metadata
Our bibliographic systems are like an archipelago. Scattered islands which need to be visited individually. In this context I was interested to read Bob Wolven:
Now, however, more radical change seems both possible and responsible in light of developments taking place outside library cataloging. The balkanized system that has characterized information retrieval to date—in which researchers use one tool to find books and journals, another to find journal articles, a third to track poems, and so forth—has allowed library cataloging practices to be evaluated in isolation. Rules and the data they generate are seen as more or less valuable in relation to their impact on the library OPAC; in turn, OPACs are seen as more or less effective for their ability to use and present cataloging data. Now, this hegemony is being challenged: metasearch tools bridge formerly separate search environments; search engines draw on multiple sources to present alternative interfaces to both popular and scholarly resources; full-text aggregations, Google Book Search, and Microsoft's Live Academic Search extend the reach of discovery into the content itself. [In Search of a New Model - 1/15/2008 - netConnect]
I sometimes puzzle over the emphasis on next generation catalogs. Of course, it is easy to understand, given the local control. But it is only one island, an important one, but one destination among several. What about all the other databases?
What questions about the value of the controlled data in our catalog records (names, subjects, etc) will we ask as it begins to be merged more with data created in different regimes? We can already see this happening in the environments that Bob mentions, and in new integrated discovery environments like Primo, Encore and Worldcat Local.
January 18, 2008
•
Categories:
Knowledge organization and representation
, Metadata
, Standards
Bob Wolven has an interesting piece in netConnect about cataloging. He mentions our approach to standards, among other things.
Perhaps worse, the kind of consensus we have demanded drives us toward complexity. Our libraries acquire a vast and wildly diverse set of resources, yet we insist on treating all of them by the same rules. We prize consistency over practicality. If some works, in some contexts, benefit from a precise transcription of statements of responsibility, or from detailed recording of pagination and illustrations, we apply those same principles to all. We apply the same level of subject analysis to the 20-page pamphlet and the 1000-page treatise. We do this not out of obduracy or short-sightedness, but because it's the only way we have found to build trust among what is, after all, a very large and diverse group. [In Search of a New Model - 1/15/2008 - netConnect]
We do sometimes treat standards activity as if the desired outcome were socially acceptable consensus. This has meant that we may allow optionality or discretion in how data is represented, or, for example, we may suggest that data go into notes. This may have been more acceptable when actual data exchange was not very frequent, or data was created for human display. However, as more of our services are supported by communicating applications, and as the volume and variety of data transfers increase, this approach is less useful. Think of how we want to process data for faceted display, or for clustering into works, or think about using data to manage flows into mass digitization or offsite storage where we want to track volumes through workflows. We want to make sure that the full intellectual effort that goes into description is available for re-use by applications.
January 10, 2008
•
Categories:
Knowledge organization and representation
, Libraries - organization and services
, Metadata
The final report of the LC Working Group on the Future of Bibliographic Control has been submitted and is now available on the LC website.
On the Record: Report of The Library of Congress Working Group on the Future of Bibliographic Control (January 9, 2008) Read final report [PDF, 442 KB]
[News and Press Releases - Working Group on the Future of Bibliographic Control (Library of Congress)]
Note: I am a member of the Group.
January 09, 2008
•
Categories:
Metadata
, Standards
An interesting announcement about some metadata standards from the Office of the Director of National Intelligence (ODNI) which use Dublin Core. Dale Meyerrose, mentioned in the quote, is associate director of national intelligence and chief information officer at ODNI.
These standards are a part of a broader attempt by Meyerrose and Defense Department Chief Information Officer John Grimes to make information more usable across the intelligence community. Meyerrose, who spoke at a lunch sponsored by the Industry Advisory Council last September, said he has signed memos since June 2007 that focus on creating a data dictionary that deals with security level labeling and one that creates a central repository for intelligence information. [ODNI issues new metadata standards]
The goal here is sharing of information within the intelligence community.
“My goal is to improve collaboration,” Meyerrose said at the event. “Another aspect is how well we collaborate with others charged with the business of intelligence.” [ODNI issues new metadata standards]
Via Stu Weibel.
November 30, 2007
•
Categories:
Knowledge organization and representation
, Metadata
The draft final report of the Working Group on the Future of Bibliographic Control has been made available [PDF] for public comment.
Responses are being accepted by the group until December 15, 2007.
Different communities of bibliographic practice have grown up around different resource types: library collections of books and journals, archives, journal articles, and museum objects and images. As these resources and others become increasingly accessible through the Web, separation of the communities of practice that manage them is no longer desirable, sustainable, or functional. Bibliographic control is increasingly a matter of managing relationships—among works, names, concepts, and object descriptions—across communities. Consistency of description within any single environment, such as the library catalog, is becoming less significant than the ability to make connections between environments: Amazon to WorldCat to Google to PubMed to Wikipedia, with library holdings serving as but one node in this web of connectivity. In today's environment, bibliographic control cannot continue to be seen as limited to library catalogs. [Report on the Future of Bibliographic Control PDF]
November 28, 2007
•
Categories:
GLAM
, Metadata
, OCLC
Reading the report [PDF] of the RLG Programs metadata practice survey, this quote from a respondent jumped out at me:
We use a variety of tools to produce a variety of records. Mature and established systems (such as our ILS) are generally effective. Tools for creation of XML are not as efficient - particularly EAD creation. Creation of EAD and ingest into our XML database is still a very manual process. Our tools are also generally not well integrated. Even when describing the same resource we use the ILS for creating MARC, home grown tools for creating EAD, and perhaps a third tool for creating item level descriptive metadata. [RLG Programs Descriptive Metadata Practices Survey Results - PDF]
It is pretty indicative of general issues to emerge. Metadata creation practices are fragmented across different materials workflows with variable systems support.
... RLG Programs surveyed 18 Partner institutions1 in July and August 2007 to obtain a baseline understanding of their current descriptive metadata practices. Although we saw some expected variations in practice across libraries, archives and museums, we were struck by the high levels of customization and local tool development, the limited extent to which tools and practices are, or can be, shared (both within and across institutions), the lack of confidence institutions have in the effectiveness of their tools, and the disconnect between their interest in creating metadata to serve their primary audiences and the inability to serve that audience within the most commonly used discovery systems (such as Google, Yahoo, etc.). PDF]
I was also interested to note that over half the institutions surveyed build and maintain one or more local thesauri.
For more detail see Karen Smith-Yoshimura.
Related entries:
November 26, 2007
•
Categories:
Metadata
I was interested to read the following in a report just released by the Research Information Network in the UK about the completeness of catalogue coverage of research collections :
The study shows significant progress: librarians estimate that 50% of material in their research collections is now covered by online catalogues, compared with 31% five years ago. But much more remains to be done before all the significant material held in UK libraries that may be of value to researchers can be readily traced through online catalogues. Librarians are keen to pursue this work, and we recommend it should remain a high priority for them. [Uncovering Hidden Resources:Progress in extending the coverage of online catalogues - PDF]
This seemed high to me on a first read - before I realised that the survey was restricted to research collections. How do they define research collection?:
Respondents were asked how many special collections for scholarly research their library holds in total. It should be noted that no attempt was made to define the term special collections for scholarly research, so the responses received were based on the interpretation of the responding librarians. In some cases, respondents considered their library’s entire stock to be a special collection, and hence the number of collections held by the responding libraries was found to vary considerably, ranging from 1 to 5000. It should be noted that many of the figures supplied were approximate and that no indication of the size of collections may be inferred from these data. The most common numbers (mode) of collections held were one or three, and the median number of collections was found to be six. PDF]
The survey was based on responses from 96 libraries. Over half of these were academic, and the balance was divided between public (26%) and specialist (21%). This represented a response rate of 26% for academic libraries, and 12% for public.
The results are not broken down by type of library, or by size of library. This might have been helpful.
November 20, 2007
•
Categories:
Featured
, Metadata
, OCLC
Updated: 11/21/07
I have spoken about library logistics before.
Logistics is about moving information, materials and services through a network cost-effectively. Resource sharing is supported by a library logistics apparatus. The emerging e-resource discovery to delivery chain, tied together with resolution services, is a logistics challenge. Many of the e-resource management issues are like supply-chain management issues. [Lorcan Dempsey's weblog: Library logistics]
It seems to me that recent developments highlight the logistics theme. Think of the systemwide inventory management questions that are beginning to arise in relation to off site storage and mass digitization. Or the issues that arise when we connect multiple discovery environments to backend library - or other - fulfillment options.
I like the UPS slogan about synchronizing commerce. It reminds us of the central role of data in logistics and of the need for integrity of data along supply chains or other processes. I was reminded of this while reading Michael Cairns' interesting post about Booknet Canada and the Global Data Synchronization Network.
Industries other than publishing also battle data reliability and timeliness and, over the years led by umbrella groups such as UCC and EAN (now combined into one organization named GS1), they have developed programs to embrace supply chain efficiency and its' co-relation data integrity. Data Synchronisation (GDSN) is such a program which I have noted a few times in the past (Post). The objective of the GDSN is to ensure that all trading partners are working with the same set of product details that are simultaneously synchronized at a network level and in transaction details such as purchase orders and shipping details. The benefits of synchronised data can extend from 'simple' efficiency improvements in the ordering and receipt process to higher effectiveness in marketing and promotions programs. [PersonaNonData: Five Questions on Global Data Synchronization] Michael interviews Michael Tamblyn, President of Booknet Canada which is offering services based on GDSN. Among the advantages he suggests are: Then there is the more forward-looking work: collaborative sales data mining for independents, backlist optimization and forecasting research, industry cost analysis on returns, digital publishing trends, our annual Technology Forum. And on it goes. [PersonaNonData: Five Questions on Global Data Synchronization]
There is a temptation in library discussions to focus on discovery and end-user issues when thinking of bibliographic data. However, bibliographic data is increasingly important to efficient library operations more generally. Think of the blurring of circulation and resource sharing in consortial arrangements, the issues of managing and tracking print collections in the context of the mass digitization and off site storage initiatives, connections between external discovery environments and library systems, resolution and the management of knowledge bases, and so on. Systemwide data synchronization and data integrity issues are becoming more central. Increasingly we recognize that efficient management of resources imposes data needs.
Some examples: What books have been digitized by Google, etc? Is an available-for-use digitized copy of this book available more easily than getting it in 3 days on ILL. How would last copies be registered and curated within a systemwide framework (Ohio, for example, or the UK, or ...)? Can I let a user make an optimum request based on price/speed of delivery balance? Can I do recommender systems across aggregate circulation data, or aggregate resolution data? Can I develop core collection recommendations based on aggregate holdings data? Can I make selection decisions based on a view of what my regional partners are selecting? Can I begin to do some modelling of collections based on the aggregate holdings of off site storage facilities. Can I receive collection development recommendations based on my users' use of Google Scholar? Can I be assured that my users will be linked correctly - and as seamlessly as possible - into my collections from Google, or Worldcat.org, or a growing range of other potential discovery venues? Can I make collection development decisions based on aggregate Counter data?
There is an earlier discussion of some similar data issues by my colleagues and me in a Library Journal article: Making data work harder.
Related entries:
November 08, 2007
•
Categories:
Learning and research - distributed environments
, Metadata
I think that reading lists and citation managers are interesting sites of connection between environments. They are potentially 'portables', travelling portals onto resources. I was interested to see the following discussion of reading lists on the Intute blog:
One solution is to provide links to key quality Internet resources within your VLE (Virtual Learning Environment) or university webpages. However, maintaining these is a challenge for lecturers, because of the time and effort required to regularly link check and update the web pages (2 ). Intute has developed the MyIntute service to make it easy to create and maintain personalised lists of resources that have already been carefully evaluated. It’s then simple to export and publish these. An optional dynamic link between Intute and your VLE or webpage (using JavaScript) automatically updates the links when they are checked by Intute staff. In addition our RSS feeds can provide a regular update of the latest resources added to Intute. Lecturers can also encourage students to keep their own personalised lists within MyIntute, knowing that they can rely on their quality. [Intute Blog » Blog Archive » Integrate Intute content]
They point to an example at the University of Leeds Library where an Intute search box and MyIntute lists have been integrated into their subject pages. The 'selected resources' are from MyIntute lists.

Intute is a national UK initiative, distributed across many universities, which supports effective use of web resources through a directory and other services. They have recently launched their blog.
Intute is run by a national network of academic subject, Internet and information specialists from UK universities, who will use this blog to post news, views and reviews about Intute services, but also about the use of Internet resources to support higher education and research. [Intute Blog]
Via Emma Place.
October 31, 2007
•
Categories:
Libraries - distributed environments
, Metadata
, OCLC
Our Openly colleagues have added a new service, xISSN, alongside xISBN. The xISSN Web service supplies ISSNs and other information associated with serial publications represented in WorldCat. Submit an ISSN to this service, and it returns a list of related ISSNs and selected metadata. The service is based on WorldCat, the world's largest network of library content and services. The current xISSN database covers 575,573 ISSNs. [WorldCat Web service: xISSN [OCLC - WorldCat Affiliate tools]: Home]
October 23, 2007
•
Categories:
Metadata
, OCLC
Jon Udell interviews my colleague Stu Weibel about Dublin Core, Worldcat, and related issues. On this episode of Interviews with Innovators, host Jon Udell invites Stuart Weibel to reflect on his leading role in the Dublin Core Metadata Initiative. They also discuss how databases like the Online Computer Library Center's WorldCat - which consolidates bibliographic data from over 50,000 participating libraries - can enrich our experience of using and contributing to the web. [IT Conversations: Stuart Weibel]
Jon Udell has a nice blog entry about the interview raising some interesting issues about the use of bibliographic data and services in a web environment.
October 16, 2007
•
Categories:
Digital asset management
, Libraries - systems and technologies
, Libraries - organization and services
, Marketing
, Metadata
, Research, learning and scholarly communication
, The cultural and scholarly record
, User experience
I find it convenient to think about current library systems activities in terms of support for three materials workflows: bought/print materials, licensed/electronic materials, and digital/digitized materials. This is being pragmatic rather than pure, and is open to challenge on many grounds. I have discussed these at more length here, and suggested some ways in which they are developing. Development is in two directions: each of the areas continues to develop itself, while at the same time there is a growing desire to find better ways of working across them (e.g. at the discovery layer, or in terms of a more unified approach to metadata creation/management).
Now, we have an agreed and well-understood set of processes around the first category. These are encapsulated in the integrated library system, and still quite strongly influence library organization. These include things like selection, acquisition, cataloging, circulation, catalog, and so on.
We have a less well agreed set of processes around the second area, and an emerging apparatus of systems support. This includes resolvers, ERM systems, A to Z lists, metasearch, and so on. A level of agreement is apparent in that substitutable systems are now available to support this activity. However, differences in organizational structure to support the area and low takeup of ERM systems suggest that we are in early days. One place where there is likely to be further evolution relates to the creation, management and sharing of the data used to drive these systems.
And we have a much less well agreed set of processes around the third area. Libraries are exploring repositories for digitized collections, they are creating institutional repositories, and building workflows for content preparation and ingest, metadata creation, and so on. In fact, there is no agreed level of service in this area: you do not naturally expect to find particular services here in the way, for example, that you expect to find a circulation system. Of course, this lack of agreement makes this a potentially expensive area. There is a lot of figuring out what to do, and routine off-the-shelf tools or services may not necessarily exist across the range of what you want to do.
This is an overly complex systems landscape, and it will have to be rationalized in coming years so that libraries can spend more time putting their systems to work in support of their users and less time actually getting their systems to work together at all.
Anyway, this is by way of prelude to an observation about repositories. A couple of repository launches have come over my horizon in recent weeks.
The first is the Digital Conservancy at the University of Minnesota, which I mentioned the other day. This aims to provide services in relation to two classes of material: faculty research outputs and university administrative materials that traditionally would have gone to the University Archives. As I suggest in my post this makes a lot of sense: the repository aims to support the full range of institutionally produced intellectual outputs.
The second was the Open University's Open Research Online, "a repository of our research publications and other research outputs." In this case, the service aims to provide support for all the research outputs of OU academics. So, what you will find are deposited open access materials. However, you will also find citations to books, journal articles, and so on, which are not actually available in the repository: you may be referred to a publisher site. The repository aims to provide a full record to research activity, not only the open access materials.
What we have here, then, are well-worked through services which offer overlapping but different views onto their University's intellectual outputs. This is not a major issue as universities work towards a view of what should be offered and what their constituencies value.
However, in the longer term, lack of agreement about services and supporting processes may be a barrier, on the management side where different systems support is needed, or on the user side where different services from different universities may lead to confusion, reducing the gravitational pull that familiarity supports.
Aside: Of course, in the longer run also, there are interesting questions about the relationship between these institutional services and network level services but that is a discussion for another day.
Related entries:
October 04, 2007
•
Categories:
Metadata
, Standards
One area where growing interest in identifiers is very clear is that of people, particularly in their role as authors or creators. In this context, the Names Project in the UK is interesting: The project is going to scope the requirements of UK institutional and subject repositories for a service that will reliably and uniquely identify names of individuals and institutions. It will then go on to develop a prototype service which will test the various processes involved. This will include determining the data format, setting up an appropriate database, mapping data from different sources, populating the database with records and testing the use of the data. This will provide important information about the future usefulness of a name authority service for institutional and subject-based repositories, and other applications beyond the repository sector. [The Names Project]
The website does not talk about how any ensuing service might be sustained.
The project has produced a useful Landscape report [pdf], documenting relevant standards and projects. Including Worldcat Identities and the VIAF project.
The benefits of using a consistent name are clear from a discovery point of view. So it is interesting that many people are inconsistent in how they identify themselves on their works. Search engines have probably made people more conscious of the distinctiveness - or otherwise - of their names? The additional step of unique identification would facilitate various services.
Related entries:
October 04, 2007
•
Categories:
Libraries - organization and services
, Metadata
I am attending events in libraries in the UK this week. I have already been in Oxford and Cambridge Universities, and head to the Open University tomorrow.
The former two have collections of world significance. The third is figuring out how to better serve a widely dispersed population over the network.
I tend to think of four facets of the library: place, collections, expertise and service. In a pre-network age these are vertically integrated around the collections. Place exists to hold the collections. Expertise is devoted to organizing and interpreting the collection for local needs. And services tend to be around acquisition and delivery of the collections.
In a network age, these come apart, and take on new directions. Library space is being reinvented to serve learning and social behaviors. Library collections are diversifying, including not only purchase and licensing of published materials but also the outputs of institutional research and learning, selectively harvested web pages, and other materials. Library expertise is being applied to all aspects the creation, transmission and use of knowledge to support user productivity. And there is a major new focus on developing network services that reach out into the research and learning behaviors of library users.
In this context, I was interested to see the services listed on the new Open University library web pages. They include metadata services: The Open University has been identifying its metadata needs, to help increase interoperability, retrieval and reuse of its assets. These needs have been considered from the view point of its systems and the requirements of external partners. The University is developing an En |