Thinking about the G5

Here are some random thoughts abot the Google library digitization discussions ....

  • If large amounts of the G5 library collections are digitized, indexed and searchable then we have an index to books in all library collections. This initiative potentially improves access to all library collections, provided we have good ways of moving from the Google results into those collections.
  • This in turn raises interesting questions about coverage moving forward. How well do G5 collections, or those parts of them that are digitized, cover the universe of library collections. What is the incremental impact of adding other collections? Related to this: what proportion of any particular library collection is represented in the G5 collection? Again, this underlines the growing interest in collection analysis.
  • Copyright. I have always thought that the distinction we make between 'bought' collections which we add to the collection (books, DVDs, CDs, ...) and 'licensed' collections to which we typically have remote access is a little more grey than it seems. At least for in-copyright materials: this is because one cannot do what one wants with the 'content' of the bought items whatever one does with the physical book or CD or whatever. One of the interesting things about this initiative is that it will turn some 'bought' resources into 'licensable' resources: new agreements may need to be made if the content of the materials is to be used in digital form, even for use within the home institution.

Comments: 1

Apr 18, 2005
Candy Schwartz

We had the good fortune to have Dale Flecker from Harvard come to Simmons last week to talk about the G5 project, and some of these same questions were raised. It's made me think about a lot of things, none of them earth-shaking, but all promising fun to come:


  • While apprently Google will not be deduplicating till after the fact, it's likely that post-G5 libaries wishing to scan the non-G5 portions of their collections may have to use deduplicating tools of their own, or perhaps this will come from the OPAC vendor or bibliographic utility community.

  • Libraries presumably will have to negotiate for use of those items which they possess in hard copy but which were part of somebody else's conversion.

  • If I have a 9th printing and the 8th printing has been scanned, will I bother with the 9th if the differences are not significant? What constitutes significant difference?

  • Will Google offer incentives to "complete" the digitizing of everything (similar to being rewarded for entering new records into OCLC)?

  • What's it going to be like doing keyword searching across that kind of size of text collection? There's some very exciting opportunities for squeezing the last ounce out of the MARC record and all associated item metadata, as well as really testing the efficacy of FRBRized records, clustering, guided navigation, and all of that other fun stuff.



  • Makes me glad I am not going to retire for a while yet.

    Candy Schwartz

    GSLIS, Simmons College