Today Google and CIC announce an agreement to digitize ten million volumes across the CIC libraries. Google has been adding new partners since the first announcement was made about the Google 5. Some folks have wondered what rationale has governed selection of partner opportunities. We do not know, but they sure are moving fast! Here are some early thoughts.
The CIC announcement is interesting for several reasons:
- It is a shared effort across a major group of libraries with significant collections. There appears to be strong CIC institutional commitment. Of course, CIC has a history of collaboratively sourced activities and this 'pooling' model makes increasing sense given the necessary policy and service challenges that need to be addressed. In this case, but also across a range of other issues that libraries face as they support changing research and learning behaviors in a reconfigured network environment. For some things, scale matters.
- The libraries have a shared approach to managing the digital copies based on shared infrastructure at the University of Michigan, and serving them up to their user communities. An example of collaborative sourcing.
- Google recently advertized for somebody to work on collection development and we seem to be seeing a stronger focus in this area. Collecting areas of importance within each library [pdf] have been identified for attention. Presumably, these decisions have been influenced by the 'collective collection' of the full Google parnership also.
This initiative in turn prompts some more general thoughts about access:
- One of the most valuable features of the Google initiative is that it digitizes book content, allowing fine-grained discovery over topics, people, places and so on. Of course this presents interesting questions about indexing, retrieval, ranking, and presentation but the advantage of having this access seems clear. It drives use and sales, and it supports enquiry. Without it, the book literature is less accessible than the web literature.
- However, as we are beginning to see on Google Book Search, we are really going beyond 'retrieval as we have known it' in significant ways. Google is mining its assembled resources - in Scholar, in web pages, in books - to create relationships between items and to identify people and places. So we are seeing related editions pulled together, items associated with reviews, items associated with items to which they refer, and so on. As the mass of material grows and as approaches are refined this service will get better. And it will get better in ways that are very difficult for other parties to emulate.
- Currently this material is made available within the Google destination site. Google is an advertizing engine and its approach depends on aggregating attention for adverts. This apporach may be difficult to deploy within a more 'data services' approach where others - especially the partners - have remixable access to content and services. However, the 'utility' value of this resource will be diminished if it is not made available in this way so that others can mobilize these resource within their own environments. How and if this gets done remains to be seen. (See the related discussion about the search API.)
- This type of access seems especially important for the partner libraries. In the early days of this activity there was some discussion of the types of services which would be built on top of the digitized books by the libraries. However, it is difficult, and maybe not very sensible, for the libraries to individually invest in some types of service development. An important factor here is that they cannot benefit from the network effects tha



Comments: 1
A couple of thoughts.
Firstly:
"Google currently links through to library services but this needs to get smoother"
The layout of a books details page (e.g. http://books.google.com/books?id=0CR1AAAACAAJ&dq=dark+is+rising) is made up of a number of sections - if this was extended to allow the addition of gadgets from iGoogle (http://www.google.co.uk/ig?hl=en), then perhaps the integration of library information into this would be a way forward (thus integrating library services into Google rather than the other way round - which I guess would also help with the issue of advertising revenue when integrating Google services into 3rd party applications)
Secondly - Copyright:
What may also be interesting is how texts digitised by Google, but not in the public domain might be licensed for use. Perhaps initiatives like the trial CLA digitisation license in the UK
(http://www.cla.co.uk/support/he/HE_TrialPhotocopyingandScanningLicence.pdf)
might evolve in allowing a specific HE institution to get access to the Google digitised texts?