I thought I would post some numbers here which were prepared by my colleague Brian Lavoie for another purpose. The question was: how many of the books in US libraries are in English?
First of all, what is a book? Deciding what a book is involves some choices (are theses in or out, for example?). This analysis uses the definition of 'print books' given in the Google 5 analysis published in DLib Magazine a while back [1].
a. All of WorldCat (Apr 09):
135.3 million records
Cataloged as "eng": 46 percent (so 54 percent non-English)
b. Print books only (Apr 09):
91.2 million
Cataloged as: "eng": 40 percent (so 60 percent non-English)
c. Print books in US libraries (Jan 09)
42.5 million
Cataloged as "eng": 57 percent (so 43 percent non-English)
d. Print books representing combined collections of three academic research libraries participating in GBS (April 2009):
7.2 million
Cataloged as: "eng": 54 percent (so 46 percent non-English)
Note - c is calculated on a slightly earlier version of the database as we had already pulled out US library holdings. The data in d is being looked at for another purpose: hence the slightly arbitrary selection of 3 libraries.
Note - these numbers are for records in the database, which represent 'manifestations' in FRBR terms. If one were to count holdings or actual copies the numbers would be different. The proportion of 'eng' would go up as English titles will be more widely held and in greater numbers of copies.
[1] Here is how the definition of a 'print book' was decided upon and operationalised for the Google 5 analysis. "Although there is no unambiguous bibliographic definition of a book, libraries have often used monographic language materials as a proxy for books, and this practice is adopted for this study. More specifically, in the context of a MARC21 record, a book is defined as a language-based monograph, identified by the codes "a" and "m" in bytes 6 and 7 of the leader, respectively. For the purposes of this study, theses/dissertations and government documents are excluded from the analysis, since these materials are usually acquired and managed as separate segments of the library collection. Records describing books in print format were identified by eliminating all non-print formats, such as digital, microform, Braille, and so on."
Layar created a ripple of interest a while ago. It is yet to be released. It is an application for Android based phones which will allow data from various partner resources to be 'layered' over the view through a camera phone. Partners discussed include banks (for ATMs), realtors, and a social network site with data about venues. They describe it as an augmented reality application: objects viewed through the camera may be augmented by data about those objects.
Layar is derived from location based services and works on mobile phones that include a camera, GPS and a compass. Layar is first avaliable for handsets with the Android operating system (the G1 and HTC Magic). It works as follows: Starting up the Layar application automatically activates the camera. The embedded GPS automatically knows the location of the phone and the compass determines in which direction the phone is facing. Each partner provides a set of location coordinates with relevant information which forms a digital layer. By tapping the side of the screen the user easily switches between layers. This makes Layar a new type of browser which combines digital and reality, which offers an augmented view of the world. [Sprxmobile - Layar]

[image: layar.eu via Tito Sierra M-Libraries 09 ppt]
Tito Sierra referenced Layar in his presentation about WolfWalk, project, at the Second M-Libraries Conference in Vancouver last week. WolfWalk is a pilot project at NCSU as I mentioned the other day. It is working on an iPod application which aims to create what Tito called a 'geomobile collection'. Here is what the project pages say:
A pilot project to create a mobile application that enables users to explore NC State campus history using a location-aware map-based interface. The application supports a map view (using Google Maps) with geotagged placemarks for approximately 60 major sites of interest on the NCSU campus, and a browse view for quickly locating a known site by name. Each site has several historical images associated with it that are sourced from NCSU Special Collections Research Center digital archives. [WolfWalk]


Tito discussed how they thought this would be of interest to alumni and was careful to describe it as a modest proof-of-concept. I thought it succeeded very well in demonstrating his contention that the challenge is not just to provide small-screen versions of digital collections but to leverage the capabilities of new mobile technologies to provide new ways of experiencing those collections. In this case, the collections augment the experience of the buildings on campus by providing historic context at the point of interaction; at the same time, the app provides a map-based approach to digital collections.
The screencast and powerpoint presentation are well worth a look. The WolfWalk pictures above are screenshots of the screencast.
Mlib09
I traveled home from the 2nd M-Libraries Conference in UBC, Vancouver, yesterday. I was interested to come across several relevant news stories in the reading materials I had bought en route: The Globe and Mail, The Economist (last week's, as it turns out), and The Financial Times. This underlined the topicality of the conference themes.
The iPhone was prominent at the conference, in discussion, but also in practice as they were slipped in and out of pockets and bags throughout. In an interesting presentation about their WolfWalk project, Tito Sierra of NCSU opened with some general remarks about geo-location and touch screens as distinctive capacities supporting new applications. He also reminded people of the importance of the Apple Apps store in reducing transaction costs for users: search and acquisition of apps was now straightforward. What Apple has done is to create a network of developers around its successful platform. The App Store is key to this as it allows app developers to find users, and users to find apps, and in the process the value of the iPhone/iTouch is increased. This point was reinforced in a story about Apple's success in countering the effects of the current downturn in the Financial Times. John Gapper quotes work by Hagel and Seely Brown of Deloitte which shows that lower costs of entry brought about by regulatory and technical trends are creating stronger competitive challenges for companies. Apple's ability to resist this trend depends on the way in which it has created a platform around which a network of partners has built thousands of apps. So, for the Palm Pre to be successful, for example, it not only has to compete with the iPhone on price and features, but also on its ability to become a platform for app developers. Much of the value of the iPhone now derives from the apps which are available to its users.
I was also struck by the number of Mlib09 delegates who were using netbooks. I suppose you would expect this at such a conference, but this did not make it any less striking. The Economist had an article on netbooks, focusing on their challenge to the computer and software industry generally. They report Gartner figures that 21M netbooks will ship this year, twice as many as last year, accounting for more than 15% of the laptop market. By the end of 2008, netbook pioneer Asus had sold nearly 5M Eees. I was interested to read that Microsoft was heavily discounting Windows XP to netbook providers to counter the Linux challenge. Acer and other firms plan to use Android.
One of the hits of the conference was the discussion by Kate Robinson of the use of QR Codes in the catalog at the University of Bath (blogged here earlier this year). It prompted discussion of the variety of ways in which people and materials could be tied into the network.
The Globe and Mail had several stories about capturing data from codes.
- Databars. A discussion of the use of Databars, smaller than barcodes, in retail and supply-chain operations.
- Samplesaint: a story about how this company, which creates digital media for cell phones, now distributes discount coupons for redemption by on-screen scanning at the checkout. Coupons can be received in various ways, including in response to an on-the-spot request by texting a number found on the relevant shelf.
- There is also a general discussion of the use of cell phones as payment devices.
Interestingly, these were opposite an advert for IBM (featuring a barcode image) which promoted its ability to make supply chains smarter and more efficient.
The presentations from the 2009 RLG Partnership Annual Symposium: Hearing voices: connecting with users, enhancing services, are now available.
Here is how the event was described ..
User studies have become a critical component in developing and improving services in our institutions. However, investigations into the needs of users and potential users are expensive and time consuming. This event oriented attendees to the importance of user studies, highlighted findings from recent projects of interest and utility to the RLG Partnership, and laid the groundwork for future collaboration. [2009 Symposium agenda]
I was interested to read Arnold Arcolio's account of user selection on several Worldcat Local implementations. Arnold works for the User Experience Group at OCLC.
See the notes about usability and the symposium from me and from Jim at HangingTogether.
Terry Eagleton said somewhere that Raymond Williams was a librarian's nightmare, meaning presumably that his work crossed academic boundaries and resisted easy classification. Let's have a look using the Classify prototype.
The prototype provides access to more than 36 million WorldCat records that contain Dewey Decimal Classification (DDC) numbers, Library of Congress Classification (LCC) numbers, or National Library of Medicine (NLM) Classification numbers.
The records are grouped using the OCLC FRBR Work-Set algorithm resulting in a work-level summary of the class numbers assigned a title. You can retrieve a classification summary by ISBN, ISSN, UPC, OCLC number, or author/title. [About Classify]
OK, so here is what happens with The Country and the City, maybe the one of his works most likely still to be read. It leans to literature, history, sociology ...

Here is Keywords ...

And here is an early work, The Long Revolution ...

There is quite a bit that could be said here. It would be interesting to explore what the pattern of classification might reveal about intellectual trends or cross-disciplinary work. I will limit myself here to saying that it looks as if Eagleton may be right some of the time ;-)
Recently Posted:
- Reading books on the move ...
June 16, 2009 – I have just read M-Libraries: Information use on the move: a report from the Arcadia Programme [pdf] by Keren Mills. It provides an overview of recent trends in 'mobilized' library services, library services which use mobile communications. It reports the results of a survey of library users about their preferences...
- An identity incompletely centered ..
June 14, 2009 – The Facebook username landgrab created a flurry of excitement over the weekend. Individuals 'claimed' their piece of network real estate in the form of a Facebook URL, and organizations had an opportunity to protect registered marks. I am now http://www.facebook.com/lorcand which chimes with my recently established Twitter presence http://www.twitter.com/lorcand I...
- Data flows in the book world
June 14, 2009 – One of the recommendations of the Library of Congress Working Group on the Future of Bibliographic Control was that ways should be found of harnessing publisher data upstream of the cataloging process. The rationale was that this would make data about materials available earlier and reduce overall creation effort. OCLC...
- Sharing usability results
June 10, 2009 – I was interested to see that MIT Libraries have a public page with links to various usability results. I thought it was quite interesting, and that, while acknowledging some local flavor, it might be useful if more libraries shared results in this way. More generally, we know that there are...
- Libraries and catalogues: systemic attention
June 05, 2009 – The Research Information Network in the UK has released a timely report: Creating catalogues: bibliographic records in a networked world [Splash page; pdf]. It is concise and has a useful Summary and Key Findings section. I found it an interesting read, in no small part because it rehearses various key...
- Audience level
June 05, 2009 – I have written about the Audience Level measure in these pages a few times. In this initiative we are using the pattern of holdings across different types of libraries (school, research, etc) to give a 'hint' about the level of interest of an item (juvenile, research/specialist, ...). You can read...
- Find on a plane ...
June 04, 2009 – After many years of traveling without much incident I left a couple of things on planes recently. They never showed up, but It was a bit of a pain inquiring about them. It occurred to me that this was an application that lent itself to something of a crowdsourced approach,...
- Searching
May 31, 2009 – I got a note from Debbie Campbell, Director, Collaborative Services, at the National Library of Australia the other week about their new prototype discovery service. The service is available at http://sbdsproto.nla.gov.au/ and provides integrated access to over 42 million metadata and text resources from a range of the National Library's...
- A single business system environment redux
May 30, 2009 – The new prototype discovery service from the National Library of Australia caused a ripple of interest the other week when it was released. One reason for the interest is that it brings together access to a range of NLA resources (Picture Australia, Libraries Australia, and Pandora, among others) as well...
- Weekend reading
May 29, 2009 – I am pleased to note two recently released RLG Partnership reports. Schaffner, Jennifer. The Metadata is the Interface: Better Description for Better Discovery of Archives and Special Collections (.pdf: 190K/17 pp.)Smith-Yoshimura, Karen. Networking Names (.pdf: 135K/25 pp.)...
- Scientific data sets
May 27, 2009 – As attention to research data management and re-use grows I was interested to see the Gateway to Scientific Data provided by CISTI at the National Research Council Canada. Access to data from scientific research is increasingly important to collaborative research. This section aims to improve access to scientific research data...
- Three sentences
May 22, 2009 – I just spent a very congenial couple of days in Champaign, Illinois, at the Summer Institute for Humanities Data Curation. Here are some memorable sentences ... Karen Wickett shared a very nice Rule in discussion: "I like to do favors for my future self". I used this sentence of Geoff...
- QOTD: libraries and museums
May 22, 2009 – CILIP is the UK professional organization for library (and other information) workers. I was interested to read the passage below in a recent submission it made to a parliamentary group looking at public libraries in England. It is referring to MLA, the Museums, Libraries and Archives Council, a government sponsored...
- Reputation enhancement
May 20, 2009 – I was interested to see the combination of services presented on the Research and Enterprise pages of the London School of Economics. These include: The LSE experts directoryLSE Research Online (the institutional repository)Links to corporate relations, commercialization activites, and to podcasts, videos and other public outputs. I was especially interested...
- Web 2.0 and air freshener
May 17, 2009 – Web 2.0 is often talked about in terms reminiscent of an advert for air freshener: spray it on and there will be a major improvement in the quality of life, or at least of your service. Add tags or an RSS feed and you will be future-ready. You will smell...