<?xml version="1.0" encoding="utf-8"?>
<!-- generator="Movable Type/4.23-en" -->
<rss version="0.91">
 <channel>
 	<title>Lorcan Dempsey&apos;s Weblog</title>
 	<link>http://orweblog.oclc.org/</link>
 	<description>On libraries, services and networks. Explores how library services and organization evolve to meet the needs of network users. </description>
 	<language>en-us</language>
 	<webMaster>dempseyl@oclc.org (Lorcan Dempsey)</webMaster>
 	<pubDate>Sat, 04 Feb 2012 15:31:56 -0500</pubDate>
	
	
 	<item> 
    	<title>Big data .. big trend</title> 
        <description><![CDATA[<p>[<i>I spoke at the Lita Top Technology Trends at Dallas. I had a trend in reserve - <strong>big data</strong> - but did not use it. Here is something along the lines of what I might have said ...</i>]</p>

<p>Big Data is a big trend, but as with expressions for other newly forming areas, it may evoke different things for different people. </p>

<p>A few years ago, academic libraries might have thought of scientific or biomedical data when they heard the expression 'big data'. In particular, the publication of <em><a href="http://research.microsoft.com/en-us/collaboration/fourthparadigm/default.aspx">The Fourth Paradigm: data-intensive scientific discovery</a></em> helped  crystallise awareness of developments in scientific practice. </p>

<p>More recently, however, big data has become a much more general term, across various domains. Indeed, it is now common to read about big data in the general business press. One comes across it in government and medicine, and in education. For example, a recent <a href="http://www.insidehighered.com/news/2012/02/01/using-big-data-predict-online-student-success">article</a> in <em>Inside Higher Ed</em> talks about 'big data' and 'predictive analytics' in relation to course data and student retention. There are two interesting aspects of this, one, the data, and, two, the management environment  ...</p>

<p>The rise of webscale services which handle large amounts of users, transactions and data has made the management of big data a more visible issue. At the same time, as more material is digital, as more business processes are automated, and as more activities shed usage data, organizations are having to cope with greater volume and variety of relatively unstructured data. Analytics, the extraction of intelligence from usage data has become a major activity. Here is a helpful characterization by Edd Dumbill on O'Reilly Radar.</p>

<blockquote>As a catch-all term, "big data" can be pretty nebulous, in the same way that the term "cloud" covers diverse technologies. Input data to big data systems could be chatter from social networks, web server logs, traffic flow sensors, satellite imagery, broadcast audio streams, banking transactions, MP3s of rock music, the content of web pages, scans of government documents, GPS trails, telemetry from automobiles, financial market data, the list goes on. Are these all really the same thing? [<a href="http://radar.oreilly.com/2012/01/what-is-big-data.html">What is big data?</a>]</blockquote>

<p>In a brief discussion of big data as a possible trend on FaceBook, Leslie Johnston provided an interesting perspective on issues from the Library of Congress.</p>

<blockquote>Our collections are not just discovered by people and looked at, they are discovered by processes and analyzed using increasingly sophisticated tools in the hands of individual researchers, using just laptops. And we not only have TB/PB of digital collections, we will have billions of items, so fully manual processing/cataloging is rapidly becoming a thing of the past.</blockquote>

<p>Leslie expanded on some of the actual data  ...<br />
<blockquote><ul>	<li>5 million newspaper pages, images with OCR, available via API, used in NSF digging into data project for data mining, combined with other collections used in new visualizations, and in an image analysis project. </li>	<li>5 billion files of all types in a single institutional web archive - researchers do not search for and view individual archived sites, they analyze sites over time, and characterize entire corpuses, such as campaign web sites over 10 years. </li>	<li>Extreme example: over 50 billion tweets: many research requests received to do linguistic analysis, graph analysis, track geographic spread of news stories, etc. </li>	<li>Collection of 100s of thousands of electronic journal articles, which require article-level discovery: they don't all come with metadata and no one can afford to create it manually.</li></ul> </blockquote></p>

<p>The remark about manual creation of metadata is one example where current processing methods do not scale. Leslie also notes:</p>

<blockquote>And we cannot do manual individual file QA for mass digitization or catalog web archives or tweets without automated extraction. And when we start talking about video and audio, it all requires automated extraction or processing. I know of one request that we process a video to produce an audio-only track so that a transcript could then be automatically generated. LC has 20 PB of video and audio. Can you imagine what it would take to provide that level of service? Researchers started asking a few years ago to get files so they could do it themselves.</blockquote>

<p>The Library of Congress may be a special case, but other organizations are facing similar issues. We are familiar with discussions about research data curation in university settings. Referring to the university challenge, Leslie then points to another interesting example. </p>

<blockquote>I hear this from research libraries, but also from archives, especially state archives that are mandated to take in all state records, physical and electronic. Email archives are already Big Data for a lot of state archives.</blockquote>

<p>Indeed, national or state institutions with responsibility for public records are reconfiguring organizations and systems to manage large volumes of e-records. My colleague Jackie Dooley pointed me at the recent <a href="http://www.whitehouse.gov/the-press-office/2011/11/28/presidential-memorandum-managing-government-records">Presidential Mandate on Managing Government Records</a> which has implications for agencies and NARA. </p>

<p>In this context, it is not surprising that we are seeing a growing interest in data mining across domains (Leslie mentions the '<a href="http://www.diggingintodata.org/">digging into data</a>' challenge). The term 'data scientist' is cropping up in job ads and position titles. A couple of years ago, Hal Varian's comments on the importance of data and the skills required to analyse it were widely noticed. </p>

<blockquote>The ability to take data - to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it's going to be a hugely important skill in the next decades, not only at the professional level but even at the educational level for elementary school kids, for high school kids, for college kids. Because now we really do have essentially free and ubiquitous data. So the complementary scarce factor is the ability to understand that data and extract value from it. [<a href="http://www.mckinseyquarterly.com/Hal_Varian_on_how_the_Web_challenges_managers_2286">Hal Varian on how the web challenges managers - reg required</a>]</blockquote>

<p>It is clear from this discussion that existing systems are not well suited to manage and analyse these types of data, and this introduces the second topic, the management environment. Indeed, for Dumbill, this is <strong>the</strong> defining characteristic of big data:</p>

<blockquote>Big data is data that exceeds the processing capacity of conventional database systems. The data is too big, moves too fast, or doesn't fit the strictures of your database architectures. To gain value from this data, you must choose an alternative way to process it.</blockquote>

<p>And alternative ways have been emerging, assisted by the webscale companies who had to face these challenges early on. Google provided MapReduce, described by Edd Dumbill as follows:</p>

<blockquote>The important innovation of MapReduce is the ability to take a query over a dataset, divide it, and run it in parallel over multiple nodes. Distributing the computation solves the issue of data too large to fit onto a single machine. Combine this technique with commodity Linux servers and you have a cost-effective alternative to massive computing arrays. [<a href="http://radar.oreilly.com/2012/02/what-is-apache-hadoop.html">What is Apache Hadoop</a>]</blockquote>

<p>MapReduce is a central part of Hadoop, whose development was supported by Yahoo, and whose further development is now supported within the Apache Software Foundation.</p>

<blockquote>Hadoop brings the ability to cheaply process large amounts of data, regardless of its structure. By large, we mean from 10-100 gigabytes and above. How is this different from what went before?</blockquote><blockquote>Existing enterprise data warehouses and relational databases excel at processing structured data and can store massive amounts of data, though at a cost: This requirement for structure restricts the kinds of data that can be processed, and it imposes an inertia that makes data warehouses unsuited for agile exploration of massive heterogenous data. The amount of effort required to warehouse data often means that valuable data sources in organizations are never mined. This is where Hadoop can make a big difference.  [<a href="http://radar.oreilly.com/2012/02/what-is-apache-hadoop.html">What is Apache Hadoop</a>]</blockquote>

<p>The availability of the Hadoop family of technologies (again, nicely <a href="http://radar.oreilly.com/2012/02/what-is-apache-hadoop.html">described</a> by Dumbill) and cheap commodity hardware has made processing of large amounts of data more accessible. Cloud options are also emerging, from Amazon, Microsoft and others. Uptake has been rapid. </p>

<p>So, while Hadoop and related technologies have emerged in the context of the Big Data requirements of webscale companies, they are becoming more widely deployed. Their scalability, coupled with lower cost, have made them an attractive option across a range of data processing tasks. They may be used with 'big data' and not so big data. </p>

<p>In this way, my big data trend may more realistically be two trends. We are indeed having to process greater volume and variety of data. The description of data management at the Library of Congress provides some nice examples. Several technologies, notably the Hadoop framework, have emerged as a result of such challenges. However, these are now also finding more broad adoption as they reduce costs and provide greater flexibility. </p>

<p><strong>Coda:</strong> In OCLC research we have been using MapReduce <a href="http://outgoing.typepad.com/outgoing/2005/06/mapreduce_and_w.html">for several years</a> and more recently have been using Hadoop. We have been also working with colleagues elsewhere in OCLC as we look at where and how Hadoop might provide benefits. </p>]]></description>
		<author>dempseyl@oclc.org (Lorcan Dempsey)</author>
        <link>http://orweblog.oclc.org/archives/002196.html</link> 
        <guid>http://orweblog.oclc.org/archives/002196.html</guid> 
        
                    <category>Analytics and measurement</category>
        
                    <category>General - systems and technologies</category>
        
                    <category>Libraries -  systems and technologies</category>
        
                    <category>OCLC</category>
        

        <pubDate>Sat, 04 Feb 2012 15:31:56 -0500</pubDate>
 	</item> 

 	<item> 
    	<title>Linking not typing ... knowledge organization at the network level</title> 
        <description><![CDATA[<p>'Knowledge organization' seems a slightly quaint term now, but we don't have a better in general use. Take the catalogue. This has been a knowledge organization tool. When an item is added, the goal is that it is related to the network of knowledge that is represented in the catalogue. In theory, this is achieved through 'adjacency' and cross reference, notably with reference to authors, subjects and works. In practice this has worked variably well. </p>

<p>In parallel with bibliographic data, the library community, notably national libraries, have developed 'authorities' for authors and subjects to facilitate this structure. From our current vantage point, I think we can see three stages in the development of these tools. </p>

<p>1. Label. In the first, subject and name authorities provide lists from which values for relevant fields were chosen. Effectively, they constrain the range and format of  subject or name data, providing an agreed text label for a concept or name. Examples are LCSH, Dewey, and the Library of Congress Name Authority File. These provide some structuring devices for local catalogues, but those systems do not exploit the full structure of the authority systems from which the values are taken. Think of what is done, or not done, with classification for example. The classification system may not be used to provide interesting navigation options in the local system, and more than likely is not connected back to the fuller structure of the source scheme. That said, having a consistent label is an advantage, and facilitates matching within and between systems. </p>

<p>2. Data. The second stage is that these authority systems are being considered as resources in themselves, and not just as sources of controlled values for bibliographic description. So, we are seeing the Library of Congress, for example, making LCSH and the Name Authority File available as linked data. OCLC is working with a group of national libraries to synthesize name authority files and make them available as an integrated resource in the <a href="http://www.viaf.org">VIAF</a> service. <a href="http://experimental.worldcat.org/fast/">FAST</a> has recently been made available in this way. The <a href="http://www.surffoundation.nl/en/themas/openonderzoek/infrastructuur/Pages/digitalauthoridentifierdai.aspx">Digital Author Identifier</a>, a national Dutch system for identifying researchers, is interesting in this context. In this arrangement, there is collaboration between the apparatus for uniquely identifying researchers and the national authority file. </p>

<p>3. Network.  In a third stage, as these network-level resources become more richly linkable and as local environments exploit that linking ability it becomes possible to do more. This type of linking has only just begun though, and it will be interesting to see how it develops. In this context, a URI is added to the label, making it actionable and globally unique. As an example, think again of the catalogue. The structuring devices we employ are about structuring relationships *within* the catalogue. This would be turned inside out if we not only imported values, but also linked those labels to those external resources. In this way, the item represented could be re-placed in the broad network of knowledge established by the authority file from which it comes. </p>

<p>Of course, alongside this, they may also link to, or draw data from, other navigational, contextual, identifying or structuring resources such as DBpedia, MusicBrainz or Geonames. These and other reference points are likely to be important webscale identity and knowledge organization services. In a sense, more generally, this has already happened, as people orient themselves by links to Wikipedia, MusicBrainz, IMDB and other network level resources. </p>

<p>As in other areas of our activity, we need to think about how activities whose natural level was once local are now moving up to the network level. And once they are at the network level, they have to live alongside other approaches. </p>

<p>If this were to become more common, there are some implications ...</p>

<p><strong>From records to entities</strong> ... we ship data around in 'records', bundles about individual items, and our systems are structured around managing these records. We do not tend to manage data about other things of interest to us to the same extent:   authors, places, people, concepts, works, and so on, the types of things we have in authority files. What would happen if we more clearly described an item by linking it to these files? More generally, we can see stronger interest emerging in some of these other entities, personal names especially. Think for example of how Amazon has created people pages or the growing interest in researcher identification. Or of places, as geolocation services take hold. Freebase is creating an '<a href="http://wiki.freebase.com/wiki/What_is_Freebase%3F">entity graph</a>' giving IDs to millions of entities (people, places and things). </p>

<p>Much of the library linked data discussion has been about making that local record-based data available in different ways. As interesting is the discussion about what key resources libraries will want to link to, and <strong>how they might be sustained</strong>. An important question for national libraries and others who manage some of the schemes mentioned above is how to move into this third phase. What would this mean for library systems or for library data of this type? What resources are important? How should they be sustained? To make this concrete, are the name authority files maintained by national libraries fit for purpose in a network world? Does it make sense to limit their scope to authors identified in a particular library workflow, cataloging, and exclude other authors (of articles, for example)? Does it make sense to limit their creation to a restricted group of specialist librarians? And so on ...</p>

<p>Finally, as knowledge organization moves to the network level how do library resource relate to others. Can other services leverage <strong>the accumulated investment of the library community</strong>, or does it fade. The organized relationship between the Deutsche Nationalbibliothek and Wikipedia in Germany is an interesting example here, where the German Wikipedia explicitly takes advantage of the structuring work done by the DNB. Wikipedia itself is very interesting in this regard, as it has effectively become an 'addressible knowledgebase'. If I want to tell you about a new concept or movement, or refer you to a place, or mention a person, I can send you a Wikipedia link. What would be required for Wikipedia to take advantage of 'knowledge organization' approaches developed in the library community? </p>

<p><br />
Related entry:</p>

<ul>	<li><a href="http://orweblog.oclc.org/archives/002185.html">Nostalgia, the Dublin Irish Festival and variant forms of names</a></li></ul>]]></description>
		<author>dempseyl@oclc.org (Lorcan Dempsey)</author>
        <link>http://orweblog.oclc.org/archives/002195.html</link> 
        <guid>http://orweblog.oclc.org/archives/002195.html</guid> 
        
                    <category>Featured</category>
        
                    <category>Knowledge organization and  representation</category>
        
                    <category>Metadata</category>
        
                    <category>OCLCr</category>
        

        <pubDate>Sun, 01 Jan 2012 13:19:04 -0500</pubDate>
 	</item> 

 	<item> 
    	<title>End of the digest ....</title> 
        <description><![CDATA[<p>For almost as long as this blog has been going we have had an associated digest. This has gone out to over 800 people. The frequency of the digest has changed as the frequency of posting has gone down. We have decided that it is now time to turn off the digest. While the blog continues, it has become more a venue for occasional comment than a steady stream. Thank you to all those who have subscribed to the digest, and I hope you continue to read entries in the future. </p>

<p>And, as a reminder, I am on Twitter at <a href="http://www.twitter.com/LorcanD">http://www.twitter.com/LorcanD</a>.</p>]]></description>
		<author>dempseyl@oclc.org (Lorcan Dempsey)</author>
        <link>http://orweblog.oclc.org/archives/002193.html</link> 
        <guid>http://orweblog.oclc.org/archives/002193.html</guid> 
        
                    <category>Personal</category>
        

        <pubDate>Thu, 13 Oct 2011 20:47:49 -0500</pubDate>
 	</item> 

 	<item> 
    	<title>Collections are library assets</title> 
        <description><![CDATA[<p>I quite like using the word 'assets' with reference to library collections. We tend to think of assets in positive terms, as things that are valuable. More of that later.  </p>

<p>I was interested to see Rick Anderson remark on the vocabulary used by my colleague Constance Malpas a while ago. This was in the context of a generous note about  Constance's <em>"Cloud-sourcing Research Collections: Managing Print in the Mass-digitized Library Environment." [<a href="http://www.oclc.org/research/publications/library/2011/2011-01.pdf">pdf</a>]</em>.</p>

<blockquote>I confess that I giggle and shudder simultaneously at the thought of referring publicly to books in our collection as "inventory that is increasingly devalued as an institutional asset." That kind of business-school-flavored language will, not to put too fine a point on it, utterly freak out significant segments of any university faculty, not to mention library staff. [<a href="http://scholarlykitchen.sspnet.org/2011/02/28/the-digitized-book-corpus-and-the-cracking-dam/">The Scholarly Kitchen</a>]</blockquote>

<p>The 'business' reference is apt, and I confess that my sense of 'asset' in general conversation has indeed been subtly transformed by the narrower acounting sense. For example in the glossary to Robert C Higgins' <em>Analysis for financial management</em>* we read that an asset is 'Anything with value in exchange'. And turning, as one does, to Wikipedia, I read an <a href="http://en.wikipedia.org/wiki/Asset">accounting definition</a>: 'An asset is a resource controlled by the entity as a result of past events and from which future economic benefits are expected to flow to the entity'.</p>

<p>What is relevant here is the idea that assets are things from which you release value. You expect a return. But assets are not ends in themselves. They are means towards creating value. Of course, this is important because assets have associated costs. Managing collections, for example, is not cost-free. </p>

<p>I remember being struck by some sentences about assets in Higgins' book when I read them first a few years ago: </p>

<blockquote>Some newcomers to finance believe assets are a good thing: the more the better. The reality is just the opposite: Unless a company is about to go out of business, its value is in the income stream it generates, and its assets are simply a necessary means to this end. Indeed, the ideal company would be one that produced income without any assets; then no investment would be required, and returns would be infinite.</blockquote>

<p>Yes, financial metrics lend clarity here, but are not relevent to libraries for whom the question of value is different and less susceptible to measurement. </p>

<p>However, it has been interesting see the growing debate about print 'assets' in libraries. As the pressure to repurpose space grows and as the print collection releases progressively less value in research and learning, there is a growing interest in managing down print assets. Not unexpectedly, this is in parallel with an emerging interest in securing system-wide preservation of the collective print record. (I provide some examples in the related entries listed below.)</p>

<p>It is clear that research libraries no longer see collections as ends in themselves, or they do not necessarily equate the size of the collection with the value of the library. More is not necessarily better. They also recognise the opportunity costs of managing large print collections.  </p>

<p>As we rethink collections, I think we are seeing them more as assets in the sense I have discussed here, as investment is driven by a stronger sense of how they will be used to generate value in research and learning. Of course, some libraries have thought this way for longer: think of how a busy public library manages its collection. And of course, some libraries will continue to have a mission-driven responsibility to collect significant portions of the scholarly record, although we will probably see more collective approaches here. </p>

<p>Anyway, to get a sense of what I mean, Rick Anderson's <a href="http://www.slideshare.net/CharlestonConference/let-them-eat-everything-by-rick-anderson-university-of-utah">presentations</a> might help ...</p>

<p>Related entries:<br />
<ul>	<li><a href="http://orweblog.oclc.org/archives/002152.html">The service turn</a></li>	<li><a href="http://orweblog.oclc.org/archives/002160.html">The collections shift</a></li>	<li><a href="http://orweblog.oclc.org/archives/002151.html">Managing down collections</a></li>	<li><a href="http://orweblog.oclc.org/archives/002106.html">We're not going anywhere ... Ok, we lied ...</a></li></ul></p>

<p>* It is always a pleasure to read something that is well written. This is a very nice example of fine technical writing. </p>]]></description>
		<author>dempseyl@oclc.org (Lorcan Dempsey)</author>
        <link>http://orweblog.oclc.org/archives/002191.html</link> 
        <guid>http://orweblog.oclc.org/archives/002191.html</guid> 
        
                    <category>Libraries - organization and services</category>
        

        <pubDate>Wed, 31 Aug 2011 15:48:37 -0500</pubDate>
 	</item> 

 	<item> 
    	<title>The ILS, the digital library and the research library</title> 
        <description><![CDATA[<p>Job adverts are interesting for a variety of reasons. They give a sense of skills and attributes in demand. They say something about how the hiring institution wants to present itself. And they can indicate trends. </p>

<p>I have been interested to see three research libraries look for senior digital library posts in recent months.</p>

<ul>	<li><a href="http://www.lisjobnet.com/job-ads/762-a110606-associate-director-for-digital-library-programmes-and-information-technologies/">Associate Director for Digital Library Programmes and Information Technologies</a>, Bodleian Libraries, University of Oxford.</li>	<li><a href="https://academicjobs.columbia.edu/applicants/jsp/shared/frameset/Frameset.jsp?time=1313628502453">Associate Vice President for Digital Programs and Technology Services</a>, Columbia University Libraries/Information Services.</li>	<li><a href="http://www.jobs.ed.ac.uk/vacancies/index.cfm?fuseaction=vacancies.furtherdetails&vacancy_ref=3014396">Head of Digital Library</a>, Information Services, The University of Edinburgh.</li></ul>

<p><strong>Note</strong>: given the nature of these resources, the links may not continue to work indefinitely.</p>

<p>Now, these are different posts in different institutions but there is the common ground that you might expect as research libraries look at creating digital infrastructure, engage with research data needs, explore new modes of scholarly communication, and so on. Each is challenging and interesting and offers a wonderful opportunity to be centrally involved in advancing how libraries support changing research and learning practices. </p>

<p>However, I was struck by something else they have in common. Responsibility for the integrated library system (or library management system) appears to be a part of each post, yet it is not foregrounded in the position description. For these libraries, maybe, the ILS is a necessary part of doing business, but is not the site of major development. Designing and developing digital infrastructure now includes the ILS but is no longer led by it. Or maybe there is some other reason .... ?</p>

<p>Now, considerable time and effort goes into these systems, and they will be reconfigured in coming years. Picking up on my opening remarks though, it is interesting to see where the adverts place the emphasis. <br />
</p>]]></description>
		<author>dempseyl@oclc.org (Lorcan Dempsey)</author>
        <link>http://orweblog.oclc.org/archives/002188.html</link> 
        <guid>http://orweblog.oclc.org/archives/002188.html</guid> 
        
                    <category>Libraries -  systems and technologies</category>
        

        <pubDate>Wed, 17 Aug 2011 20:43:37 -0500</pubDate>
 	</item> 

 	<item> 
    	<title>Preserving musical heritage ...</title> 
        <description><![CDATA[<p>One of the casualties of the London riots last week was a Sony distribution warehouse.</p>

<blockquote>The building, owned by Sony DADC, was also the main HQ for the UK's biggest distributor of independent music, Pias. [<a href="http://www.bbc.co.uk/newsbeat/14490635">More than 1.5m CDs destroyed in Sony warehouse fire</a>]</blockquote>

<p>Interestingly, Sony looked after the stock of more than 150 record labels at the warehouse. According to the BBC story quoted above "As well as CDs, the 20,000 sq m (215,000 sq ft) centre was used to store DVDs, Blu-ray discs and discs used for PlayStation Portable games."</p>

<p> It was depressing reading about the impact on the affected independent labels and artists and the music stores who depend on them (see, for example this <a href="http://www.nme.com/blog/index.php?blog=10&title=london_riots&more=1&c=1&tb=1&pb=1">NME blog</a> entry and this <a href="http://www.guardian.co.uk/music/musicblog/2011/aug/11/how-pias-fire-affects-labels?intcmp=239">Guardian article</a>). Several support initiatives have been set up to work with the labels through this difficulty.  I immediately thought about the heightened awareness about distribution, supply chain management and risk following the Japanese earthquakes earlier this year (see this <a href="http://www.nytimes.com/2011/03/20/business/20supply.html?pagewanted=all">NYT story</a> for example). </p>

<p>One of the consequences of the arson is that some of the labels may not re-issue physical formats of the music. It will only be available to consumers online in digital format. See this note on the Buzzin' Fly label, for example:</p>

<blockquote><strong>09.08.11 Buzzin' Fly stock goes up in flames in warehouse fire during London riots</strong>
London, 13h33 , temperature 21°, humidity 72%, clear. Virtually all Buzzin' Fly and Strange Feeling stock was destroyed in an arson attack on the Sony DADC warehouse in Enfield last night during the London riots. The warehouse contained all records distributed by our distributor, PIAS. Other labels are also badly hit. There are a handful of copies of some releases (and a full download catalogue) left on sale on the Buzzin' Fly online shop, but beyond that it is unlikely much of our stock will ever be repressed if at all. A huge slice of the label's history has been destroyed. [<a href="http://www.buzzinfly.com/buzzed.html">Buzzin' Fly Records</a>]</blockquote>

<p>A couple of things occurred to me. First, it is interesting to see the concentration that has occurred in the physical distribution chain and the vulnerability that caused for the indie labels. Second, I was struck again by how one of the benign consequences of the historic model is that preservation is a function of the physical distribution of materials. As national libraries and others look to maintaining a record of the cultural and scholarly environment in a digital world, the model has changed in ways that we don't yet have a very good perspective on. However, it made me realise that I knew little about how well, or not, libraries and other memory institutions (in this case, this seems an appropriate term) are prepared for the acquisition, management and preservation of digital music.</p>]]></description>
		<author>dempseyl@oclc.org (Lorcan Dempsey)</author>
        <link>http://orweblog.oclc.org/archives/002187.html</link> 
        <guid>http://orweblog.oclc.org/archives/002187.html</guid> 
        
                    <category>Digital asset management</category>
        
                    <category>The cultural and scholarly record</category>
        

        <pubDate>Sun, 14 Aug 2011 15:40:26 -0500</pubDate>
 	</item> 

 	<item> 
    	<title>Nostalgia, the Dublin Irish Festival, and variant forms of names</title> 
        <description><![CDATA[<p>The <a href="http://www.dublinirishfestival.org/">Dublin Irish Festival</a> is on this weekend - Dublin, Ohio, that is. I notice that Moya Brennan is <a href="http://www.dublinirishfestival.org/artist/">performing</a>.</p>

<p>As some folks will know, Moya Brennan is an Irish singer who was a member of the well-known family group, <a href="http://www.clannad.ie/biography/index.html">Clannad</a>. They emerged in the 70s, playing very much in a traditional irish music idiom. As they evolved, they  developed a style that was influential in the emergence of the sort of new age, 'celtic' music that became popular and has some well known practitioners. </p>

<p>One of the most vivid memories I have of secondary school is when they put on a concert there (before they were very well known). We had never seen anything like it - songs in Irish, but with a double bass and a harp, long hair, and influences from other types of music. </p>

<p>At that time, Moya was known as Máire Ní Bhraonáin. And indeed, if you look at Worldcat identities page for 'Moya Brennan', you will see 'Brennan, Máire' and 'Ní Bhraonáin, Máire' listed as alternatives. (The latter puts the surname in its Irish form.)</p>

<p>'Moya' is an assumed name, perhaps sounding something like the Irish pronunciation of 'Máire'. 'Moya Brennan' is presumably a more memorable name to have outside Ireland than 'Máire Ní Bhraonáin'.</p>

<p>Worldcat Identities does a reasonable job with related Identities, which you can see and follow in the graphical form at the <a href="http://experimental.worldcat.org/IDNetwork/display.html?query=lccn-no98-100171">Worldcat Identities Network page</a>.  It pulls together the other members of Clannad - her brothers (Brennans) and uncles (Duggans) - and some others. </p>

<p>Interestingly, however, Identities, drawing on the underlying bib data, does not reveal what many people might feel is her most interesting related identity.</p>

<p>This is her more famous sister, who was briefly part of Clannad, and whose name is Eithne Ni Bhraonain. Eithne is somewhat better known as ....... <a href="http://www.worldcat.org/wcidentities/lccn-n91-122258">Enya</a>.  </p>

<p>Now, as the relationships in Identities - mined from the bib records - are 'bibliographic', there is no reason that family or other relationships which are not somehow represented in the bibliographic data should be present. In fact, as my colleague Diane Vizine-Goetz pointed out there is a relationship in the bibliographic data, but at a very low threshold. See the names collocated in the statement of responsibility in this <a href="http://www.worldcat.org/oclc/470985067">Danish record</a> (the names are not consistently controlled in WC).</p>

<p>It seems to me that having the relationships made visible in Identities makes me want to see other important relationships - whether or not they are supported by bibliographic relationships. Not having them there seems to me to be an important limitation. This is because I am seeing things through a names or identity filter and the Identity becomes a subject of interest in itself. We are seeing more of this. Open Library, Amazon, Google Scholar, Trove and others all now present 'author' pages, because authors or creators are important topics of interest in themselves, as well as pivots for searching. </p>

<p>This prompts these brief observations about some inevitable futures if library practices in this area are to continue to be relevant:</p>

<ul><li>As I suggested the last time I spoke about variant forms of Irish names (<a href="http://orweblog.oclc.org/archives/001848.html">Name authorities, crowdsourcing, and Máire Mhac an tSaoi</a>), it would be nice to make suggestions to LC or the National Library of Ireland or the BL or ... However, I have no way of easily doing that. Authorities work - and think NACO here - is a professional activity, hedged around by rules and procedures; it is after all 'authorities' work. However, it would seem sensible to open it up to suggestion and information.</li>	<li>We now have access to several related online sources of data about people and names. Think of the related Wikipedia, Dbpedia and Freebase for example. It would be good to be able to programmatically mine them to enhance the data libraries manage. In Identities, we match names in Wikipedia and link to them when we are reasonably confident the link is correct, but it is not easy to extract relationship information. </li>	<li>One could change authorities practice to record significant related people who are not supported by bibliographic relationships. This seems sensible to me, although it runs counter to the trend to reduce effort - unless a parallel mechanism is found to more widely share the task of creation. One probably would not want to record that <a href="http://www.worldcat.org/wcidentities/lccn-n90-602202">Lorcan Dempsey</a> is <a href="http://www.worldcat.org/wcidentities/np-dempsey,%20laurence">Laurence Dempsey's</a> son; but it would be interesting to know that Martin Amis is Kingsley Amis's son, and not just let a general relationship emerge because they are both the subject of a book, for example. One might note here in passing that it would be nice to consistently type those relationships. </li>
	<li>Finally, the current model in library catalogs is one in which centralised creation of 'knowledge organization' systems (subject/name authorities in particular) is matched by decentralised application of those systems in local environments. There is some consistency, but in general local application don't make best use of the data, there may be some variability in application, and links back to the fuller knowledge systems are not present. As more activities move to the network level, it is worth thinking more seriously about network level knowledge organization also, where those knowledge systems become network level resources which others link to. Think of how <a href="http://www.viaf.org">VIAF</a> or the authorities the Library of Congress makes available at <a href="http://id.loc.gov/">http://id.loc.gov/</a> might be used, for example, to provide richer context. Currently, a catalog may include, for example, a name in its controlled form: think of how the catalog could either link to, or actually incorporate, the richer data available in the authority file, or a system like Identities. In the context of this note, this would bring the design and construction of those knowledge organization systems more into focus and encourage some of the approaches mentioned above. This is alongside the work that happens in Wikipedia, or Amazon, or other sources. You can get a sense of this at the U Wisconsin Forward experimental system which pulls data from Freebase, with varying success. See for example this search for <a href="http://forward.library.wisconsin.edu/catalog?q=conor+cruise+o%27brien&qt=search&local=false">Conor Cruise O'Brien</a>. It would be good to see library resources used to enrich results or local data. I will return to the topic of network level knowledge organization in a future post.</li></ul>

<p>(Incidentally, 'knowledge organization' seems a rather antique term .. it would be nice to have a better one.)</p>

<p></p>

<p><br />
</p>]]></description>
		<author>dempseyl@oclc.org (Lorcan Dempsey)</author>
        <link>http://orweblog.oclc.org/archives/002185.html</link> 
        <guid>http://orweblog.oclc.org/archives/002185.html</guid> 
        
                    <category>Knowledge organization and  representation</category>
        

        <pubDate>Fri, 05 Aug 2011 06:55:08 -0500</pubDate>
 	</item> 

 	<item> 
    	<title>Worldcat Identities Network: a &apos;mashup&apos;</title> 
        <description><![CDATA[<p>There has been some nice reaction to the <a href="http://www.oclc.org/research/news/2011-07-28.htm">Worldcat Identities Network</a>.</p>

<p><a href="http://experimental.worldcat.org/IDNetwork/display.html?query=lccn-n50-1905"><a href="http://experimental.worldcat.org/IDNetwork/display.html?query=lccn-n50-1905"><span class="mt-enclosure mt-enclosure-image" style="display: inline;"><img alt="win.PNG" src="http://orweblog.oclc.org/win.PNG" width="600" height="365" class="mt-image-none" style="" /></span></a></a></p>

<p>The initial motivation for this was to put a graphic display of <a href="http://www.worldcat.org/wcidentities/lccn-n50-12860#linkassoc">related Identities</a> into an Identities <a href="http://www.worldcat.org/wcidentities/lccn-n50-12860">page</a>. This did not work out and we decided to make it available as a standalone app. The aim is to show how something could be built on top of the <a href="http://oclc.org/developer/services/WCAPI">Worldcat API</a> and the <a href="http://www.oclc.org/developer/services/identities">Worldcat Identities Web Services</a>. </p>

<p>It is a 'mashup'. Which prompted me to think that the peak of the mashup has passed? Or at least we do not hear as much about mashups as before? It is interesting to consider why this is the case, but that is something for a future post perhaps. </p>

<p>Enjoy the Worldcat Identities Network ... Here - pretty randomly - is musician <a href="http://experimental.worldcat.org/IDNetwork/display.html?query=lccn-n92-12823">Paul Brady</a>, artist <a href="http://experimental.worldcat.org/IDNetwork/display.html?query=lccn-n80-23105">Richard Diebenkorn</a>, organizations <a href="http://experimental.worldcat.org/IDNetwork/display.html?query=lccn-n80-23105">The Missouri Botanical Gardens</a> and <a href="http://experimental.worldcat.org/IDNetwork/display.html?query=lccn-n82-2711">Los Alamos National Laboratory</a>, academic and writer <a href="http://experimental.worldcat.org/IDNetwork/display.html?query=lccn-n84-120698">Seamus Deane</a>, poet <a href="http://experimental.worldcat.org/IDNetwork/display.html?query=lccn-n80-109150">Eavan Boland</a>, astronomer <a href="http://www.worldcat.org/wcidentities/lccn-n79-18031">Fred Hoyle</a>, the interesting <a href="http://experimental.worldcat.org/IDNetwork/display.html?query=nc-lincoln%20theatre%20columbus%20ohio">Lincoln Theatre</a>, Columbus, and the boat, <a href="http://experimental.worldcat.org/IDNetwork/display.html?query=lccn-n78-1099">Brendan</a>. </p>

<p>It should be noted that there is no editorial assistance here. The data is pulled from Worldcat and Worldcat Identities. Worldcat identities is in turn programmatically mined from Worldcat. It would be nice to show the type of each relationship, but that is inconsistently and intermittently available in the bib data. </p>]]></description>
		<author>dempseyl@oclc.org (Lorcan Dempsey)</author>
        <link>http://orweblog.oclc.org/archives/002184.html</link> 
        <guid>http://orweblog.oclc.org/archives/002184.html</guid> 
        
                    <category>Libraries - distributed environments</category>
        
                    <category>Metadata</category>
        
                    <category>OCLC</category>
        
                    <category>OCLCr</category>
        
                    <category>User experience</category>
        

        <pubDate>Sat, 30 Jul 2011 11:22:15 -0500</pubDate>
 	</item> 

 	<item> 
    	<title>Gamification: services and libraries</title> 
        <description><![CDATA[<p>I have been interested to see more notes in my tweetstream about peoples' exercise or diet regimes. They are typically generated by network services as a by-product of some activity, running or cycling, for example, and are part of a motivating framework. The <a href="http://www.amazon.com/Withings-WBS01-Wifi-Body-Scale/dp/B002JE2PSA">Withings bathroom scale</a> is connected to the network, allows you to set goals and  to record and access weight and other data, as well as optionally communicating progress by tweeting your weight. In a similar vein, I recently came across <a href="http://www.stickk.com">stickK.com</a> - "the smartest way to set and achieve your goals". </p>

<blockquote>If you're ready to turn that goal into an accomplishment, you're ready for stickK. 
stickK was founded on the principle that creating incentives and assigning accountability are the two most important keys to achieving a goal. Thus, the "Commitment Contract" was born. </blockquote>

<p>These are all examples of the growth of gamification as a design principle, the application of game mechanics to services and systems. </p>

<p><a href="http://info.keas.com/employee-wellness-tin">Keas.com</a>, a company aimed at reducing corporate healthcare costs by improving employee wellness, puts the 'power of play' at the centre of its method: </p>

<blockquote>Keas is a social game that that promotes employee wellness. Employees form teams and earn points by making healthy choices each week, like eating better, exercising more, or even getting more sleep. Teams participate in wellness challenges and whoever gets the most points wins!</blockquote>

<p>Keas is especially interesting because its co-founder and CTO is the distinguished <a href="http://en.wikipedia.org/wiki/Adam_Bosworth">Adam Bosworth</a>, who made major contributions to products and to general internet technologies while at Microsoft and BEA, before joining Google to work on Google Health. </p>

<p>Bosworth wrote about gamification and Keas on <a href="http://techcrunch.com/2011/06/18/web-of-games/">Techcrunch</a> recently. He quickly identified three major inflection points in the development of computer and network technologies, each bringing about an order of magnitude increase in participation: the emergence of the PC, the emergence of the GUI, and the emergence of the web and mobile phones. The next inflection point, he suggested, was not about numbers but about <strong>engagement</strong>. Engagement is a central aspect of gamification, and is integral to the examples I gave above. Bosworth points to a <a href="http://gamification.org/wiki/Gamification">gamification site</a> which carries this definition: "Gamification is the integration of Game Mechanics and game-thinking in non-game environments to boost Engagement, Loyalty and Fun!".</p>

<p>He argues that inside the term 'gamification' are three important truths. First, social gaming changes engagement in "deep and dramatic ways" by appealing to the "primitive brain in all of us that wants constant rewards, social recognition and adventure". Second, gamification will accelerate the move from physical to digital. He suggests that Groupon has shown that people like the juxtaposition of shopping and games, and that bricks-and-mortar shopping will have to become more fun if it is to slow the trend to online.  And third, he forecasts the complete replacement of PCs by mobile devices within 10 years, noting that "social games lend themselves to this form factor and this form factor is location-aware and constantly with you".</p>

<p>He concludes with an interesting message to developers:</p>

<blockquote>We used to teach information design. Then we taught UI design and UI interaction. But now it will be game mechanics. Within two years (if not already), lack of understanding appointment mechanics, game mechanics and leveling will be as crippling to someone who aspires to design online solutions as it is today for someone who doesn't understand HTML and CSS and AJAX and JQuery.</blockquote>

<p>Most of the examples above draw from health care and wellness, but we can see these techniques in widespread use. The Gamification website, mentioned above, has a <a href="http://gamification.org/wiki/Gamification_Industry">variety of examples</a> from different industries. </p>

<p>I was wondering whether such approaches were feasible in a library environment when I came across <a href="http://www2.hud.ac.uk/tali/support/proj11_lemon.php">Lemon Tree</a>, a project at the University of Huddersfield in the UK. </p>

<blockquote>Lemon Tree seeks to increase the use of library resources through a social, game based elearning platform. Users will register with the system and be able to earn points and rewards for interacting with library resources, such as leaving comments and reviews of library books. Integration with other social networks such as Twitter and Facebook will be built into the system.</blockquote>

<p>There is a <a href="http://library.hud.ac.uk/blogs/projects/lemontree/">project blog</a> and it will launch in the Fall (aka Autumn). Something to watch ... it will be interesting to see if incentives are strong enough to encourage strong participation. Andew Walsh, who is managing the project, writes about it briefly in an article about the Library's work with social tools. </p>

<blockquote>The rewards users can gain through Lemon tree are developing as we see what works and what our users enjoy, with a massive range of options possible. We are particularly interested though in engaging those people who we know come into our library, but borrow very few books and rarely access our electronic resources. If we can make it fun for them to use the information resources we have and increase their usage then Lemon Tree will have succeeded for us. [<a href="http://eprints.hud.ac.uk/11035/">Tweets, texts & trees</a>.]</blockquote>

<p>I see that Andrew uses two of the words - 'engaging' and 'fun' - which were part of the definition of gamification given above. The third - 'Loyalty' - would be good too to see as a result too ... </p>

<p></p>

<p><br />
</p>]]></description>
		<author>dempseyl@oclc.org (Lorcan Dempsey)</author>
        <link>http://orweblog.oclc.org/archives/002183.html</link> 
        <guid>http://orweblog.oclc.org/archives/002183.html</guid> 
        
                    <category>Libraries - organization and services</category>
        
                    <category>Social networking</category>
        
                    <category>User experience</category>
        

        <pubDate>Sun, 24 Jul 2011 19:07:05 -0500</pubDate>
 	</item> 

 	<item> 
    	<title>Spotify and Klout: fungible influence</title> 
        <description><![CDATA[<p>Popular music streaming service Spotify has just launched in the US. For some background see the <a href="http://arstechnica.com/media/news/2011/07/music-service-spotify-finally-to-launch-in-us-on-thursday-morning.ars">Ars Technica story</a>. </p>

<p>One of the interesting aspects of the launch was the tie-in with Klout. <a href="http://corp.klout.com/about">Klout</a> is one of several services which provide analytics around social media activity. It aims to be the 'standard for influence', tracking social media impact. I have spoken about Klout before (<a href="http://orweblog.oclc.org/archives/002176.html">Analysing influence .. the personal reputational hamsterwheel</a>). The <a href="http://corp.klout.com/about">business model</a> includes the matching of 'influencers' in a particular area with providers of products and services relevant to that area. The providers may provide the influencers with 'perks' (upgrades, samples, etc) which positively influence their tweeting. </p>

<p>The 'perk' in the case of Spotify was an early free invite, and if the 'influencer' gets 5 additional people to sign up (see <a href="http://klout.com/perk/Spotify/SpotifyFreeAccounts?passalong=MzEvNTkyMTUvMg&passalongSig=44c1770db70407d894dad449db0350c150b96bd1b440275ec6e4bed7341d4bb0">here</a> for a link to my invite), then he or she gets a free pass to the premium Spotify service. Apparently, the offer was popular, or popular enough to cause Klout to go down for a while (according to <a href="http://techcrunch.com/2011/07/14/dont-have-a-free-spotify-invite-use-your-klout-perks/">Techcrunch</a>). </p>

<blockquote>Klout CEO Joe Fernandez says that Klout has partnered with Spotify to offer free invites to those Klout users who have hight scores in topics relevant to music/entertainment. We don't know the exact number of invites, but Fernandez says Klout is working with Spotify to scale the invitations further. [<a href="http://techcrunch.com/2011/07/14/dont-have-a-free-spotify-invite-use-your-klout-perks/">Techcrunch</a>]</blockquote>

<p>If my experience of the last few days is typical, the interest in Spotify seems to be driving new members to Klout as much as the other way around. I haven't seen discussion of what has happened to the 'influence' threshold. </p>

<p>While there is debate about what exactly Klout measures, some concern about where this model leads, and while it is still early days, I think that this more 'organic' approach to promotion is interesting. As is a service based on the fungibility of influence. </p>

<p>Klout may or may not be successful, but this is is another example of how social and algorithmic approaches are changing our communications and media landscape. It is also a rather literal example of those 'hidden persuaders'. </p>

<p>(Slightly edited for style 7/25/2011.)<br />
 </p>]]></description>
		<author>dempseyl@oclc.org (Lorcan Dempsey)</author>
        <link>http://orweblog.oclc.org/archives/002182.html</link> 
        <guid>http://orweblog.oclc.org/archives/002182.html</guid> 
        
                    <category>Analytics and measurement</category>
        
                    <category>Social networking</category>
        

        <pubDate>Tue, 19 Jul 2011 10:20:32 -0500</pubDate>
 	</item> 

 	<item> 
    	<title>Hamster wheeling ....</title> 
        <description><![CDATA[<p>I have found the expression 'hamster wheeling' useful over the last few years. I tend to use it in the context of any frantic effort where the participants have to keep several things going at the same time, and where it seems that slowing down might cause something to fall off. </p>

<p>More specifically, it is an appropriate description for those operations, common in a digital library environment, where a set of grants are used to keep a set of people working on a set of projects. Many of us are familiar with such environments, either as participants or observers. </p>

<p>In this context, I was interested to see a similar expression used in a <a href="http://arstechnica.com/web/news/2011/06/has-the-internet-hamsterized-journalism.ars">report</a> recently in Ars Technica. </p>

<blockquote>But, "these additional responsibilities--and having to learn the new technologies to execute them--are time-consuming, and come at a cost. In many newsrooms, old-fashioned, shoe-leather reporting--the kind where a reporter goes into the streets and talks to people or probes a government official--has been sometimes replaced by Internet searches."</blockquote><blockquote>Thus, those "rolling deadlines" in many newsrooms are increasingly resembling the rapid iteration of the proverbial exercise device invented for the aforementioned cute domestic rodent. The observation was first made by Dean Starkman in a Columbia Journalism Review piece titled "<a href="http://www.cjr.org/cover_story/the_hamster_wheel.php?page=all">The Hamster Wheel</a>."</blockquote><blockquote>The "Hamster Wheel" isn't about speed, the report quotes Starkman as saying. "It's motion for motion's sake... volume without thought. It is news panic, a lack of discipline, an inability to say no." [<a href="http://arstechnica.com/web/news/2011/06/has-the-internet-hamsterized-journalism.ars">Has the Internet "hamsterized" journalism?</a>]</blockquote>]]></description>
		<author>dempseyl@oclc.org (Lorcan Dempsey)</author>
        <link>http://orweblog.oclc.org/archives/002179.html</link> 
        <guid>http://orweblog.oclc.org/archives/002179.html</guid> 
        
                    <category>Miscellaneous</category>
        

        <pubDate>Tue, 14 Jun 2011 09:19:45 -0500</pubDate>
 	</item> 

 	<item> 
    	<title>Books, artifactually and anecdotally speaking  ...</title> 
        <description><![CDATA[<p>Anecdotally speaking ....... it seems to me that the artifactual and design elements of print books are coming more to the fore, albeit in different ways. As the use of ebooks grows this is not really surprising, as this change creates niche opportunities for experiences focused specifically on the possibilities of the print medium. Some related trends are interesting. First, we are seeing physical books repurposed, recycled, or 'upcycled' to perform other functions. Second, we are seeing a focus on the aesthetics of print books, and on book arts in general. And third, we are seeing - I think? - more interest in artists' books, in books as art,   ... </p>

<p>Here are are some examples ...</p>

<p>We attended the <a href="http://www.columbuscraftacular.com/">Columbus 'eco-chic craftacular'</a> a couple of weeks ago, pleasantly located only a couple of minutes walk from our house. I was interested to see 'upcycled' books on at least two stands, where physical books were being repurposed as notebooks and cases. I remember being in the <a href="http://www.newberry.org/general/bookstore.html">bookstore</a> in the Newberry Library last year, and being interested to see some similar examples. At the time of writing, a search on Etsy.com returns over 3.5K resuls for 'upcycled books' and nearly 7K for 'recycled books (of course, not all of these are in fact recycled books, but a lot are). </p>

<p>The marvellous Caustic Cover Critic has an <a href="http://causticcovercritic.blogspot.com/2011/06/notting-hill-editions.html">entry</a> about a new publisher, <a href="http://www.nottinghilleditions.com/">Notting Hill Editions</a>. He praises the appearance and design of the books and notes ...</p>

<blockquote>Several of the many, MANY, MANY future-of-the-book-in-the-face-of-ebooks pieces that have been cropping up in newspapers and online recently have mentioned that the way for physical books to survive in the future may be for them to be beautifully made objects. Notting Hill Editions are doing this just right--the look and feel of their books remind me of the precepts set out by master printer/typographer Daniel Berkeley Updike in his lovely The Well-Made Book. [<a href="http://causticcovercritic.blogspot.com/2011/06/notting-hill-editions.html">Notting Hill Editions</a>]</blockquote>

<p>Peter Hirtle alerted me to the Caustic Cover Critic in a comment on a <a href="http://orweblog.oclc.org/archives/002056.html">blog entry of mine</a> which discussed book covers and other artifactual properties of books, in line with the comment in the piece quoted above. </p>

<p>Although, <a href="http://orweblog.oclc.org/archives/002158.html">as I noted recently</a>, I hope that we don't see an either/or approach to publishing, where 'beautiful' books feel they don't need an electronic parallel. </p>

<p>[Here are the reissued Virago classics which I discussed in the <a href="http://orweblog.oclc.org/archives/002056.html">earlier blog entry</a>. Of course, the original Viragos had those iconic green covers. Note the remarks about the current position of Virago in the entry.]<p><a href="http://www.littlebrown.co.uk/NewsEvents/News-Archive/Virago-Modern-Classics"><img alt="virago.png" src="http://orweblog.oclc.org/archives/virago.png" width="557" height="620" /></a></p> </p>

<p><br />
Finally, and anecdotally, what about artists' books? Again, it seems to me that I see more on this topic, most recently, for example, <a href="http://museum.stanford.edu/news_room/book-as-art.html">The Art of the Book in California: Five Contemporary Presses</a>, an exhibition at the Cantor Arts Center at Stanford.  </p>

<p>We lived in Bristol for many years, near the University of the West of England's Bower Ashton campus, home of the <a href="http://www1.uwe.ac.uk/cahe/ad">Department of Art and Design</a>. The Department has a <a href="http://www.uwe.ac.uk/sca/research/cfpr/">Centre for Fine Print Research</a>, with a specialisation in <a href="http://www.bookarts.uwe.ac.uk/about.htm">Artists' Books</a>. I mention it here as it provides an entry point into a range of other materials. See for example, <a href="http://www.bookarts.uwe.ac.uk/bodmid.htm">Sarah Bodman</a>'s very recent presentation <a href="http://www.bookarts.uwe.ac.uk/sbtalkldn.pdf">Artists and librarians: making art in and out of books [PDF]</a>. </p>

<p></p>

<p><br />
</p>]]></description>
		<author>dempseyl@oclc.org (Lorcan Dempsey)</author>
        <link>http://orweblog.oclc.org/archives/002178.html</link> 
        <guid>http://orweblog.oclc.org/archives/002178.html</guid> 
        
                    <category>Books, movies and reading ...</category>
        
                    <category>User experience</category>
        
                    <category>ebooks and other e-resources</category>
        

        <pubDate>Wed, 08 Jun 2011 09:28:05 -0500</pubDate>
 	</item> 

 	<item> 
    	<title>Effective web presence ...</title> 
        <description><![CDATA[<p>We are growing used to the idea that simply making things available on the web does not necessarily make them widely discovered or discoverable. I have used the phrase <a href="http://www.google.com/search?q=%22search+engine+interoperability%22+lorcan">'search engine interoperability'</a> in these pages already. This is a play on 'search engine optimization', which some do not like (although I think we should not avoid using a standard industry term). We value interoperability in libraries, and have spent a lot of time thinking about it and building it into our systems and services. However, we tend to think of interoperability in rather heavily structured terms, and to think of it as between library systems, rather than as between library systems and the rest of the web. </p>

<p>Those responsible for library websites, repositories or other systems are indeed now thinking much more about URL patterns, good titles and descriptions, sitemaps, feeds, and so on. In this context, I think it is useful to think about search engine interoperability as managing resources in ways which promote effective crawling, indexing and ranking by search engines. Thinking, in other words, about the robot users of our systems as well as the human users.</p>

<p>Given the importance of this topic, it is good to see that the Strategic Content Alliance in the UK has made several resources available to provide guidance on these topics.</p>

<ol><li><a href="http://www.jisc.ac.uk/media/documents/themes/content/sca/SCAMOREGuide.pdf">Guide to maximising your online presence </a>(PDF) </li>
	<li><a href="http://www.jisc.ac.uk/media/documents/themes/content/sca/SCAMOREChecklist.pdf">A checklist for value from the internet </a>(PDF) </li>
	<li><a href="http://www.jisc.ac.uk/media/documents/themes/content/sca/SCAMOREFieldReports.pdf">Reports from the field - Experiences from those 'at the coalface' April 2011</a> (PDF)</li></ol>

<p>In the <a href="http://www.jisc.ac.uk/whatwedo/programmes/contentalliance/reports/maximisingonlineresources.aspx">high level description of this work</a> they evoke the SEO rationale:</p>

<blockquote>In an age when media, business, government and almost every aspect of modern society vies for the users' attention, how can we ensure that the resources that are being created through public funds reach and engage with their constituent audiences?</blockquote>

<p>The materials summarise good practice as interpreted by the Strategic Content Alliance. They go beyond what is usually thought of as SEO, and stay at a pretty general level. The reports from the 'coal face' are interesting, as site managers talk about their experiences trying to enhance their web presence.</p>

<p>Some of the same ground, with examples of good and bad practice, was covered by Ed Summers of the Library of Congress in a presentation that was well received at a DPLA meeting we both attended recently. Here it is ...</p>

<p><iframe src="https://docs.google.com/present/embed?id=dv89m3d_512fd83d9c5" frameborder="0" width="410" height="342"></iframe></p>

<p><br />
</p>]]></description>
		<author>dempseyl@oclc.org (Lorcan Dempsey)</author>
        <link>http://orweblog.oclc.org/archives/002177.html</link> 
        <guid>http://orweblog.oclc.org/archives/002177.html</guid> 
        
                    <category>General - distributed environments</category>
        
                    <category>Libraries -  systems and technologies</category>
        
                    <category>Libraries - distributed environments</category>
        
                    <category>Websites: design and role</category>
        

        <pubDate>Tue, 31 May 2011 21:57:20 -0500</pubDate>
 	</item> 

 	<item> 
    	<title>Analysing influence .. the personal reputational hamsterwheel</title> 
        <description><![CDATA[<p>Analysing influence has been a central part of academic life. We are very familiar with citation analysis. This is possible because the system allows that metric to be measured and it is seen to be meaningful. </p>

<p>As new measures have become possible in a web environment they too have been taken up. See for example the 'top of' lists at <a href="http://www.ssrn.com/">SSRN</a> or the elaborate set of rankings from <a href="http://ideas.repec.org/top/">RePEc</a>. </p>

<p>And we are also very familiar with the interest in such measures from individual researchers (it would be interesting to see some analysis of disciplinary differences here - are physicists more interested in their H-Index than geographers?). It is interesting to see influential economics blogger <a href="http://gregmankiw.blogspot.com/">Gregory N Mankiw</a> reference the RePEc list from time to time, for example, as well as other rankings. </p>

<p>As we began to use the web for personal interaction, new measures arose. I am thinking in particular of Technorati and blog rankings which were of interest for a while, although less so now as the blogging environment has changed. </p>

<p>And now we have Twitter. I have been looking at <a href="http://www.klout.com">Klout.com</a> recently ... with a mixture of intrigue and dread. Klout describes itself as the "standard for influence" and assigns scores using a proprietary algorithm based on Twitter and Facebook interactions and relationships. What is especially interesting about it is how it is trying to leverage the results of its analysis. An interview with founder Joe Fernandez provides some context. "If Klout CEO and co-founder Joe Fernandez has his way, the Klout score will become a new way of measuring people and their influence online." He makes a comparison with credit ratings:</p>

<blockquote>"Klout is basically your social credit score," Fernandez says. "Consumers should care because it affects the way employers, companies and everyone looks at your ability to spread information as a critical part of the attention economy today."</blockquote>

<p>They have an interesting business model.</p>

<blockquote>Then Klout matches relevant influencers to relevant brands, which provide special offers, especially perks like previews on new product launches. For example if you are influential about technology, HP might send you a free laptop. Or you could get a free upgrade at your favorite hotel.</blockquote><blockquote>The idea is that these influencers will create social media buzz that can help a product take off. (People have to sign up with Klout to receive the offers.)</blockquote>

<p>Klout seems to be coming over my horizon more often. I have seen several stories about its use in recruitment, where a person's Klout score is a factor in assessment. Given this interest, it is only natural that there are also critiques of this type of twitter analysis approach in general and of Klout in particular. A moment with Google for example reveals <a href="http://michelletripp.com/index.php/2010/08/31/klout-do-you-have-enough-influence-to-get-the-job/">this</a> and <a href="http://www.talentzoo.com/digital-pivot/blog_news.php?articleID=9173">this</a>. </p>

<p>Other services exist: <a href="http://www.twitalyzer.com/">Twitalyzer</a> ("serious analytics for social business") for example and the interesting <a href="http://www.peerindex.net/">PeerIndex</a> ("understand your social capital").</p>

<p>It is clear that personal or social analytics - based on Twitter and other services - is of great interest, and several companies are exploring how to monetise the measurement of personal reputation or influence. The Klout model suggests how such influence might be fungible and gestures to a future in which one's score might lead to preferential access to services or products. This is an unsurprising development in a world where data about more and more of what we do is mined to provide intelligence for other services (what we spend for example, or where we go on the web). </p>

<p>I need to go now and catch up on my tweeting. I can't let my score drop  ... I wonder could it lead to an upgrade on my next transatlantic flight :-)</p>

<p></p>

<p></p>

<p><br />
</p>]]></description>
		<author>dempseyl@oclc.org (Lorcan Dempsey)</author>
        <link>http://orweblog.oclc.org/archives/002176.html</link> 
        <guid>http://orweblog.oclc.org/archives/002176.html</guid> 
        
                    <category>Analytics and measurement</category>
        
                    <category>Personal</category>
        
                    <category>Social networking</category>
        

        <pubDate>Sat, 28 May 2011 08:42:03 -0500</pubDate>
 	</item> 

 	<item> 
    	<title>Sourcing and scaling: the University of California</title> 
        <description><![CDATA[<p>I was interested to see Heather Christenson describe The Hathi Trust as a collaboratively sourced web scale research library in a recent article (Hathi Trust: a research library at web scale. <a href="http://www.hathitrust.org/documents/christenson-lrts-201104.pdf">PDF</a>). </p>

<p>This reminded me of an entry I wrote a while ago about <a href="http://orweblog.oclc.org/archives/002058.html">sourcing and scaling</a> (which is referenced in the article). In a shared network environment, one of the most interesting issues facing libraries is how appropriately to source and scale activities. </p>

<p>A few years ago, this activity would have been sourced within the institution: each library would have developed its own infrastructure, user interface, local community outreach, and so on. Now, such an impulse is questioned. It makes sense to source something like this collaboratively. And it is provided at the level of the network: its target user population is the population of web users. </p>

<p>Questions about sourcing and scaling are becoming much more common as the logic of the network reconfigures patterns of information production and use. What does it make sense to do at institutional level? What does it make sense to source elsewhere (repository services in the cloud, for example, or institutional email services from Google)? And what should be left entirely to other providers? At what level, or scale, is it best to do things? Locally, or within a consortium, or ....?</p>

<p>Think of four sourcing options: Self (provide it locally), collaborative (provide it within a group), public (provided through state or national activity), or third party (provided by another commerical or non-commercial entity). </p>

<p>Think of three scaling options: local or institutional, group, and web scale. </p>

<p>These can be put together to give a variety of options. So, for example, Tripod, the shared catalog of Swarthmore, Haverford College and Bryn Mawr, is a collaboratively sourced group solution. PubMed is a web scale public offering. And, as already noted, Hathi Trust is a collaboratively sourced web scale service. </p>

<p>An interesting contrast between the US and many other parts of the world is that often what is done collaboratively in the US may be done through a public agency elsewhere.   For example, Christenson contrasts the HathiTrust as a collaborative activity with something that the JISC, an activity of the public higher education funding councils, might provide in the UK. It is also common in many countries outside the US to have publicly supported union catalogue and related activities. </p>

<p>We can observe two trends. First, there is a trend towards externalisation: libraries are looking to collaboratively source activities or to outsource them to third parties. Think of collaborative activities around managing down print collections here, the West project for example, or the growth of shared library systems (the Orbis Cascade Alliance, for example, recently issued an <a href="http://www.orbiscascade.org/index/shared-integrated-library-system-team-2011">RFI</a> about a shared integrated library system). Think of the growing interest in cloud-based sourcing of systems and services. </p>

<p>Second, there is a trend to 'move up' in the network, by doing more things at group level within consortia or public contexts (think of OhioLink or Summit), or by leveraging network level services (think of social networking sites, for example). </p>

<p>The current economic environment further encourages these trends. Institutions look for economies of scale through collaboration. And they also want to focus attention on high value areas, and outsource routine or shared activities.  </p>

<p>I was reminded of these issues while reading a very interesting internal report of the University of California on library services. This is the interim report of the systemwide Library Planning Task Force, convened under the auspices of the Systemwide Library and Scholarly Information Advisory Committee (<a href="http://libraries.universityofcalifornia.edu/planning/taskforce/">splashpage</a>, <a href="http://libraries.universityofcalifornia.edu/planning/taskforce/interim_report_package_2011-05-09.pdf">PDF</a> of report and related material).</p>

<p>A stark environmental picture is presented:</p>

<ul><li>The libraries will experience budget reductions of as much as $52M or 21% of their current budget base over the next six years. "To put this into perspective, this cut is greater than the total library budget of any single UC campus, and roughly equivalent to the budgets of three of our mid-sized campuses, all AAU members."</li>
	<li>The libraries will likely lose the equivalent of $17M in buying power in the same period given publisher price increases. "This is equivalent to the current library materials budgets of two mid-sized campuses, and means a reduction in the systemwide acquisition rate of about 200,000 items per year."</li>
	<li>Existing facilities will run out of space for new materials over the next 5-7 years, at the same time as "demand increases for extended hours and services and technologically well-equipped and flexible learning environments in the libraries' prime campus locations".</li></ul>

<p>They go on to observe that the impact of these factors can be mitigated through collaboration. They propose four strategies:</p>

<ol><li>Expand and collectively manage shared library services.</li>
	<li>Support faculty efforts to change the system of scholarly communication.</li>
	<li>Explore new sources of revenue.</li>
	<li>Improve the existing framework for systemwide planning, consultation, and decision-making.</li></ol>

<p>Of course, the University of California is an unusual institution, bringing together some of the world's major universities in a shared organizational framework. One result of this shared framework has been the <a href="http://www.cdlib.org/">California Digital Library</a>, which concentrates operational and innovation capacity for the whole system. CDL has been responsible for some major services, and is an active partner in the Hathi Trust. Another is the Regional Library Facilities, north and south, for managing print collections. A major recommendation is that the range of such shared services should grow, whether sourced within the UC universities or externally. </p>

<p>Cooperation is difficult. Especially where money flows, and impact needs to be seen, at the institutional level. However, given the existing level of shared services, the organizational framework, and the pressures described in this report, it will be interesting to watch what services the UC libraries move to a shared environment over the next few years.</p>

<p>p.s. The report describes the Worldcat Local-based Next Generation Melvyl described in these terms: <blockquote>The Next‐Generation Melvyl (NGM) initiative moves the discovery of information for researchers and students to the highest networked level. The initiative takes access to the highest level of aggregation and is vital for the most effective provision of information access and services. Strategically, NGM also positions the UC Libraries to provide aggregated access to a significantly increasing array of full‐text information resources: e.g., the millions of digitized books in the Google Books Project and the HathiTrust.</blockquote><br />
</p>]]></description>
		<author>dempseyl@oclc.org (Lorcan Dempsey)</author>
        <link>http://orweblog.oclc.org/archives/002175.html</link> 
        <guid>http://orweblog.oclc.org/archives/002175.html</guid> 
        
                    <category>Libraries - organization and services</category>
        

        <pubDate>Thu, 19 May 2011 20:21:17 -0500</pubDate>
 	</item> 

 </channel>
</rss>

