Lorcan Dempsey's weblog On libraries, services and networks.   

General - distributed environments

QOTD: mobile

 •  Categories: Books, movies and reading ... , General - distributed environments

I am writing a short piece on mobile communications at the moment and have been interested to see that the whole world is writing about the impact of mobile.

The Economist has a very nice special section with articles on a range of topics (see the display panel on the right of this opening section for a list of articles). There is almost no focus on the technology per se, rather it looks at how our working and social lives, our buildings and our jobs, and our attitudes and expectations are being reconfigured. The emphasis is not on 'mobility' but on permanent connectivity in an environment where computational and communication capacity is increasingly pervasive. What is our world like when the network is not something that is 'out there' but when potentially all that we do is network aware.

There are several sections; here is a note from the piece on space:

The fact that people are no longer tied to specific places for functions such as studying or learning, says Mr Mitchell, means that there is “a huge drop in demand for traditional, private, enclosed spaces” such as offices or classrooms, and simultaneously “a huge rise in demand for semi-public spaces that can be informally appropriated to ad-hoc workspaces”. This shift, he thinks, amounts to the biggest change in architecture in this century. In the 20th century architecture was about specialised structures—offices for working, cafeterias for eating, and so forth. This was necessary because workers needed to be near things such as landline phones, fax machines and filing cabinets, and because the economics of building materials favoured repetitive and simple structures, such as grid patterns for cubicles. [The new oases | Economist.com]

I particularly liked this section; it filled out the context for my suggestion a while ago that Starbucks has become 'on-demand space'.

And I was interested to see this little snippet the other day:

Who is the largest camera maker in the world? Nokia. Who is the largest manufacturer of music devices in the world? Nokia. Who is buying the company that provides the map data behind Mapquest? Nokia. [Our Cells, Ourselves - washingtonpost.com]

Related entries:


View commentsView comments (3)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit   Furl  

Google and OAI-PMH

 •  Categories: General - distributed environments , Metadata , Standards

There is an interesting note on the Google Webmaster Central Blog:

When we originally launched Sitemaps, we included support for the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) 2.0 protocol, an interoperability framework based on metadata harvesting. In the meantime, however, we've found that the information we gain from our support of OAI-PMH is disproportional to the amount of resources required to support it. Fewer than 200 sites are using OAI-PMH for Google Sitemaps at the moment.



In order to move forward with even better coverage of your websites, we have decided to support only the standard XML Sitemap format by May 2008. We are in the process of notifying sites using OAI-PMH to alert them of the change. [Official Google Webmaster Central Blog: Retiring support for OAI-PMH in Sitemaps]

Via Paul Walk, who remarks:

There are a few ways of looking at this. Perhaps ‘open access’ repositories are less concerned with Google rankings than the typical website owner. Perhaps the penetration of OAI-PMH in the world is still below any level that Google could find particularly interesting - certainly they never went to great lengths to advertise this support while it lasted. Clearly, Google have come to the end of a ‘trial period’ for their support for this protocol in their main indexing service. [paul walk’s weblog » Blog Archive » Google gives up on supporting OAI-PMH for Sitemaps]

View commentsView comments (4)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit   Furl  

Live Mesh - as if you needed another post about it

 •  Categories: General - distributed environments

Microsoft has announced its Live Mesh initiative: you must have heard the rumble? Another network word to join grid, graph, and web.

We will have to see how it rolls out, but it is a reminder of how much is at play as Google, Microsoft and a few others race to build out our network world for work, learning and living.

Here is a concise statement from the Live Mesh blog:

The core philosophy is to make it easy to manage information in a world where people have multiple computing experiences (i.e. PCs and applications, web sites, phones, video games, music and video devices) that they use in the context of different communities (i.e. myself, family, work, organizations). [Live Mesh : Live Mesh as a Platform]

The aim seems to be to allow you to synchronize and share: to synchronize your data across your different device and application environments and to share them within your various affiliation groups. Key components are the integration of local and cloud, and the use of feeds as connective tissue.

Jon Udell extends:

There’s another pattern for Live Mesh applications, one that’s less familiar. In this pattern, a website uses Live Mesh as a pipeline to communicate with Live Mesh users. If you’re running a travel site, or a bank, you can use that pipeline to transmit structured data to your users — for example, itineraries or transaction reports. It’s easy to create those XML feeds, you can leverage the Live Mesh infrastructure to deliver them securely and reliably at scale, they synchronize across all devices in each user’s Live Mesh, and they’re accessible to local applications using same RESTful feed APIs that were used to create them. [Jon Udell]

Check out the memo from Ray Ozzie, chief software architect, describing Microsoft's view of the current environment and Live Mesh responds to this, and also the interview between Ray Ozzie and Jon Udell. Here is Ozzie on content:

Content has changed at both the “head” and the “tail”. The line between editorialized portals and blogs has blurred, and all are consumed through feeds. Beyond news, movies and music and television have all expanded to embrace the web. And the interrelation of content and community has created a world of “social media”, where both head and tail content is intrinsically social by virtue of community linking, tagging, and ranking. Relationships and collective behavioral intelligence have changed how we stay informed, find and share media, and interact with one another. [Full Text of Ray Ozzie Mesh Memo - ReadWriteWeb]

Phil Wainewright also has a positive assessment based on this early information:

Notice how the application no longer resides on a specific machine — quite a departure from Microsoft’s current licensing regime — but instead is defined in relation to the individual’s mesh. No wonder this isn’t even in beta yet. Imagine how much work has to be done before this can be delivered commercially. [Meshing the desktop into the cloud | Software as Services | ZDNet.com]

View commentsView comments (0)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit   Furl  

The big switch

 •  Categories: General - distributed environments , General - systems and technologies , Libraries - systems and technologies , OCLC

I have just finished Nicholas Carr's The Big Switch. Here is a sample:

The complexity and inefficiency of the client-server model have fed on themselves over the last quarter century. As companies continue to add more applications, they have to expand their data centers, install new machines, reprogram old ones, and hire ever larger numbers of technicians to keep everything running. When you also take into account that businesses have to buy backup equipment in case a server or storage system fails, you realize that, as studies indicate, most of the many trillions of dollars that companies have invested into information technology have gone to waste. [The Big Switch, p. 56]
Most of the software and almost all of the hardware that companies use today are essentially the same as the hardware and software their competitors use. Computers, storage systems, networking gear, and most widely used applications have all become commodities from the standpoint of the businesses that buy them. They don't distinguish one company from the next. The same goes for the employees who staff IT departments. Most perform routine maintenance chores - exactly the same tasks that their counterparts in other companies carry out. The replication of tens of thousands of independent data centers, all using similar hardware, running similar software, and employing similar kinds of workers, has imposed severe penalties on the economy. It has led to the overbuilding of IT assets in almost every sector of industry, dampening the productivity gains that can spring from computer automation. [The Big Switch, p. 57]

Carr makes an analogy with electric power. Many years ago, companies would have had their own power generators. This was very inefficient and we moved to a utility model, where generating capacity was concentrated and delivered to others over the electric grid. He foresees the emergence of a similar model with computing and applications, a movement to a utility model where capacity is delivered as required over the network.

Of course, as he notes, this is already upon us. Think of a couple of prominent examples: Amazon Web Services and the range of Salesforce.com's services.

Amazon provides computation, storage and other services on an on-demand basis. Werner Vogels, Amazon CTO, has an interesting presentation where he talks about Amazon's webscale services and discusses their rationale. The subtitle of the presentation is "compete on ideas, not resources". In terms that echo Carr's, he talks about the 70/30 switch, claiming that 70% of a firm's "time, energy and dollars is spent on undifferentiated heavy lifting" in building out infrastructure, while 30% is spent on "differentiated value creation". Amazon wants to help organizations reverse those numbers, reducing the time spent on undifferentiated, increasingly commodity, infrastructure.

I was looking at My Starbucks Idea the other day and was interested to see that it was powered by force.com. This is a suite of on-demand tools from Salesforce.com which claim to allow you to build enterprise applications without any custom development work. What immediately struck me was the way in which the service was promoted, echoing Carr and Vogels: the strapline is "Finally, focus on innovation, not infrastructure". I liked their line:

Free up the dollars wasted "keeping the lights on"

with a zero-infrastructure model.

The 'big switch' is going to be a major issue for libraries over the next few years. They spend too much time getting their systems to work, and not enough time putting them to work.

Of course, much will depend on what types of services are available to libraries from their providers and it will be as interesting to see how those providers reconfigure their offerings in coming years and what new providers emerge.

Note: I was prompted to note the Big Switch after reading and commenting on Mark Dahl's post here.

Related entries:

View commentsView comments (2)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit   Furl  

Microformats

 •  Categories: General - distributed environments , Metadata

mfebooks.png

I recently installed the Operator extension in my browser.

Operator leverages microformats and other semantic data that are already available on many web pages to provide new ways to interact with web services. [Operator :: Firefox Add-ons]

Interesting to see it in action on the JISC National eBooks Observatory page above. It recognizes address and contact data. Clicking on address displays an address, and offers to show it on Google and Yahoo maps.

I could also have used LinkedIn as an example where some structured data is also exposed using microformats. LinkedIn is one of the examples used by Yahoo in its much discussed announcement the other day about its support for various structured formats.

While there has been remarkable progress made toward understanding the semantics of web content, the benefits of a data web have not reached the mainstream consumer. Without a killer semantic web app for consumers, site owners have been reluctant to support standards like RDF, or even microformats. We believe that app can be web search.
By supporting semantic web standards, Yahoo! Search and site owners can bring a far richer and more useful search experience to consumers. For example, by marking up its profile pages with microformats, LinkedIn can allow Yahoo! Search and others to understand the semantic content and the relationships of the many components of its site. With a richer understanding of LinkedIn's structured data included in our index, we will be able to present users with more compelling and useful search results for their site. The benefit to LinkedIn is, of course, increased traffic quality and quantity from sites like Yahoo! Search that utilize its structured data. [Yahoo! Search Blog: The Yahoo! Search Open Ecosystem]

The entry goes on to describe the support that Yahoo will be providing for microformats, metadata vocabularies and opensearch.

Google released its Social Graph API a little while ago.

It will be interesting to see what incentives the concentrating power of Google and Yahoo provide for the more widespread diffusion of structure in support of a web of data.

Related entry:

View commentsView comments (0)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit   Furl  

Mashups

 •  Categories: Books, movies and reading ... , General - distributed environments

Raymond Yee is the originator of the useful triple "gather, create, share", and is known for his work on the Scholar's Box. He lectures at the UC Berkeley School of Information.

I have just got a copy of his book:

Yee, Raymond. Pro Web 2.0 Mashups: Remixing Data and Web Services. Berkeley, CA: Apress, 2008.

There is an accompanying blog, Mashup Guide which also has some material from the book, including a table of contents. The table of contents shows the wide range of approaches and services he looks at.

Here is a scope note from the introduction:

The overall flow of the book is: What can be done with no programming -> programming of one system (through its API) -> figuring out how to combine 2 or several systems -> creating "service composition frameworks" for combining arbitrary systems.

It would be easy to veer off into heavy-duty theory in this book. Instead, we will keep grounded in "practical interoperability" (a grab-what-we-can-from-wherever approach) while dipping into the deeper pools of grand unification efforts (such as the full semantic web vision) that have so far not come to full fruition.

It is nice to see LibraryLookup used as an introductory example in a mainstream text like this. And to see some discussion of LCSH, FAST and Dewey in the chapter on tagging.

Yee claims that the book is useful for the experienced developer as well as for more novice users (with some knowledge of HTML, CSS and JavaScript).

I have passed the book over to my colleague, Ralph Levan, for a more technical review. I will point to his review when it is done.

View commentsView comments (0)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit   Furl  

The two ways of Web 2.0

 •  Categories: Featured , General - distributed environments , Libraries - distributed environments , Libraries - organization and services , Social networking

I find Web 2.0 increasingly confusing as a label; no surprise there. This is not just because of its essential vagueness, but because I think it tends to be used in a couple of very different ways. Where this happens there is bound to be some confusion. Schematically, I will use the labels 'diffusion' and 'concentration' for these two ways.

diffusion is probably the more dominant of the two. Here it covers a range of tools and techniques which create richer connectivity between people, applications and data; which support writers as well as readers; which provide richer presentation environments. What tends to get discussed here are blogs and wikis; RSS; social networking; crowdsourcing of content; websites made programmable through web services and simple APIs; simple service composition environments; Ajax, flex, silverlight; and so on.

concentration is a major characteristic of our network experience, which often involves major gravitational hubs (google, amazon, flickr, facebook, propertyfinder.com). These concentrate data, users (as providers and consumers), and communications and computational capacity. They build value by collaboratively sourcing the creation of powerful data assets with their users. The value grows with the reinforcing property of network effects: the more people who participate, the more valuable they become. And opening up these platforms through web services creates more network effects. These sites also mobilize usage data to reflexively adapt their services, to better target particular users or to identify design directions. Of course, these platforms are very closely controlled, and there is an interesting balance of interests between openness and control at various levels in how they manage resources (see for example my discussion of the Amazon and Google APIs).

Interestingly, if you trace Tim O'Reilly's writings on Web 2.0 since the publication of his major defining article you see an emphasis on what I have called 'concentration' come through. (See my note on an interview with Tim O'Reilly by David Weinberger, on which I draw above, and also see O'Reilly blog posts here and here.)

Now, of course 'concentration' and 'diffusion' are often complementary approaches. The major Internet hubs 'diffuse' their benefits through service and data syndication, apis, participation, etc, but their value often derives from successfully driving network effects through wide participation and consolidation of data. In fact, many of the 'diffusion' techniques work best when associated with concentrating applications. Think of tagging for example. People have incentives to tag their resources in Flickr or Librarything in ways that may not obtain in the library catalog. Scale matters in the context of the social value created in these services (of course, in these examples, folks are also tagging their own resources). You cannot simply add social networking to a site and expect it to work well. Think of all those empty forums.

Much of the library discussion of Web 2.0 is about 'diffusion', about a set of techniques for richer interaction. It is appropriate that libraries should offer an experience that is continuous with how people experience the web.

However, there is a very important way in which the library experience is not continuous with the web. It remains fragmented: it does not have the characteristics of the concentrating, gravitational hubs which characterize so much web use, and are so much a part of O'Reilly's Web 2.0. Fragmented by database boundary, by service boundary (e.g. connecting a discovery experience gracefully to a fulfillment experience through resolution), by library boundary. We are now familiar with the comparison between this fragmented experience and discovery on the web. And we are also familiar with discussion of how the library presence is weakly represented in the major network presences.

However, think also of the library management environment. Think for example of places where data needs to be concentrated to create value: aggregating user data across sites (e.g. counter data), or aggregating user created data (tags, reviews), or aggregating transactions (e.g. circulations, resolver clickthroughs). Motivations here are to drive business intelligence which allows services to be refined (e.g. how does my database usage compare to that of my peer group), to develop targeted services (people who like this, also liked that), to improve local services (e.g. add tags or reviews). These are examples where scale matters, where data may need to be concentrated above the individual library level.

And, we are seeing for fee services emerge which address this need. LibraryThing, for example, syndicates its user-generated tagging to libraries. I am not sure that ScholarlyStats provides a service which compares usage across libraries; it would be interesting to know if there were demand for such a thing.

This then touches on larger questions about sourcing decisions (in what combination of local, collaborative, and third party do libraries acquire their service capacities) and about concentration of library presence (in what combination of library or library and third party are services offered).

For example, I discussed Georgia Pines and OhioLink the other day as examples of groups of libraries collaboratively sourcing a concentrated library presence which increases their gravitational pull.

And libraries are beginning to think more seriously about sourcing services with central web presences. Think for example of the decisions made by the National Library of Australia and the Library of Congress when they chose to use Flickr for significant image projects. NLA is seeking to expand the coverage of PictureAustralia; LC is seeking to collect tags from viewers. In each case, the library wants to benefit from the concentration of users and data that Flickr has created on the web. And to suggest another example, Andy Powell has been raising some intriguing questions about how repository services should be sourced in ways that, again, map onto peoples' experience of the web: would a consolidated network level service be more motivating than a serious of institutional presences? (see here and here). Social networking or other services, he suggests, might flourish at this network level in ways that are not feasible at the institutional level.

When we discuss Web 2.0, there is a temptation to think about blogs and wikis, RSS and a Facebook application, and to stop there. There is also some useful thinking about how to expose web services or data in ways that they can be remixed into other applications. However, Web 2.0 is also about concentration, concentration of data, of users and of communications. We need also to think about how libraries reconfigure services in an environment of network level gravitational hubs, driven by network effects. This will involve greater concentration of library resources in various ways, and also - probably? - greater reliance on other web presences to deliver their services.

View commentsView comments (8)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit   Furl  

Let me be open with you ...

 •  Categories: General - distributed environments , Marketing , Standards

'open' is a word that usually needs to be qualified to be of any use in our conversations. Simply standing on its own it is not clear what it means. Unless qualified the word is like 'home made', 'new' or 'natural', a widely applied promotional label with little informational value.

The storm in a teacup around OpenTranslators is interesting in this context. This is a hosted service from Care Affiliates, working with Index Data and WebFeat. It is a nice idea.

OpenTranslators will allow libraries to use the federated search interface of their choice to access over 10,000 databases using SRU/SRW/Z39.50. The databases consist of: licensed databases, free databases, catalogs, Z39.50, Telnet and proprietary databases. Libraries that already have a Z39.50 client in their OPAC will be able to connect to, not only library catalogs, but also thousands of additional databases. Those libraries that are building or already using an open source federated search tool will now be able to expand the world of information that can be accessed. Finally, for those institutions/organizations building new mashup clients, this will allow them to access and use vast amounts of additional content. [OpenTranslators; the ability to choose the Federated Search Interface and Content of your choice using open standards.]

This is open in the sense that it is placing a standards-based layer over a bundle of useful functionality. The translators can be accessed through a well-defined public interface (in this case SRU/SRW/Z39.50), the definition of which is under the control of no single organization. This is a well-established and long-standing sense of 'open', as in 'open standards' or 'open systems'.

However, calling a service 'open' in this sense says absolutely nothing about the business model or configuration under which it is made available. It might be available for free, on a fee-for-use basis, as a subscription service; it might be available as locally deployable software, as a service in the cloud, and so on. In this case, Care Affiliates are making a subscription service available to users on a hosted basis.

To use the service you need appropriate client (SRU/SRW/Z39.50) capability. There are many choices here. Libraries will have this capability as part of software they buy from a vendor, or in some cases as part of an open source package. There is no necessary link between 'open source' and the use of the 'OpenTranslators' service.

While this new development was generally welcomed, some of the responses discussed the service in terms of 'open source' or even 'open access'.

This prompted the following from Dan Chudnov (who had taken notes and was naming names). Language mattered, he suggested, and .....

Open source, open access, and open standards are completely different activities undertaken by completely different combinations of people in completely different circumstances. To conflate them all because of the common word "open" is shortsighted enough - to misapply the terms against the intent of the proponents of each of these separate categories of endeavors is to sow distrust. [Welcome to 1998 | One Big Library.]

Looking at these exchanges I was reminded of early discussion around OAI where the 'O' for open looked towards 'open access' but also towards open standards independent of the business model supporting the application.

The Open Archives Initiative develops and promotes interoperability standards that aim to facilitate the efficient dissemination of content. The Open Archives Initiative has its roots in an effort to enhance access to e-print archives as a means of increasing the availability of scholarly communication. Continued support of this work remains a cornerstone of the Open Archives program. The fundamental technological framework and standards that are developing to support this work are, however, independent of the both the type of content offered and the economic mechanisms surrounding that content, and promise to have much broader relevance in opening up access to a range of digital materials. [Open Archives Initiative]

View commentsView comments (0)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit   Furl  

Web-Scale development: "the data center is the computer"

 •  Categories: General - distributed environments

More computing is happening in the cloud. Personal and corporate applications are being sourced from network based services. 'cloud' is a good way of describing the consumer experience: the consumer does not have to worry about the details of implementation or carry the burden of physical plant. However, it is a little misleading in an important way. As users pull more from the cloud, providers increasingly concentrate capacity to meet the need. We move to a utility model. Such utilities require large computational capacity, and considerable physical plant (space, power, cooling, ...). The cloud is fed from massive physical infrastructure.

One of the interesting stories of our times is the rush to build large processing plants by Google, Microsoft and others. Web-scale computing exists alongside major physical presences.

Communications of the ACM is celebrating its 50th anniversary with a special issue. It has a couple of short articles which address issues of operating at this scale.

David Patterson talks about the dramatic differences between "developing software for millions to use as a service versus distributing software for millions to run their PCs" (Technical perspective: the data center is the computer). He quotes Luiz Barroso of Google: "The data center is now the computer". He talks about the challenges of writing applications where the target deployment environment is a data center, and introduces MapReduce, an approach Google developed to address this issue. The challenge is to architect for large systems made up of thousands of individual computers.

Jeffrey Dean and Sanjay Ghemawat of Google describe the MapReduce programming model (MapReduce: simplified data processing on large clusters). What struck me here was the scale of operation. Here is the abstract:

MapReduce is a programming model and an associated implementation for processing and generating large datasets that is amenable to a broad variety of real-world tasks. Users specify the computation in terms of a map and a reduce function, and the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks. Programmers find the system easy to use: more than ten thousand distinct MapReduce programs have been implemented internally at Google over the past four years, and an average of one hundred thousand MapReduce jobs are executed on Google's clusters every day, processing a total of more than twenty petabytes of data per day. [MapReduce]

Head-spinning, as Niall Kennedy's suggests:

It's some fascinating large-scale processing data that makes your head spin and appreciate the years of distributed computing fine-tuning applied to today's large problems. [Google processes over 20 petabytes of data per day]

My colleague Thom Hickey introduces MapReduce here and talks about using it here.

View commentsView comments (0)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit   Furl  

QOTD: clouds and control

 •  Categories: General - distributed environments

A Wired interviewer asks Nick Carr when does the Big Switch from the desktop to the cloud happen?

Carr: Most people are already there. Young people in particular spend way more time using so-called cloud apps — MySpace, Flickr, Gmail — than running old-fashioned programs on their hard drives. What's amazing is that this shift from private to public software has happened without us even noticing it. [Q&A: Author Nicholas Carr on the Terrifying Future of Computing]

And about the balance between individual liberation and institutional control:

Carr: Computers are technologies of liberation, but they're also technologies of control. It's great that everyone is empowered to write blogs, upload videos to YouTube, and promote themselves on Facebook. But as systems become more centralized — as personal data becomes more exposed and data-mining software grows in sophistication — the interests of control will gain the upper hand. If you're looking to monitor and manipulate people, you couldn't design a better machine. [Q&A: Author Nicholas Carr on the Terrifying Future of Computing]

View commentsView comments (0)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit   Furl  

Interoperability

 •  Categories: General - distributed environments , Miscellaneous

Driving to yet another shopping opportunity this evening, I heard a story about 'e-prescriptions' on NPR. I was gobsmacked, as they say, when I heard a snippet from John Kerry, which included the word, interoperability, yes, that's interoperability.

I looked up the transcript later:

"We have the technology. We have the interoperability. We know how to make this happen. But not enough people are embracing this rapidly enough," said Sen. John Kerry (D-MA) at a Capitol Hill news conference last week. Kerry, along with several other Democrats and Republicans, is sponsoring a bill that would initially give doctors a bonus in their Medicare payments if they start e-prescribing. [NPR : Congress Looks to Require Electronic Prescriptions]

I wonder is it a word that current presidential hopefuls would be advised to use or to avoid ;-)

View commentsView comments (0)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit   Furl  

Moving to the network level

 •  Categories: General - distributed environments , The cultural and scholarly record

It was an interesting week for announcements about network level services.

Amazon announced SimpleDB:

Amazon SimpleDB is a web service for running queries on structured data in real time. This service works in close conjunction with Amazon Simple Storage Service (Amazon S3) and Amazon Elastic Compute Cloud (Amazon EC2), collectively providing the ability to store, process and query data sets in the cloud. These services are designed to make web-scale computing easier and more cost-effective for developers. [Amazon.com: Amazon SimpleDB, Amazon Web Services]

Enough to make Nick Carr suggest in his commentary that a 'tipping point' approaches. Tipping, that is, from locally deployed software to processing capacity available on demand in the 'cloud'.

And then, Dan Cohen wrote about the Zotero Commons:

The Zotero-IA alliance will create a “Zotero Commons” into which scholarly materials can be added simply via the Zotero client. Almost every scholar and researcher has documents that they have scanned (some of which are in the public domain), finding aids they have created, or bibliographies on topics of interest. Currently there is no easy way to share these; giving them a central home at the Internet Archive will archive them permanently (before they are lost on personal hard drives) and make them broadly available to others. [Dan Cohen’s Digital Humanities Blog]

One of the benefits of network-level services like Flickr or SlideShare is that they allow you to 'add' your materials to the public web and provide you with a URL to facilitate sharing. This is a motivation here also: "one of the great advantages of the Zotero Commons at IA will be the transport of scholarly materials currently residing on personal hard drives to a public space with stable, rather than local, addresses". It will be ineresting to see if the Zotero Commons develops the network effects that characterise various of the successful network level services.

View commentsView comments (0)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit   Furl  

QOTD: a feed-based universe

 •  Categories: General - distributed environments , User experience

The BBC website is a major network hub. It has been through a redesign and a beta site is available for inspection. The home 'page' becomes a "page composition layer".

From a conceptual point of view, the widgetization adopted by Facebook, iGoogle and netvibes weighed strongly on our initial thinking. We wanted to build the foundation and DNA of the new site in line with the ongoing trend and evolution of the Internet towards dynamically generated and syndicable content through technologies like RSS, atom and xml. This trend essentially abstracts the content from its presentation and distribution, atomizing content into a feed-based universe. Browsers, devices, etc therefore become lenses through which this content can be collected, tailored and consumed by the audience. [BBC Internet Blog - A lick of paint for the BBC homepage]

I will be interested to see how this plays out. Will people actually spend a lot of time customizing it for their use?

View commentsView comments (1)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit   Furl  

The personal to global traverse

 •  Categories: Digital asset management , General - distributed environments , Research, learning and scholarly communication , Social networking , User experience

Network services have accustomed us to move from the personal to the global. Think of iTunes. I have my own local library on my PC which I can synchronize with mobile devices. It is also tightly integrated with the global network iTunes. And the MiniStore uses aggregate buying patterns to make recommendations to me based on what I have in my 'library'.

Variations of this pattern are repeated everywhere. Flixster allows me to rate movies, and relates those to those of my 'friends' and to the aggregate global network level (Flixster drives the Movies application in Facebook). del.icio.us, LibraryThing, Flickr: I can move from my own collection to a global resource in various ways, often assisted by navigational features based on shared attributes across collections and items.

Of course, the dynamic is different in different places. In LibraryThing, for example, the 'global' data level is made up from aggregate personal collections, and central to the service is the idea that connections between our collections are important connections between us. In iTunes, the 'global' data level is already provided as an indication of available purchases, and I do not get to see other people's collections. Although, as already suggested, I benefit from 'hints' based on aggregate buying decisions. In this way, the balance between 'personal' and 'social' value varies across services.

At the same time, we have seen a related interest in all sorts of ways in creating personal collections which may draw materials from many services. Look at Zotero or the work of the SImile project for instance. These personal collections may or may not connect up to global or shared data layers.

Whatever the context, and whether or not the service has a social orientation, the idea of traversing from the personal to the global is becoming an important characteristic of our web experience. Yet another thing for libraries to think about as they work towards reconfiguring services for the web environment ...

Aside: I am reminded of Dan Chudnov's suggestion that the professional mission of librarians is to help people build their own libraries.

Related entry:

View commentsView comments (1)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit   Furl  

Some notelets on Facebook and the social graph

 •  Categories: General - distributed environments , Knowledge organization and representation , Social networking

Some holiday morning notelets ....

1. The social graph in action. I felt a tremor in the social graph this week. A bundle of my Facebook befrienders attended the CETIS conference. I was suddenly aware of status lines, notes, imported blog entries. I had a sense of some of what was discussed and could follow up if I wanted. It happened in the background. It was like the weather: I had a sense of what was happening without having to do much investigation. Incidentally, CETIS have done a nice job in collecting some of the network amplification of the conference on the website: blog posts, del.icio.us bookmarks, and so on.

2. The social graph, not. Facebook's flatness does not very well accommodate our layered and multidimensional social lives. A lot to talk about there, but this is still a holiday morning notelet .... To pick a simple and relatively straightforward example: what to do with an unwelcome invitation to be a 'friend' from your boss? I assume we will see a more nuanced way of managing the ways in which we present ourselves emerge over time. Which raises issues about how we port or share our represented identities, something that we do not do well now. The social graph is site-specific.

3. Net, web, graph. Tim Berners Lee gave the social graph expression a lift yesterday in a post about the evolution of our networked environment. He talks about a net/web/graph stack. The 'net' allowed us to address computers directly, abstracting away from the underlying connection paths. The 'web' allowed us to address documents, abstracting away from the machines on which they reside. In each case, new and unanticipated value was built on the navigable spaces the net and the web created. The 'graph', Tim Berners Lee suggests, allows us to work with the things that documents are about, friends, flights, proteins, customers and so on, abstracted away from the documents or sites themselves. If represented appropriately, and he uses the example of FOAF, applications can combine and recombine data about things across multiple documents and sites. So, an application could combine what various sites know about me and my relationships. So yes, in these terms, the social graph meets the semantic web. Of course, we have yet to see whether Facebook believes that the social graph is actually greater than the Facebook graph.

View commentsView comments (1)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit   Furl  

Search web service

 •  Categories: General - distributed environments , Libraries - distributed environments , Search , Standards

Under the auspices of OASIS appears a discussion document about the 'search web service'.

The Search web service is a means of opening a database to external enquiry in a standardized manner that facilitates discovery of query and response possibilities and makes it possible for heterogeneous databases to be queried simultaneously with the same or similar queries. Client software can be easily configured using a standardized XML explain document that is accessible from the base URL or via the explain operation. In contrast with protocols such as SQL and XQuery, detailed knowledge of a database’s structure is not necessary as the explain document contains parsable information on server defaults, searchable indexes and record schemas that are returned in the response. [OASIS Specification Template]
There is a cryptic note about its relationship to SRU:
This specification is based on the SRU (Search Retrieve via URL) specification which can be found at http://www.loc.gov/standards/sru/. It is expected that this standard, when published, will deviate from SRU. How much it will deviate cannot be predicted at this time. The fact that the SRU spec is used as a starting point for development should not be cause for concern that this might be an effort to fast track SRU. The committee hopes to preserve the useful features of SRU, but not to preserve those that are not considered useful. [OASIS Specification Template]
There is a wiki for the OASIS group working on this.

View commentsView comments (0)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit   Furl  

Discovery happens elsewhere, again

 •  Categories: Books, movies and reading ... , General - distributed environments , Search

The decision by the New York Times to open up for general reading the formerly for-fee TimeSelect parts of its website is being widely discussed. The rationale given is interesting.

Since we launched TimesSelect in 2005, the online landscape has altered significantly. Readers increasingly find news through search, as well as through social networks, blogs and other online sources. In light of this shift, we believe offering unfettered access to New York Times reporting and analysis best serves the interest of our readers, our brand and the long-term vitality of our journalism. We encourage everyone to read our news and opinion – as well as share it, link to it and comment on it. [A Letter to Readers About TimesSelect - New York Times]

This is another indication that discovery happens elsewhere. The material is currently not available to people who come to the website, but more importantly it is not available for crawling, linking, quoting, commenting. It is not open to the web. The website is not the focus of a user's attention: the web is, and for material to be discoverable it must be open to the ways in which web users discover and share materials .... elsewhere.

Incidentally, I was struck by the comment that the online landscape has changed significantly in two years. That's two years!

Related entry:

View commentsView comments (2)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit   Furl  

Discovery happens elsewhere

 •  Categories: General - distributed environments , Libraries - distributed environments , Marketing , Search , User experience

I have been using the phrase 'discovery happens elsewhere' in recent presentations. I think it captures quite nicely an increasingly important part of how we think about our services.

No single website is the sole focus of a user's attention. Increasingly people discover websites, or encounter content from them, in a variety of places. These may be network level services (Google, ...), or personal services (my RSS aggregator or 'webtop'), or services which allow me to traverse from personal to network (Delicious, LibraryThing, ...).

This means thinking about services in different ways. About how we disclose stuff to other discovery environments; about where our metadata is; about URL structures, RSS feeds, and so on.

I have suggested before that it would be an interesting experiment to think about our services as if they had no user interface. Here maybe it would be interesting to think about services as if they could only be reached from some other place. It makes you think about the variety of other places that discovery happens.

Credits. 'Discovery happens elsewhere' is influenced by Steve Rubel's use of the phrase 'traffic happens elsewhere' in his discussion of what he calls the 'cut and paste' web.

Related entries:

View commentsView comments (1)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit   Furl  

Syndicating moi

 •  Categories: General - distributed environments , Marketing , User experience

I was looking at Macenstein earlier. And was slightly surprised after a while to notice that amongst the Amazon ads appearing on the right were some for very familiar stuff.

Turns out that Macenstein is feeding me my own personalized Amazon data. And it looks as if other people might be similarly surprised as there are several links to 'privacy information'.

This explains that Macenstein is an Amazon Associate Web site.

Your browser automatically sends any Amazon cookies on your computer to our server when you view this type of Amazon.com link on an Associate Web site. (For more information about how Amazon uses cookies, see our Privacy Notice.)
Although we may use your Amazon.com cookie to determine whether you are a recognized Amazon visitor and to offer personalized content (such as product recommendations) and special offers, we do not keep or attempt to construct a record of the Web sites you visit. [Amazon.com Associates Privacy Information]

And in response to a question about whether these 'placements' are the same as the personalized recommendations you would see on Amazon itself:

They can be. The products you see listed when visiting an Amazon Associate's site can be based on a variety of factors, such as that site's topics and sales history. We might also show you items based on your own personal purchase history at Amazon.com. The Associate Web site hosting this Amazon.com link does not have access to these "personalized" recommendations. [Amazon.com Associates Privacy Information]

I thought this was interesting for several reasons. It is another example of how what we see is increasingly situational, dependent on what a service knows about us. I saw this from our home machine. I would see something different from my work machine. It is also an example of what Steve Rubel says in a very interesting post about making content embeddable in fine-grained ways: "traffic is becoming something that happens elsewhere".

In the very near future portals including iGoogle, My Yahoo and Netvibes as well as social networks will be able to easily inhale the smallest pieces of content from across the web. Don't wait. Start now to make everything on your website embeddable. Traffic is becoming something that happens elsewhere, not just on your site. [Micro Persuasion]

Incidentally, I was looking at Macenstein because of the Steve Wozniak story noted by John Naughton.

The Rubel reference, well worth a read, is via dlf-dispatches. Twittering Stu also discusses it.

View commentsView comments (0)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit   Furl  

A catalogue in your face

 •  Categories: General - distributed environments , Libraries - distributed environments , Libraries - organization and services , Metadata , Social networking , User experience

I was interested to see the Page Tools in the University of Alberta catalogue (look in the left hand bar below). A reader can send a correction or suggestion to the library: it would be interesting to know how many folks use this option and what types of suggestion are made. Also interesting to see the ability to save the page to various bookmarking sites.

Folks will notice that I am looking at this through the library's new Facebook application.

We are pleased to announce the University of Alberta Libraries Facebook application. This new application allows access to our library catalogue, Ask Us Services, RefWorks and Get It Citation linker from within the Facebook platform. [Library News » Library Services Available Through Facebook]

As libraries place themselves more and more in other environments it would also be good to begin to see some numbers about what the impact on use of services is.

ualbertasmaller.png

View commentsView comments (7)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit   Furl  

Web world

 •  Categories: General - distributed environments , Miscellaneous

I answered a short questionnaire the other day on libraries and technology. I found myself wanting to rewrite some of the questions to avoid the sense they gave of technology as an external agent acting on libraries. Libraries are co-evolving with people's behaviors which are themselves being reconfigured in a network environment. Technology is not external. I was interested to read this general piece about the emergence of the web by Fintan O'Toole this morning in the Irish Times:

Yet all of these contradictions point to the ultimate power of this creation: its uncanny ability to mirror humanity. Unlike all previous new technologies, it is not a set of tools outside of ourselves. The web meshes machines and people more completely than any previous technology and its contradictions are ours. It is our unjust societies that maintain the digital divide as a new global class system, our capacity for generosity that created the collectivist not-for-profit ethic of the web, our greed that is making it increasingly subject to corporate takeovers, our desire for connection that drives its curiosity, our damaged minds that generate its dangers. It is, virtually, the way we are. [ireland.com - In Focus - Virtual Ireland]

Incidentally, I quoted extensively from a marvelous piece on public libraries by Fintan O'Toole a while ago, "Reading, writing and rebelling: growing up with public libraries".

Related entries:

View commentsView comments (1)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit   Furl  

Microsoft in the cloud

 •  Categories: General - distributed environments , General - systems and technologies

There is a short piece about Microsoft plans for network level services on CNET. Not much detail and still indicating a direction rather than reporting a lot to look at. Nevertheless, it is an interesting direction.

Microsoft is in the early stages of a plan that will see virtually its entire lineup of underlying Internet services opened up to developers, the software maker made clear this week.
In addition to making available its existing services, such as mail and instant messaging, Microsoft also will create core infrastructure services, such as storage and alerts, that developers can build on top of. It's a set of capabilities that have been referred to as a "Cloud OS," though it's not a term Microsoft likes to use publicly...
... But, quibbles over nomenclature aside, Microsoft made clear this week that it aims to play the same role on the Internet that it plays today on the desktop--that of providing its own applications as well as the underlying plumbing and tools that developers use to build their products. [Microsoft's 'Cloud OS' takes shape | CNET News.com]

Related entries:

View commentsView comments (0)    Post a commentPost a comment    Bookmark:  del.icio.us   Digg This   Google Boomarks   reddit