I was very interested to read this brief piece about the 'new discipline' of 'computational advertising':
Web advertising is the primary driving force behind many Web activities, including Internet search as well as publishing of online content by third-party providers. A new discipline - Computational Advertising - has recently emerged, which studies the process of advertising on the Internet from a variety of angles. A successful advertising campaign should be relevant to the immediate user's information need as well as more generally to user's background and personalized interest profile, be economically worthwhile to the advertiser and the intermediaries (e.g., the search engine), as well as be aesthetically pleasant and not detrimental to user experience. [ACL-08: HLT - Tutorials]
This is from the notice about a tutorial session at ACL-08: HLT which is taking place in Columbus in June. The conference combines the Annual Meeting of the Association for Computational Linguistics (ACL) with the Human Language Technology Conference (HLT) of the North American Chapter of the ACL.
Given the nature of the conference, the tutorial has a particular focus:
In this tutorial, we focus on one important aspect of online advertising, namely, contextual relevance. It is essential to emphasize that in most cases the context of user actions is defined by a body of text, hence the ad matching problem lends itself to many NLP methods. At first approximation, the process of obtaining relevant ads can be reduced to conventional information retrieval, where one constructs a query that describes the user's context, and then executes this query against a large inverted index of ads. We show how to augment the standard information retrieval approach using query expansion and text classification techniques. We demonstrate how to employ a relevance feedback assumption and use Web search results retrieved by the query. This step allows one to use the Web as a repository of relevant query-specific knowledge. We also go beyond the conventional bag of words indexing, and construct additional features using a large external taxonomy and a lexicon of named entities obtained by analyzing the entire Web as a corpus. Computational advertising poses numerous challenges and open research problems in text summarization, natural language generation, named entity extraction, computer-human interaction, and others. The last part of the tutorial will be devoted to recent research results as well as open problems, such as automatically classifying cases when no ads should be shown, handling geographic names, context modeling for vertical portals, and using natural language generation to automatically create advertising campaigns. [ACL-08: HLT - Tutorials]
When we tried out the Kindle a while ago, my son immediately began to touch the screen. But no, the only effect was to leave marks.
This morning in our local Border's I noticed that they had little notices stuck above the screens of their enquiry system. They said that these were not touch screen systems and that people should use the tracker ball and button.
It is interesting how things come to be expected ....
We have been looking at Etsy at home recently: "your place to buy and sell all things handmade". It is just nice to use. I especially like the useful search by color ;-)
Turn your ideas into reality with Alchemy! Buyers can post requests for custom handmade items, and then sellers bid on the opportunity to make the goods. Please check out the Alchemy overview help guide and the rules for Alchemy before getting started. Have fun! [Etsy :: Alchemy - Public Listings]
They have a range of interesting looking community sections, including virtual labs (billed as "live workshops and online classes"), but I have not tried them out so don't have a sense of how active they are. They look nice though!
I was interested to see that Etsy had made the WebWare top 100 Web apps for 2008 list.
Just slightly more than half of all the votes cast in the Webware 100 went to the top 10 vote-getters. Six of these top 10 are no surprise at all: Facebook, Firefox, Google, iTunes, MySpace, and YouTube. But the other four may not be as familiar to most Webware readers:
From Fred Stutzman, who is looking for network-free moments in which to 'code, write or create':
In an attempt to resist the encroachment of network into the spaces of productivity, I've created Freedom. Freedom is a Mac application that disables your computer's networking capabilities for a selected time interval. Some of you may turn off your network when you need to be productive; I've done that, but always found myself popping the network on at my next break (and losing 20 minutes to YouTube/Wikipedia/etc). Freedom takes this approach a step further, locking you out of your network for your selected time interval; Freedom enforces freedom. [Unit Structures: Productive Unit Structures: Introducing Freedom]
Having moved from London to the Mid-west we are very aware of the impact of location on life-style, the switch from public transport to the car being a major example. In this context, I was interested to read this paragraph by Castells et al in Mobile communication and society: a global perspective which I am reading quickly to help me address a writing obligation in which I am currently delinquent ;-)
Another critical difference between national systems is related to the predominant transportation method: in the States, for example, where most people drive their own cars, certain types of mobile communication activities (such as SMS) are less viable. In contrast, where public transport is the main means of movement (as in parts of Asia and Europe) people have a greater ability to use wireless technologies on-the-go and consequently develop expertise faster. [Mobile communication and society: a global perspective, p.37]
I sometimes wonder about reading in this regard, but have not done the work to see if there has been any research about the correlation between 'predominant transportation method' and reading levels. For example, I would read a newspaper on my way to work when I traveled by train or bus; I spend less time reading newspapers now that I drive.
I have just finished Nicholas Carr's The Big Switch. Here is a sample:
The complexity and inefficiency of the client-server model have fed on themselves over the last quarter century. As companies continue to add more applications, they have to expand their data centers, install new machines, reprogram old ones, and hire ever larger numbers of technicians to keep everything running. When you also take into account that businesses have to buy backup equipment in case a server or storage system fails, you realize that, as studies indicate, most of the many trillions of dollars that companies have invested into information technology have gone to waste. [The Big Switch, p. 56]
Most of the software and almost all of the hardware that companies use today are essentially the same as the hardware and software their competitors use. Computers, storage systems, networking gear, and most widely used applications have all become commodities from the standpoint of the businesses that buy them. They don't distinguish one company from the next. The same goes for the employees who staff IT departments. Most perform routine maintenance chores - exactly the same tasks that their counterparts in other companies carry out. The replication of tens of thousands of independent data centers, all using similar hardware, running similar software, and employing similar kinds of workers, has imposed severe penalties on the economy. It has led to the overbuilding of IT assets in almost every sector of industry, dampening the productivity gains that can spring from computer automation. [The Big Switch, p. 57]
Carr makes an analogy with electric power. Many years ago, companies would have had their own power generators. This was very inefficient and we moved to a utility model, where generating capacity was concentrated and delivered to others over the electric grid. He foresees the emergence of a similar model with computing and applications, a movement to a utility model where capacity is delivered as required over the network.
Of course, as he notes, this is already upon us. Think of a couple of prominent examples: Amazon Web Services and the range of Salesforce.com's services.
Amazon provides computation, storage and other services on an on-demand basis. Werner Vogels, Amazon CTO, has an interesting presentation where he talks about Amazon's webscale services and discusses their rationale. The subtitle of the presentation is "compete on ideas, not resources". In terms that echo Carr's, he talks about the 70/30 switch, claiming that 70% of a firm's "time, energy and dollars is spent on undifferentiated heavy lifting" in building out infrastructure, while 30% is spent on "differentiated value creation". Amazon wants to help organizations reverse those numbers, reducing the time spent on undifferentiated, increasingly commodity, infrastructure.
I was looking at My Starbucks Idea the other day and was interested to see that it was powered by force.com. This is a suite of on-demand tools from Salesforce.com which claim to allow you to build enterprise applications without any custom development work. What immediately struck me was the way in which the service was promoted, echoing Carr and Vogels: the strapline is "Finally, focus on innovation, not infrastructure". I liked their line:
Free up the dollars wasted "keeping the lights on" with a zero-infrastructure model.
The 'big switch' is going to be a major issue for libraries over the next few years. They spend too much time getting their systems to work, and not enough time putting them to work.
Of course, much will depend on what types of services are available to libraries from their providers and it will be as interesting to see how those providers reconfigure their offerings in coming years and what new providers emerge.
Note: I was prompted to note the Big Switch after reading and commenting on Mark Dahl's post here.
There are six panels. One displays a (short) list of university podcasts; another displays standard virtual tour stuff.
The other four are more interesting. One displays the University's wikipedia entry. One displays photos from Flickr (I am not sure how they are being selected: is it more than the 'university of southampton' tag?). One displays videos from Youtube (again, I am not sure if these are any videos which show up on a 'university of southampton' search or if some other selection criteria apply).
And finally, one displays a tag cloud which links through to underlying del.icio.us pages of links to University of Southampton pages. So, for example, the jobs tag links through to a page of links about University policies, amenities and so on that might be of interest to somebody looking for a job. In this case, there is more active management of the collection by the user 'Southampton'.
I liked what I assume to be the intent here (there are no links to explanatory material). Although, this seems like a sketch for what one might do, rather than the fully worked through presence. For example, why not display the full del.icio.us tag cloud which gives richer access to the Southampton pages? What would the best approach be to showcasing research and learning outputs?
The site is designed by Precedent, "specialists in strategic thinking, digital communications and brand communications". A moment with Google revealed:
James Soutar, a senior branding and communications consultant at Precedent, said the web would be "the principal battlefield" in the competition for students. Information on consumer and social networking sites, such as Facebook, could become as influential as that on universities' own websites, he added. [Times Higher Education - Post-92 websites fail on the basics]
He talks about consolidation within the BI (Business Intelligence) market: "After more than a dozen acquisitions made by Business Objects, Cognos, and Hyperion over the past few years, these BI tools/analytics industry leaders were themselves snapped up in a matter of months by SAP, IBM, and Oracle respectively." And he notes the earlier consolidation of the underlying database industry around Oracle, IBM and Microsoft.
Held argues that consolidation has improved the overall BI marketplace. It delivers - he suggests - economies of scale and economies of innovation (and, although he does not mention it by name, economies of scope). These 'mega-vendors' offer a range of products. For some customers, the ability to concentrate interaction with a single vendor, a single helpdesk, and a single contract, and to benefit from discounts, are important benefits. For vendors, it should be possible to remove redundant costs in administration and distribution. Competition between a small number of dominant players is good for the market.
He suggests, however, that the mega-vendors find it difficult to innovate or meet new needs; they have a very full array of products spread over a large customer base. This means that there will always be investment available to new entrants who innovate around technology or business models to meet evolving needs.
He points to open source and SaaS (software as a service) as two important business model innovations. He also provides some technology innovation examples, emphasizing performance and price improvements.
Does this map onto the process automation providers within the library community? Here are some thoughts, focusing on the US environment. (And, full disclosure, OCLC has some offerings in some of the areas I discuss below.)
There has definitely been consolidation within the classic ILS environment. This is good in principle, as the library market - not very big to begin with - has been overpopulated with vendors trying to provide a full range of products. In practice, of course, much depends on how the remaining vendors work through integration issues. We can see some potential economies of scope (as diversifying library needs can be met from a single source) and scale (as development, support and R&D are consolidated).
However, none of these vendors is very large, they operate in a small community, and they have limited organic growth opportunities in their historic core. They have moved to meet diversifying library needs with additional products. Accordingly, we have seen that process automation for the 'bought/physical collection' (the ILS) has been joined by process automation for the 'licensed collection' (metasearch, resolution,knowledge base,ERM), and the 'digital collection' (repositories). Other products have also appeared to meet more specific needs (self-service, e-reserves, ...). Recently, a new category of discovery system has emerged which pulls together institutional data (from the ILS and from repositories), and several products have appeared. Now, each vendor has a significant development challenge in creating this full array of products, and we have seen some licensing of other components (support for metasearch or knowledge base, for example). Interestingly, we have not seen these companies acquire new entrants who are also developing these newer products (more of these below).
And, although we have seem some libraries acquire pieces from different vendors this is not as widespread as one might expect for some of the reasons suggested above. There are economies in dealing with as few vendors as possible. In addition, the library community has quite a personalized relationship with its ILS vendor community which adds to the incentives to acquire various components from the same vendor.
Marshall suggests that 'dissatisfaction and concern prevail' in this marketplace. I think we can expect further consolidation, as the number of vendors here reduces to two or three, maybe with particular specialties.
What about innovation? There is some concern that there has been little innovation in the classic ILS space, which matches Held's observation. That said, we can point to Ex Libris's collaboration with Herbert Van Der Sompel around the deployment of resolution as a service as a notable instance, or experimentation with ERM. It is not surprising that as new areas have been identified we have seen a range of new entrants, sometimes emerging from within the library or academic community. See for example Serials Solutions, which aims to provide a complete approach to licensed collections. The metasearch and resolution arena has seen several companies emerge, some of whom syndicate services to other players. See for example Muse Global, Openly Informatics (now part of OCLC), WebFeat or TDnet. And more recently, as we have seen attention to better discovery environments, Aquabrowser is being deployed by some libraries.
One area where innovation has been slow is in how the library systems apparatus engages with the tools that people are increasingly using to organize their own information spaces, at the browser level, or in social bookmarking, social networking, and other network-level sites.
Business model innovation? Held mentions Open Source and SaaS (Software as a Service). We have seen two major areas of open source development. The first is in the area of repositories, where we see Fedora, Dspace, and Eprints. The effort involved in deployment here may be high. Each initiative has gone through some organizational development, looking for ways to sustain itself, and the role of grant/foundation money has been important. The second is in the ILS arena, where Koha and Evergreen are receiving a lot of attention. Koha is more widely deployed; there have been some recent high-profile commitments to Evergreen. There are also some other areas where open source solutions are in use: metasearch (e.g. Index Data, LibraryFind), text searching (e.g. Lucene, Index Data), and a recent interest in 'next generation catalog' solutions (e.g. Solr, Vufind). Index Data has been active for a while, with a strong niche presence in Z39.50 applications and text searching and metasearch offerings. One interesting development is the emerging support industry here, where Care Affiliates, Index Data, Equinox and LibLime will offer support and consultancy. It will be interesting to see how this range of activity develops in coming years. In part it will probably depend on the ability of this nascent support industry to meet mainstream library requirements for support and reliability; and in part of course on the ability to continue to develop the software.
And what about SaaS? SaaS tends to be used quite loosely. Think simply of three levels. The first is where individual instances of an application are hosted. This may save the library some costs (hardware, sysadmin) but does not really alter the service model in other ways. A second is a 'multi-tenancy' model where multiple customers may be served from the same instance, but each with their own virtual application, potentially with configuration options. This may deliver savings but there may also be service improvements. Enhancements, fixes, etc, are available to all at the same time. Serials Solutions' services might be an example here. The third level becomes more interesting where shared use of a service generates network effects. Take a hypothetical example: a supplier could more easily develop recommender systems across multiple circulation systems. An actual example appears to be provided by Aquabrowser's announcement of its MyDiscoveries feature which aims to share user contributions to the catalog across customer instances. The SaaS model has been rapidly adopted in wider contexts, and while there has been some library adoption, it is interesting that there is not a high level of discussion of the approach.
Marshall writes:
The year 2007 saw considerable upheaval in the library automation industry. To get some sense of the aftermath of the recent rounds of mergers, acquisitions, product consolidations, and to gauge interest in open source automation systems, I created and executed a survey that aims to measure the prevailing perceptions in libraries. [Perceptions 2007: an International Survey of Library Automation]
What is interesting to me is the extent to which the ecology of library process automation is richer than it was a few years ago. If we think of managing three materials workflows (bought/print, licensed/electronic, digitized/digital), and the progressive movement of libraries into the latter two, then we see that library needs are now potentially met by a wide number of players. The classic ILS vendors remain central players, but they have been joined by others.
The ILS vendors have products in all three areas, and are developing new discovery products. We have seen new entrants in the repository space (including ContentDM, now owned by OCLC) and in the licensed materials space (resolover, knowledgebase, metasearch, ERM) where a variety of products are available from a range of vendors. In this context, the collection of services within the Cambridge Information Group is interesting (Serials Solutions, Refworks, Illumina, Aquabrowser as well as other bibliographic products). And, of course, OCLC provides services also. Open source offerings have emerged to meet needs across the board.
We will definitely see more convergence alongside further new entrants. It will be interesting to see how the Open Source offerings develop, and I think that we will see some game-changing offerings in the SaaS space.
I hope Marshall repeats the survey. It would be interesting to extend its scope - if that can be done without too much loss of focus - to consider more of the wider process automation landscape.
The issue is that libraries have to manage a range of database resources whose legacy technical and business boundaries do not very well map user preferences or behaviors. The approach has been to try to move away from presenting a fragmentary straggle of databases to bundling them in various ways in a metasearch application, sometimes in one big search, sometimes in smaller course or subject bundles. The issues here are well-known, not least of which is that libraries typically have limited control over the performance of the target databases.
As an alternative, a few libraries have explored consolidating locally loaded data. This can work very well, as it becomes easier to build additional services over a consolidated resource. However, this is a rather too adventurous undertaking for most libraries. Another approach is for a third party to consolidate, and this is what we have seen with Google Scholar, Scopus, Worldcat, and others.
More recently, recognizing the advantages of local consolidation, we have seen the emergence of a new class of library system which pulls together metadata from locally managed stores (e.g. digital repository, ILS, institutional repository, ...) and offers an integrated search. This may still have to work closely with a metasearch engine to integrate access to external databases. ILS vendors are moving in this direction, and through Worldcat Local, OCLC is also addressing this type of integration.
This is a discussion worth returning to, but that is not my purpose here. Rather I wanted to point to an interesting treatment of similar issues from a different domain. Mike Stonebraker, database guru and writer in the group blog, The Database Column, has a post where he contrasts two models of data integration: ETL (extract, transform and load) and federation. The focus is on enterprise systems. The ETL model will typically involve a centralized data warehouse and "for each operational system, they will employ some sort of ETL process to transform data instances into the global schema and then load them into the centralized warehouse".
'Extract, transform and load' is a good characterization of what is involved in consolidation of library data, whether this is attempted locally or through third parties. One of the interesting questions is the sophistication of the 'transform'. Think of author names, for example, or subjects, or other controlled data, and what would be involved to effectively merge data created within different regimes. What is the impact, for search or for faceted display, of limited or no transformation of these elements?
Here are the headings Stonebraker uses for his discussion.
Data element "heat": Hot data favors ETL
Indexing: Federation is harder to optimize
Resource management: Faster BI query responses for ETL shops
Complexity of the schema change: ETL approach performs less joins
Timeliness: ETL approaches must deal with out-of-date data issues
Mapping: Federations can't handle some transformations
BI is short for 'business intelligence'. 'hot' data is data that is accessed often.
Now, while it is clear that our environment is similar to that discussed here in many ways it would be interesting to do a similar analysis with our domain in mind to see where there are differences. Of course, one issue is that most of the data under discussion here seems to be within institutional control.
Here is his conclusion:
In summary, virtually all enterprises use the ETL approach for data integration. The data federation market is, in contrast, quite small. The place where I see federations as most viable is when there are many, many data sources (e.g., more than 5,000 sources) and BI users utilize only a small number of them at any given time. In this extreme case, the average data element is accessed zero times before it is updated or deleted. In this instance, one is better off leaving the data where it originates. On the other -- more common -- hand, when most data elements get used several times, the ETL approach will continue to be preferred. [To ETL or federate ... that is the question - The Database Column]
Google Book Search: Document Understanding on a Massive Scale [PDF] is a brief treatment of issues faced by Google as they grow their corpus of digitized books and work to make it useful in various ways.
Luc Vincent of Google discusses OCR (issues of many languages occurring unpredictably in variously formatted volumes, at scale), and then focuses on issues of document understanding.
In addition to OCR, making these books easily accessible and useful on http://books.google.com has required developing a number of additional state-of-the-art systems. These include systems for automatically deskewing, cropping and cleaning-up scanned book pages, which is critical as pre-processing prior to OCR, but also to generate clean and small images for efficient web serving. While this may be a well understood problem for high-quality documents, doing this well on scanned century-old book pages is no small feat. Most of the advanced systems developed for Google Book Search however involve some form of Document Understanding and as such, come after OCR in the book processing pipeline. Systems that have been developed, are being developed or are being considered as interesting research challenges include: [Google Book Search: document understanding on a massive scale PDF]
These challenges include: page ordering, language identification, chapter identification, content linking (relate table of contents to appropriate boundaries, index entries to pages, ...); summarization; metadata extraction and cross validation; topic identification; book clustering and linking (create relationships between volumes).
He also discusses ranking:
Specifically, how should books that match a particular query be ranked? The web is notorious for its rich graph of hyperlinks, famously exploited by Google’ PageRank algorithm [6]. This structure applies somewhat to technical publications, which typically contain numerous references to other technical publications. However the universe of books is different and most books (eg, novels) do not contain any references. Novel approaches therefore had to be developed, exploiting an array of new signals. Additionally, these techniques were recently extended to allow “blending” of book search results with web search resuts when appropriate. [Google Book Search: document understanding on a massive scale PDF]
The paper outlines presentation options based on copyright status and also discusses how Google supports the document understanding community through the release of software and data sets.
I was interested that there was no discussion of social features.
In libraries we worship interoperability, in the abstract at least. We believe it is an unalloyed good.
My snappy interoperability tag is "recombinant potential": things are interoperable to the extent that they are capable of being combined or recombined with other things.
We are traveling with a laptop, head phones, two cell phones, a Blackberry, a digital camera and three iPods (two Nanos and a shuffle). I am sure that there is other stuff I am not remembering or do not know about ;-)
This requires us also to carry a variety of chargers, and, as they are US devices which we want to use in Ireland, a couple of adapters. Can we mix and match these, combining chargers and devices? Using headphones with the cell phone with music on it? Of course not, or only in limited ways.
I mentioned the other day that I had left my laptop cable behind. Can I borrow somebody else's laptop cable? Of course not. And we need to find another way to charge the iPods.
Sure, there is a small industry creating various 'recombining' devices, but this requires additional thinking and investment, something we are not organized or inclined to do.
Now, I am sure that I could rustle up a literature on the economics of all of this, suggesting why vendors are interested in this level of lock-in. But for the moment, I just wish that their recombinant potential were higher, reducing our traveling clutter and increasing our convenience.
It is interesting to read about the developments within Oracle's new on-demand customer relationship management offering which is being discussed in various places. What is striking is how the familiar Web 2.0 approaches are coming to enterprise applications: mashups between various data sources, extensive data mining, sharing and commentary.
Here is Phil Wainewright:
Attendees at SIIA’s OnDemand Summit on Friday were given a sneak preview of some of the new applications being unveiled tomorrow at the Oracle OpenWorld conference. Let me tell you frankly, I was seriously impressed. Oracle has evidently done some hard thinking about how Web 2.0 technologies and ideas can be adapted to practical enterprise use. The next time a C-level executive asks you how Web 2.0 affects the enterprise, you’ll be able to point to Oracle’s new CRM OnDemand applications. They are a powerful demonstration of how Web 2.0 can be applied in an enterprise environment. [» Oracle goes for the CRM jugular | Software as Services | ZDNet.com]
He goes on to provide some details - and it is worth reading them. There is also some commentary from his ZDNet colleagues. Library user expectations are being shaped by major network services on the web. It looks as if expectations for backoffice systems will begin to be raised as well.
Our activities in the network world leave traces. The analysis of these traces is now a major undertaking as organizations mine this data to understand behaviors, to improve their systems, and to refine their offer.
Tony Hirst has a seriesofposts about 'course analytics':
In contrast to the academic analytics, one of the things I set out to explore was how an off the shelf web stats analytics tool (Google Analytics) could be used to help me learn more about what students were doing with our online course materials, and help me identify what - if anything - a "learning site's" goals could be, and what the site might be optimised for. [OUseful Info: Course Analytics - Prequel]
And further ....
For the moment, what I am interested in is how website analytics can be used applied to online course websites in order to gain a better understanding of online study habits and the bahaviour of students taking an online course. [OUseful Info: Course Analytics, Part 1 - Visitor Behaviour]
He provides some interesting analysis, looking at how students use course materials. He then extends the question to the library website, and based on discussion with his Open University library colleagues he suggests a list of questions that might be tackled with this approach. What sort of search engine searches result in referrals to the library website, for example. How well is actual page popularity mapped by front page navigation options? And so on.
He wonders what success looks like:
How to define library website goals is another interesting exercise... If the site was Amazon, where the aim is to sell goods, a relevant goal page would be a "Thanks for the cash - the goods will be with you in a day or two" page. What is the range of useful, successful transactions on a Library website? [OUseful Info]
He is interested in hearing from libraries who use Google Analytics, or similar off the shelf approaches, and about what they are measuring. If you have some experience, leave him a comment .....
I propose that a resource and its URI ought to have an intuitive correspondence. …. URIs should have a structure. They should vary in predictable ways: you should not go to /search/Jellyfish for jellyfish and /i-want-to-know-about/Mice for mice. If a client knows the structure of the service’s URIs, it can create its own entry points into the service. ….. URIs do not technically have to have any structure or predictability, but I think they should. This is one of the rules of good web design, ….. [RESTful web services. Leonard Richardson and Sam Ruby. P. 83]
Lacking a Strunk and White Elements of Style for URI namespace, we’ve made a mess of it. It’s long past time to grow up and recognize the serious importance of principled design in this infinitely large namespace. [RESTful Web Services « Jon Udell]
I was reminded of these while reading Michael Panzer's discussion of URI patterns and Dewey the other day.
Although the Dewey Decimal Classification is currently available on the web to subscribers as WebDeweyand Abridged WebDewey in the OCLC Connexion service and in an XML version to licensees, OCLC does not provide any “web services” based on the DDC. By web services, we mean presentation of the DDC to other machines (not humans) for uses such as searching, browsing, classifying, mapping, harvesting, and alerting.
In order to build web-accessible services based on the DDC, several elements have to be considered. One of these elements is the design of an appropriate Uniform Resource Identifier (URI) structure for Dewey. [025.431: The Dewey blog: Designing identifiers for the DDC]
Many organizations are probably having similar discussions, and this is certainly part of a general exploration of this issues within OCLC.
The goal of an academic library is to be the best in the world at serving the unique teaching, learning and research needs of its home academic institution by being active participants in the creation, transmission and dissemination of knowledge. [p. 10]
She closes the chapter like so:
We cannot simply rest on our knowledge that the students, members of the rising Net Generation, are different. We must understand how and why and embrace those differences - not ignore, reject, or dismiss them. Our roles as translators requires us to meet undergraduates where they are, mentally, physically, and virtually, and help bring them to where the faculty reside. If we cannot begin to deepen our affinity with undergraduate students now, how much more daunting and difficult the task will be when they become our Net Generation faculty. [p. 11]
I have been in a couple of discussions recently which raise a related issue. To what extent is faculty's perception of the library based on memories of their use when they were undergraduates and graduate students?
In her final chapter, the author discusses five guiding principles for the academic library under the following headings:
Adopting an R&D culture
Rethinking "library as place"
Accepting that the library is not the virtual place
Jane Hart is compiling a list of the top tools folks use in their learning and working lives. It is aggregated from the top tens of contributors active in e-learning.
It is broadly what you would expect near the top, with Firefox, Skype, Delicious and Google stuff heading the list. Some of the individual lists have some less pervasive 'tools' and they are worth a scan.
I was interested to see that Google Reader was joint third (mentioned by fourteen people), ahead of Bloglines which was in ninth (mentioned by ten).
There is a short piece about Microsoft plans for network level services on CNET. Not much detail and still indicating a direction rather than reporting a lot to look at. Nevertheless, it is an interesting direction.
Microsoft is in the early stages of a plan that will see virtually its entire lineup of underlying Internet services opened up to developers, the software maker made clear this week.
In addition to making available its existing services, such as mail and instant messaging, Microsoft also will create core infrastructure services, such as storage and alerts, that developers can build on top of. It's a set of capabilities that have been referred to as a "Cloud OS," though it's not a term Microsoft likes to use publicly...
... But, quibbles over nomenclature aside, Microsoft made clear this week that it aims to play the same role on the Internet that it plays today on the desktop--that of providing its own applications as well as the underlying plumbing and tools that developers use to build their products. [Microsoft's 'Cloud OS' takes shape | CNET News.com]
Manuel Castells' Rise of the Network Society is a loose, baggy monster of a book, to hijack an expression. It is full of memorable phrases. One that has stuck in my mind is his catchy definition of space as 'material support for time-sharing social practices'. This is in the context of how networks create new spaces for social practices.
I mentioned Eduserv and Second Life the other day. I was sent a follow-up note pointing to the press release [pdf] about the projects it is funding on learning and 'virtual world' spaces. And Andy Powell has posted an interesting personal reflection about Second Life, its appeal, and its potential.
Whether this level of attention is justified is another matter of course. As with the early days of the Web, what we are seeing at the moment is a lot of experimentation - with no-one being quite sure what works well and what doesn't. We're seeing lots of people in the education sector getting excited, getting involved, getting in-world, and then trying to work out what the hell they are going to do when they get there. Those people are usually operating alone or in small units - there is still little high-level strategic commitment to Second Life or 3-D virtual worlds. [Sage - eFoundations]
At the same time, Herbert sent me a note recently about a very interesting looking conference at Ghent University, new Google partner and Herbert's former workplace.
The International Conference on Analogous Spaces interrogates the analogy between spaces in which knowledge is preserved, organized, transferred or activated. Although these spaces may differ in material, virtual, or operational ways, there are resemblances if one examines their ‘structure,’ ‘form’ and ‘architecture’. How do these spaces co-exist and interrelate? [___ Analogous Spaces ___]
The call for papers [pdf] cites Castells, and also notes Paul Otlet as an inspirational figure.
It is organized around three themes. Here is the second.
The second theme deals with the space of knowledge and memory. How can we compare the encyclopedia and the museum, the book and the library, the diagram and the database? How do they use architecture to structure knowledge and how is architecture used as a metaphor of memory? [___ Analogous Spaces ___]
It certainly looks as if material support for time-sharing social practices will be provided in a variety of ways ....
Update: I just saw Dave Tosh's post querying the value of Second Life in an educational context.
And while talking about other lives, those interested in thinking more about Second Life in an educational setting can follow the discussion on the Eduserv Foundation blog (second life on eFoundations) where this is a particular interest. They held a dual-world symposium (rl-sl does not quite have the ring about it that ac-dc does) recently around the theme Virtual worlds, real learning? See the debriefing and follow blog commentary via Technorati.
Andy Powell (aka Art Fossett) has a short video briefly describing Eduserv, the Foundation and its Second Life activities. Watch out for the discussion of, ahem, slashups (i.e. second life mashups), and the description of the recent Eduserv Foundation grants to projects exploring educational uses of Second Life.
There are lots of news stories, but little news, about the deal to develop a major virtual venue between the Swedish Entropia Universe and the Cyber Recreation Development Corporation (CRD), the online entertainment division of the Beijing Municipal People's Government.
China will soon have its first cash-based virtual world, where millions of people can work, socialize, learn and fall in love, Sweden-based Entropia Universe announced Wednesday. ....
... "An important aspect for this project is also the positive effects on our environment that we foresee," Liu said in a release.
"People will actually be able to work from home inside Entropia Universe, as many people do today, even from rural areas, thereby decreasing the amount of pollution generated by travel." [China Post - Taiwan Business,World Business,台灣財經]
The story goes on to quote Entropia saying "that as many as seven million people will be able to access China's virtual universe simultaneously and the hope is to attract some 150 million residents from around the Earth to socialize and do business there". It is planned to launch in August next year, so it will be interesting to see whether it develops in line with current aspirations.
There has been some discussion - less than I expected - about Google's steps to develop a unified search across its services (blogsearch, booksearch, YouTube, etc) so that blogs, video, books, maps, and so on are returned in results on the main Google site.
This latest refinement sounds simple, but it isn't. According to the Californian technology powerhouse, it is a result of two years' work by more than 100 engineers and involved a major revamp of the company's software platform. [Google takes search to next level | | Guardian Unlimited Business]
This is a major step given the central importance of ranking to Google and the different ranking models that it employs across these individual services.The first signs of the integration are showing up and more stuff will be progressively introduced.
Google's vision for universal search is to ultimately search across all its content sources, compare and rank all the information in real time, and deliver a single, integrated set of search results that offers users precisely what they are looking for. Beginning today, the company will incorporate information from a variety of previously separate sources – including videos, images, news, maps, books, and websites – into a single set of results. At first, universal search results may be subtle. Over time users will recognize additional types of content integrated into their search results as the company advances toward delivering a truly comprehensive search experience. [Google Press Center: Press Release]
This is all intriguing. It will be fascinating to see how they handle a major transition which has a significant impact on the core of the Google user experience and how well they deliver on the promise of integration.
We sometimes talk about Google as if it is something fixed: we know what it does and how it works. This is especially so in library discussions where the library experience is compared to the Google experience, as if it were something that was going to continue in its current form. However, look at what they have been doing with Google Booksearch and now look at this big change. We do not know what Google will be like in three years time - it will certainly not be the Google of today.
What I find most interesting about these directions is gradual introduction of additional navigation options. We are used to hearing people talk about the 'simple search box' as a goal. But, a simple search box has only been one part of the Google formula. Pagerank has been very important in providing a good user experience, and effective ad placement is important for their revenue model. However, as we move to merged results over a mixed resource base a single ranked list becomes less useful, but also other browse/navigation options become more important. We are seeing Google experiment and narrow by resource type, navigate by related terms, offer related searches, and so on. Basically, they are mining their data to offer a richer 'texture of suggestion' than they have in the past. Search may start with the simple search box, but then a variety of directions are opened up based on the results.
We can see this emerge as a pattern. A simple entry point into a richer navigation space. This is emerging in our library catalogs which are moving to think about how to better exploit the structure of the data to create navigable relations (faceted browsing, FRBR, ...). In this way, the user follows data paths presented after an initial search rather than having to make complicated choices up front before seeing any results. And this highlights again the need to make our bibliographic data work harder in systems and services.