The current issue of Newsweek discusses a new book on the use of aggregated user data: Super Crunchers: why thinking-by-numbers is the new way to be smart. As more activities move online, we have more data about intentions, choices, attention. The punchline of the story is that service choices are increasingly guided by intelligence mined from this data, which in turn signals a reduced role for intuition and human judgment. This points to a major change in the networked environment, as services develop reflexively based on data-driven reconfiguration. Everywhere data is collected about what we do, and used to improve services, target offers, and deepen the relationship the supplier has with the end user.

But according to a new book by Ian Ayres, an econometrician and law professor at Yale, this is a microcosm of a powerful trend that will shape the economy for years to come: the replacement of expertise and intuition by objective, data-based decision making, made possible by a virtually inexhaustible supply of inexpensive information. Those who control and manipulate this data will be the masters of the new economic universe. Ayres calls them "Super Crunchers," which is also the title of his book, ... In fields from criminal law (where statistical projections of recidivism are taking discretion away from judges and parole boards) to oenophilia (where a formula involving temperature and rainfall is a better predictor of the quality of a vintage than the palates of the most vaunted experts), "intuitivists" are on the defensive against the Super Crunchers. ...
... But the same explosion of computing power gives large companies powerful new tools with which to entice—or, in some cases, to torment—their customers. "It's going to be easier to find the products and services we want," Ayres predicts. "The sellers are doing the work for us." Amazon's computers know what we'll like even before we figure it out for ourselves; Netflix customers, says Ayres, like the movies the service recommends better than the ones they choose on their own. But auto dealers can use the same kinds of data to calculate to a fine point just how far they can push their customers on price and loan rates. When airlines cancel a flight, Ayres writes, they use an algorithm to predict which customers are most vulnerable to being lured away by a competitor and to give them, not the airline's own best customers, priority in rebooking. [How Data Mining Is Replacing Intuition - Newsweek Technology - MSNBC.com]

It is interesting to think about libraries in this context, where there is a contrast with the general environmental trend. While a key potential strength of the library is its proximity to and knowledge of its users, its automated customer relationship management is shallow and short-lived. Typically it is not data-driven (based on what you have already borrowed, you might be interested in this; people who borrowed this, also borrowed that; people who searched for this, ended up borrowing that; and so on). The use of such data in support of collection development is variable. Now, there are some good reasons for restraint, to do with privacy or the quality of the library experience. And we do not have good mechanisms for aggregating such data to improve the service. But as personal contact with all users is not likely, and as folks come more to expect personalized experiences, recommendations, and other features, then we will likely see more options emerge.

Update: there is a piece by Ayres in today's FT extracted from the book.

Related entries:

Comments: 1

Aug 31, 2007
Ryan Shaw

Most of the machine learning experts I know are not nearly as confident as Ayres in the superiority of their algorithms to intuition, especially since, as most of them will admit, designing machine learning algorithms itself requires a healthy dose of intuition. It is also extremely hard hen evaluating the success of data mining to factor out the effects of the way these technologies are being presented to users. Are Google's results really more relevant, or do we believe they're more relevant because they're Google's results? (See [1].) Likewise for Netflix's recommendations. I am a rabid consumer of music--Last.fm and its ilk have loads of data on my listening habits. But their algorithmic recommendations are sterile and predictable compared to the recommendations of the music bloggers I follow. As for libraries--do we really want recommendation engines impinging on that territory? How would scholarship be affected if collaborative filters drove scholars toward convergence on the same kinds of sources?


[1] http://jcmc.indiana.edu/vol12/issue3/pan.html