What is data curation? Here is one view:

Data curation is the active and on-going management of data through its lifecycle of interest and usefulness to scholarship, science, and education. Data curation activities enable data discovery and retrieval, maintain its quality, add value, and provide for re-use over time, and this new field includes authentication, archiving, management, preservation, retrieval, and representation.
As the practice of science and other disciplines change, data curation has become a major issue.

In some cases a systemwide response will be required as the scale and complexity of the task means that it may not make sense for it to be invidually addressed in many institutional contexts. However, it is also clear that institutions need to figure out appropriate responses. I have spoken about some of the programmatic responses before, and some of the reports that have outlined issues.

Last Friday I drove across the snowy flat landscape between Columbus and the University of Illinois at Urbana Champaign where I was presenting at the Graduate School of Library and Information Science (GSLIS). I was very interested to discover that GSLIS is introducting a data curation concentration. Here is the rest of the paragraph from which the quoted piece above comes:

The Data Curation Education Program (DCEP) concentration within our ALA-accredited master of science offers a focus on data collection and management, knowledge representation, digital preservation and archiving, data standards, and policy ..... Our program will provide a strong focus on the theory and skills necessary to work directly with academic and industry researchers who need data curation expertise. [Master of Science: Concentration in Data Curation]

This is an intriguing development, and I understand that there has been strong interest in the program.

Aside: Folks interested in this topic may find the presentations from the Second International Digital Curation conference relevant. They cover a range of topics with good international participation. There is a section on education for 'data scientists' which includes a presentation from GSLIS colleagues.

Related entries:

Comments: 1

Feb 26, 2007
K Bodling

An important aspect of this is that the "life cycle" can be MUCH longer than one might at first think. It doesn't take a lot of reading to come across accounts of scientists -- currently active scientists -- finding and fruitfully mining data streams that are decades and decades old. The folks doing this work have to cultivate a very long view.