Google and OAI-PMH

There is an interesting note on the Google Webmaster Central Blog:

When we originally launched Sitemaps, we included support for the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) 2.0 protocol, an interoperability framework based on metadata harvesting. In the meantime, however, we've found that the information we gain from our support of OAI-PMH is disproportional to the amount of resources required to support it. Fewer than 200 sites are using OAI-PMH for Google Sitemaps at the moment.



In order to move forward with even better coverage of your websites, we have decided to support only the standard XML Sitemap format by May 2008. We are in the process of notifying sites using OAI-PMH to alert them of the change. [Official Google Webmaster Central Blog: Retiring support for OAI-PMH in Sitemaps]

Via Paul Walk, who remarks:

There are a few ways of looking at this. Perhaps ‘open access’ repositories are less concerned with Google rankings than the typical website owner. Perhaps the penetration of OAI-PMH in the world is still below any level that Google could find particularly interesting - certainly they never went to great lengths to advertise this support while it lasted. Clearly, Google have come to the end of a ‘trial period’ for their support for this protocol in their main indexing service. [paul walk’s weblog » Blog Archive » Google gives up on supporting OAI-PMH for Sitemaps]

Comments: 0

Apr 24, 2008
Dorothea Salo

Clunky, overengineered protocol used only by libraries loses out to fast-n-simple format created by a Big Kahuna.

I'm shocked. Shocked, I tell you. Who could have anticipated it?

Apr 24, 2008
Jonathan Rochkind

So... would it be easy to write a tool that produces a google XML sitemap for any OAI-PMH provider?

Apr 24, 2008
Dorothea Salo

There's been an unofficial Google Sitemap hack for DSpace for half of forever. The funny thing was when its use was recommended to keep Google crawlers from crashing DSpace installs.

Apr 30, 2008
Herbert Van de Sompel

Google actually never really used OAI-PMH, so there's not much news, here.
As to "used only by libraries", I note that the Linked Data community is coming up with creative approaches to leverage existing OAI-PMH infrastructure. And I also note that an illustrious web guru seems to quite like what he sees. I don't think I'll comment on the "clunky and overengineered" aspect of the protocol.