The main project I'm focused on at work right now relates to uniquely identifying the people who write and referee articles for our journals. Our referee database is pretty good, but even that has a number of duplicate entries, as I've been finding. In one case the name was the same but with last and first name's switched; in another somehow we'd created a record with a slightly modified version of the surname (and a note that the name was wrong). It's a lot trickier than one might at first imagine, but an open public solution for researcher identification could be a big help. There seem to be several projects in the works in that regard:
There are at least two general issues:
(1) Identity of authors of previously published articles
We think we can get most of our records linked in this way to unique instances of authors, but we're inevitably going to have some percentage of erroneous relationships: some papers linked to the wrong author, and some authors listed two or more times as being different people.
ResearcherID in a sense doubles the problem - ISI makes no attempt to verify authorship claims, and it doesn't even seem to provide a way to uniquely list articles more than 10 years old (I uploaded all my 1995 and earlier publications there, and it doesn't seem to understand anything about them - no citation data etc.) At least CrossRef would solve the "what article are you talking about" problem by ensuring unique article identifiers in the first place. And then, how do we uniquely associate our author records with a ResearcherID number? ISI doesn't currently provide a protocol for an individual to prove to a third party that they own a particular ResearcherID, and the webservices they do provide are tied to WebOfScience subscriptions. An individual can acquire more than one ResearcherID, and list subsets of the same publications under multiple identities, if they chose (or if they just forgot they'd registered previously).
AuthorClaim seems to be based on an internal database of articles (most of mine were not there, though it found over 1000 matching my name!), so it controls the article side of things - doing this with the CrossRef database and DOI's as article identifiers would make sense. That doesn't seem to be what it's doing yet, but at least articles should be uniquely identified there, so it doesn't have the ResearcherID problem on that front. On the other hand it does seem to have similar issues with claims and potential for author duplication.
OpenID, generically, does not help either although it does provide that third-party proof-of-ownership piece that ResearcherID is missing right now. An individual can have many different OpenID's, just as they can have many different email addresses, and an OpenID associated with an individual is probably practically just about as useful as an email for uniquely identifying them. We already have email addresses for essentially all our (corresponding) authors of the last decade, and two decades for a good fraction, and it's still tough to figure out exactly who's who.
Unless there's some strong motive for researchers to stick to a unique non-shared ID in self-identifying, or other actors in the research system force such a unique ID somehow, this issue of duplicate records for older work is not going away.
(2) Identity of authors from the point of submission through publication/citation etc.
The requirements for handling issue (2) are technically straightforward but perhaps practically difficult due to the need for cooperation:
The requirements I'd suggest to get full authoring relationships for historical data, issue (1), are technically trickier but perhaps practically easier:
There's a lot of potential here, but it's going to be slow going without a widely used and agreed upon unique, validatable, permanent, author identifier.