Where's the fraud?

In previous posts I have discouraged discussion of Michael Mann's work since I had not investigated it at all myself - but inevitably it came up anyway. There were a couple of interesting comments from Steve Mosher and AMac that I am highlighting in this post. If commenters here agree that the Tiljander case is the closest thing anybody has come up with to show consistent misbehavior by climate scientists (following the basic "fraud-like" criteria I set out) then I commit to looking into it myself and trying to understand why scientists and bloggers seem to be disagreeing about it. AMac's denial of "fraud" while calling it an "honest mistake" seems odd to me - if it's really an "honest mistake" it should be acknowledged, not repeated.

Or if folks here think the Tiljander case is not a real problem but some other hockey stick "trick" or feature is definitely fraudulent, I'll look there. Tell me what your best case is!

Just because scientists are human - that is biased, inconsistent, lazy, argumentative, make mistakes, argue, play "politics", etc. etc. does not make some piece of science a fraud. Scientists in their natural state are fiercely competitive with one another - recognition for solving some problem or being first to discover some new truth about the world is all that matters. Tearing down somebody else's work, if you're right, is always grounds for praise. As long as there is some collection of predictions about the world from a piece of science and measurements to verify those predictions, then no matter what the biases or mistakes of the scientists involved, as long as they are not being deliberately fraudulent, the truth will prevail. Of course, without that check and balance from nature, even without fraud, science can get wildly speculative (*cough* string theory *cough*).

Human frailties can mar any piece of scientific work, and this shouldn't surprise anybody. The worry is that some pieces of work that people have come to respect and rely on have been, in some manner, fabricated and are themselves wrong. But fraud is hard to perpetuate in science - it almost always turns up later when others try to do the same experiment or analysis over again and consistently get some different result. On the other hand, if there has not been any actual fraud, what's the problem? The science is still right, even if the scientists behaved abominably (and I've personally witnessed some pretty abominable stuff from people who received great honors...). That's sort of the beauty of the objectivity that the intrinsic competition and reference to nature of science forces on you: personalities really don't matter, only the truth does - it's only the thought that counts as I wrote some time ago.

But - there have been cries of fraud. Let's try to get to the bottom of them. Here are 5 objective criteria for clear continuing fraud that I posted here in a comment the other day:

(1) A result (graph, table, number) presented in a peer-reviewed or IPCC article that was false - i.e. said to be one thing, but was actually something else. Incomplete presentation is not sufficient - I want to see something that was actually false (such as this AR4 case would have been if it had worked out). Truncation doesn't count unless they claimed to be presenting a whole series and clearly actively concealed the truncation. End-point smoothing doesn't count (for example the Briffa 2001/NCDC graph) unless they specified how they were handling the endpoints and did it differently. Etc.

(2) Where the falsity made a material difference to the overall message of the graph, table or number. That is, the visual or mental impact of the difference is obvious to a cursory look at the presentation, and doesn't require detailed magnification of the curve or looking at the last decimal point to see any difference.

(3) Where the problem, identified by blogger or scientist or whoever, has been presented in a clear manner demonstrating the "wrong" and "right" versions for all to see

(4) Where the original scientific group responsible has not responded with acknowledgment of the error and corrected the record as far as possible, and committed not to make the same mistake again

(5) Where the original group has in fact repeated the error more than once, after public disclosure of the problem.

I'm reposting here two lengthy responses to this outline, and encourage further discussion of these in the comments below:

From AMac:

Tiljander/Mann Fraud?

...Short answer: No, But.

This freestanding comment is a Reply to MikeN's "Interesting, I think" (Sun, 6/27/2010 - 00:36) and Arthur Smith's "On the "'my standard' question" (Sat, 06/26/2010 - 18:26). This seeming side-issue may illuminate some of the points being discussed with the termination of the Briffa series in 1960.

Arthur listed 5 criteria in "On the 'my standard' question". Paraphrasing,

(1) A false result is presented in a peer-reviewed article or IPCC report.
(2) The falsity made a material difference to the overall message of a graph, table or number.
(3) The "wrong" and "right" versions of the identified problem have been presented in a clear manner.
(4) The authors have not acknowledged and corrected the error, or committed to not repeat the mistake.
(5) The authors have repeated the error, after public disclosure of the problem.

Fraud
Two definitions for "Fraud":

a: deceit, trickery; specifically: intentional perversion of truth in order to induce another to part with something of value or to surrender a legal right
b: an act of deceiving or misrepresenting: trick

We're in tricky [sic] territory already: accusers can mean (or claim to mean) that they're discussing "misrepresentation", but the charge of evil intent is present or nearby. Lack of care and precision in statements made by scientists and advocacy bloggers is one of the major polarizing factors the AGW dispute, IMO. Steve covered this ground nicely in Climategate: Not Fraud, But 'Noble Cause Corruption' (also note the cries for blood in the comments).

It's tractable to evaluate what somebody wrote in a journal article, much less so to ascertain what was in their heart at the time of writing. To me, this says most "fraud" charges will be either wrong or unprovable. They'll always be red flags to a bull (bug or feature?).

Tiljander
As described in the Methods and SI of Mann08 (links here), Prof. Mann and co-authors set out to catalog and use non-dendro proxies that contain temperature information. They assembled candidates and looked at behavior over the time of the instrumental record, 1850-1995. During this time of warming, the calculated mean temperature anomaly in most CRUtem cells (5 deg longitude x 5 deg latitude) rose. Proxies with parameters that also rose passed screening and progressed to the validation step (see the paper). The four measures (varve thickness, lightsum, X-Ray Density, and darksum) taken by Mia Tiljander from the lakebed varved sediments of Lake Korttajarvi, Finland also passed validation, and thus were used in the two types of paleotemperature reconstructions (EIV and CPS) that make up the paper's results. The authors recognized potential problems with the Tiljander proxies, but used them anyway. Because of their length (extending much earlier than 200 AD) and the strength of their "blade" signal (Willis Eschenbach essay), the proxies are important parts of the reconstructions.

The evidence is overwhelming that Prof. Mann and co-authors were mistaken in their belief that the Tiljander proxies could be calibrated to CRUtem temperature anomaly series, 1850-1995. The XRD proxy discussed here. The issue was recently raised again by A-List climate scientist and RealClimate.org blogger at Collide-a-Scape, The Main Hindrance to Dialogue (and Detente). Gavin and Prof. Mann's other allies are unable to address the matters of substance that underlie this controversy; see my comment #132 in that thread.

Arthur's 5 Criteria and Mann08/Tiljander
(0) Mann08's use of the Tiljander proxies is not fraud, IMO. All evidence points to an honest mistake.

(1) False result presented in a peer-reviewed article? Yes.

(2) Falsity made a material difference to the overall message of [graphs]? Hotly contested. Mann08 has many added methodological problems, making it difficult to know (see comment #132 and critical posts linked here). IMO, this demonstrated failure of key Mann08 methods (screening and validation) calls the entire paper into question.

(3) Clear presentations of "wrong" and "right" versions of the identified problem? Hotly contested. Gavin believes that the twice-corrected, non-peer-reviewed Fig S8a shows that errors with Tiljander (if any) don't matter. I rebut that in comment #132 and in this essay.

(4) The authors have not acknowledged and corrected the error, or committed to not repeat the mistake. Yes. In their Reply published in PNAS in 2009, Mann et al. called the claims of improper use of the Tiljander proxies "bizarre."

(5) The authors have repeated the error, after public disclosure of the problem. Yes. Mann et al. (Science, 2009) again employed the Tiljander proxies Lightsum and XRD in their inverted orientations (ClimateAudit post); see lines 1063 and 1065 in "1209proxynames.xls" downloadable in zipped form from sciencemag.org (behind paywall).

Summary, and Lessons for the Briffa Series Truncations
The key issue is not fraud. Nor is it that authors of peer-reviewed articles make mistakes. Everybody--scientists, book authors, and climate-bloggers included--makes mistakes.

Instead, the important question is: Does climate science adhere to Best Practices? Appropriately, scientists and bloggers scrutinize articles that cast doubt on the Consensus view of AGW, as shown by Tim Lambert in the 2004 radians-not-degrees case. What about papers that support the Consensus view? Are such errors in those papers picked up? Do the authors correct those papers, too?

Best Practices don't mainly concern the detection of glaring, easily-understood errors like a radian/degree mixup or an upside-down proxy. There are a host of issues -- as there are with drug research, structural engineering, mission-critical software validation, and a large number of other areas. I won't enumerate them -- beyond a plea for the correct and rigorous use of statistical tools. Recent threads at Collide-a-scape are full of suggestions and insights on this question, from AGW Consensus advocate scientist Judith Curry, and many others.

The key to the Tiljander case is the defective response of the scientific establishment and the AGW-advocacy-blogging community. I think it teaches that paleoclimatology is a young science that has yet to establish Best Practices (as the concept is understood by other specialties, by regulators, or by the scientifically-literate lay public). To the extent that Best Practices should be obvious -- e.g. prompt acknowledgement and correction of glaring errors -- scientists' and institutions' responses merit a "D" or an "F" to this point.

Broadly speaking, I think scientifically-literate Lukewarmers and skeptics accept the analysis of the last few paragraphs. In contrast, opinion-setters in the climate science community and among AGW-Consensus-advocacy bloggers emphatically reject it.

IMO, these differing perceptions explain much of the gulf between the opening positions of Steve Mosher and Arthur Smith on the general question of the justification for the 1960 truncation(s) of the Briffa series, and on the specific question of Steve's error in ascribing the padding of the AR4 truncation to a splice with the instrumental record.

From Steve Mosher

I think this is a really important comment. It lets me describe the central thesis of the book and our view of things.

What the mails detail is the creation of a bunker mentality. this mentality is best illustrated by some of the mails written by Mann. Essentially it is a vision of a battle between climate scientists and skeptics. Us and them. I put aside the question of whether this
mentality was justified or not. The important thing is that this mentality existed. Jones in an interview after climategate confirms the existence of this mentality. I do not think there is any evidence that contradicts this observation. The mentality existed. It is reflected in the language and the actions. What I try to focus on is how this mentality shapes or informs certain behaviors. We struggled a great deal with the language to describe the behavoir. Fraud was too strong a description. I would say and did say that the mentality eroded scientific ethics and scientific practices. it lead to behaviors that do not represent "best practices." These behaviors should not be encouraged or excused. They should be fixed.

When we try to make this case we face two challenges. We face a challenge from those who want to scream fraud and we face a challenge from those who want to defend every action these individuals took. Finding that middle road between "they are frauds" and "they did no wrong." was difficult to say the least. In the end its that middle ground that we want claim. The mails do not change the science ( said that many times in the book), but the behaviors we see are not the best practices. We deserve better science, especially with the stakes involved. If our only standard is the standard you propose, then I don't think we get the best science. I'll just list the areas in which I think the bunker mentality lead people to do things they would not ordinarily do. And things we would not ordinarily excuse.

A. Journals. There are a a few examples where the mails show the small group engaging in behaviors or contemplating behaviors that dont represent best practices.

1. Suggesting that "files" should be kept on journal editors that make editorial decisions you dont agree with
2. Influencing reviewers of papers.
3. Inventing a new category ( "provisionally accepted") for one paper so that it can be used by the IPCC.

B. Data archiving and access.
1. Sharing data that is confidential with some researchers while not sharing it with others. If its confidential, its confidential. If its
not, then its not.
2. Failing to archive data.

C. Code sharing.

1. Withholding code when you know that the code differs from the method described in the paper and correspondents
cannot replicate your results because of this discrepency. And you know they cannot replicate BECAUSE of this failure
of the paper to describe the code completely.

D. Administrative failures.
1. Failure to discharge one's administrative duties. see FOIA.

E. Failure to faithfully describe the total uncertainties in an analysis.

As you can see, and as we argue, none of these touches the core science. What we do argue is this. The practices we can
see in the mails do not constitute the best practices. I've argued at conservative sites that this behavior did not rise to the
level of fraud. And I took my lumps for failing to overcharge the case. On the other hand, those who believe in AGW (as we do), are unwilling to acknowledge any failings. We were heartened by Judith Curries call for a better science moving forward. We think
that the behaviors exhibited do not represent the best science. We think we can and should do better. The gravity of the issue demands it. So on one side we hear the charges of fraud . That's extreme. On the other side we hear a mis direction from the core issue. When we point out that best practices would require code and data sharing,for example, the answer is
" the science is sound." we dont disagree. What we say is that the best path forward is transparency and openness. Acknowledge that the decisions made were not the best and pledge to change things going forward.

Concern E is the heart of the matter WRT chap 6 of WG1. On our view Briffa was put under pressure to overstate the case.
That's not fraud. It's not perpetuating false statements. If you study the mails WRT to the authoring of that chapter you will come away with the impression that Briffa was under pressure to overstate the certainty. That doesnt make AGW false. It cannot. It is however a worrisome situation.

Here is Rind advising the writing team.
"pp. 8-18: The biggest problem with what appears here is in the handling of the greater
variability found in some reconstructions, and the whole discussion of the 'hockey stick'.
The tone is defensive, and worse, it both minimizes and avoids the problems. We should
clearly say (e.g., page 12 middle paragraph) that there are substantial uncertainties that
remain concerning the degree of variability - warming prior to 12K BP, and cooling during
the LIA, due primarily to the use of paleo-indicators of uncertain applicability, and the
lack of global (especially tropical) data. Attempting to avoid such statements will just
cause more problems.
In addition, some of the comments are probably wrong - the warm-season bias (p.12) should
if anything produce less variability, since warm seasons (at least in GCMs) feature smaller
climate changes than cold seasons. The discussion of uncertainties in tree ring
reconstructions should be direct, not referred to other references - it's important for
this document. How the long-term growth is factored in/out should be mentioned as a prime
problem. The lack of tropical data - a few corals prior to 1700 - has got to be discussed.
The primary criticism of McIntyre and McKitrick, which has gotten a lot of play on the
Internet, is that Mann et al. transformed each tree ring prior to calculating PCs by
subtracting the 1902-1980 mean, rather than using the length of the full time series (e.g.,
1400-1980), as is generally done. M&M claim that when they used that procedure with a red
noise spectrum, it always resulted in a 'hockey stick'. Is this true? If so, it constitutes
a devastating criticism of the approach; if not, it should be refuted. While IPCC cannot be
expected to respond to every criticism a priori, this one has gotten such publicity it
would be foolhardy to avoid it.
In addition, there are other valid criticisms to the PC approach....."

The PARTICULARS of this are unimportant. What matters is Rind's advice about treating uncertainties in a forthright manner.
All of our criticism of Briffa can be summed up in one sentence. He didn't do the most forthright description of the uncertainties.
That's it. whether it was his treatment of McIntyre's paper, or failing to disclose the truncated data in the clearest manner, that is the take home point we want to stress.

However, I'm not sure I understand why Steve is reluctant to cry "fraud" - or AMac for that matter. As I noted at the start, if "Tiljander" (whatever that means, I have not looked into it at all myself) makes a difference and they're still doing it wrong without acknowledging the error, then either perhaps the error hasn't been explained in a clear enough manner (showing how it makes any difference), or that's real fraud. Same with Steve Mosher's complaint about what Briffa was "forced" into (whether or not the emails provide enough context for these conclusions I can't say either - again I haven't looked into it myself). But if what Steve says here is true, then there was a numerical quantitative parameter - uncertainty - that was mis-stated in the IPCC reports. A proper analysis would show a different number. If Steve is right somebody should be able to do that proper analysis and get the right number, and show how it makes a difference. Persistence in using the wrong number after it's been shown there's a correct one would again be an instance of continued repeated fraud.

So - is it there, or not? What's the best case? Comments welcome on Mann in particular here, thanks!

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

The program's code was

The program's code was archived online. The code shows R2 verification being calculated. In addition, Mann reported R2 verification for the 1820 step of his reconstruction. What about my accusation makes no sense?

What seems to make no sense to me is your comment, "If not, why all the doubts about what he said?" There is no doubts about what he said. I have no idea where you get this from. He said he did not calculate R2 verification scores. That was a lie.

I have no idea why you think this is a matter of, "he said, but we think he should have said...." It isn't. This is a matter of Mann lying, and being caught on it.

Arthur, this is another

Arthur, this is another popular denialist misrepresentation. So what if r^2 is in the code... it is (usually) inappropriate for testing the skill of a reconstruction, and so wasn't used, like the man said. More appropriate are RE/CE.

Why this is so is clearly explained in the appendix of

http://www.cgd.ucar.edu/ccr/ammann/millennium/refs/Wahl_ClimChange2007.pdf

With pictures :-)

This response is nonsense.

This response is nonsense. The first thing to note is this claim of R2 verification being "inappropriate" for reconstructions is a (relatively) new claim. It was not advanced in Mann's work. It was not even raised until years later as a supposed justification for Mann withholding adverse test results. If Mann felt the R2 scores weren't meaningful, he could have displayed the results and explained why they didn't matter. He did not. He hid the results, then denied calculating them.

The idea that the R2 verification scores were "inappropriate" could not possibly justify Mann lying about calculating them. Moreover, I don't accept the idea of R2 being "inappropriate." While you could say this is a legitimate dispute over a statistical methodology, you would be wrong. Remember, Mann listed the R2 verification score for the 1820 step of his reconstruction.

If R2 verification is "inappropriate," obviously it shouldn't have been used in the paper at all. One obviously cannot argue it was "inappropriate" when it gave adverse results, but "appropriate" when it gave good results.

Mann calculated the R2 verification scores. In his paper, he included a good result from it. He also hid the bad results from it. There is no excuse.

The takeaway point of all of

The takeaway point of all of this is if you don't use bad data, you don't get his results. Nobody has admitted this.

Because it's not true!

Remove Tiljander and remove all tree-ring proxiies -- remember, that includes the brislecone pines -- and you still get a solid hockey stick at least back to 1000AD... in spite of having flushed most of your data down the drain. Look at the plots man!

I have looked at the plots.

I have looked at the plots. In the original plot, the temperatures around 1000 AD barely touched the baseline (0 degrees). The recent temperatures fell shy of .3 degrees. In the version without the questionable proxies, the 1000 AD temperatures are all the way up to .4 degrees.

A hockeystick requires a flat shaft which ends in a upward curve. If the temperatures around 1000 AD were greater than the temperatures in the recent times, obviously you do not have a flat shaft. This means there is no hockeystick.

The temperatures in the past come out higher than recent temperatures if you remove the questionable proxies. This means those proxies are the basis for the conclusion.

Do you really want to get

Do you really want to get into dendro issues, as to what is a valid or not valid proxy? That is a very large field. Better to stick to math and algorithm analysis. Nevertheless, the NAS Report is available on the subject. McIntyre says the followon papers they cite also use the bristlecones that they reject. On the fraud level, this was a key point for Atte Korhola, as Mann had a directory with analysis of one of his papers without bristlecones that showed a great deal of sensitivity.

If it is a question of mere

If it is a question of mere scientific opinion on "what is valid", then that's one thing. If there's objective evidence one way or another on "what is valid", that's another. I'm looking for the objective facts on this, if any. Points of legitimate scientific debate certainly should not be the issue; eventually they will be resolved one way or another, or perhaps have been since the original articles. If somebody persists in using something "invalid" after the legitimate scientific debate on it is over, that's the serious issue here.

> people put up dozens of

> people put up dozens of slightly different claims ... but rarely do they seem to get very specific about it. Much of it boils
> down to opinion about choices that could have been made one way or another by any reasonable person at the time.
> ... Be very specific in what you are claiming the problem is ...

cf http://simondonner.blogspot.com/2009/10/climate-science-filibuster.html

Brandon Shollenberger -- Hank

Brandon Shollenberger -- Hank Roberts makes a useful point in this "people put up dozens of slightly different claims" comment and cite from Simon Donner.

In trying to communicate across the divide, we have one group that is "too skeptical" about A and "too creduous" about B; another group that is "too skeptical" about B and "too creduous" about A.

Where A = the claims of mainstream climate scientists, and B = the claims of climate-science critics.

Arthur Smith has asked for specific, concrete examples of critics' grievances, to examine and investigate. This is perforce going to place alot of possibly weak mainstream climate science out of bounds. In other words, passing this test doesn't mean that mainstream climate science is "out of the woods," i.e. shown to be comparably robust to better-established and less-politicized physical sciences. But that is simply my opinion, and one that is emphatically not shared by many readers here. I don't have the knowledge or the expertise or the time to make a general case that would convince a skeptic on this point of my opinion.

So Arthur's approach of focusing on verifiable details has the potential to be useful to all parties, though we don't know how the exercise will end: maybe with added insight as to substance or process for one "tribe", or for both tribes. Or maybe for neither.

Hank Roberts: As you note in the Michael Mann's errors thread, you commented on Tiljander back in October, and so presumably know the arguments. What is your view of these two questions:

"Are the four Tiljander proxies calibratable to the instrumental temperature record, 1850-1995?"

and

If Mann08’s use of the Tiljander proxies was valid, they necessarily were used in an extraordinary fashion: in contradiction to the original specialist authors’ interpretations for lightsum and XRD.

"Is it acceptable scientific practice for Mann08’s Methods section to be silent on these highly unconventional uses of the Tiljander proxies?"

AMac, you say, "So Arthur's

AMac, you say, "So Arthur's approach of focusing on verifiable details has the potential to be useful to all parties, though we don't know how the exercise will end: maybe with added insight as to substance or process for one "tribe", or for both tribes. Or maybe for neither."

I agree with the idea and approach. Focusing on single issues can be a good way to reach agreement, or at least understanding. That is why I gave the example I gave. It is a simple issue. When it was revealed Mann's reconstruction failed R2 verification, Mann claimed he hadn't calculated the R2 verification scores. This claim was repeated many times. It was untrue.

The issue is extremely simple. There are three facts at hand. I can't imagine a simpler example to start on.

Brandon, it would help if you

Brandon, it would help if you had sources for your facts. Right now, we have 3 assertions:
Mann's code included R2 verification calculation(I thought the issue was he didn't release the code?)
Mann's reconstruction failed R2
Mann claimed to not calculate R2

I would be glad to provide

I would be glad to provide references for anything I talk about. I didn't provide references in my initial post because it is hard to know what things people will want references for. For example, a reference on Mann's code would probably be wanted by everyone. On the other hand, do I need to provide a reference when I say the caption for a figure in Mann's paper discusses the R2 verification score of the 1820 step of his reconstruction (Figure 3)? Is it reasonable for me to expect people here to already have Mann's paper? This sort of uncertainty is why I prefer to list my claims in a clear manner, then let people ask for references. So moving onto the claims:

Mann's code included R2 verification calculation(I thought the issue was he didn't release the code?)

Before providing a reference, I'd like to clear up something. As you mention, Mann refused to release his code. This was problematic as the procedural descriptions in his paper were not sufficient to know how he handled some things. Indeed, some of what he did was not documented in anything he published (I would rather not discuss this part right now as it would just sidetrack us. We can revisit it later if anyone would like). Without Mann's code, it was impossible to reconcile his work with attempts to replicate it.

However, Mann did archive some (not all) of his code later on. While this can all be confusing at first glance, the two issues are separate. Now then, you can download the code being discussed here: ftp://holocene.evsc.virginia.edu/pub/MANNETAL98/METHODS/multiproxy.f
Of note is a ClimateAudit post: http://climateaudit.org/2005/07/23/cross-validation-r2-source-code-refer...

Mann's reconstruction failed R2

There are many possible references I could provide for this as it has been discussed many times. However, I think the most convincing reference comes from a paper by Ammann and Wahl. They wrote a paper in response to McIntyre and McKitrick, defending Mann's work. Their paper lists the R2 verification scores for Mann's reconstruction. Oddly enough, a link to this paper was provided in response to one of my posts just a little while ago by Martin Vermeer (see Table 1S): http://www.cgd.ucar.edu/ccr/ammann/millennium/refs/Wahl_ClimChange2007.pdf

Mann claimed to not calculate R2

Offhand, the best reference I can provide is this: http://climateaudit.org/2006/03/16/mann-at-the-nas-panel/ If you'd like, I can probably track down something more tangible. Unlike the other two claims I made, I have never heard this claim disputed, so I never thought to keep track of a source for it. I don't think it really matters, as if you disregard this source, and even this whole point, Mann's behavior would still be horribly inappropriate. The most you could say be throwing this out is Mann calculated the R2 verification score, saw it was bad, and simply refused to ever disclose it. In other words, whether or not he lied about the paper later doesn't change the paper, and that is what is the main issue.

If there are any other things you would like references for, feel free to ask.

Edit: Peculiar thing. The link to Mann's code seems not to be working. I downloaded it two days ago without any problem, and the link had been up for years. I don't know if it is a temporary problem, or if they decided to take it down for some reason. If nothing else, I can e-mail the copy I downloaded.to anyone who wants it.

Your post probably killed the

Your post probably killed the link. It was probably one of those special directories that Mann set up, that the IT department doesn't want open. ftp://holocene.evsc.virginia.edu?

http://www.meteo.psu.edu/~mann/shared/research/MANNETAL98/

Compare this to your code.

That seems to be the same.

That seems to be the same. The METHODS directory contains a file by the same name and size. When I downloaded it, it appeared to be identical.

With that resolved, is there anything else which needs to be covered? Given the above references, does anyone dispute that Mann calculated the R2 verification scores, reported those which were good for his paper, withheld those which were adverse, then lied about having ever calculated the R2 verification scores?

It seems pretty straightforward, though I will be glad to offer clarification or information if any is needed.

I haven't examined this in

I haven't examined this in detail, and I don't intend to until I've looked through the Tiljander case. If you're sure on this R2 business, I'll take a look later. But the problems with your assertions on this are:

* Does the code in question necessarily calculate the scores you are talking about every time it's run?
* If certain input parameters or settings are needed to do that calculation, how can you know whether Mann ran it with those settings?
* Even if the code clearly was run to calculate these scores, how do you know Mann looked at them?

Because only if he both calculated and looked at the results can you call his statements on this a lie.

I think he's saying that Mann

I think he's saying that Mann reported some R2 results, so it makes sense to think he calculated all of them. There are different steps in the program, back to 1800, back to 1700, etc.

Oh come on. Mann reported

Oh come on. Mann reported the R2 verification scores for the 1820 step of his reconstruction. He put it in his paper.

You don't even need to look at his code to know he calculated R2 verification scores.

Schmidt on

Yes, Gavin Schmidt's response

Yes, Gavin Schmidt's response to the pending questions on Tiljander is at the C-a-s post that Michael Tobis links.

That post was linked and discussed earlier in this thread, at "Well you did get" and at "This is a response to Martin".

I've paraphrased Gavin's (and C-a-s commenter toto's) response to the question,

Are the Tiljander proxies calibratable to the instrumental record, 1850-1995?

as, "I don't know, and it doesn't matter."

I think each reader should consider whether this is an acceptable answer to a scientific issue (as opposed to, say, an adequate response to some courtroom examination).

As a thought experiment, consider Tim Lambert's discovery of the Methods error made by McKitrick and Michaels, mixing up Degrees and Radians. Would a defense of McKitrick and Michaels that ran "I don't know, and it doesn't matter" be acceptable to you?

In that case, McKitrick and Michaels acknowledged their error. Isn't that a better outcome than settling for "I don't know"? Rather than claiming "and it doesn't matter," McKitrick and Michaels recalculated their findings and the associated statistics. So we can check the P values in the corrected manuscript, and see whether "it matters" (it significantly weakens their findings, in my opinion).

Advocates of "I don't know, and it doesn't matter" should show what distinguishes calls for McKitrick and Michaels to investigate their possible error on radians, from calls for Mann et al. to investigate their possible error on Tiljander. In my opinion, both are pro-science developments.

This exchange upthread (Reply to "Hmm. So, Kaufmann) with Martin Vermeer shows another subtlety with the "I don't know" part of Gavin's answer. "I don't know" could mean one of several related things:

1. "I don't know--I haven't investigated the issue."
2. "I don't know--having looked into it, I haven't come to a conclusion yet."
3. "I don't know--I've looked into it, but I lack the interest/expertise/time/etc. needed to draw a conclusion."
4. "I don't know--it's not one of those questions that I or anyone else can answer definitively."

So I should have followed up with Gavin by asking, "Does 'I don't know' mean #1, #2, #3, #4, or something else?"

I just noticed that an easy

I just noticed that an easy way to read the threads when they are in low volume, is to click more on the left column, and it shows the most recent posts.

While perhaps the behavior

While perhaps the behavior meets your definitions, I don't think Mann is guilty of fraud. If he were guilty, why is he releasing his code so people can see how guilty he is? His code and data publications are much better in his later papers than his early ones. He is not willing to cede any ground to Steve McIntyre, but yet his behaviour in later papers shows him adopting Steve's critiques, Tiljander excepted. We find a sensitivity study without bristlecones, dropping PC analysis, more code release, etc.