Where's the fraud?

In previous posts I have discouraged discussion of Michael Mann's work since I had not investigated it at all myself - but inevitably it came up anyway. There were a couple of interesting comments from Steve Mosher and AMac that I am highlighting in this post. If commenters here agree that the Tiljander case is the closest thing anybody has come up with to show consistent misbehavior by climate scientists (following the basic "fraud-like" criteria I set out) then I commit to looking into it myself and trying to understand why scientists and bloggers seem to be disagreeing about it. AMac's denial of "fraud" while calling it an "honest mistake" seems odd to me - if it's really an "honest mistake" it should be acknowledged, not repeated.

Or if folks here think the Tiljander case is not a real problem but some other hockey stick "trick" or feature is definitely fraudulent, I'll look there. Tell me what your best case is!

Just because scientists are human - that is biased, inconsistent, lazy, argumentative, make mistakes, argue, play "politics", etc. etc. does not make some piece of science a fraud. Scientists in their natural state are fiercely competitive with one another - recognition for solving some problem or being first to discover some new truth about the world is all that matters. Tearing down somebody else's work, if you're right, is always grounds for praise. As long as there is some collection of predictions about the world from a piece of science and measurements to verify those predictions, then no matter what the biases or mistakes of the scientists involved, as long as they are not being deliberately fraudulent, the truth will prevail. Of course, without that check and balance from nature, even without fraud, science can get wildly speculative (*cough* string theory *cough*).

Human frailties can mar any piece of scientific work, and this shouldn't surprise anybody. The worry is that some pieces of work that people have come to respect and rely on have been, in some manner, fabricated and are themselves wrong. But fraud is hard to perpetuate in science - it almost always turns up later when others try to do the same experiment or analysis over again and consistently get some different result. On the other hand, if there has not been any actual fraud, what's the problem? The science is still right, even if the scientists behaved abominably (and I've personally witnessed some pretty abominable stuff from people who received great honors...). That's sort of the beauty of the objectivity that the intrinsic competition and reference to nature of science forces on you: personalities really don't matter, only the truth does - it's only the thought that counts as I wrote some time ago.

But - there have been cries of fraud. Let's try to get to the bottom of them. Here are 5 objective criteria for clear continuing fraud that I posted here in a comment the other day:

(1) A result (graph, table, number) presented in a peer-reviewed or IPCC article that was false - i.e. said to be one thing, but was actually something else. Incomplete presentation is not sufficient - I want to see something that was actually false (such as this AR4 case would have been if it had worked out). Truncation doesn't count unless they claimed to be presenting a whole series and clearly actively concealed the truncation. End-point smoothing doesn't count (for example the Briffa 2001/NCDC graph) unless they specified how they were handling the endpoints and did it differently. Etc.

(2) Where the falsity made a material difference to the overall message of the graph, table or number. That is, the visual or mental impact of the difference is obvious to a cursory look at the presentation, and doesn't require detailed magnification of the curve or looking at the last decimal point to see any difference.

(3) Where the problem, identified by blogger or scientist or whoever, has been presented in a clear manner demonstrating the "wrong" and "right" versions for all to see

(4) Where the original scientific group responsible has not responded with acknowledgment of the error and corrected the record as far as possible, and committed not to make the same mistake again

(5) Where the original group has in fact repeated the error more than once, after public disclosure of the problem.

I'm reposting here two lengthy responses to this outline, and encourage further discussion of these in the comments below:

From AMac:

Tiljander/Mann Fraud?

...Short answer: No, But.

This freestanding comment is a Reply to MikeN's "Interesting, I think" (Sun, 6/27/2010 - 00:36) and Arthur Smith's "On the "'my standard' question" (Sat, 06/26/2010 - 18:26). This seeming side-issue may illuminate some of the points being discussed with the termination of the Briffa series in 1960.

Arthur listed 5 criteria in "On the 'my standard' question". Paraphrasing,

(1) A false result is presented in a peer-reviewed article or IPCC report.
(2) The falsity made a material difference to the overall message of a graph, table or number.
(3) The "wrong" and "right" versions of the identified problem have been presented in a clear manner.
(4) The authors have not acknowledged and corrected the error, or committed to not repeat the mistake.
(5) The authors have repeated the error, after public disclosure of the problem.

Fraud
Two definitions for "Fraud":

a: deceit, trickery; specifically: intentional perversion of truth in order to induce another to part with something of value or to surrender a legal right
b: an act of deceiving or misrepresenting: trick

We're in tricky [sic] territory already: accusers can mean (or claim to mean) that they're discussing "misrepresentation", but the charge of evil intent is present or nearby. Lack of care and precision in statements made by scientists and advocacy bloggers is one of the major polarizing factors the AGW dispute, IMO. Steve covered this ground nicely in Climategate: Not Fraud, But 'Noble Cause Corruption' (also note the cries for blood in the comments).

It's tractable to evaluate what somebody wrote in a journal article, much less so to ascertain what was in their heart at the time of writing. To me, this says most "fraud" charges will be either wrong or unprovable. They'll always be red flags to a bull (bug or feature?).

Tiljander
As described in the Methods and SI of Mann08 (links here), Prof. Mann and co-authors set out to catalog and use non-dendro proxies that contain temperature information. They assembled candidates and looked at behavior over the time of the instrumental record, 1850-1995. During this time of warming, the calculated mean temperature anomaly in most CRUtem cells (5 deg longitude x 5 deg latitude) rose. Proxies with parameters that also rose passed screening and progressed to the validation step (see the paper). The four measures (varve thickness, lightsum, X-Ray Density, and darksum) taken by Mia Tiljander from the lakebed varved sediments of Lake Korttajarvi, Finland also passed validation, and thus were used in the two types of paleotemperature reconstructions (EIV and CPS) that make up the paper's results. The authors recognized potential problems with the Tiljander proxies, but used them anyway. Because of their length (extending much earlier than 200 AD) and the strength of their "blade" signal (Willis Eschenbach essay), the proxies are important parts of the reconstructions.

The evidence is overwhelming that Prof. Mann and co-authors were mistaken in their belief that the Tiljander proxies could be calibrated to CRUtem temperature anomaly series, 1850-1995. The XRD proxy discussed here. The issue was recently raised again by A-List climate scientist and RealClimate.org blogger at Collide-a-Scape, The Main Hindrance to Dialogue (and Detente). Gavin and Prof. Mann's other allies are unable to address the matters of substance that underlie this controversy; see my comment #132 in that thread.

Arthur's 5 Criteria and Mann08/Tiljander
(0) Mann08's use of the Tiljander proxies is not fraud, IMO. All evidence points to an honest mistake.

(1) False result presented in a peer-reviewed article? Yes.

(2) Falsity made a material difference to the overall message of [graphs]? Hotly contested. Mann08 has many added methodological problems, making it difficult to know (see comment #132 and critical posts linked here). IMO, this demonstrated failure of key Mann08 methods (screening and validation) calls the entire paper into question.

(3) Clear presentations of "wrong" and "right" versions of the identified problem? Hotly contested. Gavin believes that the twice-corrected, non-peer-reviewed Fig S8a shows that errors with Tiljander (if any) don't matter. I rebut that in comment #132 and in this essay.

(4) The authors have not acknowledged and corrected the error, or committed to not repeat the mistake. Yes. In their Reply published in PNAS in 2009, Mann et al. called the claims of improper use of the Tiljander proxies "bizarre."

(5) The authors have repeated the error, after public disclosure of the problem. Yes. Mann et al. (Science, 2009) again employed the Tiljander proxies Lightsum and XRD in their inverted orientations (ClimateAudit post); see lines 1063 and 1065 in "1209proxynames.xls" downloadable in zipped form from sciencemag.org (behind paywall).

Summary, and Lessons for the Briffa Series Truncations
The key issue is not fraud. Nor is it that authors of peer-reviewed articles make mistakes. Everybody--scientists, book authors, and climate-bloggers included--makes mistakes.

Instead, the important question is: Does climate science adhere to Best Practices? Appropriately, scientists and bloggers scrutinize articles that cast doubt on the Consensus view of AGW, as shown by Tim Lambert in the 2004 radians-not-degrees case. What about papers that support the Consensus view? Are such errors in those papers picked up? Do the authors correct those papers, too?

Best Practices don't mainly concern the detection of glaring, easily-understood errors like a radian/degree mixup or an upside-down proxy. There are a host of issues -- as there are with drug research, structural engineering, mission-critical software validation, and a large number of other areas. I won't enumerate them -- beyond a plea for the correct and rigorous use of statistical tools. Recent threads at Collide-a-scape are full of suggestions and insights on this question, from AGW Consensus advocate scientist Judith Curry, and many others.

The key to the Tiljander case is the defective response of the scientific establishment and the AGW-advocacy-blogging community. I think it teaches that paleoclimatology is a young science that has yet to establish Best Practices (as the concept is understood by other specialties, by regulators, or by the scientifically-literate lay public). To the extent that Best Practices should be obvious -- e.g. prompt acknowledgement and correction of glaring errors -- scientists' and institutions' responses merit a "D" or an "F" to this point.

Broadly speaking, I think scientifically-literate Lukewarmers and skeptics accept the analysis of the last few paragraphs. In contrast, opinion-setters in the climate science community and among AGW-Consensus-advocacy bloggers emphatically reject it.

IMO, these differing perceptions explain much of the gulf between the opening positions of Steve Mosher and Arthur Smith on the general question of the justification for the 1960 truncation(s) of the Briffa series, and on the specific question of Steve's error in ascribing the padding of the AR4 truncation to a splice with the instrumental record.

From Steve Mosher

I think this is a really important comment. It lets me describe the central thesis of the book and our view of things.

What the mails detail is the creation of a bunker mentality. this mentality is best illustrated by some of the mails written by Mann. Essentially it is a vision of a battle between climate scientists and skeptics. Us and them. I put aside the question of whether this
mentality was justified or not. The important thing is that this mentality existed. Jones in an interview after climategate confirms the existence of this mentality. I do not think there is any evidence that contradicts this observation. The mentality existed. It is reflected in the language and the actions. What I try to focus on is how this mentality shapes or informs certain behaviors. We struggled a great deal with the language to describe the behavoir. Fraud was too strong a description. I would say and did say that the mentality eroded scientific ethics and scientific practices. it lead to behaviors that do not represent "best practices." These behaviors should not be encouraged or excused. They should be fixed.

When we try to make this case we face two challenges. We face a challenge from those who want to scream fraud and we face a challenge from those who want to defend every action these individuals took. Finding that middle road between "they are frauds" and "they did no wrong." was difficult to say the least. In the end its that middle ground that we want claim. The mails do not change the science ( said that many times in the book), but the behaviors we see are not the best practices. We deserve better science, especially with the stakes involved. If our only standard is the standard you propose, then I don't think we get the best science. I'll just list the areas in which I think the bunker mentality lead people to do things they would not ordinarily do. And things we would not ordinarily excuse.

A. Journals. There are a a few examples where the mails show the small group engaging in behaviors or contemplating behaviors that dont represent best practices.

1. Suggesting that "files" should be kept on journal editors that make editorial decisions you dont agree with
2. Influencing reviewers of papers.
3. Inventing a new category ( "provisionally accepted") for one paper so that it can be used by the IPCC.

B. Data archiving and access.
1. Sharing data that is confidential with some researchers while not sharing it with others. If its confidential, its confidential. If its
not, then its not.
2. Failing to archive data.

C. Code sharing.

1. Withholding code when you know that the code differs from the method described in the paper and correspondents
cannot replicate your results because of this discrepency. And you know they cannot replicate BECAUSE of this failure
of the paper to describe the code completely.

D. Administrative failures.
1. Failure to discharge one's administrative duties. see FOIA.

E. Failure to faithfully describe the total uncertainties in an analysis.

As you can see, and as we argue, none of these touches the core science. What we do argue is this. The practices we can
see in the mails do not constitute the best practices. I've argued at conservative sites that this behavior did not rise to the
level of fraud. And I took my lumps for failing to overcharge the case. On the other hand, those who believe in AGW (as we do), are unwilling to acknowledge any failings. We were heartened by Judith Curries call for a better science moving forward. We think
that the behaviors exhibited do not represent the best science. We think we can and should do better. The gravity of the issue demands it. So on one side we hear the charges of fraud . That's extreme. On the other side we hear a mis direction from the core issue. When we point out that best practices would require code and data sharing,for example, the answer is
" the science is sound." we dont disagree. What we say is that the best path forward is transparency and openness. Acknowledge that the decisions made were not the best and pledge to change things going forward.

Concern E is the heart of the matter WRT chap 6 of WG1. On our view Briffa was put under pressure to overstate the case.
That's not fraud. It's not perpetuating false statements. If you study the mails WRT to the authoring of that chapter you will come away with the impression that Briffa was under pressure to overstate the certainty. That doesnt make AGW false. It cannot. It is however a worrisome situation.

Here is Rind advising the writing team.
"pp. 8-18: The biggest problem with what appears here is in the handling of the greater
variability found in some reconstructions, and the whole discussion of the 'hockey stick'.
The tone is defensive, and worse, it both minimizes and avoids the problems. We should
clearly say (e.g., page 12 middle paragraph) that there are substantial uncertainties that
remain concerning the degree of variability - warming prior to 12K BP, and cooling during
the LIA, due primarily to the use of paleo-indicators of uncertain applicability, and the
lack of global (especially tropical) data. Attempting to avoid such statements will just
cause more problems.
In addition, some of the comments are probably wrong - the warm-season bias (p.12) should
if anything produce less variability, since warm seasons (at least in GCMs) feature smaller
climate changes than cold seasons. The discussion of uncertainties in tree ring
reconstructions should be direct, not referred to other references - it's important for
this document. How the long-term growth is factored in/out should be mentioned as a prime
problem. The lack of tropical data - a few corals prior to 1700 - has got to be discussed.
The primary criticism of McIntyre and McKitrick, which has gotten a lot of play on the
Internet, is that Mann et al. transformed each tree ring prior to calculating PCs by
subtracting the 1902-1980 mean, rather than using the length of the full time series (e.g.,
1400-1980), as is generally done. M&M claim that when they used that procedure with a red
noise spectrum, it always resulted in a 'hockey stick'. Is this true? If so, it constitutes
a devastating criticism of the approach; if not, it should be refuted. While IPCC cannot be
expected to respond to every criticism a priori, this one has gotten such publicity it
would be foolhardy to avoid it.
In addition, there are other valid criticisms to the PC approach....."

The PARTICULARS of this are unimportant. What matters is Rind's advice about treating uncertainties in a forthright manner.
All of our criticism of Briffa can be summed up in one sentence. He didn't do the most forthright description of the uncertainties.
That's it. whether it was his treatment of McIntyre's paper, or failing to disclose the truncated data in the clearest manner, that is the take home point we want to stress.

However, I'm not sure I understand why Steve is reluctant to cry "fraud" - or AMac for that matter. As I noted at the start, if "Tiljander" (whatever that means, I have not looked into it at all myself) makes a difference and they're still doing it wrong without acknowledging the error, then either perhaps the error hasn't been explained in a clear enough manner (showing how it makes any difference), or that's real fraud. Same with Steve Mosher's complaint about what Briffa was "forced" into (whether or not the emails provide enough context for these conclusions I can't say either - again I haven't looked into it myself). But if what Steve says here is true, then there was a numerical quantitative parameter - uncertainty - that was mis-stated in the IPCC reports. A proper analysis would show a different number. If Steve is right somebody should be able to do that proper analysis and get the right number, and show how it makes a difference. Persistence in using the wrong number after it's been shown there's a correct one would again be an instance of continued repeated fraud.

So - is it there, or not? What's the best case? Comments welcome on Mann in particular here, thanks!

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Arthur, I've consistently

Arthur,

I've consistently argued that it is not a case of fraud. In its simplest form I would see a fraud as saying or doing something you KNOW to be wrong. I have no issue with appending a temperature series as long as it is FULLY describe and defended. I have no issue with briffa truncation. I say so in the book. I list the following options that briffa had:

1. ADjust the temperature series ( as Rob Wilson and Esper have both done on ocassion)
2. Show the full series
3. Delete the whole series
4. Truncate the series.
5. Reject Dendrochronology in its entirety.

The Only issue is documenting, fully documenting, the uncertainty ( how the answer changes) when you make each of these choices.
It is a choice that has uncertainty. The only way to estimate the IMPACT of the choice is to calculate the answer under each of these choices. What happens if you dont truncate ( elevated MWP) what happens if you truncate ( we know that for 1960 date)
Truncate at various dates?
What happens if you take out the series entirely? we dont know that. What happens if you throw out all tree rings. Some people have done that. Again same with Tiljander. I have no issue with Mann using the series. However, once he finds out that Tiljander TRUNCATED a part of the series, then I think a complete treatment by mann should look at all the various cases. Kaufman, apparently first used the NON truncated data and then switched to the truncated data. No problem there. no fraud there. AS LONG as this CHOICE is documented and its impact measured. let the numbers be what they are. I guess my bias comes out here. personally, every time I every made choices about the treatment of raw data ( include, exclude,adjust) we had to document all of these choices and show the impact on the final result. The difference in the those final results demonstarted the impact of the choices we made as analysts. the MATH was the math. The CHOICE of which data to use and which data not to use IS a potential source of uncertainty. You cannot estimate it without doing the sensitivity study. If you dont do the sensitivity study, then you are over estimating your certainty. you are in effect holding to a position that the truncation makes no substantial difference. That's an assumption. IF TRUE, then it's an argument for NOT TRUNCATING. If FALSE, its an argument for documenting the effect of the truncation.

In Steve Mosher's comment

In Steve Mosher's comment beginning Arthur, I've consistently argued that it is not a case of fraud (Sun, 06/27/2010 - 16:32), he discusses data truncation in the Briffa case, and then in the Tiljander case.

I believe that he misspoke as far as Tiljander. The issue there is not "truncation" of data series, but (1) their calibration to the instrumental record, and (2) in two cases, their use in an inverted orientation with respect to the orientation claimed by the relevant authorities.

ya truncation is the wrong

ya truncation is the wrong word.

Arthur, I'm not sure why you

Arthur, I'm not sure why you write, "AMac's denial of 'fraud' while calling it an 'honest mistake' seems odd to me."

The "it" in "calling it" has two possible antecedents.

(1) -- Mann08's authors' decision when writing the manuscript to use the four Tiljander proxies as they did.
(2) -- The authors' responses to post-publication claims from McIntyre and others that at least two of the Tiljander proxies were used in an "upside-down" orientation.

Taking (1) first, the straightforward interpretation is simply a lack of adequate due diligence on the part of Prof. Mann and his co-authors. If you inspect the traces of the four Tiljander proxies, there is an obvious orientation for each, with respect to temperature. It turns out that "obvious" is correct for darksum, but inverted for lightsum and XRD -- with respect to the orientations proposed by the only relevant authorities, namely Mia Tiljander and her co-workers in Boreas (2003) and her dissertation (2005). (Literature links here; blog links here.) Tiljander et al don't offer a temperature-related interpretation for their fourth proxy, varve thickness, so that's an ambiguous call.

Tiljander warned that increasing local non-climate-related factors rendered the varve series' use as climate proxies increasingly suspect after 1720. Those factors are mainly farming, peat cutting, road- and bridge- building, and lake eutrophication. Mann08 quotes Tiljander on this general point. They then went on to see if the proxies could be used. The proxies all passed the Mann08 screening and validation tests -- but the increasing deposition of sediments due to these local effects appears to have swamped the climate influences, 1850 - 1995.

I simply don't see any fraud in this account.

On to (2). It seems to me that "fraud" would mean that Mann08's authors knowingly lied about their use of the Tiljander proxies, in response to challenges on the topic. After much travail, McIntyre and McKitrick got a brief Comment published in PNAS on 2 Feb. 2009; Mann et al. offered a brief Reply. The authors called McIntyre and McKitrick's charges "bizarre," but did not explicitly make a claim about their use of the Tiljander proxies. Since that time, to my knowledge, Mann and co-authors have avoided making any statement about Mann08's uses of the proxies, either with respect to their calibratability (all four) or their orientation (lightsum and XRD). In my opinion, it would be a big stretch to conflate "nonresponsiveness" with "fraud".

As a side note, informed allies of the Mann group also go to great lengths to not make definitive declarations about the technical issues of the Tiljander proxies. For example, with the comments I have made in mind, see what Gavin Schmidt (A-list climate scientist and RealClimate.org blogger) actually wrote at Collide-a-scape within the past fortnight. He decries the "pathology" of my blog comments and offers further opinions and tortous explanations in the comments there -- but never does he issue a straightforward statement on the merits of Mann08's uses of the proxies.

I hope that offers some additional perspective on the matter.

AMac - if you believe that

AMac - if you believe that your argument is correct, then either Mann and co. understand it or they don't. If they understand it and persist, that sounds like deliberate fraud. If they don't understand it - well, why don't they? Something about the presentation? The source? If you are right it doesn't sound like an "honest mistake" any more. If this is the clearest case, let's look at it and try and understand - that's what I'm willing to do here.

Arthur, I understand your

Arthur, I understand your point but do not agree--to the extent that I went back and reviewed the dictionary definition of "fraud." Perhaps Prof. Mann or one of his co-authors will speak up, and shed some light on this issue.

To me, the most important thing about Tiljander -- assuming my perspective turns out to be broadly valid -- is what this case says about the conventions of science. As you point out, there will always be scientists who display . Science runs via institutions like journal editorship, peer review, project funding, tenure, science-blogs, the press, and government, academy, UN, and inter-governmental reviews and consensus projects. Are these institutions robust enough so that personalities, politics, groupthink, whatever-else, can't be played as trump cards?

Well, the question to me

Well, the question to me really is what's the closest thing to fraud, the worst behavior that's been documented, that has the strongest case here. If it doesn't meet criteria 1 and 2 then it seems like it really didn't impact the science in any significant way. If it does meet those but not 4 or 5, i.e. the researchers admitted and corrected the problem after it was identified, then I agree, honest mistake. If you've got 1-4 there's a problem, it seems clearly an indicator of troublesome behavior that has a significant negative impact on the science. That's the part that matters.

Yes, I think the institutions are extremely robust, under most circumstances, to the usual human personalities, politics etc. I've seen truth win out amongst even some pretty rotten behavior. The exceptions are fields that are so small that you only have essentially 1 collaboration (ie. 1 leader and lots of followers) doing all the work (which climate science, even a small portion of it like dendrochronology or whatever it's called, doesn't seem to qualify as) or fields that don't have the fundamental grounding in experimental objective reality that you get with physical sciences (i.e. the cliquish stuff you seem to get in the social "sciences").

But anything can go wrong - the proof of a problem is for another group (whether bloggers or another group of scientists) to come in and do the analysis work to prove that some piece of work really came to a wrong conclusion. Normally that happens within the field itself between competing groups. Reality trumps everything else.

Arthur, for the executive

Arthur, for the executive summary go to Gavin's comment that AMac refers to, and take it from there. Spoiler warning: check out the figure link and don't imbibe while reading the comments "responding" to that one ;-)

This is a response to Martin

This is a response to Martin Vermeer's "Arthur, for the executive" comment (Mon, 06/28/2010 - 03:04.

Martin, thanks for linking to Gavin's summary. In it, he claims to be repeating an explanation of Mann08/Tiljander issues that--for some inscrutable reason--I have been failing to acknowledge and accepting as the definitive answer (hence his "pathology" charge). It's a swell paragraph, so let me reprint it here.

Given the methodology used in that particular paper (Mann et al, 2008) (weighting based on a local calibration to temperature in the modern period), the 'tiljander' proxies can only be used one way. If there is a contamination in the modern period by non-climatic influences (which the originating authors suggested there might be), then they just can't be used. This issue was clearly acknowledged in the M08 paper and both of these possibilities (with and without 'tiljander') were shown (it made almost no difference to the final reconstruction).

As Lucia and others pointed out downthread, that's an answer, but it is not responsive to the questions I have posed. Neither Gavin, nor Mann08's authors, nor any other defender of the paper has ever addressed those actual questions, to my knowledge.

Martin, I would be very interested in your replies. Arthur's, as well.

Here are the two questions. You can find background and links farther down that C-a-s thread, in Comment #127 and Comment #132.

"Are the four Tiljander proxies calibratable to the instrumental temperature record, 1850-1995?"

And

If Mann08’s use of the Tiljander proxies was valid, they necessarily were used in an extraordinary fashion: in contradiction to the original specialist authors’ interpretations for lightsum and XRD.

"Is it acceptable scientific practice for Mann08’s Methods section to be silent on these highly unconventional uses of the Tiljander proxies?"

Well, you did get *some*

Well, you did get *some* reply there to your first question - for example, 'toto' #63 says:

“We don’t know, because it depends on whether the modern anthropogenic effects in Tiljander’s data actually erased the temperature signal – something that is very possible, but not certain. Things being so, we can do either of two things – use the series, or not. Being wary of a priori decision, we do both. Turns out, it doesn’t change much concerning the exceptional status of late 20th century warming.

which of course you responded to in #69.

Steve Bloom apparently answered "yes" in some way, while over at Stoat I think I saw an answer from William Connolley that looked like "no". I assume your answer is "no", correct? Can you point to the best exposition of why the answer is "no"?

Also, if others have used Tiljander but chopped off the post-1850 part, then how did they calibrate? Is calibration to the instrumental temperature record 1850-1995 essential? Is it central to Mann's method? I haven't looked yet, but I want to know your best answers on these. Thanks.

Response to Well you did get

Response to Well you did get *some*.

Gavin's rambling answer to

"Are the four Tiljander proxies calibratable to the instrumental temperature record, 1850-1995?"

in C-a-s Comment #29 can be paraphrased as

"Maybe, and it doesn't matter."

The answer of toto's that you quote from Comment #63 is a more elegant phrasing of the same idea.

Here's a thought experiment for you and Martin. Suppose that you strongly objected to the conclusions reached by the paper in question. Would an answer like "Maybe, and it doesn't matter" to a key methodological question serve as a strong defense of that paper, in your mind? (I hope not.)

Steve Bloom kind-of indicated "yes" at C-a-s. But kinda sorta isn't all that precise. William Connolley's position is too subtle, nuanced, and complex for me to summarize. I went through and annotated the first of our three exchanges here. I could do the same for the other two, but it's a lot of work collating and re-formatting all of my failed-moderation comments. I'm not sure it'd be worth it.

> Also, if others have used Tiljander but chopped off the post-1850 part, then how did they calibrate?

It varies. Kaufman (2009, Science) performed a splice of sorts, after correcting the orientations of the flipped proxies. At first glance, Mann (2009, Science) repeated the procedure used in Mann08, retaining lightsum and XRD in their upside-down orientations.

> Is calibration to the instrumental temperature record 1850-1995... central to Mann's method?

Yes. A read of the Methods makes this apparent.

Would an answer like "Maybe,

Would an answer like "Maybe, and it doesn't matter" to a key methodological question serve as a strong defense of that paper, in your mind? (I hope not.)

If that answer is factually valid, why wouldn't you accept it? If it doesn't matter, it cannot be key.

Also, Gavin essentially

Also, Gavin essentially responded *no* to your first question in an early comment there under a conditional:

"If there is a contamination in the modern period by non-climatic influences (which the originating authors suggested there might be), then they just can't be used."

Another question - has this data been used by other people for reconstructions since (or before) the Mann 08 paper?

This is a response to "Also,

This is a response to "Also, Gavin essentially" Mon, 06/28/2010 - 11:12

Also, Gavin essentially responded *no* to your first question in an early comment there under a conditional

Arthur, in a case like this, do you think conditional answers serve to advance understanding? Or -- are they sometimes the source of unnecessary complexity, and future confusion?

Recall, this is a simple question about the application of a key method in an oft-cited paper in a prominent peer-reviewed scientific journal (at position #5 by one measure of influence).

If what Gavin meant by his conditional response was, "I don't know," then he could have saved us a bit of trouble by phrasing it that way.

It's more than "I don't

It's more than "I don't know". The thing he doesn't know, apparently, is whether or not the temperature response really had "contamination in the modern period by non-climatic influences". If it was, then he agrees with you. That's a reasonably strong statement.

In particular, it reduces your question to the one of contamination - the original authors, according to Gavin, "suggested there might be". Is there other strong evidence of this? Why would Gavin seem to doubt it?

Regarding contamination Can I

Regarding contamination

Can I refer you to this post from March 2010: The Newly-Discovered Jarvykortta Proxy -- II. It includes the relevant quotes from Mia Tiljander et al.'s 2003 paper in Boreas that address the contamination issue.

It also includes a discussion of how Tiljander interpreted higher values and lower values of the Lake Korttajarvi XRD data series. The usages of the lightsum and XRD proxies in Mann08 contradict Tiljander's interpretations, while Mann08's usage of the darksum proxy is consistent with Tiljander.

You might find the graphical depictions of the XRD proxy to be helpful, as well.

There might be authorities other than Mia Tiljander and her co-authors who interpret these proxies differently. If so, I am unaware of their identities. It would be a big step forward if Mann08's defenders would reference their work.

Arthur, you write,

Why would Gavin seem to doubt it?

To whom should you address that question?

Amac a couple relavant

Amac

a couple relavant threads on contamination

http://climateaudit.org/2008/09/03/mann-et-al-2008-korttajarvi/

And see what the authors think about the upside down issue

http://climateaudit.org/2010/02/06/say-my-name-–-february-rerun/

So if arthur what to know what the original authors think about it, repeating the mistake.. the small group of climate scientists who did this.. its a good place to start.

http://climateaudit.org/2010/02/06/say-my-name-–-february-rerun/

AMac, I know my answer isn't

AMac, I know my answer isn't going to satisfy you. But it's the only answer you're going to get from me. A satisfactory answer (to you) would require me making up things, which I will not do.

To the first question: perhaps, perhaps not. I don't know. but it doesn't matter (much). And that I do know.

To the second question: you mean 'upside down' use? Again, it doesn't matter, due to the way the calibration is done. Flip them any way you like, and the result is the same. That includes the 'correct' (non-extraordinary, non-unconventional) orientations -- even if we don't know what they are. This is my best understanding.

BTW these are the answers Gavin gave you too. Feel free to fail to understand them, or find them philosophically unsatisfying; but please don't deny that you got them.

Martin - while perhaps not

Martin - while perhaps not expressed correctly, from what I've read so far there may be a real issue with the "upside down" business.

Yes, it is true that "the result is the same" in Mann's analysis no matter the sign (or presumably any linear modification) of the input data series.

The problem is that, if this "contamination" claim is correct, this particular series contains two parts - an old part that may have one relationship with temperatures, and a modern part that spuriously has (perhaps) an opposite relation with temperatures. The "flip" is in the data series itself, not in how it was input to the calibration process. Which means that, as Gavin put it, it should not have been used in Mann's analysis. Or it could have been used if the calibration was done to the early part and the modern part discarded.

That is, if the claim of modern contamination is correct.

Yes, that agrees with my

Yes, that agrees with my understanding.

The good news is of course that the solution with the dodgy proxies removed is so close to the standard one. This is precisely what sensitivity analysis is for: to CYA for things you cannot be sure of :-)

(BTW I disagree with AMac's contention that this small change is a red flag: on the contrary, MikeN is right that this is just because 7 is so much smaller than 1209.)

This is a response to Martin

This is a response to Martin Vermeer's "AMac, I know my answer isn't going to satisfy you." Mon, 06/28/2010 - 14:07. (My browser is flipping between "flat list" and "threaded list" displays, making it hard to associate Replies with their predicates.)

To the contrary, I think you have supplied terrific answers to the two questions. Because they are more concise than Gavin's, they're better than his, as well.

Question 1 is

"Are the four Tiljander proxies calibratable to the instrumental temperature record, 1850-1995?"

And Martin's answer is

Perhaps, perhaps not. I don't know. but it doesn't matter (much). And that I do know.

Question 2 is

"Is it acceptable scientific practice for Mann08’s Methods section to be silent on these highly unconventional uses of the Tiljander proxies?"

And Martin's answer is

It doesn't matter, due to the way the calibration is done.

Arthur, do you concur with Martin on these two answers?

Martin, may I ask you a follow-up question?

Mann08 Figure S9 shows the 15 Northern Hemisphere proxy records that pass the screening procedure back to at least AD 818, including the four Tiljander proxies (from the figure legend).

Question 1 as asked concerns four of these records (Tiljander's lightsum, XRD, darksum, and thicknessmm). The assumption is that the authors' answer for the other 11 proxies is, "Yes, they are all calibratable, as we outlined in the Methods."

Would your confidence in Mann08's results decline if the tally was 5-to-10? 6-to-9? What about 14-to-1, or 15-to-0 (i.e., "perhaps none of our longest-duration proxies are calibratable, but it doesn't matter (much)")?

If at some point in this progression, you lost faith in the calculated reconstructions of Mann08: what would trigger such a change of heart?

AMac, I won't answer

AMac,

I won't answer concretely to this question as I haven't done the sums, but just observe that back to those early years, the number of available proxies diminishes dramatically. So yes, if you start throwing out proxies that seem questionable, at some point you are going to lose those early years (which are pretty weak already as noted in the paper). But the later years are much stronger, as witnessed by surviving even the removal of all tree-ring proxies.

OT, this is something Lucia didn't see by the way. The removal of tree-ring proxies is just a sensitivity study, verifying that tree rings as a group largely tell the same story as non-tree rings as a group. To take the tree ring -less result and use it to argue -- whatever, is just wrong. No-one seriously suggests that indeed all tree ring proxies are faulty: rather to the contrary, as this sensitivity study suggests.

>Flip them any way you like,

>Flip them any way you like, and the result is the same.

This answer is at odds with ClimateAudit's analysis. Given the code is available, this should be resolvable.

I would add that if the proxy is not contaminated, then that means it has a divergence problem, or things in that area have gotten much colder while the rest of the planet got warmer. In that case, it is still an error to flip the proxy upside-down and interpret it in the opposite fashion to the author's statement. This would take the medieval warm period and say it was cold. I believe the term is spurious regression.

So, two obvious questions

So, two obvious questions would be.

1. If a potentially contaminated series makes no substantial difference, why include it? And why would kaufman
upon learning of the potential contamination remove it.
http://climateaudit.org/2009/09/03/kaufmann-and-upside-down-mann/

The point is a minor one. But it would seem to me that if you get the same result by excising the suspect part of
of the series, then you should excise it and take the potential objection off the table as Kaufman has. Again, I have
no issue with excising corrupt data ( in tree rings or whatever) as long as the issue is noted and documented.
Perhaps Mann refuses to excise it because Mcintyre complained about it. We won't know, but I'd say kaufman takes the
correct approach from a best practices standpoint.

2. If FLipping the series makes no difference Then WHY keep it upside down. A difference that makes no difference, makes
no difference. This is the stringest argument for using the orientation the original authors suggested. Kaufman, recognizing
the wisdom of this flips the series. It changes the result in minor ways: see below

Reports: “Recent warming reverses long-term Arctic cooling” by D. S. Kaufman et al. (4
September 2009, p. 1236). Of the 23 previously published proxy temperature records
included in the synthesis, 4 were corrected to conform to the interpretations of the original
authors, and one was updated by omitting the high-pass filter. The 10-year mean proxy val-
ues are now corrected in the supporting online material (table S2) and at
www.ncdc.noaa.gov/paleo/pubs/kaufman2009. The primary trends of the Arctic tempera-
ture reconstruction, however, are not changed, including the millennial-scale summer cool-
ing that was reversed by strong warming during the 20th century and (on the basis of the
instrumental record) continued through the last decade. Two of the corrected records (Lake
Korttajärvi, Lake Lehmilampi) were not included in the calibration of proxy values to mean
summer temperature, but the other three (DYE-3, Hallet Lake, and Haukadalsvatn) were.
The supporting material now includes the corrected version of the calibration equation (r2
= 0.76, P< 0.05). The resulting corrected temperature reconstruction is shown below,
along with the original version taken from Fig. 3C. The corrected temperature trend through
1900 (green line in Fig. 3C) is –0.21°±0.06°C per 1000 years rather than –0.22°
±0.06°C per 1000 years as originally reported.
The corrected regional sensitivity of sum-
mer (JJA) temperature to orbital forcing inferred from the proxy-based reconstruction is
0.06°±0.03°C per Wm–2rather than 0.07°±0.02°C per Wm–2as originally reported

Now, this example shows the principle issue between guys like Amac and me and other people.

The trend changes from .07C per 1000 years to .06. For Arthur and perhaps martin this is no big deal.
You are talking a 10-15% change in trend. Some people look at this and say " AGW is still true, so the difference makes no difference" IT MAKES NO DIFFERENCE to your interests. It makes no difference to the truth of AGW. That's all true and beside the point. The point is having the best estimate. you think 10% doesnt matter. Good, then lets agree on the lower estimate, if 10% doesnt matter. So the issue can be distilled down the this. These small differences dont matter to YOU. They don't contradict or change your opinion. So why would you object to estimates that are marginaly lower? because McIntyre found the issue?

In a nutshell: if excluding the series makes no difference, then exclude it. If excluding a suspect part makes no difference, then excise it. if putting it in the same orientation as the orginal research suggests makes no difference, then use the original orientation. 'it makes no difference" cuts TWO ways as an argument.

Steve, I would agree with you

Steve,

I would agree with you when starting from scratch. But for an existing paper you have to be reponsive to issues brought up by showing how they affect the reported results. If they completely undermine them, you have to retract; short of that, completely rewriting the paper is not usually an option.

BTW I would have left out these questionable proxies. But it's a judgment call, and Mann made this choice.

Ah, I see you mean something

Ah, I see you mean something different with your second question. Why the orientation that follows from the Mann calibration of these proxies is the reverse of Mia Tiljander's interpretation?

Yes, sure they could have commented on this... OTOH these are interpretations and inherently uncertain. If the Mann interpretation is wrong, we arrive again at the alternative that the proxies are unusable. Which is considered in the paper. So, no big issue.

I don't think it suffices to

I don't think it suffices to say that this is an interpretation.
If I were to ignore the contamination issue, I could use Tiljander and build a reconstruction that looks more like the 1990 IPCC temperature graph.
Somehow I don't think RealClimate would leave it as a matter of interpretation. They would demand an answer for why I used Tiljander despite contamination issues.
I see nothing in the Mann paper that deals with Tiljander here. The SI mentions the contamination, uses it anyways, and conducts a sensitivity analysis to say the results are not affected without it and 3 other proxies. It says see S1 dataset for details, but I can see no details there that are relevant to this issue. So there is nothing in the paper that deals with Mann's interpretation of the issue.
Is it OK to just use something in opposite fashion to the author's intent, and make no claim as to why?
If I were to produce a reconstruction along the lines I describe, it would be fraud.

I think a case can be made that the proxy can be used despite the contamination: global warming allowed for more development which caused the problem. However, Mann makes no case, other than results not affected substantially.
Now Mann's use isn't just to ignore the contamination issue, but to use the proxy upside-down. This detail is not recognized.

http://pajamasmedia.com/blog/

http://pajamasmedia.com/blog/climategate-not-fraud-but-noble-cause-corru...

"Same with Steve Mosher's complaint about what Briffa was "forced" into (whether or not the emails provide enough context for these conclusions I can't say either - again I haven't looked into it myself). But if what Steve says here is true, then there was a numerical quantitative parameter - uncertainty - that was mis-stated in the IPCC reports. A proper analysis would show a different number. If Steve is right somebody should be able to do that proper analysis and get the right number, and show how it makes a difference. Persistence in using the wrong number after it's been shown there's a correct one would again be an instance of continued repeated fraud."

I think your assumption is wrong. There isnt a calculation that was done wrong. There is an analysis that is incomplete.
lets just take briffa's truncation. Without the truncation briffa has argued that the MWP could be elevated.

"During the second half of the twentieth century, the decadal-scale trends in wood density and summer temperatures have increasingly diverged as wood density has progressively fallen. The cause of this increasing insensitivity of wood density to temperature changes is not known, but if it is not taken into account in dendroclimatic reconstructions, past temperatures could be overestimated… In the areas where the growth data extend through to the warm late 1980s and early 1990s (NEUR, WSIB, CSIB, ESIB), the divergence is at a maximum in the most recent years. Over the hemisphere, the divergence between tree growth and mean summer temperatures began perhaps as early as the 1930s; became clearly recognisable, particularly in the north, after 1960; and has continued to increase up until the end of the common record at around 1990."

That Observation, the MWP could be overestimated, is begging to have a number put on it. Also, you'll see the uncertainty in WHEN the divergence begins. 1930? "clearly recognoziable after 1960?" That as well invites some examination. What happens to our estimate of the MWP under all cases. Its THAT uncertainty I am refering to. the choice to truncate or not is NOT a logical necessity. it is a choice with an probability of being the correct thing to do. So there is an assocaited error as well. That uncertainty has not been calculated incorrectly, it hasnt been calculated at all. It could be small, in which case, the overestimate is small. Choosing to truncate is defensible. What is the impact on the final answer? Choosing to throw the whole series out is defensible. What is the impact on the final answer? examining the temperature record is a defensible choice. Wilson for example, rejected the CRU series in one of his studies to "improve" the final answer. These are all choices. None of them is logically derivable from antecedents. If they dont produce different results, then thats a mathematical result. If they do, then you have insight into the importance of the analyst CHOICE of how to treat this data. Not calculating these differences is hardly fraud. It's incomplete work. It's a question mark. It won't change the science that says c02 causes warming. IT CANT. It may change our certitude that the MWP was slightly cooler than today or slightly warmer, also a difference that makes no difference in the grand scheme of things.
That, however, is no excuse for failing to do a complete job. Further, when I pick my car up from the mechanic and note that he didnt change the oil, he cannot get away with the following: "change your own damn oil." I am well within my rights as a customer to note that he hasnt finished the job to my satisfaction. He's not cheating me. Perhaps he just forgot. perhaps he ran out of time. But none of that alters the fundamental fact that the car doesnt have any oil in it, or maybe its a quart low. I really just want the oil filled properly.

Steve, if you think you're

Steve, if you think you're being "polite" by not claiming fraud, just cut it out. Honesty about the situation trumps politeness. What do you think "noble cause corruption" is supposed to be, if it doesn't involve some sort of wrong-doing by the parties involved? Don't mince words, say what you think is correct.

And if there's no fraud or something close to it, then all you're saying is some people were honest but lazy or spiteful or playing politics and didn't do everything they could. Well, tough, that's true of every piece of scientific work out there. There's always more work that could be done. I don't get the point of your long whines about this if you really don't think there was something wrong with the substantive seemingly-objective conclusions of the work.

"Steve, if you think you're

"Steve, if you think you're being "polite" by not claiming fraud, just cut it out. Honesty about the situation trumps politeness. What do you think "noble cause corruption" is supposed to be, if it doesn't involve some sort of wrong-doing by the parties involved? Don't mince words, say what you think is correct."

I am saying what I actually mean. Being polite has nothing to do with it. please refrain from imputing motives to me. Not all wrong doing is fraud. Not all wrong doing changes the science. Not all wrong doing should be blithly ignorted You seem to want to hold a bright line. If it's fraud it matters and nothing else matters. I'll take the FOIA as the first example. It's fairly clear, if one believes the ICO, that CRU ( jones and palmer) did not handle Hollands FOI according to the regulations. They stretched, bent, and probably broke the rules. That didnt change the science.
Was their behavior fraud? no. Is their behavior something that you want to countenance? you want to leave it go unchecked? simply because it does not rise to the level of fraud? I don't. Two months ago I had a conversation with the NOAA FOIA officer. Her response to me?" I don't care WHO you are Mr Mosher, I don't care what Dr. Peterson says, if you have a right to the documents you will get them." and I did. Contrast that with the behavior we see detailed in the mails/ Jones talking to the FOIA officer about the kind of people who read CA and send in FOIA. Unprofessional. Let's take Jones treatment of Data. In the very same mail he SENDS confidential data to Rutherford. And he says, if anybody asks for this data under FOIA, he will argue that the data is
confidential. And if the FOIA officer orders him to give it to people, he will delete it. That attitude toward data sharing is not fraud. Do you want to argue that this attitude is laudable? do you want the next generation of scientists to think that confidentiality agreements can be broken on a whim? or that FOIA dont matter? do you? I dont. let's take phil Jones refusal to send code to McIntyre: people argue that you dont need code because papers describe the method. Really?

Dear Phil,

In keeping with the spirit of your suggestions to look at some of the other multiproxy
publications, I've been looking at Jones et al [1998]. The methodology here is obviously
more straightforward than MBH98. However, while I have been able to substantially emulate
your calculations, I have been unable to do so exactly. The differences are larger in the
early periods.

Since I have been unable to replicate the results exactly based on available materials, I
would appreciate a copy of the actual data set used in Jones et al [1998] as well as the
code used in these calculations.

There is an interesting article on replication by Anderson et al., some distinguished
economists, here [1]http://research.stlouisfed.org/wp/2005/2005-014.pdf discussing the
issue of replication in applied economics and referring favorably to our attempts in
respect to MBH98.

Regards, Steve McIntyre

From: Phil Jones
To: mann@xxxxxxxxx.xxx
Subject: Fwd: CCNet: DEBUNKING THE "DANGEROUS CLIMATE CHANGE" SCARE
Date: Wed Apr 27 09:06:53 2005

Mike,
Presumably you've seen all this - the forwarded email from Tim. I got this email from
McIntyre a few days ago. As far as I'm concerned he has the data - sent ages ago. I'll
tell him this, but that's all - no code. If I can find it, it is likely to be hundreds of
lines of
uncommented fortran ! I recall the program did a lot more that just average the series.
I know why he can't replicate the results early on - it is because there was a variance
correction for fewer series.
See you in Bern.
Cheers
Phil

is that fraud. I think not. I think that is obstruction of science. I can go on. The point is, neither Amac or I will be goaded into making fraud claims ( I'll slip up on my own thank you). One final example. Mann, repeating a mistake after McIntyre pointed it out. Then finally correcting it. Then failing to acknowledge the person (McIntyre) who found the error

http://climateaudit.org/2008/11/09/the-rain-in-spain/

Fraud? no. stubbornness? yup. unprofessional conduct for not attributing the correction properly? yup. Do you want this type of behavior to be ignored? It's not fraud, the science doesnt change, move along. In my old world ( aerospace) these types of lax practices were a WARNING sign. We were not trying to save the planet, only make planes that didnt crash. So best practices were demanded.

As I said "noble cause corruption" comes pretty close to describing this. I'll give you an example. A police officer fills out an arrest report. He writes
down the wrong time for the arrest. A simple mistake. The guy is guilty. The cop creates a bogus report changing the time. does that violate procedures? yup. does it make the guy not guilty? nope. do we want to encourage this kind of behavior? no. why? because noble cause corruption can start with minor infractions of procedures and escalate into vigalantism.

The bottom line is you do not want to criticize anyone for anything less than fraud. That's clear. Just argue that sloppiness is preferred. Just argue that it is best practice. just argue that you want people to NOT share data or code, especially with critics. Just argue that giving credit to people who point out errors is UNacceptable. Just argue that your obligations to follow FOIA laws SHOULD be ignored. Just argue that you should do what ever you can get away with SHORT of fraud. Just make those arguments. Its what you believe, correct?

You continue:
"And if there's no fraud or something close to it, then all you're saying is some people were honest but lazy or spiteful or playing politics and didn't do everything they could. Well, tough, that's true of every piece of scientific work out there. There's always more work that could be done. I don't get the point of your long whines about this if you really don't think there was something wrong with the substantive seemingly-objective conclusions of the work."

It's clear that you don't. Its clear that you believe people should be lazy and sloppy and spiteful and political. That is what you think they Should do? correct? In fact, you think its laudable. right? If you don't agree with these then you DO get the point of the whines. So stop being coy. If you think is is better to be lazy, then defend it. If you don't think its better to be lazy, then speak up. If you don't care, then I'll suggest that you don't care too much about the quality of climate science. And if you don't care about the quality of climate science, then I think people should ignore the conclusions or at least retain a healthy degree of skepticism about them. I would prefer that they take the science seriously. I would prefer that people NOT defend laziness. So either tell people they should be lazy and spiteful or tell them they should not, but don't defend it lazily.

All your actual complaints

All your actual complaints seem petty personality-level stuff. Sure it's "wrong" by some moral standard. But if there's no proof of actual harm to the science then you've got nothing, that's what I'm trying to tell you.

And then there's the "slippery slope" nonsense - it "can ... escalate into vigilantism" - really? Please! If you have evidence of "vigilantism" then go point it out, get people fired. Quit the petty pussy-footing.

Scientists aren't paid well enough for the time they put in, to demand "best practices" from some arbitrary outside perspective. Pay them as well as professional engineers and maybe you'll get something more.

I remember when I worked at Argonne National Lab for a couple of years as a postdoc ($30,000 a year with no benefits, 60-hour weeks - and that was a good salary compared to most places), we got a bunch of visits from "Tiger Teams" from DOE HQ. They were trying to get us to adopt some sort of fuzzy "quality" perspective, industrial ideas about best practices. We laughed.

We laughed because that place was home to thousands of scientists, and almost every one of them was doing something wildly different from everybody else, with their own theoretical physics experience, their own software code, their own experimental apparatus, or their own special materials of one sort or another. There were a few large facilities where common safety practices and such could be sorted out and initiated, but that was certainly not the rule. Our little theory group of a dozen or so people had almost as many different computer hardware platforms (Sun, SGI, HP, DEC, NeXT, plus large parallel processing machines at the computing center); we were running simulations across the spectrum of condensed matter physics. And one individual in a given year would likely work on half a dozen separate projects, with some overlap based on experience, but often very very different details. Our documentation was our publications - it was certainly a very productive time for me. But it was productive because we were at the forefront of science, doing things nobody else had done before - there were no "quality standards" or best practices other than the peer review of our colleagues and journal publication. If we bothered with more than that, productivity would have greatly suffered.

If some moral failing you are complaining about doesn't matter to the science, then it truly does not matter.

I think perhaps Lucia can

I think perhaps Lucia can offer a different perspective on working at labs. I certainly can offer a different perspective working with more diverse computer platforms and yes much larger teams and yes longer hours and blah blah blah. In the end you take public money, you have to shape up. But it's good to see in writing the kind of attitude that you support. I think that says it all.

Ha! Every one of us could

Ha! Every one of us could have gotten far-better-paying jobs in industrial settings if we hadn't loved what we were doing. "you take public money, you have to shape up" - hardly likely. Half my grad school class ended up on Wall Street as it is raking in millions. Impose arbitrary nonsense like "best practices" in any kind of rigid style and you'll kill science in this country.

To be a little less cavalier

To be a little less cavalier - two points:

(1) First, you are interpreting "the kind of attitude that (I) support" from my writing. It seems to me you are completely misinterpreting my point. Just as Steve McIntyre did with my entire article the other day (*NOT* about climateaudit!) and just as I suspect you have done with much of the commentary you have made here and elsewhere about motivations. You cannot know those things, and you seem to have a tendency to misread them.

(2) My real issue here is that there are no "best" practices in curiosity-driven scientific research. If you are working in an area with established "best" practices, then you are sheltering yourself somewhere in well-worn territory, not doing your job as a scientist out on the frontier of knowledge. Applied research and development is, and should be, quite different - and I think that's what some of the commentary about "regulatory" science that's been out there is referring to. Much of climate science has been, up to now, curiosity-driven, basic research. There are also applied research areas within essentially the same fields - NCAR does a lot of practical weather-forecasting related work for instance. In those areas it certainly makes sense to establish standards for data handling, provenance, quality, publishing, etc. But with basic curiosity-driven research there just is no set of "best" standards that will ever make sense, because by definition the scientist is doing things that haven't been done before and nobody can know what will work and what won't, what makes sense and what are just blind alleys, etc, until the work is done. The closest you can get is looking at what other scientists in closely related research areas are doing - and they'll be looking at you, that goes on all the time - various levels of peer review, conferences, collaborations, etc.

That doesn't make curiosity-driven researchers careless. To the contrary! We repeatedly try things out, rework old problems in new ways, try to gain insights and verify that we're not deluding ourselves in someway, or haven't broken some piece of our equipment or software or mental framework. Aside from the routine cross-checking while porting thousands of lines of (often fortran) code all over the place, I had at least two episodes in my research career where my care uncovered serious bugs in what we were relying on.

In the first case, DEC had been shipping a chip for months that, under certain circumstances, only achieved single-precision floating point accuracy, even though you were doing double-precision calculations. Very rarely, once in millions of calculations, the last bits of a double-precision operation would be completely randomized. I sent them code that replicated the problem, and they (very quietly) acknowledged it and fixed the chip.

In the second case, I discovered that a piece of quantum chemistry software used very widely in the research community and even sold commercially had the wrong implementation of a density-functional component: they had relied on the originally published paper, and not noticed there was an erratum that corrected the formula. I discovered the problem by comparing the output from the standard q. chem. software with our home-built software that had the right formula. We communicated with the original authors and they again acknowledged the problem and corrected their code.

Sure you could turn those examples into "best practices" of some sort - "always crosscheck calculations" perhaps. Evidently some people don't do that as much as they should. But who's going to go around forcing people to cross-check what they're doing, or checking that they're cross-checking? 99.99% of the time they get the same result both ways and that's that, everything's fine.

The only reliable "best practice" that works for curiosity-driven research is the motivation of the researcher to get things right, to not delude themselves, to not publish stuff that's wrong. And it works, for the most part, spectacularly well.

"Same with Steve Mosher's

"Same with Steve Mosher's complaint about what Briffa was "forced" into (whether or not the emails provide enough context for these conclusions I can't say either - again I haven't looked into it myself)."

please do not remove my nuance. Do not council me to retain it for your benefit and then remove it when quoting me.

"If you study the mails WRT to the authoring of that chapter you will come away with the impression that Briffa was under pressure to overstate the certainty"

Peck: to Briffa

Lastly, we wanted you to know that we can probably win another page
or two (total, including figs and refs) if you end up needing it.
Susan didn't promise this, but she gave us the feeling that we could
get it if we ask - but probably only for your section, and maybe an
extra page for general refs (although we're not going to mention this
to the others, since we're not sure we can get it). Note that some of
the methodological parts of your sections should go into supplemental
material - this has to be written just as carefully, but it gives you
another space buffer. All this means you can do a good job on
figures, rather than the bare minimum. We're hoping you guys can
generate something compelling enough for the TS and SPM - something
that will replace the hockey-stick with something even more
compelling.

From: Jonathan Overpeck
To: Eystein Jansen
Subject: Re: Bullet debate number 2
Date: Wed, 15 Feb 2006 16:36:46 -0700
Cc: Keith Briffa

thanks. Agree on the attribution front, but what about being more specific (at least a
little) about what the "subsequent evidence" is. Is there really anything new that gives us
more confidence?

Keith? Eystein?

thx, peck

Briffa:

"Peck, you have to consider that since the TAR , there has been a lot of argument re
"hockey stick" and the real independence of the inputs to most subsequent analyses is
minimal. True, there have been many different techniques used to aggregate and scale
data - but the efficacy of these is still far from established. We should be careful not
to push the conclusions beyond what we can securely justify - and this is not much other
than a confirmation of the general conclusions of the TAR . We must resist being pushed
to present the results such that we will be accused of bias - hence no need to attack
Moberg . Just need to show the "most likely"course of temperatures over the last 1300
years - which we do well I think. Strong confirmation of TAR is a good result, given
that we discuss uncertainty and base it on more data. Let us not try to over egg the
pudding.
For what it worth , the above comments are my (honestly long considered) views - and I
would not be happy to go further . Of course this discussion now needs to go to the
wider Chapter authorship, but do not let Susan (or Mike) push you (us) beyond where we
know is right."

On my reading that looks like Briffa is being put under some pressure. Also, when Rind advises him to treat
these things in a forthright manner, Overpeck tell briffa to take what Rind says with a grain of salt.

One line of defense is to argue that briffa was put under pressure, but didnt succumb. That will lead to a discussion
of why he choose to redact the tree ring series when an expert reviewer requested that he show it, as he apparently had in the original published paper.

I'm not convinced Tiljander

I'm not convinced Tiljander meets criteria 1 or 2. I suppose upside-down usage meets 1.
I'd have to see if Tiljander makes a difference with criteria 2, but with 1200 proxies and that algorithm, I'd be surprised if it did.

Tiljander meets which

Tiljander meets which criteria?

Re: I'm not convinced Tiljander meets criteria 1 or 2 by MikeN (Sun, 06/27/2010 - 20:29) --

My view of the five criteria are presented in the body of Arthur's post.

Re: "(1) False result presented in a peer-reviewed article?" Proper due diligence would have revealed that the Tiljander proxies are uncalibratable to the instrumental temperature record, 1850-1995, due to increasing contamination by non-climactic local factors during the 19th and 20th centuries. The upside-down orientation of lightsum and XRD is simply fallout from using the correlation between those series and CRUtem gridcell temperature. The non-climate effects might as easily have yielded a spurious rightside-up-oriented correlation. In fact, in the case of darksum, they did.

Re: "(2) Falsity made a material difference to the overall message of [graphs]?" 1209 proxies were used, but most of them are tree-ring chronologies. One of the two main findings of this paper was supposed to be the concordance between tree-ring proxies and various other proxies. In addition, most of the proxies cover relatively recent times. The four Tiljander proxies are among the few (one to two dozen; I can't recall the exact number) that extend earlier than ~1000 AD. The second main finding was the extension of the paleotemperature reconstructions into the first millenium.

As best I can tell, the Tiljander proxies are used in the calculations to produce Figs. 2 (a,b,c,d), 3 (both parts), S2, S4 (8 of 12 parts), S5 (4 of 6 parts), S6 (4 of 6 parts), S7 (a,b), S8 (original and both revisions), S10 (both parts), S11 (both parts), S14 (both parts), S15 (both parts), and S16 (both parts). At Collide-a-scape, Gavin claimed the with-Tiljander and without-Tiljander traces in the twice-revised non-peer-reviewed Fig. S8a were "similar" (Comment #29). But Lucia inspected the same traces, and thought that the without-Tiljander curve showed much, much greater variability (Comment #45), a significant difference. So, what would recalculation of all the affected figures show? I don't know. At this point, some reasonable people can already see material differences to the overall message.

And as I noted in Comment #49, there is a subtle but important problem with Gavin's interpretation, if indeed the Tiljander proxies are uncalibratable. Because that strongly suggests that the addition of invalid data leads to a reconstruction that is nearly as good as one that is built on valid data. This ought to be a red flag, in my opinion.

In my comment "Tiljander

In my comment "Tiljander meets which criteria?" (Sun, 06/27/2010 - 23:16, I said,

The four Tiljander proxies are among the few (one to two dozen; I can't recall the exact number) that extend earlier than ~1000 AD. The second main finding [of Mann08] was the extension of the paleotemperature reconstructions into the first millenium.

Here are those numbers. By my count from the Mann08 SI file "1209proxynames.xls", including the four Tiljander proxies, there are 37 Northern Hemisphere series that extend earlier than 800 AD. Of these, 22 are not tree-ring series.

AMac - is it your

AMac - is it your understanding that Mann's procedure does some sort of weighting of the reconstruction that varies by time period, or is it some kind of constant weighting? That is, if a Tiljander curve is x% of the reconstruction in the year 500 AD, is it the same x% at 1000 AD and 1500 AD, or is it some different number? Not sure I'm asking the question properly (I'm really completely new to this reconstruction business) but I hope you get my drift. Thanks.

This is a response to

This is a response to Arthur's comment AMac - is it your query.

Arthur, my understanding is that there would have to be two answers, one for the CPS procedure and one for the EIV procedure. I believe that in both cases, weighting varies, depending on the interval. I recall that the reconstruction period is broken into defined intervals, perhaps of a century each, and that a proxy's contribution in one interval need not be the same as in another one.

This philosophy would allow for the use of proxies that cover only part of the time period of interest. As noted above, very few of the 1209 proxies extend into the earliest times.

For a definitive answer, there are more knowledgeable people that could be approached. RealClimate.org's comment section can be pretty responsive to queries like this. "Bishop Hill" (Andrew Montford) wrote a book that covered this topic, posting the question on his blog could yield a "skeptical" perspective. We could see if they match.

Steve, there are two issues

Steve, there are two issues with Tiljander. The divergence issue, and the upside down-usage, which perhaps results from the divergence.
Kaufman cut off the later portion, and still managed to use it upside-down.

ya mike. The discrepency

ya mike. The discrepency between manns use and kaufmans use should get arthur's attention. but it won't.

Steve, do you agree the

Steve, do you agree the Tiljander stuff is, in your view, the closest thing to a clear case of "fraud" or whatever you want to call it - something that actually affects the science, and seems to have been willfully ignored (and repeated?) by the climate science community? If you and AMac agree on this, I'll definitely put some effort into looking into it more. I've already started reading several of the threads AMac pointed to and I think I understand the claims a bit better now. I'll summarize when I've had a chance to read more.

Not fraud. Its more like

Not fraud. Its more like protective stupidity.

http://en.wikipedia.org/wiki/Crimestop

The history Mann has is rather clear. He refuses to openly and transparently make any corrections that would involve
giving credit to McIntyre. The rain in Seine is the best example.

Its the same patter that does not allow people to even THINK about issues that fall short of fraud. Because if you admit to
any shortcomings the fear is ( I've heard it expressed many times) that the skeptics will take advantage of any slip, however minor.
It's that fear, fear that the slightest increase in doubt will lead to inaction, that drives the decisions to not even look at the quality issues.

You're talking a lot about

You're talking a lot about motive and intent here. I don't believe you have any way of knowing those things you say about Mann's motives are true; no further discussion of motive please. Let's once again at least try to stick to real substantiatable facts, ok?

I wouldn't call it fraud.

I wouldn't call it fraud. I'd say it is a mistake, which, when pointed out, went uncorrected.
I think TAR meets criteria 1-4, and will try to get those sources. I will pursue that on the old thread.
I thin you have already agreed that it constitutes fraud if the details are as Mosher wrote.

http://www.pnas.org/content/106/6/E10.full?ijkey=687b9a07eb0706917914aa8...

From Mann's response:
The claim that “upside down” data were used is bizarre. Multivariate regression methods are insensitive to the sign of predictors. Screening, when used, employed one-sided tests only when a definite sign could be a priori reasoned on physical grounds. Potential nonclimatic influences on the Tiljander and other proxies were discussed in the SI, which showed that none of our central conclusions relied on their use.

1)One possibility is that Mann is correct and Steven McIntyre is simply wrong.
2)The other is Mann makes no effort to evaluate the issue, either due to a)incompetence, b)laziness, or c)anger. 2a could mean he reached the wrong conclusion as well.

3)The other is he is just being stubborn, which in this case I guess constitutes fraud. With another author making the criticism, perhaps he is more likely to concede.

I don't rule out 2a in this case.

The Stoat threads had William

The Stoat threads had William Connolley starting out confidently repeating claims from 'the Team' regarding Tiljander, and Amac got him to a position where he didn't sound so confident.

Jeff Id has some technical details.

By the way, most of the criticism seems to be with CPS, while EIV is ignored almost as a joke. I'll focus on CPS.

Here's my take on how Tiljander and Mann 08 operate.

Tiljander proxy, there are actually 4, and I'm not sure which one I'm talking about, has a medieval warm period, and then shows the modern time to be very cold. In the proxy, high values represent cold, and low values represent warm, so the figure looks like a hockey stick.

The modern cold portion is considered to be influenced by non-climate factors, and should not be used.

Kaufman's Arctic warming paper cut off the late portion, but used the proxy upside-down, confused by the hockey stick shape. I believe a correction from Mann would have averted this mistake, as Kaufman shared a Mann coauthor. The calibration was also corrected to reflect that Tiljander was not used in the later portion.

Mann 08, takes 1200 proxies, then filters and calibrates the ones that match the modern temperature record. These filtered, calibrated proxies are then averaged together. Did not cut off the proxy. Instead, the later proxy 'cold' period, which has high values, is interpreted as being correlated with temperatures, and the medieval warm period in the proxy is treated as cold, since it is a low value.

Mann said Multivariate regression methods are insensitive to the sign of predictors.
Now this could be a Gavinesque 'you're an idiot, nothing to see here' response.
It could also mean that Mann thinks his algorithm is doing something different from what is claimed.
When I asked on ClimateAudit, I was told that the algorithm will not flip a proxy to match correlation; the proxies have to be input by hand in the proper orientation. So if the proxy values had been inverted, so cold is lower, and high is warmer, this proxy would have been dumped by the algorithm.

By the way, I'm thinking this

By the way, I'm thinking this Tiljander business seems to be turning into a sort of amusing riff on the Briffa "truncation" archiving/publishing question.

That is, if Tiljander really thought their data was contaminated in the modern period, perhaps they should not have published it but instead published a truncated series that contained only the data they thought was reliable. Wouldn't that have cleared this problem right up?

That's a fascinating point,

That's a fascinating point, Arthur. Can you expand on it?

What would have been the optimum date for Tiljander et al to have truncated their archived data?

Why do you think I would know

Why do you think I would know the answer? I'm really completely new to this problem.

But if Tiljander and co. had good reason to believe it, they would surely have had good reasons to pick an "optimum date" to truncate too.

The best option would be Steve's #2 suggestion - publish both the full series (with caveat: *do not use this for reconstructions unless you know what you're doing*) and a truncated series ("suitable for use in reconstructions"). That way people unfamiliar with the details of the analysis would have a dataset that the originators felt was clean to use in their work, and people interested in looking into the divergence issues would have the full series to work with too.

Anyway, I think it's interesting that this highlights one serious problem with the "publish everything" approach...