Michael Mann's errors

[UPDATE July 1, 2010: Penn State just today issued a final report on their investigation into allegations of misconduct by Dr. Mann. Note their criteria for misconduct were as follows:

(1) fabrication, falsification, plagiarism or other practices that seriously deviate from accepted practices within the academic community for proposing, conducting, or reporting research or other scholarly activities;
(2) callous disregard for requirements that ensure the protection of researchers, human participants, or the public; or for ensuring the welfare of laboratory animals;
(3) failure to disclose significant financial and business interest as defined by Penn State Policy RA20, Individual Conflict of Interest;
(4) failure to comply with other applicable legal requirements governing research or other scholarly activities.
where "research misconduct does not include disputes regarding honest error or honest differences in interpretations or judgments of data, and is not intended to resolve bona fide scientific disagreement or debate."

The report concludes "there is no substance to the allegation against Dr. Michael E. Mann"; the worst they could say was that he was somewhat "careless" in sharing unpublished manuscripts with colleagues. However, this was a pretty high-level review, and seems it did not get into the "Tiljander" issue. So we'll see where that goes here.]

Deep Climate has a new post up concerning the IPCC TAR Figure 2.21 "hockey stick" curves prepared by lead author Michael Mann for the 2001 report. This follows up on my post looking into similar questions regarding the AR4 Figure 6.10b "hockey sticks" in the latest (2007) report. Despite Steven Mosher's claims that the same rather subtle "trick" of padding the Briffa curve with instrumental temperatures was used in both figures, my analysis showed it could not have been in the AR4 case. Deep Climate has now shown that it's possible this padding was used for the Briffa curve in the TAR figure, but it makes essentially no difference (0.01 degrees over about 10 years, almost imperceptible in the full graph). The Mann curve in the TAR figure ("MBH") clearly was padded from 1980 with instrumental temperatures in the end-point smoothing, while the Jones curve in that figure clearly was not. The details of Mosher's (and McIntyre's, in this case) accusations about Mann and Briffa also seem to be contradicted by the actual record as DeepClimate shows, but that's another matter.

In any case, there definitely was an issue with the way Michael Mann was doing end-point smoothing in several of his early published papers, and in at least one of the curves in this IPCC report. As noted in the previous discussion, Mann admitted to this several years ago and indicated it wouldn't happen again, and it doesn't seem to have. So, an error, with minor impact on a couple of curves, since repented of.

In my last post, I called for the best examples of anything close to fraud by climate scientists, in particular Mann. There certainly are a number of cases where he has made similar errors that seem to have not substantially effected the results, but still were indicative of a certain carelessness, and sometimes stubbornness in recognizing the problem. And there are other examples of minor errors by other groups, for example several glitches in the GISS instrumental temperature analysis over the years that caused minor shifts in historical temperature numbers, particularly regionally.

Of course the worst case of unacknowledged errors in climate science was probably the UAH satellite analysis, which for 26 years was substantially under-reporting warming due to an algebra error in the analysis.

People make mistakes; as long as they correct them when the error is detected, it's really not such a big deal. The question I raised in my "Where's the fraud?" post was whether there was any substantive error of some sort that had not been admitted or corrected after exposure.

The strongest case seems to be the Tiljander proxy issue in a 2008 paper, as outlined by AMac in comments on my last post, and reference links there. Since nobody has proposed anything stronger, I plan to look into it in some further detail to make sure I understand the alleged error myself in its proper context. Here's how I understand it from my reading so far of the case (which hasn't yet included looking at any of the scientific papers or data involved, merely reading comments from others):

* Several data series were published by Tiljander et al which they believed could act as a temperature proxy (in ways they described) but only up to about 1720; for more recent years the authors believed the data were contaminated by human influences that masked any temperature signal

* Michael Mann and coauthors included the Tiljander series in a 2008 reconstruction paper. They were apparently aware of the possible contamination issue and made note of it in supplemental material that included a graph comparing the reconstruction with and without the Tiljander (and a few other suspect) series

* What Mann apparently was not aware of, or did not acknowledge, was that the contamination, for at least some of the Tiljander series, was bad enough to reverse the correlation between temperature and their data, so that Mann's calibration method would spuriously turn some of the Tiljander temperature proxies upside down in the pre-1720 period (at least if Tiljander's temperature correlation claims are correct). Mann's calibration method was regression-based and so didn't care whether the input was "upside down" or the right way up, it would determine a linear coefficient of whatever sign made the modern temperature correlation work.

* Kaufman published a similar reconstruction in 2009, using a different calibration method that really did care about sign but didn't use modern-period calibration; since some of the Tiljander proxies in that case were truncated to the earlier period that would have made sense, except that that method depended on having the right sign. A correction was published when this was discovered.

* McIntyre and McKitrick published a comment on Mann 2008 making the "upside down" accusation, but not being clear that the issue was a twisting of the correlation in Tiljander itself

* Mann's reply indicated he didn't understand the twist either.

* There's been a lot of blogosphere discussion of this, but one side seems to keep saying "upside down" while the other side says "it doesn't matter to the reconstruction", ie. the usual people talking past each other. The real problem is the twist in the data, which nobody seems to talk about.

Open questions to me are:

(1) Is the "twist" real - was the original Tiljander paper correct in assignment of certain temperature correlations which reversed in the modern period?

(2) Should Mann and friends have figured out the twist from the blogosphere discussion, the M&M comment, and Kaufman's correction, rather than interpreting "upside down" literally and not understanding it, as their reply and various defenses since suggest?

(3) Is it possible, despite the one figure showing no substantive difference to the reconstruction, that use of Tiljander did make a material difference to some of the conclusions of Mann 2008 (and apparently later papers using the same data)?

If the answer to these 3 questions is "Yes", then that adds up to a somewhat damning allegation against Mann's work on this. Evidently he's made some minor errors in the past, if those are true then I agree this one was a bit worse.

So, I'll spend a bit of time looking into these; references and help welcome in the comments, thanks.

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Arthur, I salute you for your

Arthur, I salute you for your willingness to look into this. It seemed like a bottomless pit of confusion over a small matter to me so I did not take up AMac's invitation to look into it. There has been considerable discussion on Stoat about this, as doubtless you already know. My impression was only that William's conclusions did not satisfy AMac and that I was unlikely to have more patience for the matter than William did.

I think it's good that somebody takes a fresh look but even in the worst case I still don't see this rising to front page scandal status.

I think I've already done

I think I've already done this. http://scienceblogs.com/stoat/2009/11/tiljander_again.php (and links).

(1) I don't think calling it a twist, or a reversal of correlation, is helpful. The issue is the non-climate information in the recent series, which introduces whatlooks like an opposite correlation.

(2) Dunno. The original McI comment is "Their non-dendro network uses some data with the axes upside down, e.g., Korttajarvi sediments, which are also compromised by agricultural impact (M. Tiljander, personal communication), and uses data not qualified as temperature proxies (e.g., speleothem δ13C). " [http://www.pnas.org/content/106/6/E10.full?ijkey=6054afc9aeb848de052d626d7d74d9d42eb8291f&keytype2=tf_ipsecsha]. That to me simply gets the issue wrong, at least in the lead, and "compromised" is too weak for what McI asserts.

(3) I doubt it, for the reasons given in my post: the Tiljander series should have been strongly deweighted by Mea's algorithm and hence not matter much.

Yup - I just want to

Yup - I just want to understand it myself. Back when you were trying to understand it, I was ignoring it. It does seem to be the strongest case claimed against Mann, so if there's really not much to it, well...

I don't think you can say

I don't think you can say this is the strongest cast against Mann unless you include a repeat usage in 2009.

Agreed, that adds

Agreed, that adds considerable weight (sorry I didn't list that in my summary).

Arthur, those are nice "open

Arthur, those are nice "open questions". Also good to see Michael Tobis and William Connolley weighing in.

On Question (1), here are the most relevant quotes from Tiljander et al (Boreas, 2003).

Figure 5 legend, page 570 -

High X-ray density corresponds to high amount of mineral matter (light grey value tints in X-ray film) and low X-ray density corresponds to dark grey values caused by a higher proportion of organic matter.

Figure 9 legend, page 573 -

LS (light sum) is the sum of grey values and describes the amount of mineral matter. DS (dark sum) = LSmax – LS and describes the amount of organic matter.

Page 571 -

The above-mentioned factors, the amounts of inorganic and organic matter, form the basis of the climate interpretations. Periods rich in organic matter indicate favourable climate conditions, when less snow accumulates in winter by diminished precipitation and/or increased thawing, causing weaker spring flow and formation of a thin mineral layer. In addition, a long growing season thickens the organic matter. More severe climate conditions occur with higher winter precipitation, a longer cold period and rapid melting at spring, shown as thicker mineral matter within a varve.

I agree with William that calling it a "twist" may not be helpful. I commented at AGW Observer that the discussion of the orientations of the lightsum and XRD proxies as interpreted by Tiljander and as employed by Mann08 often runs into semantic difficulties. Flipped, inverted, upside-down, (and twisted) have clear meanings in everyday English, but problems have repeatedly arisen. It is cumbersome but perhaps useful to refer to the specifics of the interpretation. I think Tiljander’s view is:

darksum — pre-1720, higher values generally correspond to higher temperatures.
lightsum — pre-1720, higher values generally correspond to lower temperatures.
XRD — pre-1720, higher values generally correspond to lower temperatures.
thickness — no explicit interpretation is offered.

The ultimate effect of the use of a given proxy on the final paleotemperature reconstruction must be one of the following:

1. For all years, a higher value of the proxy causes the reconstruction to report a higher temperature (than would be the case if that proxy had a lower value for that year).

2. For all years, a higher proxy value causes the reconstruction to report a lower temperature (than would be the case if that proxy had a lower value for that year).

3. For some years, a higher proxy value causes the reconstruction to report a higher temperature. For other years, a lower proxy value causes a higher reported temperature.

4. The value of the proxy in question has no effect on the reported temperature.

My interpretations are that -

#1 is consistent with Tiljander’s interpretation of darksum.

#2 is consistent with Tiljander’s interpretation of lightsum and XRD.

#3 is non-intuitive, but possible. To my knowledge, there is no evidence that Mann08 employed such an approach.

#4 is a trivial possibility, whereby the proxy has no effect.

With respect to Mann08, it appears to me that all four Tiljander proxies (darksum, lightsum, XRD, and thickness) were used such that they contributed to the paleotemperature reconstructions as outlined in #1.

With respect to Tiljander03's interpretation, Mann08's uses would appear to be in agreement for darksum and discordant for lightsum and XRD (no interpretation was offered for thickness).

(I have compiled links to relevant papers, archives, etc. here; additions and corrections welcome.)

Side note: Earlier, I

Side note: Earlier, I expressed frustration with controlling threading of comments at this site. Arthur recommended getting an account here (N.S.'s home page, left side). That has greatly helped with reading of comments, and with writing them.

Arthur, thanks for hosting these discussions.

Thanks everybody for your

Thanks everybody for your efforts to get to the core of this often confusing debate. Like Arthur I haven't read the source material so what I write here draws on what appears to be considerable agreement between Michael, William and AMac on what Mann actually did.

I'm not a paleoclimatologist and I'm not a statistician, but the effect on the final results seems to have been quite small, and we're told that Mann also produced supplemental material omitting proxies considered to be suspect, including those now in question.

It seems to be a characteristic of these critiques of Paleo work by Mann that they focus on minor errors, later detected and corrected, which have no actual effect on the conclusion. It's good to know that the conclusions of work in this field are so very robust!

AMac boringly continues to

AMac boringly continues to assert that Tiljander's interpretation of the data should be given more weight than anyone else's. While AMac probably feels he needs to do this in order to keep his arguments afloat, as a semi-interested non-technical observer of all of this I am moved to wonder what other baseless assertions he might be making.

> AMac boringly continues to

> AMac boringly continues to assert that Tiljander's interpretation of the data should be given more weight than anyone else's.

Steve Bloom, would you specify which paper you have in mind when you mention "anyone else's [interpretation of Tiljander's data]"? If you mean Mann08 itself, that would imply (1) that Mann08's authors indeed have a different interpretation than Tiljander et al presented in 2003, and (2) that it would be proper scientific procedure for Mann08's authors to greatly revise the original authors' interpretations without so stating. These issues were discussed in the prior thread; see e.g. Martin Vermeer's comments.

As to your speculations on my motives and character, they strike me as incorrect, and as ill-suited to a discussion of the topics that Arthur Smith set out in the body of this post.

AMac, I agree there's no need

AMac, I agree there's no need to discuss your motives and character.

Regarding your continuing assertion that it would not be "proper scientific procedure for Mann08's authors to greatly revise the original authors' interpretations without so stating," this would be according to who? I don't recall your answering that elsewhere, and yet you keep citing to Tiljander's interpretations as if your assertion is true and that they should thus somehow be granted greater weight than anyone else's. You need an authority for that, I'm afraid, and a pretty powerful one.

Tell you what though, I'll agree that it can be characterized as impolite (in a purely social sense) if Tiljander thinks so. Have you asked?

Steve Bloom, let us stipulate

Steve Bloom, let us stipulate for the moment that the Mann08 authors used the Tiljander lightsum and XRD proxies as described under (1) upthread (Arthur and Ari aren't sure if this is the case, yet).

1. For all years, a higher value of the proxy causes the reconstruction to report a higher temperature (than would be the case if that proxy had a lower value for that year).

There are two possibilities. My view of what probably happened is -

Prof. Mann and co-authors unwittingly performed faux calibrations of darksum, lightsum, XRD, and thickness to the instrumental record. Although they quoted Tiljander03′s cautionary text on the problem of post-1720 contamination, they did not exhibit sufficient due diligence on this issue. Thus, as they went on with their reconstruction work and prepared their PNAS manuscript, they weren’t aware of what they had done. They simply missed it.

You are championing the notion that Mann08's authors knowingly oriented lightsum and XRD in a manner that contradicts the analysis of the proxies' creators in Boreas--which is the sole published interpretation of these proxies.

You are further proposing that Mann08's authors knowingly failed to disclose their unconventional interpretations in the abstract, introduction, methods, results, discussion, or supplemental information of their PNAS paper, despite highlighting other, lesser issues that might confound the interpretation of these proxies.

Finally, you assert that it would be proper scientific procedure for Mann08's authors to greatly revise Tiljander et al's interpretations without so disclosing. You demand that I produce a powerful authority in support of the idea that the published interpretations of the scientists who analyzed the varve series should "be granted greater weight than anyone else's." In this context, "anyone else" must indeed mean "anyone else," as you've offered no citations to other authors' competing interpretations of these proxies.

If it turns out that your account is correct, I think that would provide valuable information as to publication standards in climate science, as compared to other physical sciences. But in the absence of evidence, it seems unlikely.

For a powerful authority in support of my position, I offer Richard Feynman's well-regarded essay "Cargo Cult Science."

AMac, agreed with everything

AMac, agreed with everything except your use of the word 'faux'. Mann et al. knowingly oriented those two proxies in the way they did, by a method they believed on good grounds to be valid; what they missed was the disagreement with Tiljander's interpretation. Why can't you see that distinction?

'faux' implies that we (you as the user of the word, or majestetic we as the scientific community) know what the correct orientation is. Do we? I would say this is a legitimate point of discussion. You have read Tiljander et al., you know how speculative this interpretation stuff is. Let's be humble on this ;-)

Martin Vermeer wrote, Why

Martin Vermeer wrote,

Why can't you see that distinction?

Why can't I see what distinction?

"Mann et al. knowingly oriented those two proxies in the way they did" -- yes, I have indicated that this is my opinion.

"by a method they believed on good grounds to be valid" -- yes, I have indicated that this is also my opinion (and satisfies Occam's Razor)

"what they missed was the disagreement with Tiljander's interpretation." -- yes, I have also indicated that this is also my opinion.

What's left?

'faux' implies that we (you as the user of the word, or majestetic we as the scientific community) know what the correct orientation is. Do we?

I have repeatedly indicated, including in direct responses to your comments, that I believe that the correct orientation should be taken to be the orientation that was assigned by the scientists who originally analyzed these data. This is reinforced by the failure of those who argue otherwise to point to any authorities who have offered any contrasting interpretations of these varve proxies.

Perhaps you believe that this is an important avenue to explore further. Perhaps you, or Arthur, or Ari will decide to assert something along the lines of, "I believe that there is good evidence that XRD and lightsum should be interpreted such that higher values cause a paleotemperature reconstruction to report a higher temperature (than would be the case if that proxy had a lower value for that year)." At that point, yes, I agree that this issue will become, again, a legitimate point of discussion.

"Again", because that is close to the theory floated by Steve Bloom in his comment "AMac boringly continues to". This very comment is a response to your response to my response to Steve Bloom's response to my response to "AMac boringly continues to"!

'faux' implies that we... know what the correct orientation [of XRD and lightsum are.]

To be clear: Based (1) on the publications of the qualified authorities who have interpreted these proxies, and (2) on the reasonable physical interpretations that these experts have offered, it is my opinion that we can be reasonably confident that we know the correct orientations of XRD and lightsum.

So at the end you return to a

So at the end you return to a simple argument from authority. OK, I'll take that standard. Now, what are the relative qualifications of the Mann authors vs. the Tiljander authors? To underscore the obvious, the former have far greater authority in the field.

Steve Bloom, it seems to me

Steve Bloom,

it seems to me that the point of our exchange isn't for you to convince me, or for me to persuade you (though that would be nice).

Rather, we have each had the opportunity to explain our reasoning to interested readers, and to support our respective positions as best we can.

In that regard, I think the part of this comments thread that led from your "AMac boringly continues to" comment to here has served its purpose.

Arthur -- any thoughts on this?

Steve Bloom (not verified)

Steve Bloom (not verified) wrote:
Regarding your continuing assertion that it would not be "proper scientific procedure for Mann08's authors to greatly revise the original authors' interpretations without so stating," this would be according to who?

There is absolutely no question that proper scientific procedure dictates that if you cite a source, then deviate from the original interpretation of the findings of that source, you must clearly note this, and explain why.

It is perfectly appropriate to cite a paper and then disagree with its conclusions, but if you simply use the result without further comment, the reader must be able to assume that you agree with the original author. The only alternative would be that the reader would be required to follow all references and read all cited papers to find out where there is disagreement - and even then, you would be in the dark as to why the authors of the derived work decided to deviate from previous interpretations.

Aha, so it's according to Ulf

Aha, so it's according to Ulf (not verified). Well, then, AMac is vindicated: An anonymous commenter on the internet has agreed with him. It was doubtless the smallest of oversights that Ulf neglected to point to any of the doubtless multitudinous sources where this consensus about scientific procedure is documented.

Can you provide a single

Can you provide a single source which says it is okay to cite a paper's data, completely disagree with its conclusions, not mention the disagreement, and have that disagreement be used a component of your analysis?

I am pretty sure most people will immediately think "misrepresentation" when they hear that.

Tony, if by "Michael" you

Tony, if by "Michael" you mean me, you are wrong me. I've studiously said next to nothing about this, except that it doesn't strike me as earthshakingly important even in the least favorable (to Mann) interpretation.

I would nevertheless like to see a clear summary, and appreciate Arthur's efforts in that direction.

>As noted in the previous

>As noted in the previous discussion, Mann admitted to this several years ago and indicated it wouldn't happen again, and it doesn't seem to have.

What are you referring to when you say this?

This is the first time I've seen Mann mention this, and that was last year November.
http://www.realclimate.org/index.php/archives/2009/11/the-cru-hack/comme...

Ah, I hadn't checked the

Ah, I hadn't checked the date, you're right, I'm not aware of anything earlier. Nevertheless, it seems to have only been done with the MBH99 data, and not since.

I don't see much in the way

I don't see much in the way of repentance or error admitted in the comment. It is just a method being described. Reads to me as going out of the way to avoid conceding that instrumental temperatures were used.

Also, it is likely that it was used in MBH98 as well, since Phil Jones referred to Mike's Nature trick, and MBH99 was in GRL.

Well, maybe you're not

Well, maybe you're not accustomed to the ways in which scientists typically understate things. The comment explicitly stated that instrumental temperatures were used: "padding with the mean of the subsequent data (taken from the instrumental record)" but also includes what is tantamount to an embarrassed admission that they did things wrong earlier:

"The methods used for this end-point problem in smoothing are problematic, often ambiguous and various alternative approaches have been used ... Over the past 5 years ... we have favored an "optimal boundary condition" approach" ... In some earlier work though ... (instrumental)".

That's about as close as you ever see to somebody admitting wrongness in earlier work (at least in a case like this where it really doesn't make much difference - if there was a substantive effect you obviously ought to get much more explicit discussion of the issues).

MikeN, sorry I was actually

MikeN, sorry I was actually referring to Michael Tobis, whose comment above I took to be an endorsement of William M. Connolley's position as expressed on his own blog, Stoat. Although AMac still seems to have reservations I think there is now substantive agreement on the facts.

EDIT: Oh scrub that, I misread the attribution lines. Sorry for needlessly involving you, MikeN. Sorry for misrepresenting your position, Michael Tobis.

> is the twist

> is the twist real?

http://scienceblogs.com/stoat/2009/11/tiljander_again.php#comment-2050236
(prior discussion with amac about deciding by looking at pictures vs. data in papers about varve interpretation)

ps, a late pointer update

ps, a late pointer update (original one went bad) from the very end of that Stoat thread:
http://scienceblogs.com/stoat/2009/11/tiljander_again.php#comment-2508945

Hank Roberts, your citation

Hank Roberts, your citation of that Brauer et al paper on changes to the Lake Meerfelder varve record during the Younger Dryas (~12,700 BP) was a helpful addition to the conversation on Tiljander at Stoat. Thanks for supplying the updated URL for the reprint.

Our discussion of Brauer started with your Comment #16, continuing with comments 18, 19, 20, 23, 25, and 30, and winding up with my comment 32.

Arthur, your rendering of the

Arthur, your rendering of the Kaufman et al. method is too vague to be useful. Read the paper!

On the time line (search

On the time line (search eastangliaemails dot com with "kaufman"; otherwise google scholar):

- The Kaufman et al. manuscript was submitted on or before March 23, 2009.
- It came back asking for revisions on or before May 26.
- It appeared in Science Sept. 4 (but the electronic version may have been earlier).
- On Sept. 5, Kaufman receives his first hate mail :-( and discusses the Tiljander issue. Mann is cc:ed. A correction is outlined.
- On Nov. 27, Mann et al. 2009 "Global signatures.." appears in Science.
- On Feb. 2010 Kaufman et al.'s correction appears.

Note especially the several months' delays in the publication process.

Seems I remembered everything

Seems I remembered everything wrong, as I was out of the country earlier this year, and I remember the Kaufman correction coming out. I was the first to notify ClimateAudit about it.

Checking now, I see a post that dates a corrigendum at Oct 2009.

I'm going to work on this

I'm going to work on this code for the next few days. Will start with Steve McIntyre's R code, and go from there.
Let me start by pointing out that I think that Mann is incorrect in his PNAS comment regarding regression being blind to sign of predictor. From his matlab code:
for i=1:m1-1 % This is for searching annually-resolved proxies
if (z(3,i)==9000 | z(3,i)==8000 | z(3,i)==7500 | z(3,i)==4000 | z(3,i)==3000 | z(3,i)==2000) &...
x(kk,i+1)>-99999 & x(kkk,i+1)>-99999 &...
z(1,i)>=ilon1 & z(1,i)<=ilon2 & z(2,i)>=ilat1 & z(2,i)<=ilat2 & z(ia,i)>=corra
n=n+1;
%%%% low pass filter to 0.1
temp=x(kk:kkk,i+1)*sign(z(ia,i));
[smoot,icb,ice,mse0]=lowpassmin(temp,0.1);
yc(1:kkk-kk+1,n)=smoot;
locc(1:2,n)=z(1:2,i);
end
end

Note the *sign(z(ia,i)), suggesting a flipping of the proxy to match the temperature record.
However, prior to that, we already have if z(ia,i)>=corra, and corra=.106 so the Tiljander series is already dropped if it is negatively correlated. There are other sections of code that have abs(z(ia,i))>=corr... but Tiljander is in class 4000.

Remains to be tested, as I have no idea what this code is doing so far.

MikeN, you may be better off

MikeN, you may be better off trying to set up a Matlab environment to run the actual code. This piece of code doesn't look pretty (ugh!) and if the rest is like this, I doubt you can infer behaviour from just looking at it.

MikeN, I take back the

MikeN, I take back the previous. I just looked at the relevant code myself, and find it well-written and it is quite obvious what it does.

http://www.meteo.psu.edu/~mann/supplements/MultiproxyMeans07/code/codecps/

Relevant files are gridproxy.m and gridboxcps.m. One produces a (large) amount of output which is further processed by the other.

As you say, the sign() function will flip a proxy if it is input the wrong way around. After that, in gridboxcps, the proxy will be 1) normalized -- i.e., its mean centered on zero and its variance scaled to unity -- and 2) calibrated to the instrumental values in its 5x5 degree grid box, again by shifting the mean to the instrumental mean, and scaling the variance to the instrumental variance (as the name "composit plus scale" implies). As variances (or their roots, standard deviations) are always positive, this does not further change the proxy orientation at any point.

So, it is clear that Mann's claim that the CPS algorithm as implemented is 'blind' to a proxy's input orientation, is correct. It is also true however as you say, that the screening applied at input is not 'blind'. The code distinguishes three cases (and agrees with the description in the paper):
1) It is a priori known that this proxy type is oriented upward. This includes the Tiljander proxies (which may, as we now know, be a wrong assumption for two of them).
2) It is a priori known that this proxy type is oriented reversed.
3) We have no a priori knowledge of this proxy type's orientation.
This separately for annually-resolved and decadallly-resolved proxies.

Only in case 3) will also the screening procedure be 'blind' to the input orientation, as two-sided screening is used (the abs() function). In the other cases a proxy that is input 'upside down' will be thrown out already at the (one-sided) screening stage. So it is also true to say that it is not possible to input a proxy the wrong way around. That is, if the a priori assumed orientation for this proxy type is correct. And for Tiljander, that is precisely the twist...

Ugh.

I've been busy and haven't

I've been busy and haven't run any code yet, but it appears we agree on what the code is doing.

>So it is also true to say that it is not possible to input a proxy the wrong way around.
That's not true. If you input a proxy the wrong way, it gets dropped, which is not the intended effect.

Saying Mann is correct in this instance is 'bizarre'. He was specifically asked about Tiljander, and responds that the algorithm is blind. Several commenters went off of this and thought that a proxy would be flipped back if it was entered upside-down. This only applies in case 3. Did you determine what proxies go into this case?

That's not true. If you

That's not true. If you input a proxy the wrong way, it gets dropped, which is not the intended effect.

MikeN, I agree -- with the benefit of hindsight. But, if you not only flip the proxy but also re-classify it as upside down, it will get through the pre-screening, and after that, everything is the same. And EIV doesn't even do a pre-screening. Not so bizarre, I would think.

It is IMHO hopeless to try to reconstruct the thinking processes of Mann or McI. It is clear that they were talking past each other.

...and given that McI's claim

...and given that McI's claim was that the proxies were input 'upside down', pointng out that you cannot even do that (whether due to them being flipped back, or not even getting through the pre-screening, or the whole algorithm being sign insensitive) is, I think, pertinent. Even if Mann only made part of this argument.

It's certainly helpful to

It's certainly helpful to look at the Mann08 code, as Martin Vermeer, MikeN, and Ari Jokimatti are doing (have done).

It is also worth returning now and again to the main questions that Arthur proposed examining. He listed them in the body of this post, after "Open questions to me are:". I have rephrased them in my own words to avoid the use of the terms "twist" and "upside down." Here and elsewhere, use of such terms can derail discussion, due to different interpretations applied by various commenters.

(1) Was the original Tiljander03 paper correct in assigning these temperature correlations to the pre-1720 period:
Darksum - higher signals correlate to warmer temperatures;
Lightsum - higher signals correlate to cooler temperatures;
X-Ray Density - higher signals correlate to cooler temperatures.

Tiljander03 claims that these climactic signals were progressively overwhelmed, beginning in the 18th Century and continuing through the 20th Century. They state that increasing agricultural activity, roadbuilding activity, and lake eutrophication led to these non-climate-related trends:
Darksum - local activities led to generally increasing signals over that time;
Lightsum - local activities led to generally increasing signals over that time;
X-Ray Density - local activities led to generally increasing signals over that time.

Is the account in Tiljander03 correct? Do plausible alternative accounts exist?

Mann08 did not explicitly refer to Tiljander03's pre-1720 climate-based correlations. Mann08 explicitly referenced Tiljander03's post-1720 cautions about non-climate contamination of the proxies. Did Mann08 go on to properly take these cautions into account?

Did the Methods of Mann08 correctly calibrate Darksum, Lightsum, XRD, and Thickness to the instrumental temperature record, 1850-1995? As a side question: if not, were Lightsum and XRD calibrated as follows for the reconstruction period, 200-1850:
Lightsum - higher signals correlate to warmer temperatures;
X-Ray Density - higher signals correlate to warmer temperatures.

(2) Should Mann and coauthors have figured out the problems with calibration of the Tiljander proxies to the instrumental record from the discussion in the blogosphere, then from the M&M Comment in PNAS, and then from the correction to the Kaufman Science 2009 manuscript? Is the Mann et al. Reply to M&M in PNAS responsive to the criticisms of the calibration issues that were raised? In particular, does the Reply's discussion of "upside down" constitute a failure to deal with the calibration issues?

(3) Is it possible, despite the third version of Fig. S8a showing no [or "modest" - AMac] substantive difference to the temperature anomaly trace in the reconstruction period, that use of the Tiljander proxies made a material difference to some of the conclusions of Mann08? Would that criticism extend to some conclusions of later papers using the same data in similar fashion?

I hope that the investigations of the handling of the Tiljander proxies by the code for the CPS and EIV procedures will allow some of these points to be settled. In particular, it should allow resolution of these questions:

For Lightsum, under CPS as implemented in Mann08: during the reconstruction period, do higher signals correlate to cooler temperatures or to warmer temperatures?

Similarly for Lightsum, under EIV.
Similarly for XRD, under CPS.
Similarly for XRD, under EIV.

AMac, the only question that

AMac, the only question that has some scientific relevance is (3). The rest is lawyerly stuff for which I have no patience. But to answer (3), I must answer part of (1), so here goes.

(1) I believe that the Tiljander orientations are likely correct, and their caution on the contamination in later years almost certainly correct.

(3) I do not believe this is realistically possible. Any effect from the mishandling of Tiljander drowns in the other, substantial uncertainties both statistical and structural, especially going back in time before 1000AD. These uncertainties are well described in the paper. I would have no hesitation to use these reconstructions in my own work.

Ad (2):

Should Mann and coauthors have figured out the problems with calibration of the Tiljander proxies to the instrumental record from the discussion in the blogosphere,

Thanks for making me laugh ;-)

Laughter's indeed a tonic.

Laughter's indeed a tonic. But as for (2), you should probably thank Arthur: my revisions to wording, but that was the second point that he raised near the end of his post.

I wonder if you would expand on your statement

(1) I believe that the Tiljander orientations are likely correct

There are (of course) two possible orientations:

* Higher proxy signal corresponds to higher temperature, and
* Higher proxy signal corresponds to lower temperature.

In the case of the Lightsum and XRD proxies, which orientations were employed in Mann08's CPS and EIV analyses?

AMac, I see now that I was

AMac, I see now that I was ambiguous. What I meant was that the understanding, or interpretation, by Tiljander et al, on what was the proper orientation of these proxies, was likely correct.

Mann et al. of course assumed all lake sediment proxies to be oriented 'upward'. The paper says so, and -- if what MikeN asserts, that 4000 is the code for this proxy type, is true -- so does the code.

Martin Vermeer, thanks for

Martin Vermeer, thanks for the clarification.

Here is what I understand you to be saying in the just-prior comment.

* Tiljander03 was likely correct in their interpretation of Lightsum and XRD: that higher values correspond to lower temperatures.

* Mann08 assumed all the Tiljander proxies were oriented 'upward.' Thus, Mann08 interpreted Lightsum and XRD in this fashion: that higher values correspond to higher temperatures.

Is this paraphrasing correct?

Yep.

Yep.

Martin, your clarity is much

Martin, your clarity is much appreciated. Ari Jokimäki appears to have arrived that this conclusion, as well.

MikeN, was the the CPS code?

MikeN, was the the CPS code? CPS does depend on the sign The EIV method of analysis does not depend on the sign of the predictor as far as I know (if this is the EIV code, that might be what you are seeing with the sign function).

It was the CPS code. Based

It was the CPS code. Based on ClimateAudit comments, the EIV code is a total mess and not a serious algorithm at all. Basically upside-down Tiljander times ten.

Not true IMO, see my analysis

Not true IMO, see my analysis elsewhere on EIV. It is indeed, as John Sully points out, insensitive to a predictor's orientaton and does not include a screening step. So Mann's assertion is 100% correct here.

Here's some strong evidence

Here's some strong evidence in favor of the second part of the Gavin/toto answer to the Calibratability Question ("I don't know, and it doesn't matter.") To be clear, this argues against my assertion that "it does matter."

Back in September 2008, blogger Jeff Id did some analysis of the proxies used in Mann08, and emulated some of Mann08's reconstructions, with much success. His post is Mann 08 Series Weight Per Year.

Jeff estimates the contributions of various types of proxies at various points in time. His figure, "Percent Contribution to M08 Temp Reconstruction by Year" is worth examining in detail. It shows that in the earliest parts of the reconstruction he studied -- say, up through ~1100 or ~1300 -- the results are dominated by two types of proxies: cave precipitation records (yellow), and the Punta Laguna proxies (greenish-blue).

The contribution of the Tiljander proxies (gray) is quite modest (well under 5%), throughout.

The "Tiljander" argument

The "Tiljander" argument matters, even if Jeff Id's graph "Percent Contribution to M08 Temp Reconstruction by Year" and/or twice-revised Fig. S8a correctly represent the modest contribution of these proxies to the reconstruction.

Comments in a recent post by IPCC AR5 WG2 author Ed Carr provide a timely illustration. The 9 July 2010 post at "Open the Echo Chamber" is "Apparently, we have learned nothing..." In the ensuing 50+ comment thread, some skeptics chime in with, well, skepticism about the IPCC process and the AR4 report. Here, knowledgeable and articulate commenter 'caerbannog' reviews some reasons why criticisms of Prof. Mann's work by Steve McIntyre are ill-founded. He says,

...you can generate hockey-stick-shaped leading principal components via [a non-centered PCA] method. But there’s an easy way to distinguish a “noise” hockey-stick scenario from one where the hockey-stick results from a real temperature signal.
You look at the eigenvalue magnitudes...

Any competent analyst trying to extract a temperature signal from proxy data via the PCA method would look at the eigenvalue magnitudes before deciding whether to proceed with the regression steps.

Had Mann’s eigenvalues looked like McIntyre’s noise eigenvalues, Mann would most likely said to himself, “There’s not much of a common temperature signal in my tree-ring data here; don’t think that I will be able to do much with it.”

If Mann’s tree-ring eigenvalues had looked anything like McIntyre’s noise eigenvalues, Mann certainly would have realized that his tree-ring data did not contain any temperature information worthy of publication.

This whole “spurious hockey-stick” argument used against Mann is completely without merit.

Later in that conversation with 'Nullis in Verba', 'caerbannog' adds,

Having a solid background in formal math/statistics is important, but it is also very important to be able to “relate” the numbers you’ve crunched with physical, real-world processes. That’s where many stats/mathematics types come up short — they haven’t worked with “real world” data enough to get practical experience with “real world” scenarios.

And there’s no substitute for working with lots of “real world” data to get an appreciation for this.

At the (current) tail of the comments, I link to this post of Arthur's and remark,

...The use of proxies in the Mann group’s September 2008 article in PNAS (Mann08) sheds much light on [the use by Prof. Mann of screening, verification, and calibration steps in the evaluation and use of proxy datasets for paleotemperature reconstructions.] In particular, the Lake Korttajarvi varved lakebed sediments characterized by Tiljander et al (2003) are an important test case of this claim:

“Mann08 demonstrates methods of proxy selection and calibration for paleotemperature reconstruction that are robust.”

In my opinion, analysis of Mann08′s use of the Tiljander proxies shows that this recent, high-profile paper clearly fails the “robustness” claim.

The dialog between 'caerbannog' and 'Nullis in Verba' demonstrates that many people who are conversant with much of the technical detail of the controversies about the "Hockey Stick" are unaware of the issues associated with the selection and use of the Tiljander proxies. Being uninformed, they have not considered Tiljander's implications for the larger arguments they advance.

AP: I don't think Mike's

AP: I don't think Mike's comment was very direct or repentant either. He was very quick to mix in the "but it doesn't matter". (This is a trait with him.) Also, I have seen other places where he concedes a point in such an opaque manner that it does not even read like he is correcting himself. Read back the differencing of the Ritson work in how red are your proxies. Mike, 2 or 3 times in a row, said there was no self-differencing by Ritson, then after even sympathetic commenters called him on it the 3rd time, he finally changed his story. But his commetn was NOT "I was wrong 3 times, it is self-differenced", instead it was a very opaque statement. Read back to that Ritson statement and see what I mean. I think Moshpit (who I have criticized some) did better in coming clean...

P.s. Mann and McI are both similar in their reluctance to admit things. I couldn't even get Mann to make a comment (one way or the other) as to whether he was amis in not documenting the short centering PCA. Not even asking if the method was right or wrong, just if it should have been mentioned! (Note: and it doesn't change the recon much, but it sure changes PC1 and PC1 was a shown graphic in his paper).