Archive for the ‘Uncategorized’ Category

Proposition 04: False

Tuesday, January 5th, 2010

Well, the final GSOD readings for 2009 have now been released. Time to finish this off…

rmw42@pandora:~/NOAA$ ./average 2009
rmw42@pandora:~/NOAA$ cat Average-* | sort > Averages.txt
rmw42@pandora:~/NOAA$ cat Averages.txt | ./warm5-2009
1484 stations had 2009 as one of their 5 warmest years
4187 stations did not have 2009 as of their 5 warmest years
11746 stations rejected for having insufficient data

Only 26% of the weather stations have 2009 anywhere in their top 5 – I guess that’s a reason for the Met Office not to count their climate chickens too early, as December was brutally cold and pushed the year out of the record books.

GHCN data

Monday, December 14th, 2009

I’ve been looking at the GHCN data as a possible replacement for/addition to the GSOD series. It’s mostly a set of monthly averages, though there is a 1.7gb lump of “daily” data (which, glancing at extracts, appears to have a huge number of missing entries).

The GHCN does appear to have better coverage of Africa and 1934, so it will be interesting to see how that affects the results.

Proposition 04: Chillin’

Monday, December 14th, 2009

Still only provisional results. The previous results were from the 2009 GSOD data to December 3rd. I’ve just downloaded the data up to December 10th and re-run the code:

rmw42@pandora:~/NOAA$ cat Averages.txt Average-2009.txt | sort | ./warm5-2009
3294 stations had 2009 as one of their 5 warmest years
2303 stations did not have 2009 as of their 5 warmest years
11767 stations rejected for having insufficient data

So, adding a week of cold December temperatures, the split has gone from 3665:1906 (66%:34%) to 3294:2303 (59%:41%). Still well in favour of 2009 being a very hot year, but another three weeks of cold might tip the balance…

Proposition 06: Probably False

Sunday, December 13th, 2009

A fairly simple check to see compare two years’ annual mean temperatures:

rmw42@pandora:~/NOAA$ cat Averages.txt | ./compareyears 1934 1998
12 stations had 1934 warmer than 1998
3 stations did not have 1934 warmer than 1998
15139 stations rejected for having insufficient data

As expected, 1934 is warmer, but there aren’t many stations that far back in the daily data – the daily records only start in 1929. This would need to be confirmed against the raw monthly GHCN figures, which are more complete for the ’30s.

The deltas are:

1.7F, 0.3F, 1.2F, 0.3F, 0.0F, -0.8F, 3.4F, -0.8F, -0.4F, 1.5F, 1.7F, 3.6F, 3.1F, 8.5F, 1.7F

These give a mean difference of +1.66F, median of +1.49F, and quartiles of +0.04F and +3.06F.

0.01C is 0.018F, which is less than the lower quartile, so at least 75% of the results (actually 80%) are greater than the claimed difference. 15 samples is about the minimum required to assume that a Gaussian approximation is valid, and these results strongly suggests that the claim is false, but more data would be helpful.

Provisional result: 1934 was warmer than 1998 by 0.8-0.9 degrees Celsius.

Proposition 06: 1934 vs 1998

Sunday, December 13th, 2009

Someone called iCowboy posted a comment on Iain Dale’s Diary saying:

1934 vs. 1998 – the difference is 0.01 Celsius between those two years, but the statistics definitely show that 1998 sits well within a warm period, 1934 was generally cooler.

in response to claims that 1934 was actually warmer than 1998.


1934 is (at most) 0.01 degree Celsius warmer than 1998


Scan through the station-year records, find any which cover both 1934 and 1998 with reasonably-complete (>90%) data and check the temperature difference between them.

The Met Office speaks

Wednesday, December 9th, 2009

Here is a possible source for some of Jo Steele’s comments in the Metro.

This gives us another, clearer formulation for Proposition 03:

The decade 2000–2009 has been warmer, on average, than any other decade in the previous 150 years;

as well as some other statements which might bear investigation.

Proposition 04: Possibly True

Wednesday, December 9th, 2009

Not going to confirm this one yet, as there are are ~4 weeks’ more readings still to come – in the coldest part of the year in the Northern Hemisphere – which I expect will drag the annual mean temperature down somewhat.

However, using the data to the first week of December, we get:

rmw42@pandora:~/NOAA$ cat Averages.txt Average-2009.txt | sort | ./warm5-2009
3665 stations had 2009 as one of their 5 warmest years
1906 stations did not have 2009 as of their 5 warmest years
11771 stations rejected for having insufficient data

which shows the proposition to be true in 66% (i.e. near enough two thirds) of stations for now, though if the December average is something like 10F colder than the summer this could change substantially.

It also shows a potential flaw in my “>=90% of days” check – a year can miss out almost an entire month and still be “OK”. That month could just as easily be July as January, and so could move temperatures up or down. I believe missing days will be somewhat randomly distributed, but this should be checked.

Proposition 05: False or Undefined

Wednesday, December 9th, 2009

Well, that one was remarkably easy. Take the sorted annual mean recorded temperature and see if the top result is 1998:

rmw42@pandora:~/NOAA$ cat Averages.txt | ./topyear
1163 stations had their warmest year in 1998
4985 stations did not have their warmest year in 1998
9970 stations rejected for having insufficient data

That would be a “no”, then. 19% of stations report 1998 as their warmest year, which means 81% do not. Clearly the proposition is not true.

19% is quite a lot, though, and 1998 was clearly a very warm year. Is it likely that another year scores a much higher proportion? If not, the proposition is ill-defined: there is no such thing as a warmest year for the entire planet, as different parts are warm at different times. That’s different from the proposition being false – instead the proposition is meaningless.

Possible follow-up: what does the histogram of “warmest years” look like? Would need to be careful to take account of incomplete coverage… Similarly, the Mean Reciprocal Rank of a year would be interesting to see.

Propositions 03, 04, 05: Hot decade

Wednesday, December 9th, 2009

Today’s Metro article “China lambasts US and EU emission pledges” by Jo Steele contains a barrage of claims:

Meanwhile, the Noughties have been the warmest decade on record and this year has been one of the five hottest, weather experts warned.

While 1998 remains the hottest single year since records began in 1850, the last ten years have been the warmest, the Met Office revealed.

Wow. It ends with what looks like a repeat of (false) Proposition 02, but the others can stand a bit of scrutiny.

Proposition 03:

The Noughties have been the warmest decade on record

Proposition 04:

This year [2009] has been one of the five hottest [years on record]

Proposition 05:

1998 remains the hottest single year since records began


Well, propositions 04 and 05 seem fairly straightforward – top 1 and top 5s to go with the top 10 analysis already done. A bit of Perl hacking should get results for those today, though since we’re still in 2009 the final result will have to wait until January. Proposition 03 will take a bit more work, but not much – do we consider rolling decades or bucket the temperature record by (year/10)? Rolling decades is probably the more general version, but both are interesting tests to run.

So, time to get cracking!

First draft of distance-to-ocean

Monday, December 7th, 2009

Have reprocessed the data from yesterday, and graphed it.

Africa has some problems with station counts – there are none producing daily data in the 1950s or the early 1970s. I believe this might be different in the monthly GHCN dataset, but I’ve been using the GSOD figures instead – another strand to check. The station counts are shown below:

As a result, I’ve had to trim the data somewhat. 1956 and 1968 show major differences from the adjacent years, so are too volatile to include, and most of the other removed years have at most a dozen stations. There are some differences in station placement from the 1957-1967 series to the post-1973 one, but the latter seems quite stable:

All in all, I’d say this doesn’t show any particular trend in station placement in Africa w.r.t the ocean.

In Europe, we have:

which is a rather different story. The station counts are good for the entire period, and – aside from some difficulties immediately after WW2 – stations seem to have moved significantly closer to the water. Something like 100km/20% closer to the ocean between the early 1960s and the mid/late 1970s, and stable ever since.

The spreadsheets are here: Africa and Europe.