Posts Tagged ‘proposition 02’

Propositions 03, 04, 05: Hot decade

Wednesday, December 9th, 2009

Today’s Metro article “China lambasts US and EU emission pledges” by Jo Steele contains a barrage of claims:

Meanwhile, the Noughties have been the warmest decade on record and this year has been one of the five hottest, weather experts warned.

While 1998 remains the hottest single year since records began in 1850, the last ten years have been the warmest, the Met Office revealed.

Wow. It ends with what looks like a repeat of (false) Proposition 02, but the others can stand a bit of scrutiny.

Proposition 03:

The Noughties have been the warmest decade on record

Proposition 04:

This year [2009] has been one of the five hottest [years on record]

Proposition 05:

1998 remains the hottest single year since records began


Well, propositions 04 and 05 seem fairly straightforward – top 1 and top 5s to go with the top 10 analysis already done. A bit of Perl hacking should get results for those today, though since we’re still in 2009 the final result will have to wait until January. Proposition 03 will take a bit more work, but not much – do we consider rolling decades or bucket the temperature record by (year/10)? Rolling decades is probably the more general version, but both are interesting tests to run.

So, time to get cracking!

Mea culpa

Monday, December 7th, 2009

This morning, taking a nice long bath, I realised there could be a problem with my parsing of the temperature records. This turns out to be the case.

Following the format specification, I’d jumped to column 25, skipped any spaces or tabs, read the digits, read a dot (silently aborting if not present), and then optionally read another digit. That produced records for 285057 station-years.

What I didn’t do, and what my checks had failed to spot, was handle negative Fahrenheit temperatures. I guess I’m not used to winters being colder than salted ice. A minus sign is neither whitespace nor a digit, so it fell through those tests and silently aborted at the next as it’s not a dot! Net result: no records from any station where the temperature dips below 0.0F. Oops. I’ve fixed that (check if there’s a minus sign, and negate the value read if so) and also put in a message on stderr if that abort is triggered.

I’ve run through all the years, with no messages on stderr, and there are now 394264 station-years of data produced. Code will follow when I figure a good way to post it.

I’ve updated the Proposition 02 results, though it’s actually increased the margin from 97.7% to 98.2% of stations not showing the warmest ten years as post-1997.

Proposition 02: False

Sunday, December 6th, 2009

Some more horrific Perl, but it did the job…

rmw42@pandora:~/NOAA$ cat Average.txt | ./warmest 10
32 stations had 10 of their 10 warmest years post 1997
1360 stations did not have 10 of their 10 warmest years post 1997
11347 stations rejected for having insufficient data

So, I make that 98% of weather stations finding that the warmest ten years in their history were not post-1997. That’s quite shocking, really. I think it’s safe to say that the statement is completely and utterly false – if 98% of weather stations active for the last 24 years don’t show the last 12 as containing the ten warmest, by what measure can we claim they were the warmest years?

I want to test this to see if there’s any pattern to the stations used, whether requiring good data integrity skews the results, and whether rejecting so many stations (~90% of the total) was necessary – but I think the reasons I gave on Friday are sound. It doesn’t matter a damn to me if a station had good data during WW2 – if it hasn’t been active for 60 years, it can’t tell me how warm 2007 was! And surely a station giving only one temperature reading per year – yes, there are some like that – is hopeless?

As Adam and Jamie might say: Myth Busted!

Updated 2009/12/7 09:23:

rmw42@pandora:~/NOAA$ cat Average.txt | ./warmest 10
44 stations had 10 of their 10 warmest years post 1997
2460 stations did not have 10 of their 10 warmest years post 1997
13614 stations rejected for having insufficient data

Data ahoy!

Sunday, December 6th, 2009

I’ve finally uploaded the early years’ weather data (about 350mb) to my shell account, which took about three hours this morning. I’ve processed it and so now I’ve a complete set of averages, about 280000 station-years.

I can use this with the country station list/distances to determine the set of stations in each year – something like

join AfricaStations.txt Average-1969.txt -t',' -o1.3 | sort

will get me the distances (3rd field in the first file, so 1.3) for all the stations active in 1969 in order of distance. Then I just need to get the order statistics and graph the result to get the answers for Proposition 01.

For Proposition 02, I can use the same station-year average temperature list, sorted by station, to extract and check the warmest years. I think that will need some Perl…

Temperature records

Saturday, December 5th, 2009

The NOAA data is awkwardly arranged. It’s in (literally) thousands of GZip files, one per year per station, stored in yearly TAR files. About 3gb compressed, so probably double that. My PC chokes on them as the virus checker sees an archive file and decides to look inside, so I’ve had to download them using a shell account on a Linux box. I’ve now got the years 1929-1959 and 1970-1973 on my PC, and 1960-1969 and 1974-2009 on the shell account.

I’ve written a C program (“annual”) to parse the temperature records and calculate the mean (and standard deviation), and a shell script to de-TAR a year’s data to a temp directory and to do

cat $f | gzip -d | tail -n+2 | ./annual $1 >> Average-$1.txt

for each station record ($f). The ‘tail’ call strips off the first line (column headers), and $1 passes through the year given to the shell script. All this results in a comma-separated text file containing station ID, year, mean temperature, number of samples, and the standard deviation.

Now just to create another shell script to loop through this lot, and deal with the files on my home PC as well as the ones on the linux box… If I ‘nice’ everything hopefully nobody will notice it running all night!

Proof of concept

Friday, December 4th, 2009

I’ve downloaded the data for station 723150-03812, Asheville Municipal Airport in North Carolina. This is the one the NCDC/NOAA use for their sample data and is, I guess, their local airport.

Throwing the full 1948-2009 data – about 26000 records - into an Excel spreadsheet, using the SUMIF and COUNTIF functions to pull out the relevant days, and sorting the results shows that the ten warmest years in Asheville are (from lowest to highest): 1974, 1980, 2001, 1999, 2002, 1991, 2007, 2009, 1998, 1990

2009 is an aberration as there are a bunch of cold days to come before the end of the year, but it’s clear that there have been equally warm years in recent decades.

This surprised me, I expected at least eight or nine out of the ten to be the warmest if the effect were clear-cut – particularly with all the fuss people have made about airport locations for weather stations, the El Nino in ’98, and the satellite data showing warming throughout the 1990s.

It remains to be seen how typical (or not) Asheville is…

Proposition 02: The warmest ten years

Friday, December 4th, 2009

I was reading the Evening Standard on my way home tonight, and saw the Rt Hon Ed Miliband MP’s op-ed – “Climate change sceptics are today’s flat-earth brigade”. In it he mentions a line I’ve heard numerous times before.


The 10 warmest years on record have all occurred since 1997


It seems to me that this statement is either true or false, and can be checked using temperature records from weather stations – the sort of data I’m collecting already. For any particular station, the statement is either true or false. For the planet as a whole, a clear majority of stations should ‘vote’ for the statement being true.

  • Collect daily station records from the NOAA here
  • Exclude stations which don’t include the years 1985 – 2008 (12 years either side of 1997 – it’s no use using the station if the test is necessarily true because it’s only post-1997 or necessarily false because there aren’t ten years after 1997 to be the warmest)
  • Exclude stations without records for at least 90% of days
  • Calculate the annual mean temperature for each station in each year
  • Sort station data by annual mean temperature to find the ten warmest years, see if they’re all after 1997 or not

I expect there’ll be some stations for which it’s true and some for which it’s false – there will always be random weather effects – but if recent years are the warmest this should be clear from the measurements. It would be strange to imagine a warm year which left most places colder!