Posts Tagged ‘code’

Mea culpa

Monday, December 7th, 2009

This morning, taking a nice long bath, I realised there could be a problem with my parsing of the temperature records. This turns out to be the case.

Following the format specification, I’d jumped to column 25, skipped any spaces or tabs, read the digits, read a dot (silently aborting if not present), and then optionally read another digit. That produced records for 285057 station-years.

What I didn’t do, and what my checks had failed to spot, was handle negative Fahrenheit temperatures. I guess I’m not used to winters being colder than salted ice. A minus sign is neither whitespace nor a digit, so it fell through those tests and silently aborted at the next as it’s not a dot! Net result: no records from any station where the temperature dips below 0.0F. Oops. I’ve fixed that (check if there’s a minus sign, and negate the value read if so) and also put in a message on stderr if that abort is triggered.

I’ve run through all the years, with no messages on stderr, and there are now 394264 station-years of data produced. Code will follow when I figure a good way to post it.

I’ve updated the Proposition 02 results, though it’s actually increased the margin from 97.7% to 98.2% of stations not showing the warmest ten years as post-1997.

Station locations against time

Sunday, December 6th, 2009

I don’t know why I chose Perl, it’s a hateful language. Turns out it wasn’t quite so simple – ‘sort’ works lexicographically (i.e. as text) so “117″ is less than “20″ as 1<2. Converting lines to integers and sorting on those, using delightful syntax like

@values = sort {$a <=> $b} @values;

and then split cases for odd/even list lengths for the median and again for upper/lower quartiles. Still, in the end it gives me the results I’m looking for – a list of comma-separated values for year, number of stations, mean distance, median distance, lower quartile distance and upper quartile distance. Or “year,0,,,,” if there are no stations. That should be just what I need to paste into Excel and produce graphs…