Tuesday, 31 July 2012

Mike's Olympic medal table 30 July 2012


Country
Gold
Silver
Bronze
Score
1
China
9
5
3
130
2
United States
5
7
5
100
3
Italy
2
4
2
46
4
France
3
1
3
44
5
Japan
1
4
6
43
6
Korea
2
2
2
36
7
DPR Korea
3
0
1
35
8
Russia
2
0
3
28
9
Australia
1
2
1
23
10
Kazakhstan
2
0
0
22
11
Romania
1
2
0
21
12
Brazil
1
1
1
18
13
Hungary
1
1
1
18
14
Germany
1
1
0
16
15
Netherlands
1
1
0
16
16
Ukraine
1
0
2
15
17
United Kingdom
0
2
2
14
18
Georgia
1
0
0
11
19
Lithuania
1
0
0
11
20
South Africa
1
0
0
11
21
Colombia
0
2
0
10
22
Cuba
0
1
0
5
23
Mexico
0
1
0
5
24
Poland
0
1
0
5
25
Thailand
0
1
0
5
26
Chinese Taipei
0
1
0
5
27
Azerbaijan
0
0
1
2
28
Belgium
0
0
1
2
29
Canada
0
0
1
2
30
Indonesia
0
0
1
2
31
India
0
0
1
2
32
Moldova
0
0
1
2
33
Mongolia
0
0
1
2
34
Norway
0
0
1
2
35
New Zealand
0
0
1
2
36
Serbia
0
0
1
2
37
Slovakia
0
0
1
2
38
Uzbekistan
0
0
1
2

Olympic medal table

From time to time, I plan to post an alternative version of the Olympic medal table.  Conventionally countries are ranked by the number of gold medals, followed by silver if the gold tally is equal, followed by bronze.  This means that a country with five silver medals and no gold medals would be ranked behind a country with one gold medal.  This seems a bit unsatisfactory.  There is in http://www.topendsports.com/events/summer/medal-tally/rankings-weighted.htm a list of some different sets of weightings that have be applied to the three medal counts to produce an overall score.  In the conventional system the weightings are effectively 1,000,000 or some suitably high number for gold, 1,000 (say) for silver, and 1 for bronze, at least for the purpose of ranking countries (not estimating their relative performance).  That's how you express the fact that no matter how many silvers you have (within reason) you will rank lower than a country with one gold.

The weighting system I am about to suggest doesn't appear in that list.  It comes from an article I read in Mathematics Today some years ago.  This is how it's derived:

Let's say a gold medal is worth G points, a silver medal S points and a bronze B points.  Now all that is going to matter is their relative scores, so let's completely randomly (!) say that G + S + B = 18 (say).  Now let's have a 3-dimensional coordinate system with axes labelled G, S and B.  A set of three weights will then be a point in this 3-dimensional space, and the job is to choose the "best" point in some sense.  The fact that G + S + B = 18, together with the obvious facts that G > 0, S > 0 and B > 0, mean that the point will actually be on an equilateral triangle between the points (18, 0, 0), (0, 18, 0) and (0, 0, 18), as shown here.

Now let's put this equilateral triangle flat on the paper and ask ourselves where our chosen point may be.  We can take some new facts into account.  One is that a silver is not going to be worth more than a gold.  This cuts the triangle in half, and our chosen point will be somewhere in the grey triangle BGP:

The next fact is that a bronze is not going to be worth more than a silver (or a gold, of course).  This cuts down our region of possible points still further to the triangle RGP:


Beyond that, it becomes a matter of subjective taste where we place our point.  The conventional weighting-for-ranking system is close to point G.  The system where you just count the total number of medals is point R.  But our reasoning is that, in the absence of any further guidance, we'll place our point in the "centre" of the grey triangle.  Now triangles have many different "centres" but the most reasonable one for us to use here is the centroid, the centre of gravity, which is the red spot with coordinates (11, 5, 2).  (Now you see why I chose them to add up to 18!)

So there is an answer.  We have a points system with 11 for gold (echoes of This is Spinal Tap), 5 for silver and 2 for bronze.  My impression is that this remains quite an "elitist" scoring system because each medal is worth more than double the one below, but it is going to be a bit fairer for ranking than the conventional system.

Other constraints could be applied: for example, we could impose that the ratios G/S and S/B are the same (though with this system they are already quite close, at 2.2 and 2.5 respectively).  Or we could say that a gold is worth no more than a silver and a bronze, so G<=S+B (this would lead to G=8, S=6.5 and B=3.5, which would be better expressed as 16, 13 and 7 - a system which seems very generous to silver medal holders).


Sunday, 17 June 2012

My first computer program

I've just read Bill Gates's Wikipedia entry and was interested by the fact that he wrote his first computer program at the age of 13 or 14, to play tic-tac-toe (noughts and crosses to us Brits).  This reminded me that I have been meaning for years to write up my first computer programming experience, in 1975 at the age of 15 or 16, when I was a sixth-former at RGS Guildford (then a state grammar school).  The computing facility available to us was an ICL 1900 series machine, which seemed to take up most of the fourth floor of a modern building at the local technical college.  To begin with, our method of entering code was by filling in forms, which staff at the Tech would transfer onto punched cards.  We would submit our lines of FORTRAN IV code, and a few days later receive a printout, usually with compilation (syntax) errors, and eventually, after some corrections, with the output from our program.  Later, we would punch our own cards using entirely mechanical punches, for which we needed to learn the two or three-hole codes for each of the ASCII characters.

As a reasonably bright youngster in the early days of school computing, my idea for a first program was quite ambitious.  I had started to learn the guitar, so I thought it would be cool to do some computer-generated music.  I made a list of guitar chords and drew up a two-dimensional table of conditional probabilities for one chord followed by another.  So for example, a D7 might have a 60% chance of being followed by a G, a 20% chance by a C and a 20% chance by an A7.  For a G, it might have been 30% C, 20% E minor, 20% D7, 20% A minor and 10% B minor.  The program would run in a loop, starting with a particular chord in the set, and generating subsequent chords randomly according to the probability table.  I was fascinated by the idea of changing this table and seeing whether the music that came out would have a different "feel".  Reasonably advanced stuff in those days (or so I like to think)!

This project immediately set me apart as an awkward customer, because I needed a random number generator.  I can't remember how I did that in the end, but I don't think it was a built-in function.  What I do remember, with pride and embarrassment in equal measure, is the results of my efforts.  Pride because, of the ten or so of us, my printout was the only one that came back first time without any compilation errors.  But embarrassment because there didn't seem to be any output at all.  My teacher and I thought this must be a one-off problem, so we re-submitted  the code with no changes (we didn't think of inserting "hello world" print statements or anything like that).  Back it came the following week, same problem.  Nothing!  When the same thing happened the third time I noticed that the printouts did have something on them: a black rectangular blob in the top left corner.  Here is the explanation:  I had chosen to output my chords as single-digit numbers (dealing with characters and strings seemed too hard for a first program).  A chord would be printed each time a print statement was encountered in my loop.  Normally, that would produce a new chord on each line.  But I wanted to save paper and print the chords across the page.  I had been told that the "+" control character would make the next piece of output appear on the same line.  But what I had not appreciated was that there was an implicit carriage return as well.  So the 50 chords I asked for were indeed printed - all on top of each other!  The following week I had my first readable output and I remember playing my first piece of computer-composed music - which has been lost to posterity.

My second project was much more ambitious, and never properly completed.  It just had to be computer chess.  That was a tall order in those days.  I just about managed to write a program to play legal chess, but did not make much progress on getting it to play well!

Tuesday, 13 March 2012

The National Grid, Eastings and Northings

Yesterday I reported some nasty fly-tipping on one of the country lanes on the way to work.  I reported its location using the standard Ordnance Survey system of two letters and six digits, so SU 731189.  I copied it to the Hampshire police, who had to phone me to check the location as they "don't use that system".  "We use Eastings and Northings" said the copper.  But the OS system is Eastings and Northings: the 731 is Eastings in units of 100 metres and the 189 is Northings.  But the police mean something different.  They use two six-digit numbers which do the job of the letters and the numbers of the OS system. Their two numbers are measured in metres from the "false origin" of the national grid.  The true origin is at 2° west and 49° north.  The 2° is important - it's the "central meridian" going down the middle of the country, which makes it a good basis for the map projection they use.  But the trouble with the true origin is that you would have to have negative Eastings for everything west of the line.  So they have a "false origin" 400 km to the west, somewhere beyond the Isles of Scilly, so all the Eastings are positive.

Now, to keep the numbers manageable, they use letters to define 100 km grid squares.  The first letter chooses one of 25 500 km squares in a pattern like this:

A B C D E
F G H J K
L M N O P
Q R.S T U
V W X Y Z

with the false origin at the bottom left corner of the S (where I've put a dot).  They're not all needed - in fact H, N, O, S and T cover the whole of Great Britain.  Then the second letter selects a 100 km square using the same pattern, but within the 500 km square already chosen.

So now we can go from our OS grid reference to our eastings-and-northings numbers-only reference as used by the police.  In my example, the big square S places us at the false origin, and the U then takes us 400 km east and 100 km north.  That 4 and 1 give us the first digits of our eastings and northings.  So our map reference SU 731189 becomes 473100 118900.  The zeros are because the OS reference is only accurate to 100 metres, while the 12-digit one is accurate to one metre.  Actually, it should be a 13-digit reference - the northings really need an exrta digit to allow us to get up to square H which covers Orkney and Shetland.

Now finally here's a handy table matching two-letter grid squares to the first eastings digit and the first (two) northings digit(s).  I haven't found this table on the OS website, but it might be there somewhere.  Most of the other stuff is covered by a nice little presentation that starts here.


Second          First letter
letter    H     N     O     S     T
  A      0,14  0,9   5,9   0,4   5,4
  B      1,14  1,9   6,9   1,4   6,4
  C      2,14  2,9   7,9   2,4   7,4
  D      3,14  3,9   8,9   3,4   8,4
  E      4,14  4,9   9,9   4,4   9,4
  F      0,13  0,8   5,8   0,3   5,3
  G      1,13  1,8   6,8   1,3   6,3
  H      2,13  2,8   7,8   2,3   7,3
  J      3,13  3,8   8,8   3,3   8,3
  K      4,13  4,8   9,8   4,3   9,3
  L      0,12  0,7   5,7   0,2   5,2
  M      1,12  1,7   6,7   1,2   6,2
  N      2,12  2,7   7,7   2,2   7,2
  O      3,12  3,7   8,7   3,2   8,2
  P      4,12  4,7   9,7   4,2   9,2
  Q      0,11  0,6   5,6   0,1   5,1
  R      1,11  1,6   6,6   1,1   6,1
  S      2,11  2,6   7,6   2,1   7,1
  T      3,11  3,6   8,6   3,1   8,1
  U      4,11  4,6   9,6   4,1   9,1
  V      0,10  0,5   5,5   0,0   5,0
  W      1,10  1,5   6,5   1,0   6,0
  X      2,10  2,5   7,5   2,0   7,0
  Y      3,10  3,5   8,5   3,0   8,0
  Z      4,10  4,5   9,5   4,0   9,0

Saturday, 14 January 2012

Friday 13th

One or two people have asked me to explain why the 13th of the month is more likely to fall on a Friday than on any other day.  There's no special reason why it happens to be a Friday, but it arises from two pieces of luck, one of which explains why there is any variation at all between the days of the week, and the other leading to Friday being top of the list.

The first piece of luck is to do with the length of our calendar cycle.  We use the Gregorian calendar, which has a 400-year cycle of year lengths.  Why 400?  Because the rule is that we have a leap year every four years, except for the century years which don't, except every fourth century when we do.  So 2100, 2200 and 2300 will not be leap years, but 2000 was - a double exception which meant that nobody noticed how special it was (apart from the fear of the millennium bug of course, but that's another story).  Now if you work out the number of days in that 400-year cycle, it comes to 400 x 365 + 97 (because of 100 leap years minus the 3 special non-leap years), which is 146,097.  The coincidence is that that huge number of days happens to be a whole number of weeks (20,871 to be precise).  If it weren't a whole number of weeks, then the length of the calendar cycle taking days of the week into account would be 2,800 years (because it would take seven 400-year cycles for everything to come back to start on the same day of the week).  And the 13th (or any other day of the month) would over that 2,800 years occur an equal number of times on every day of the week.

So with the fact that the 400-year cycle is a whole number of weeks, we are left with counting up how many times the 13th of the month falls on each day of the week.  We can know straight away that there will not be an even spread, because there are 4,800 months in the cycle, and 4,800 is not divisible by 7.  When you count it up, this is what you get for each day:

Sunday 687
Monday 685
Tuesday 685
Wednesday 687
Thursday 684
Friday 688
Saturday 684

So Friday wins!  Not by much... but what surprises me a bit is that the variation is more than just one day.  It all comes down to the effect of the three very specific hiccups in the cycle introduced by the century days.

As for the 13th of January, this will be on a Friday, Sunday or Wednesday 58 times each, a Monday or Tuesday 57 times each and a Thursday or Saturday 56 times each.

So now you know.  Not that it will have any effect on anyone of course.  Pure geekery!

Tuesday, 10 January 2012

Glass half full? Glass half empty?

I am generally an optimistic type - some might say a "glass half full" person.  But I'm pessimistic about the survival of those metaphors, particularly if pedantic geeks like me get their teeth into them.  Because of course, "glass half full" could equally allude to pessimism.  It depends on whether the glass is in the process of being filled or emptied at the time.  And it depends, too, on what adverbs you might use with the phrase.  All this leads to a list of possibilities:

Glass being filled
Glass only half full: pessimistic - need to wait
Glass already half full: optimistic - not long now
Glass only half empty: optimistic - it was emptier before
Glass still half empty: pessimistic - it's taking ages

Glass being emptied
Glass only half empty: optimistic - still plenty to enjoy
Glass already half empty: pessimistic - oh dear, well, I suppose I was very thirsty 
Glass only half full: pessimistic - I'm sure there was more left than that!
Glass still half full: optimistic- I'm savouring this.

So all this just goes to show...