Wednesday, August 26, 2015

St Petersburg Paradox

I was having lunch with teacher friend  the other day, and discussing some interesting examples of how statistics and probability can get kind of weird. He loved the Birthday Problem and decided to use it for his class, but was particularly fascinated by the more tricky St Petersburg Paradox.

The problem goes thus: there is a game that costs X dollars to play, which simply involves tossing a coin. You start with a pot $2, and every time the coin comes up heads the banker doubles the pot. As soon as the coin comes up tails the game ends, and you get to walk away with the pot. The question is, how much is a reasonable amount of money X to play the game?

Where the paradox comes in is how statistics defines 'fair'. Usually we calculate the average, or "expected" amount of money to be made from the game, by totalling up all of the possibilities combined with how much we expect to make from them. In this game, we have a 50:50 chance of getting $2 (the first throw being a tail), and then a 1/4 chance of getting $4 (a head, then a tail), then 1/8 chance of getting $8 (heads, heads, tails) and so on. That means we can expect on average $1 from the worst-case scenario (it's $2, and happens half the time, and $2 x 1/2 = 1), and another $1 from the heads-tails scenario ($4 x 1/4 = $1) and so on. This process goes on forever - it's always possible to get more heads - so the average amount we expect to win in this game is $1 + $1 + $1 + .... = infinite money, and that's how much we should apparently spend to play the game.

This obviously doesn't make sense. For a start, you're always going to lose at some point, so it's physically impossible for you to make infinite money no matter how many times you get heads. The problem is that the idea of an expected amount of money depends on the assumption that we want to know what happens in the long run, so it assumes we are playing this game infinitely many times and taking the average. But when we play infinitely many times, we suddenly have access to the end of the rainbow where we're making infinite money - the idea is that infinity is a mathematical construct that we never see in reality. Usually we can deal with it pretty happily without weird things happening, but this is a weird game, and breaks our usual assumptions.

What we can do instead is see what's most likely to happen to our winnings as we keep playing. For a single game, it's pretty clear that most of the time we'll win either $2 or $4 (with a 50% and 25% chance respectively), and occasionally $8 (12.5%) but we're not likely to win much more than that. If we play two games, then our worst case scenario is that we'll win $4, with a 1/2 x 1/2 = 1/4 chance. There are two ways we can win $6 - we can win $2 then $4, or $4 then $2. Both of these options have a 1/2 x 1/4 = 1/8 chance of happening, so overall we've got 1/4 chance of that happening too. We can calculate the other possibilities that way too - obviously we have to stop at some point, but we can go far enough to get a decent idea. We can then keep going and see what happens when we play more and more games in a row, and getting bigger jackpots gets more and more likely.

Of course, the best way to do this is with a computer to avoid all those pesky calculations. Here is a graph of the possibilities over the course of 100 games:

Lighter colours represents where a possibility is relatively likely, and dark colours where it is unlikely. You can see little waves towards the top-left of the graph - this is where after a few games there's a small but decent chance of getting a single big win which overwhelms all of the other winnings. Especially when not many games have been played, it's more likely that you'll get a single big win and a lot of small wins than multiple medium-sized wins.

The blue line represents the median average win, and is surrounded by red interquartile lines - the idea is that half of the time, your winnings per game after a certain number of games will be between the two red lines. For example, after 50 games, it's 50-50 whether your average winnings are above or below $8.20 (the median), and half the time your average winnings will be between $6.12 and $12.44. So if you paid only $6 a game, you're probably doing pretty well at this point!

The most important part of this graph is that these numbers are going up as we keep playing games, meaning that the game becomes more and more reliably profitable. Further along the graph, the computer can no longer keep track of the higher numbers of winnings (which is why the red line disappears) so we need to find another way to work out what happens with more than 100 games. Using results cited in this paper, we can actually estimate the median winnings as

$2.55 + log2(number of games)

So after 100 games, $9.20 looks like a reasonable price - paying that price, half the time we'll end up ahead, the other half we won't. Note that the distribution is what statisticians call skewed - even though we only come out ahead half the time after 50 games, the "good" half is a lot better than the "bad" half is bad.

Let's say that we really want to milk this game for all it's worth, and we've found a game online that we can make our computer play for us. If we can play a million games a second, and leave our computer running for a year, that's over 30 trillion games. If we put that into our formula, we get a median win of $47.40 per game. If we paid that much per game to play, we'd expect to lose a lot of money at the start but make it back as the games wore on and we got more and more jackpots, breaking even after a year. However, if we only paid $9.20 as before, we'd expect to be doing ok by 100 games (i.e. after 100 microseconds), and by the time our program had been running for a year, we'd be looking at profits around $1200 trillion dollars - 700 times Australia's GDP and enough to basically rule the world.

Unfortunately, no casino will ever host this game, online or otherwise, for exactly this reason. Sooner or later, the house will always lose.

Thursday, April 23, 2015

Quadruple rainbow!

A couple of days ago, someone at a train station in New York tweeted this photo of a quadruple rainbow:

Like most people, I'd never even heard of such a thing! Some reasonably reputable sites assured me that such a thing exists, and is caused by the combination of two things:

The first is that you can get two different paths of reflection of rays of light happening within water droplets, which gives us another "secondary" rainbow.

The second is the effect of a body of water, usually behind the observer along with the sun, reflecting the sun - it acts just like another (though less bright) sun shining from a different location, and gives us another pair of rainbows. Because the second pair of rainbows is from another "virtual sun", the centre of the rainbow is in a different place so they're offset a bit from the first pair, hence the weird shapes.

So, that makes sense. But there's water everywhere! Rainbows aren't that uncommon, and even double rainbows are seen occasionally, so why are quadruple rainbows so rare? I've never seen one, and I've seen plenty of double rainbows!

First, the reason that we don't see rainbows all the time is that we need the sun to be shining behind you, and it to be raining in front of you so we have water droplets for the sunlight to reflect off. Often weather is one or the other - either all rain and clouds (hence no sun) or no rain. Also, if the sun is too high in the sky, a rainbow can't happen - a raindrop has to bend the light a certain amount (40-42° for a normal rainbow, 50-53° for a secondary one), so you can't have the sun and the reflections you need for a rainbow in the sky reaching your eye at the same time if the sun's elevation is more than 40° above the horizon. For a secondary rainbow it only needs to be less than 53° above the horizon, but the reflections are a lot weaker (the rays of light have to pass through the raindrop twice and bounce off the inside once) so it's a lot harder to see unless the conditions are just right.

Here's a drawing from this site showing the path of light from the sun (yellow lines) at sunrise or sunset, and how they bounce off raindrops to create rainbows at the blue and red colours (these are reflected at different angles, hence the colours of a rainbow). The higher the sun in the sky, the more downward-pointing those yellow lines will be, and the closer to the horizon and harder to see the rainbow will be (to see the effect, tilt your head to the left and imagine the ground is still horizontal from your perspective).

To get the other two rainbows, we also need to have a body of water the right distance away behind you (it's also possible in front of you, but more difficult) to create another "sun" that will also make two rainbows - this will already be more difficult because the reflected sun will be less bright depending on how good a mirror the water body is.

We can do a bit of geometry and work out what the required distances would be. Given the sun is reflecting off a raindrop a certain distance in front of us, this plot gives us the relative distance we'd need the height of the raindrop (above the ground) and water (behind us) for the primary and secondary (dotted in the plot) reflected rainbows to occur:

We can work out a few things from this. First, the apparent height of the original and secondary rainbows (green) are never much larger than the distance the rain is away - the higher the sun is in the sky, the lower the height relative to distance. As the rain goes right to the ground, this doesn't really restrict us at all until the rainbow goes below the ground and we can't see it.

However, for the reflected rainbows (black), it's the opposite - the higher the sun is in the sky, the higher we expect the raindrops to be, and they're almost always going to be at least as high as the rain is far away. We'd expect raindrops to usually be less than 6km high based on this site, so the rain should be closer than that at least.

Using the timestamp in the Twitter post (changed to New York time, 5:57am) and this site, we can actually work out where the sun was in relation to New York when the picture was posted (and, hopefully, taken). It turns out it was not long after sunrise, so the sun was quite low in the sky at about 8.2° in height. The blue line on the graph represents this. The highest points of our secondary original rainbow (green, dotted) and our primary reflected rainbow (black, plain) should be at around the same place, with the original slightly lower, and this looks to be the case on the image. So far so good! Also the primary original (green plain) and secondary reflected (black dotted) are below and above these two.

The next thing we need to check is if the water lines up - the water for the primary reflected rainbow should be about 7 times further than the rain, and the secondary reflected about 12 times further. The direction of the sun at that time of morning was about 81°, so just north of east. If we assume the rain was about 1km away, looking at the map of the location there are two likely-looking patches of shallow, calm water about 7km and 12km in that direction at Oyster Bay and Cold Spring Harbor respectively.

So after doing some detective work, it looks like not only is the quadruple rainbow plausible, but the combination of a series of unlikely but very possible events!

Tuesday, November 4, 2014

The reality of Melbourne Cup betting

An interesting proposition just came up on my twitter feed. Tom Waterhouse, the smug git you see on TV spruiking his bookmaking service, is offering $25 million dollars to anyone who can pick the first 10 runners in the Melbourne Cup in order - and it only costs you $10 per try! (apparently you get 10 tries at most, don't get greedy now!)

Sounds good, right?

Well, no. Intuitively most people can smell a rat straight away - after all, it's difficult to win the lottery and that's only 6 numbers that need picking (though there are more of them). But it's far worse than that. Even with two horses scratched, if we naively assume each horse has the same chance of winning, then picking the first horse is a 1 in 22 chance. Picking the next horse is then a 1 in 21 chance (as we can eliminate the winner), then so on until the 10th horse is a 1 in 13 chance. All up, the probability of doing this is about 1 in 2 million million. Even if the entire population of Australia at about 23.6 million (including children!) put in their 10 bets each, there would be an 0.01% chance that anyone would win.

The smart gamblers are probably now thinking "well, each horse doesn't have an equal chance of winning - I can exploit that!". And they'd be right! So let's look at the odds of each horse winning (I've used the fixed odds at Betfair but feel free to substitute your own). We can estimate the probability of each horse winning from the bookmaker's odds, and this is as accurate a representation as we're likely to find without significant effort - after all, it's in the bookmaker's interests to know the probabilities as accurately as they can to make the most amount of money! A formula to estimate the probability of a particular horse winning is:

1/<the odds for your horse> divided by the sum of (1/odds) for every horse.

Doing this gives us a probability of about 16% of the favourite, Admire Rakti, winning. So let's put our money on the favourite winning, followed by the second favourite in second place, and so on. Once our favourite goes past the line, we then need the second favourite (either Fawkner or Lucia Valentina) to come next. We can estimate the probability of this happening by dividing its probability of winning (about 13%) by the probability of all the remaining horses' probabilities of winning combined (about 84%), giving us a probability of about 15% of this horse coming second given our favourite has already come first.

Again, we rinse and repeat until we get through the first ten horses. This gives us a much nicer final probability of 1 in 28 million. Again, naively you might think that if everyone in Australia had a go at this, surely with 236 million bets, we'd be able to do it pretty easily.

Unfortunately though, if everyone in Australia put their bets in here, they're not all going to be able to pick this most likely scenario. If they did then either everyone would win, sharing the $25 million dollars and getting $1 each from their outlay of $100, or nobody would win! Instead, everyone would have to organise to pick the 23.6 million best odds. And then, even if someone managed to win, Tom Waterhouse would still be pocketing $2,360 million dollars and only having to shell out $25 million, making his smug face even more unbearable...

Friday, July 25, 2014

Fruits of procrastination

Winter tends to be a bit slow for me, in terms of work and productivity at least. It gets that little bit harder to concentrate, or stay motivated on tasks that are... the less fun parts of my job as a research scientist.

To that end, I thought I'd keep this blog alive by sharing some of my afternoon's procrastination, which I thought was kind of cool and a real reflection of how even now in 2014 we're still a fair way away from 'science fiction' in a lot of our endeavours. Artificial intelligence is a big one of these - we've achieved a lot since electronic computers hit the scene not-so-long ago - but our imaginations at least for now far outstrip what we've been able to do. Exactly because it excites people's imaginations, progress is heavily trumpeted - and make no mistake, some cool things have been done, especially in AI-friendly environments such as strategy games (chess is probably the most obvious example here).

Unfortunately, things get much more difficult for AIs when we go from simple games where the options are finite and often manageable to more realistic real-world tasks where there are numerous things that need to be coordinated at once - something that our human brains are evolved to deal with but computers have no such base to work from. The programmer can of course give the computer insights as to how humans would deal with things, and sheer processing speed can help make up some of the difference - any first-person computer gamer can attest to AIs being potentially very skilful (though often easily fooled by unusual strategies).

My afternoon's procrastination has involved looking at RoboCup - a series of competitions based around the game of soccer (or football, depending where you're from). The AIs actually look reasonably clever in the simulated 2D version. Keep in mind that to keep some degree of 'realism' each AI player has been given some simulated 'noise' to their sensors so they don't have perfect information, much like players in real life.

Once you get to 3D though, things start looking seriously clunky. Each virtual robot has 22 different joints to control - and it shows. They're very good at doing set combinations of movements (like a set shot at goal, given enough time) but it's not exactly what you'd call graceful...

When you convert this to real life robots, things get even worse. Really the only thing these robots can do consistently well is get up after they've fallen over - and after watching this video for any length of time you'll understand why this is a vital necessity:

The stated goal of RoboCup is that "by the middle of the 21st century, a team of fully autonomous humanoid robot soccer players shall win a soccer game, complying with the official rules of FIFA, against the winner of the most recent World Cup". At the moment that looks kind of optimistic, but when you consider how far computing came from the earliest personal computers in the 80s to the present, then extrapolate to 30 years in the future, their goal doesn't seem quite so unrealistic.

Monday, April 21, 2014

DIY animal surveys (part 2)

So after the success of my first forays into using motion detection to film the neighbourhood cats, I thought maybe I'd get a little bolder and set up the equipment next to the house. I originally decided against this because I thought any cats (especially kittens) would be scared off by the proximity to light and humans, but considering how bold the last one was, it'd be worth a try!

The next morning, a quick perusal of the food bowl suggested that nothing had been eaten, so I wasn't feeling particularly optimistic as I went to review the footage - yet again, I needn't have worried. This time I picked up not one, but two feline feeders, obviously working together:

After their first joint perusal of the offerings on display, they individually came back to the bowl...

... and laptop...

... again...

... and again - often looking around curiously at objects (or potentially off-screen cats) as they did so.

The black cat was evidently the wilier of the two - while the above photos were all taken in the space of five minutes, it returned a couple of hours later apparently having ditched its companion to see if any tastier food had magically appeared in the bowl that it could have for itself.

Tuesday, April 15, 2014

DIY animal surveys

Our neighbourhood is a cat neighbourhood. Walking along the streets at dusk or after dark, you can see at least a small handful of local cats prowling around or sitting smugly on their owners' driveways soaking up the last bit of heat of the day. So it didn't come as any surprise to me that every time I discarded the scraps outside that our own (indoor) cat for whatever reason didn't eat, they'd invariably be gone the following day.

I thought it worth investigating exactly which cat was taking these scraps. We've occasionally seen kittens wandering around our yard and more regularly around the neighbourhood, and I was a bit concerned for their welfare - so I thought it would be good to know if they were feeding in our yard and whether they could be collected for a rescue shelter.

So I got out my old crappy laptop with its old crappy webcam and set it up outside in our garage, somewhere that rain/wind wouldn't bother it (though it's old enough that I wouldn't have been too distraught if something did happen to it), and turned on a motion capture software program (I can thoroughly recommend yawcam - it's free!). I was unsure whether our nightly visitor would be put off by the outside light I left on for the webcam to be able to see, and my fiancee was understandably cynical as to whether the process would work at all. So come next morning, I rushed out to reclaim my laptop, and after flicking through the images captured during the night, felt vindicated at seeing this photo come up at 12.11am:

I needn't have worried, though, as I'd forgotten two basic attributes of cats. Firstly, they are curious and attracted to new and interesting objects - and secondly, they're attracted to warm objects. The laptop that had been running all night out in the cold was both of these things! Thus, at 2.23am, the vision went entirely black, followed by images of the cat walking directly in front of the laptop sniffing at it:

Then an hour later at 3.14am it returned for another look at the laptop before scurrying off, not to be seen again in the footage (though it may well have returned - the laptop stopped recording when Windows decided to restart after downloading a security update... a lesson for anyone wanting to try this at home!)

It just goes to show that with the modern (and sometimes slightly less modern) technology we have available and take for granted, it's actually pretty easy to set up some fun and interesting projects to see what's just outside your door. It's probably worth noting, though, that the webcam didn't actually pick up any evidence of said cat eating the food left out for it, even though it was definitely gone the next morning!

Tuesday, January 7, 2014

Testing the Bechdel Test

So, recently this article came out showing that of the top 50 movies of 2013, those that passed the Bechdel Test made more money overall at the US Box Office than those that didn't. For those not in the know, the Bechdel Test evaluates whether a movie has two or more named women in it who have a conversation about something other than a man. The test seems simple enough to pass, but surprisingly quite a lot of movies don't! Of the 47 top movies that were tested, only 24 passed the test (and at least* seven of those were a bit dubious). Gravity was understandably excluded from the test because it didn't really have more than two named characters**, and apparently no-one has bothered to test the remaining two.

The article comes with this nifty little infographic:

I've seen a couple of complaints on the web by people saying that this isn't enough proof - the somewhat ingenuous reasoning I saw was that the infographic shows totals and not averages, so can't prove that the average Bechdel-passing film performs better. Though there are more passes (24) than fails (23), the difference is not nearly enough to account for the almost 60% difference in total gross sales. The averages can quickly be calculated from the infographic above - the average passing film makes $176m, and the average failing film makes $116m, still a very substantial $60m difference!

A more reasonable criticism is that it may be possible that things just happened this way by chance. Maybe this year a handful of big films happened to be on the passing side, and if they had failed there'd be no appreciable difference? Well, we can test that as well using the information in the infographic. All we need to do is run what's called a randomisation test - this is where we randomly allocate the 50 tested movies in this list to the "pass", "fail" and "excluded" categories in the same numbers as in the real case (so, 24 passes, 23 fails, 3 excluded). We can use a random number generator to do this, or if you're playing along at home, put pieces of paper in a hat, whatever. We repeat this process a large number of times (I did it 10 million times) and see how often we can replicate that $60m difference between passing and failing films or better by chance alone.

It turns out that when you put your pieces of paper in a hat to make your own test, you'll only be able to beat the actual difference 0.71% of the time, or about 1 in 140 times. This is pretty good evidence that it's not a fluke and that the Bechdel Test really did influence movies' bottom lines this past year. One thing that we can't say based on this is whether this is a direct effect - i.e. that people consciously or subconsciously decided to go watch passing films over failing films. It could be that there is some indirect, or confounding effect, causing this phenomenon. For example, maybe directors who write films that pass the test tend to be better filmmakers in other ways which make people want to watch their films more? Either way, a trend towards more women in substantial roles in films can be no bad thing! (though it's worth mentioning that passing the Bechdel test by no means guarantees a "substantial role", and even failing movies can have their strong points - see this link)

* Having watched Man of Steel, I'd argue that it was pretty dubious too - I think the only non-about-a-man conversations between two women were one-sided one liners (hardly a conversation)... in any case, any feminist points it may have gained were swiftly taken away in my book by the female US Air Force Captain being mostly portrayed like a ditz rather than as a dedicated leader of people required for the rank. More here.
** So I'm told. I haven't watched it yet.