The Poisson distribution, named after French mathematician Siméon Denis Poisson, is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known constant rate and independently of the time since the last event.
The Poisson distribution can be useful in modelling events to find value but it does have some limitations that people should be aware of before blindly following it.
Looking at the example game below you will see that the over 2.5 goals market is sitting very close to the evens line. This would lead us to conclude that the expected goals in this game is 2.5.
If we put the figures in to a Poisson distribution calculator, using 2.5 as the expected goals figure, we come out with the following results.
If we wanted to have confidence in our calculations it would be nice if these two aligned. So the question is, which of these methods of benchmarking is the most accurate?
The Poisson method has the advantage that it is based on the over 2.5 goals market which is generally the most liquid of these type of markets. However, it has the disadvantage that we need to accurately assess the expected number of goals. Even in this example where the evens line looks like it is giving us 2.5 as an average, can we say this is correct. It may be an even money bet to be over or under 2.5 but the distribution of goals is discrete and non-negative, in fact we would need to use an average number of goals of 2.65 so that the Poisson calculation came out with odds of 2.02.
Given the difficulty in assessing the average number of goals our preference is to use the Betfair figures in our model – it is more time consuming but we believe it is more accurate.
When Poisson breaks down
There are certain circumstances that suit the use of a Poisson calculation and there are times when it is not appropriate.
Poisson is good when:
You can accurately estimate the expected par score for a particular event.
The scoring method is such that it goes up in individual discrete iterations (so one ‘point’ at a time).
The ‘points’ are scored independently of each other.
Poisson is bad (or not so good) when:
The par score cannot be determined accurately.
Points are scored in different multiples (1 run, 2 run, 3 runs, 4 runs, 6 runs – from any particular ball in cricket).
Point scoring is affected by previous events (in a football match, you are more likely to win another corner if you’re currently taking a corner).
One such example of events that are not entirely independent cropped up when England played Afghanistan in the 2019 Cricket World Cup. England won the toss and elected to bat, at this point the spreads were indicating that if England hit six sixes during their innings, that would be a par score….
The scorecard (from http://www.espncricinfo.com) shows England hit a new record of 25 sixes with their captain Eoin Morgan dispatching 17 of these. Using Poisson based on the par being six we would have odds of nearly 170 million to one on there being 25 or more sixes.
Putting aside the possible inaccuracy of using the spreads site to provide a par score it is clear that in these circumstances the likelihood of hitting these sixes cannot be independent, the batsman gains in confidence, the bowler loses his, the batting team are spurred on by breaking records and try to hit more sixes than they might have done had all the previous sixes not gone before them.
Poisson can be a useful tool when used carefully but it isn’t a golden bullet for all circumstances.