Using world ranking to predict the results of the 2019 Rugby World Cup pool stages

Data visualisation for the pool stages of the 2019 Rugby World Cup

Published on Aug 2019Modified on Aug 2019 – Read time: six and a half minutes.Written by Tom Hazledine

This year (2019) is a rugby world cup year. I like data visualisation, and I like rugby. So here's my primitive attempt to calculate the results of the pool stages of the 2019 World Cup. My simple heuristic is that world ranking (as of Aug 22^nd 2019) is a predictor of a team's success. In effect, my simple algorithm states that a team with a higher ranking will always beat a team with a lower ranking.

Just here for the predictions? skip to the end

Note: I'm not trying to predict who will win a match, just who is expected to win. When the inevitable "upsets" occur (like Japan vs. South Africa in 2015), I want to be able to say "wow! They only had a {N}% chance of winning, but they did it!"

So let's translate that calculation into code:

const chanceOfWinning = (teamRank, oppositionRank) => {
    const combinedRanks = teamRank + oppositionRank;
    const invertedRanking = combinedRanks - teamRank; // Becasue a rank of 1 is the best
    const percentage = (invertedRanking / combinedRanks) * 100;
    return percentage.toFixed(1); // Round to 1 decimal place
};

chanceOfWinning(2, 13); // 86.7
// NZL are ranked `2`, and ITA are ranked `13`
// Therefore NZL have an 86.7% chance of beating ITA

With this primitive algorithm, I can produce a "likelihood of winning" percentage for any pairing of teams. And on first inspection it looks pretty good (based on my own subjective opinion of who should win a given match). New Zealand are currently ranked #2 in the world, and should be expected to crush a #13 side like Italy. Samoa and Russia (#16 and #20 respectively) should be a much closer match, but you'd expect Samoa to emerge victorious.

2nzl

86.7%

13.3%

ita13

16sam

55.6%

44.4%

rus20

There are problems with using rank

But this method starts to look a bit shaky when we include the #1 ranked team in the world (a crown recently claimed by Wales at the time of writing). Not because Wales are particularly special, but because this algorithm massively favours lower rankings. I'd expect Wales to crush Uraguay (#19 in the world), but would not expect them to have such an easy time against Australia (ranked #6). The ranking-based algorithm predicts both matches would be walkovers:

1wal

95%

ura19

1wal

85.7%

14.3%

aus6

And there's another problem with using world rankings. Rankings, by their very nature, are ordinal. By ranking alone, the difference between #1 and #2 is the same as the difference between #2 and #3, and so on... Whereas in reality, some teams are much closer than their mere ranking would suggest.

Using points rather than ranking

A better metric to use as the base for our calculation would be points. Word Rugby, the sport's governing body, uses a points system to determine the world rankings. These points are based on match performance, and range from zero to one hundred (the top side generally has a rating of somewhere near 90 points). In late August 2019, Wales have 89.43 points and New Zealand have 89.40 - it's tight at the top! Australia are on 84.05 and Uraguay have 65.18 points.

Using points rather than rank changes our algorithm slightly (we no longer need to invert the team's value, as higher points are better).

const chanceOfWinning = (teamPoints, oppositionPoints) => {
    const combinedRanks = teamPoints + oppositionPoints;
    const percentage = (teamPoints / combinedRanks) * 100;
    return percentage.toFixed(1);
};

Plumbing our examples into this calculation produces a much tighter set of matches. The end results are still the same (in this system, a team with higher points will always beat a team with lower points, in just the same way as the team with the better ranking always wins).

2nzl

55.4%

44.6%

ita13

16sam

51.6%

48.4%

rus20

1wal

57.8%

42.2%

ura19

1wal

51.6%

48.4%

aus6

These results look a little better than the ranking-only method. The delta between WAL/URA and WAL/AUS looks more realistic, and whoever is in the #1 spot has less of an unfair advantage. But now the amounts look wrong. Any theory that gives Italy a 44.6% of beating New Zealand must be inaccurate.

Increasing the weighting

The points-based system is a better reflection of the team's relative chance of winning, but to my eyes the results aren't extreme enough. It gives too much credit to the lower-tier teams, and not enough to the top-tier ones. For the calculation to better match my expectations, it needs to favour the teams at the top of the rankings. Not only that, but it needs to do it progressively - so a team in the middle gets a bit of a boost, but not as much as those at the top get.

I need to write a function that will adjust the points value of each team. The easiest way to get the result I'm after is to multiply each team's points by a power.

const adjustment = num => Math.pow(num, 5);

const chanceOfWinning = (teamPoints, oppositionPoints) => {
    const combinedRanks = adjustment(teamPoints) + adjustment(oppositionPoints);
    const percentage = (adjustment(teamPoints) / combinedRanks) * 100;
    return percentage.toFixed(1);
};

I started with 2 as the exponent, and that was better than nothing, but still not enough. 10 was too extreme, and in the end I settled on 5. Increasing each team's points by a power of 5 gave me a set of probabilities that looked about right. That formula added just enough of a notch in the middle of the graph - and thereby increasing the likelihood of a top-tier team beating a lower-tier one.

2nzl

74.6%

25.4%

ita13

16sam

57.9%

42.1%

rus20

1wal

82.9%

17.1%

ura19

1wal

57.7%

42.3%

aus6

Results for all the pools

This is of course only based on my experience of rugby and my own highly subjective opinions. But it is still anchored in reality because I'm using the points as a starting point, and treating each team equally (as much as I want to give England a boost, the algorithm doesn't support it).

Ironically, this calculation shows that the draw for this world cup does give England a slight boost. The top 8 teams make it through to the quarter finals as you would expect. But when it comes to the semis, 4^th ranked South Africa miss out, while 5^th ranked England manage to sneak in. A side-effect of the pools being drawn years before the event. On the other hand, it probably shows that the draw-process works fairly well if, given all the top 8 make it into the quarters (or at least shows that the rankings have been comparatively static).

I'm not expecting these predictions to come true - there's a lot more to success in rugby than simple rankings. But I do find this kind of objective analysis useful for setting expectations. Looking at these predictions, I'll make more of an effort to see matches I might otherwise have passed on. Tonga vs. USA, for instance, looks like it'll be a close one. As do Scotland vs. Japan and New Zealand vs. South Africa (although after this year's Championship you don't need an algorithm to tell you that'll be a real grudge match!).

Pool A matches

Pool A results

Pos.	Team	Wins
1st	Ireland	4 wins
2nd	Scotland	3 wins
3rd	Japan	2 wins
4th	Samoa	1 win
5th	Russia	0 wins

Pool B matches

Pool B results

Pos.	Team	Wins
1st	New Zealand	4 wins
2nd	South Africa	3 wins
3rd	Italy	2 wins
4th	Canada	1 win
5th	Namibia	0 wins

Pool C matches

Pool C results

Pos.	Team	Wins
1st	England	4 wins
2nd	France	3 wins
3rd	Argentina	2 wins
4th	United States	1 win
5th	Tonga	0 wins

Pool D matches

Pool D results

Pos.	Team	Wins
1st	Wales	4 wins
2nd	Australia	3 wins
3rd	Fiji	3 wins
4th	Georgia	1 win
5th	Uraguay	0 wins

If you enjoyed this article, RoboTom 2000™️ (an LLM-powered bot) thinks you might be interested in these related posts:

Newer post:

Algorithmically predicting the results of the 2019 Rugby World Cup

Published on Sep 2019

Older post:

Writing well is essential. Try your best to get good at it

Published on Jul 2018

There are problems with using rank

Using points rather than ranking

Increasing the weighting

Results for all the pools

Pool A matches

Pool A results

Pool B matches

Pool B results

Pool C matches

Pool C results

Pool D matches

Pool D results

Related posts

Algorithmically predicting the results of the 2019 Rugby World Cup

Rugby prediction: retrospective

Signup to my newsletter