Using Data Science to Predict AFL Results: Can We Crack the Code?

Posted Jun 20, 2021

By Matt Anderson

6 min read

Australian Rules Football (AFL) is an absolute rollercoaster of a sport. It’s fast, physical, and unpredictable—exactly what makes it thrilling for fans. But when it comes to predicting the outcomes of matches, even seasoned footy fans struggle to call it right. Enter data science, a tool that’s revolutionised the way we look at sports predictions. In this post, I’ll explore how data science can be used to predict AFL results, the methods and models involved, and whether it’s worth the effort.

Why Predicting AFL Results Is So Challenging

Before we dive into data science, let’s set the scene. Unlike more rigidly structured sports like tennis or basketball, AFL has an almost chaotic feel to it. Each match involves a wide range of variables, from player injuries and team tactics to weather conditions and even umpiring. Add to that a good old-fashioned upset here and there, and you’ve got a sport that’s notoriously difficult to predict with accuracy. But difficult doesn’t mean impossible.

What Makes AFL Predictable? Data Points That Matter

So what do we have to work with when trying to predict AFL outcomes?

Team Performance Metrics: These include statistics like inside 50s, clearances, contested possessions, and goal accuracy. AFL is rich with game stats, many of which are crucial to determining a team’s performance on the field.
Player Statistics: Individual player stats such as disposals, tackles, marks, and goals can provide insight into how much a specific player contributes to the team’s chances of winning. Advanced metrics like Player Ratings and metres gained are also handy for deep analysis.
Game Location: Home ground advantage is a real thing in AFL. For example, teams playing at the MCG tend to perform better if it’s their regular home ground. Similarly, interstate travel (e.g., a Perth team flying to play in Melbourne) can affect performance.
Weather Conditions: Rain, wind, and even the temperature can heavily influence how a game plays out. Wet weather can make handling the ball harder, leading to lower scores.
Head-to-Head Matchups: Past encounters between two teams can offer clues. Some teams just seem to have the wood over others, regardless of form.
Injuries and Team Selection: Missing key players due to injury or suspension, or the inclusion of a debutant, can be game-changing factors.

The Data Science Toolbox

Once you’ve got your hands on this mountain of data, what do you do with it? Here’s where the fun starts. Data scientists typically rely on a range of models and algorithms to process this information and make predictions. Let’s break down some of the most common ones:

1. Regression Models

Linear regression models are a simple way to predict outcomes based on various input factors. You’d take variables like past performance, player form, and weather to predict the margin or probability of a win. For instance, if we know Team A performs 10% better on average when playing at home and Player X is in great form, we can apply a weighted score to forecast the likely outcome.

2. Machine Learning Algorithms

More complex than regression, machine learning models like Random Forests or XGBoost are commonly used in sports analytics. These models can handle a huge amount of data, analysing the relationships between variables in a non-linear way. They also “learn” as they go, getting better as more games are fed into them. With machine learning, the model might start to recognise patterns like how specific player matchups affect the overall team dynamic or how weather impacts goal-kicking accuracy.

3. ELO Rating System

The ELO system, originally used for chess, has been adapted to many sports, including AFL. It’s a dynamic model where teams’ ratings are updated after each game. A team gains points for a win and loses points for a loss, but the amount of points exchanged depends on factors like the opponent’s rating and whether the game was home or away.

4. Monte Carlo Simulations

This is a statistical technique that runs thousands (or millions) of simulations of a game, adjusting for random variables like player performance or team tactics. Over time, it can provide a probabilistic outcome for a match, showing the percentage chance of each team winning.

So, Does It Work?

Yes and no. Predicting AFL results with data science is possible, but it’s not foolproof. Even the best models can’t account for every variable—like a star player having an off day or a controversial umpiring decision. However, with enough data, these models can significantly improve the chances of making accurate predictions over time.

For example, sports betting platforms often use sophisticated models to set odds, and many fans build DIY prediction models using publicly available data. Some claim to reach up to 70-80% accuracy on predictions over a season, but this can vary dramatically from week to week.

Case Study: Predicting the AFL Grand Final

Let’s say we wanted to predict the outcome of the 2023 AFL Grand Final. We could feed the model with:

Form data: How have the teams performed in the lead-up?
Player performance: Key matchups, like how a dominant ruckman may affect clearances.
Location and conditions: Is it being played at the MCG, and what’s the weather forecast?
Head-to-head history: How have these teams fared against each other this season?
Injuries: Are there any notable absentees?

Using this data, a machine learning model might predict a 65% chance of one team winning, with an expected margin of 12 points. It’s not a guarantee, but it gives you a more informed perspective.

The Human Element

No matter how good the data and models are, there’s always a human element in AFL—moments of brilliance or disaster that no algorithm can predict. Think of the miracle goals after the siren, the underdog victories, and the unforeseen injuries. This is part of why we love the game.

For data science to truly “crack the code” of AFL, it would need to quantify not just performance metrics but intangible elements like team spirit, crowd influence, and individual player psychology. We’re not there yet, but who knows what the future holds?

Building Your Own AFL Prediction Model

If you’re a data science enthusiast or just a footy fan looking to get more analytical, building your own AFL prediction model is totally doable. Tools like Python, R, and libraries like scikit-learn can help you start building a basic model. There’s a heap of free AFL data available, such as from the official AFL website, AFL Tables, or Champion Data (for a fee). From there, it’s about experimenting with different models, variables, and simulations to see what works best.

Wrapping Up

Data science is a powerful tool in AFL predictions, but it’s not a crystal ball. Models can get close, and they often offer more insight than pure guesswork, but the unpredictable nature of footy means surprises are always around the corner. Still, for those of us who love numbers, analysing the game in this way adds another layer of excitement to the sport. So the next time you’re watching a match, know that behind every kick and every goal, there’s a wealth of data just waiting to be crunched.

And who knows? You might just crack the code to predicting footy results!

Let me know what you think about the topic, and if you want, I can show you how to build a simple AFL prediction model in Python. Could be a fun project for the off-season!

This post is licensed under CC BY 4.0 by the author.