Using Artificial Intelligence to Understand Movement Patterns in Team Sports

by Fusion Sport
 | 4th May, 2017

As the striker streams down the sideline, the defenders need to make a decision. To shut down the striker’s space or zone off and cover the outlet? Caught in two minds, the defenders hold off, allowing a free shot at goal. Luckily, the goalie makes a great save, but an unmarked attacker gathers the rebound and scores into an open net. As a coach, what do we tell our defenders to do here? While many may have opinions, new research from Disney Research, California Institute of Technology and STATS might provide us objective data to help answer these questions1.

Position Data

Accurate player position data is now readily available to sporting organisations. Technology companies such as Opta or STATS have allowed for highly detailed annotated event and position data through video tracking and manual tagging, while more advanced measures can come from inertial measurement units — such as GPS units — which we’ve written about previously. With the availability of this data, sporting organisations are beginning to move away from ‘what happened’ (e.g. Team A beat Team B) and move towards the ‘why did that happen’ (e.g. How could we have prevented their midfield from dominating)2.

Drowning in Data

While the aforementioned technologies have helped sporting organisations acquire this type of data, analysing it is a different beast. Given the complexity, size and contextual nature of positional data, most of an analyst’s time is spent producing high-level reports — not to mention that expert domain knowledge is required to gain insight from the raw data.

Figure 1: Acquisition of team sport analysis is often only a small part of the picture.
Image taken from Stein et al (2017)2


An example of how domain expertise is necessary to draw insight out of raw position data comes from a fantastic Grantland article written in 2013 about the Toronto Raptors analytics team. The article talks about an idea called ‘ghosting’, which is when coaches and analysts identify where they thought a defender ‘should’ be on the court. With this information, they can assess player effectiveness by comparing their actual position to where they should have been. While this is an innovative approach to assessing contextual player positioning, it is time-consuming and relies on subjective decisions.


This original ghosting work motivated the Disney Research team1. In their paper, they take positional data from STATS that contains 2D coordinates of every player (and the ball) during each game of a Premier League season, as well as human annotated events (such as tackles, goals and shots). They then use a deep learning algorithm, similar to the one Google recently used in AlphaGo, to learn the complex interactions and patterns in the positional data.

From this, the team could develop automated ‘ghosts’ that showed, in various contexts, where an average defender would position themselves. They can also use a specific subset of the data to understand, for example, where a defender from the top four teams would position themselves in that situation. The video below explains that by learning what a defender from a top team would do, the probability of the offensive team scoring drops from 70% down to 40%1.

Call to Action

As technologists, we are always trying to assist coaches to perform their role. Rather than replacing their decision making, capturing their domain expertise with automated analytical processes allows them to do more of what they do best — coaching!

So, knowing this, what does our ghosting model reveal about the scenario introduced at the start of this article? Defender ghosts from a top level team pushed up to challenge the striker, while the winger pushed deep into defence. This meant that a) the initial shot would have been taken under pressure and b) the rebound would have likely been cleared.

Figure 2: Example of Ghosting players (in white) versus actual defenders (Blue) for a defensive scenario.
Taken from Le et al (2017)1

Are you trying to analyse positional data? How do you manage it? Our analytics team would love to hear from you — let us know on Twitter or Facebook or as always email us at


1. Le HM, Carr P, Yue Y & Lucey P (2017). Data-Driven Ghosting using Deep Imitation Learning.
2. Stein M, Janetzko H, Seebacher D, Jäger A & Nagel M (2017). How to Make Sense of Team Sport Data: From Acquisition to Data Modeling and Research Aspects. Data; DOI: 10.3390/data2010002.