Skip to main content

Posts

Common football visualizations and statistics: Passing Networks

More detailed and readily available data and a thriving analytics community on Twitter have produced a series of new statistics and visualizations for football fans. In this series I will try to give a brief definition of various metrics, background on charts and sources for where to find them going forward. Passing Networks Passing Networks contain two main types of information: average positioning of each player and most frequent passing links between players of one team. Each node typically represents the average location of a player. This provides additional information on the game plan and tactical approach of a team on top of the lineup sheets and graphics we usually see before a game. In these pregame charts lineups tend to be symmetrical and pressed into preassumed tactical formations which don't necessarily represent what is happening on the pitch. Each edge or link between nodes highlights a frequent passing path between two players. More important passing links ca

Common football visualizations and statistics: Expected Goals

More detailed and readily available data and a thriving analytics community on Twitter have produced a series of new statistics and visualizations for football fans. In this series I will try to give a brief definition of various metrics, background on charts and sources for where to find them going forward. Expected Goals (xG) Expected goals try to evaluate the quality of chances. For every shot it assigns a value based on the historical success rate (probability of scoring) of similar shots. The value ranges between 0 and 1. Different properties of the shot can be used to assess its value: location on the pitch, was it a header or a shot, did it occur during open-play or from a free kick/corner, did the shot follow a long series of passes or a rebound. Advanced models also include defensive pressure, i.e. how are defenders positioned around the shot taker. This metric can be useful in various ways. By comparing expected goals and actual goals for a single player we can see how g

Quantifying Injury Rates

Being a Bayern Munich supporter injuries have been a big part of past campaigns I have followed. During most of the past season (2017/2018) Manuel Neuer was injured with a broken foot. For the second match against Real Madrid in the Champions League semi finals the team had to additionally compensate the injuries of Jerome Boateng, Arturo Vidal, Kingsley Coman and Arjen Robben. While Pep Guardiola was still coach at Bayern his dispute with the long-serving team doctor Mueller-Wohlfahrt culminated into him resigning after 38 years with the team (Mueller-Wohlfahrt was later reinstated as the team doctor in 2017). The Data There is not much structured and comparable data out there on injuries and time lost due to injuries (at least not that I am aware of). The great website fussballverletzungen.com/ (on German) regularly surveys type, duration and frequency of all injuries across the Bundesliga. Transfermarkt has a pretty detailed overview of missed matches for each player, but it mak

Fun With Google Trends: Most Controversial Refereeing in the Champions League

Michael Oliver's decision to award a last minute penalty to Real Madrid against Juventus was a controversial one. There were immediate protests by the Juventus players following which Gianluigi Buffon saw a red card for dissent in his last Champions League game of his career. After Cristiano Ronaldo converted the penalty for a last minute winner, the discussion then took to social media resulting in threats to Michael Oliver and his wife over the following weeks. Let's have a look at the most controversial refereeing decisions in the Champions League. As a proxy for how controversial a decision is we will use the Google search interest for the referee in the month of a Champions League fixture led by the same referee. Google Trends data starts in 2004 and I looked at referees with the most CL matches since 2009 which should cover the most influential referees. There is some manual data cleaning necessary as search interest can also be influenced by other factors, e.g. other

Is Possession Data Getting More Extreme?

Possession is probably one of the football statistics which is cited most often. It has gained even more media attention as we have seen larger divergence in possession across teams. Frequently game approaches of teams are broadly categorized between possession-oriented and counter-attacking: think about Barcelona, Bayern Munich, Manchester City vs Chelsea, Borussia Dortmund and Liverpool. At first glance the possession statistic seems to be fairly trivial; who controls most of the ball during the 90 minutes? There are however some details that can have quite an impact on the final number. The simplest approach is to approximate possession with touches. This can however favour teams that try to play short and controlled passes and prefer a slow build-up play. On the other extreme you could try to assess possession for every second of the game, even interpret a uncontrolled clearance as giving up possession. The OptaPro Blog has written a very good article on this exact topic. In thi

[How-To] How to a add custom table to a ggplot chart

For one of my recent posts I needed to add a customized and dynamic table to a ggplot chart. This can often be helpful to display information in addition to the main idea you are presenting in the plot. In my example I wanted to show free kick conversion rates by season in a table format. I found this to be far from straightforward so I will try to outlay the main concepts here in case anybody has the same problem. To make this as engaging as possible I will use an example based on the diamonds data set which comes with the usual R installation. This data set has information on the size, cut, color, prize and some other characteristics of diamonds. You can follow the steps below by copying the code into your RStudio or R environment. library (ggplot2) head (diamonds) ## # A tibble: 6 x 10 ## carat cut color clarity depth table price x y z ## <dbl> <ord> <ord> <ord> <dbl> <dbl> <int> <dbl>

Lionel Messi's free kick stats

I got a lot of feedback of people wanting to compare Cristiano Ronaldo's free kick stats with those of Lionel Messi. See them below in direct comparison: Lionel Messi Cristiano Ronaldo