Big data doesn't always get us closer to truth. Especially if there a fair bit of luck involved. And many think this applies to football/soccer games (Sally and Anderson, for example, say that soccer results are 50% luck). Yet data analysis provides valuable, sometimes counter-intuitive insights into this beautiful sport and the science of winning and losing in general.
How many measurable elements of a soccer game contribute to the outcome? 2014 FIFA world cup's statistics page displays scores calculated with sophisticated motion analysis from thousands of player movements along with more straightforward measures such as goals scored, short, medium and long hits, completion rate of passes, blocked and saved shots, attempts off target, tackles and blocks. And there are also flops, screams, winces, poundings of the grass and other theatrical elements that may also decide the fate of the game.
Just a quick glance at the FIFA statistics page (refreshed after each game) might bring surprises. On June 24th, the best defending team was Columbia that advanced to the next round, while the best attacking team, Côte d'Ivoire, and the team with the highest number of successfully completed passes, Spain, were not able to make it. The leader board is now featuring winning teams as the best attacker (France, as of June 25th) and the best passer (Germany, as of June 26th), but obviously neither of these achievements alone is sufficient to predict the winner.
Data from the last seasons of the British Premier League demonstrated that scoring a goal was twice less valuable than not conceding a likely goal. Yet England - #6 on the list of top attackers, #8 on the list of best passers and #11 on the list of best defenders - did not make it into the top 16, while France - #32 as a defender and one of the very best attacking teams has advanced to the round of 16. Only three teams among the top ten attackers won the 1st round and all three - Argentina, France and Germany - are leaders in their respective groups. Compare it to seven from the ten best defending teams that advanced to 1/16th finals. Note that all of them took second, not the 1st positions in their groups. So the ability to defend against counter-attack is definitely crucial to success, but the propensity to attack increases both the risk and the potential return.
A good example of skill in soccer is the elegant passing style of Spanish players dubbed Tiki-taka. This approach, based on speed, unity and a comprehensive understanding of the field geometry, helped Spain to win in Euro 2008, the 2010 FIFA World Cup and Euro 2012. Network analysis of interactions among the players (Cota t al., 2011; Peña & Touchette, 2012) highlighted the importance of skillful passes, yet the ability to do it well doesn't always lead to success - as was demonstrated by Brazilian team during the 2013 FIFA Confederations Cup (they won despite possessing the ball only 47% of the time vs 53% for Spain) and by Netherlands and Chile that knocked Spain out in the group stage.
How can we measure luck in soccer separate from skill?
One way would be to forecast game outcomes in terms of probabilities (as a researcher from Wolfram Alpha did for the upcoming round of 2014 world cup -- see the figure) then look at the distribution of actual results of games between these teams. Another useful tool adopted from ice hockey analysis is PDO - the sum of a team's shooting and save percentages (fraction of shots resulted or not resulted in a goal scored). Neither of the approaches was able to pinpoint particularly lucky teams. In analysis of a 2010/2011 season, good and bad teams appeared to be equally "lucky" or "unlucky".
"I am a great believer in luck,"said Thomas Jefferson, "and I find the harder I work, the more I have of it".
Perhaps we need to focus on values like amounts of running during the game - as a proxy for hard work? US defender Michael Bradley holds the trophy for the 1st round of world cup (largest distance covered), but average distance ran by players that advanced or did not advance to the next round seems to be about the same. However, if we compare players from higher and lower divisions of national teams, the differences in these distances become more dramatic. In one Dutch study, top-class players performed 28% and 58% more high-intensity running and sprinting, respectively, than the moderate players (Mohr et al., 2003). Better goalkeepers ran more too, as seen from a recent Italian study (Paduli et al, 2014). So hard work (and good health to carry on) is very important, indeed. At least in order to join the elite soccer club.
The amount of data created every minute for the analysis of soccer games is absolutely amazing (and will be increasing exponentially with newer optical tracking tools and high-tech Kinexon balls). In order to accurately predict the outcome of the game played by almost equally skilled & hardworking teams one needs to know minute-to-minute movements of every player. Like the weather, the scores of such games might be hard to forecast past a certain timeframe. Yet, better models and more sophisticated computations will be yielding more accurate results (as shown by Aurametrix for subtle cause-effect relationships contributing to how you feel on a daily basis).
But for now let's call it luck when we can't see the unseen and predict things before they happen. And let's enjoy top-flight soccer for the next few weeks.
Javier López Peña, & Hugo Touchette (2012). A network theory analysis of football strategies In C. Clanet (ed.), Sports Physics: Proc. 2012 Euromech Physics of Sports Conference, p. 517-528, \'Editions de l'\'Ecole Polytechnique, Palaiseau, 2013. (ISBN 978-2-7302-1615-9) arXiv: 1206.6904v1How many measurable elements of a soccer game contribute to the outcome? 2014 FIFA world cup's statistics page displays scores calculated with sophisticated motion analysis from thousands of player movements along with more straightforward measures such as goals scored, short, medium and long hits, completion rate of passes, blocked and saved shots, attempts off target, tackles and blocks. And there are also flops, screams, winces, poundings of the grass and other theatrical elements that may also decide the fate of the game.
Just a quick glance at the FIFA statistics page (refreshed after each game) might bring surprises. On June 24th, the best defending team was Columbia that advanced to the next round, while the best attacking team, Côte d'Ivoire, and the team with the highest number of successfully completed passes, Spain, were not able to make it. The leader board is now featuring winning teams as the best attacker (France, as of June 25th) and the best passer (Germany, as of June 26th), but obviously neither of these achievements alone is sufficient to predict the winner.
Data from the last seasons of the British Premier League demonstrated that scoring a goal was twice less valuable than not conceding a likely goal. Yet England - #6 on the list of top attackers, #8 on the list of best passers and #11 on the list of best defenders - did not make it into the top 16, while France - #32 as a defender and one of the very best attacking teams has advanced to the round of 16. Only three teams among the top ten attackers won the 1st round and all three - Argentina, France and Germany - are leaders in their respective groups. Compare it to seven from the ten best defending teams that advanced to 1/16th finals. Note that all of them took second, not the 1st positions in their groups. So the ability to defend against counter-attack is definitely crucial to success, but the propensity to attack increases both the risk and the potential return.
A good example of skill in soccer is the elegant passing style of Spanish players dubbed Tiki-taka. This approach, based on speed, unity and a comprehensive understanding of the field geometry, helped Spain to win in Euro 2008, the 2010 FIFA World Cup and Euro 2012. Network analysis of interactions among the players (Cota t al., 2011; Peña & Touchette, 2012) highlighted the importance of skillful passes, yet the ability to do it well doesn't always lead to success - as was demonstrated by Brazilian team during the 2013 FIFA Confederations Cup (they won despite possessing the ball only 47% of the time vs 53% for Spain) and by Netherlands and Chile that knocked Spain out in the group stage.
How can we measure luck in soccer separate from skill?
One way would be to forecast game outcomes in terms of probabilities (as a researcher from Wolfram Alpha did for the upcoming round of 2014 world cup -- see the figure) then look at the distribution of actual results of games between these teams. Another useful tool adopted from ice hockey analysis is PDO - the sum of a team's shooting and save percentages (fraction of shots resulted or not resulted in a goal scored). Neither of the approaches was able to pinpoint particularly lucky teams. In analysis of a 2010/2011 season, good and bad teams appeared to be equally "lucky" or "unlucky".
"I am a great believer in luck,"said Thomas Jefferson, "and I find the harder I work, the more I have of it".
Perhaps we need to focus on values like amounts of running during the game - as a proxy for hard work? US defender Michael Bradley holds the trophy for the 1st round of world cup (largest distance covered), but average distance ran by players that advanced or did not advance to the next round seems to be about the same. However, if we compare players from higher and lower divisions of national teams, the differences in these distances become more dramatic. In one Dutch study, top-class players performed 28% and 58% more high-intensity running and sprinting, respectively, than the moderate players (Mohr et al., 2003). Better goalkeepers ran more too, as seen from a recent Italian study (Paduli et al, 2014). So hard work (and good health to carry on) is very important, indeed. At least in order to join the elite soccer club.
The amount of data created every minute for the analysis of soccer games is absolutely amazing (and will be increasing exponentially with newer optical tracking tools and high-tech Kinexon balls). In order to accurately predict the outcome of the game played by almost equally skilled & hardworking teams one needs to know minute-to-minute movements of every player. Like the weather, the scores of such games might be hard to forecast past a certain timeframe. Yet, better models and more sophisticated computations will be yielding more accurate results (as shown by Aurametrix for subtle cause-effect relationships contributing to how you feel on a daily basis).
But for now let's call it luck when we can't see the unseen and predict things before they happen. And let's enjoy top-flight soccer for the next few weeks.
REFERENCES
Cotta, C., Mora, A., Merelo, J., & Merelo-Molina, C. (2013). A network analysis of the 2010 FIFA world cup champion team play Journal of Systems Science and Complexity, 26 (1), 21-42 DOI: 10.1007/s11424-013-2291-2
Padulo J, Haddad M, Ardigò LP, Chamari K, & Pizzolato F (2014). High frequency performance analysis of professional soccer goalkeepers: a pilot study. The Journal of sports medicine and physical fitness PMID: 24921614
Mohr M1, Krustrup P, Bangsbo J. (2003) Match performance of high-standard soccer players with special reference to development of fatigue. J Sports Sci. 21(7):519-28.