Saturday, June 28, 2014

On Luck, Skill and Hard Work - in Soccer and Life

Big data doesn't always get us closer to truth. Especially if there a fair bit of luck involved. And many think this applies to football/soccer games (Sally and Anderson, for example, say that soccer results are 50% luck). Yet data analysis provides valuable, sometimes counter-intuitive insights into this beautiful sport and the science of winning and losing in general.

How many measurable elements of a soccer game contribute to the outcome? 2014 FIFA world cup's statistics page displays scores calculated with sophisticated motion analysis from thousands of player movements along with more straightforward measures such as goals scored, short, medium and long hits, completion rate of passes, blocked and saved shots, attempts off target, tackles and blocks. And there are also flops, screams, winces, poundings of the grass and other theatrical elements that may also decide the fate of the game.

Just a quick glance at the FIFA statistics page (refreshed after each game) might bring surprises. On June 24th, the best defending team was Columbia that advanced to the next round, while the best attacking team, Côte d'Ivoire, and the team with the highest number of successfully completed passes, Spain, were not able to make it. The leader board is now featuring winning teams as the best attacker (France, as of June 25th) and the best passer (Germany,  as of June 26th), but obviously neither of these achievements alone is sufficient to predict the winner.

Data from the last seasons of the British Premier League demonstrated that scoring a goal was twice less valuable than not conceding a likely goal. Yet England - #6 on the list of top attackers, #8 on the list of best passers and #11 on the list of best defenders  - did not make it into the top 16, while France - #32 as a defender and one of the very best attacking teams has advanced to the round of 16. Only three teams among the top ten attackers won the 1st round and all three - Argentina, France and Germany - are leaders in their respective groups. Compare it to seven from the ten best defending teams that advanced to 1/16th finals. Note that all of them took second, not the 1st positions in their groups. So the ability to defend against counter-attack is definitely crucial to success, but the propensity to attack increases both the risk and the potential return.

A good example of skill in soccer is the elegant passing style of Spanish players dubbed Tiki-taka. This approach, based on speed, unity and a comprehensive understanding of the field geometry, helped Spain to win in Euro 2008, the 2010 FIFA World Cup and Euro 2012. Network analysis of interactions among the players (Cota t al., 2011; Peña & Touchette, 2012) highlighted the importance of skillful passes, yet the ability to do it well doesn't always lead to success - as was demonstrated by Brazilian team during the 2013 FIFA Confederations Cup (they won despite possessing the ball only 47% of the time vs 53% for Spain) and by Netherlands and Chile that knocked Spain out in the group stage.

How can we measure luck in soccer separate from skill?
One way would be to forecast game outcomes in terms of probabilities (as a researcher from Wolfram Alpha did for the upcoming round of 2014 world cup -- see the figure) then look at the distribution of actual results of games between these teams. Another useful tool adopted from ice hockey analysis is PDO - the sum of a team's shooting and save percentages (fraction of shots resulted or not resulted in a goal scored). Neither of the approaches was able to pinpoint particularly lucky teams. In analysis of a 2010/2011 season, good and bad teams appeared to be equally "lucky" or "unlucky".

"I am a great believer in luck,"said Thomas Jefferson, "and I find the harder I work, the more I have of it".

Perhaps we need to focus on values like amounts of running during the game  - as a proxy for hard work? US defender Michael Bradley holds the trophy for the 1st round of world cup (largest distance covered), but average distance ran by players that advanced or did not advance to the next round seems to be about the same. However, if we compare players from higher and lower divisions of national teams, the differences in these distances become more dramatic. In one Dutch study, top-class players performed 28% and 58% more high-intensity running and sprinting, respectively, than the moderate players (Mohr et al., 2003). Better goalkeepers ran more too, as seen from a recent Italian study (Paduli et al, 2014). So hard work (and good health to carry on) is very important, indeed. At least in order to join the elite soccer club.

The amount of data created every minute for the analysis of soccer games is absolutely amazing. In order to accurately predict the outcome of the game played by almost equally skilled & hardworking teams one needs to know minute-to-minute movements of every player. Like the weather, the scores of such games might be hard to forecast past a certain timeframe. Yet, better models and more sophisticated computations will be yielding more accurate results  (as shown by Aurametrix for subtle cause-effect relationships contributing to how you feel on a daily basis).

But for now let's call it luck when we can't see the unseen and predict things before they happen. And let's enjoy top-flight soccer for the next few weeks.

Javier López Peña, & Hugo Touchette (2012). A network theory analysis of football strategies In C. Clanet (ed.), Sports Physics: Proc. 2012 Euromech Physics of Sports Conference, p. 517-528, \'Editions de l'\'Ecole Polytechnique, Palaiseau, 2013. (ISBN 978-2-7302-1615-9) arXiv: 1206.6904v1

Cotta, C., Mora, A., Merelo, J., & Merelo-Molina, C. (2013). A network analysis of the 2010 FIFA world cup champion team play Journal of Systems Science and Complexity, 26 (1), 21-42 DOI: 10.1007/s11424-013-2291-2

Padulo J, Haddad M, Ardigò LP, Chamari K, & Pizzolato F (2014). High frequency performance analysis of professional soccer goalkeepers: a pilot study. The Journal of sports medicine and physical fitness PMID: 24921614

Mohr M1, Krustrup P, Bangsbo J. (2003) Match performance of high-standard soccer players with special reference to development of fatigue. J Sports Sci. 21(7):519-28. 

Sunday, April 13, 2014

The Curse of the Internet

It's hard to imagine our lives without the
Someone from 1950s appeared today... what's most difficult thing about life to explain to them. A device in pocket capable of accessing all information known to man. Use it to look at pictures of cats and argue with strangers.
Internet  - either mobile or desktop.

The Internet has become a catalyst of innovation, an essential tool in business and social life. It brought new levels of participation and access to knowledge. It enabled new forms of interaction, albeit mostly utilized for entertainment purposes (as in the famous answer of a Reddit user to a now deleted question captured in the figure on the right).

But despite all the advantages and conveniences, does the Internet really serve us or is it the other way around?

Internet companies, large and small, are quietly but forcefully collecting our life's data hoping to have us "on the leash."

If people want to use a web service, the service gets away with almost anything. Google knows about our friendships, content of gmail and google voice conversations. They see the places we go or want to go on maps and how we spend time on millions of websites. Amazon knows about our tastes and interests, phone carriers have nearly minute-by-minute accounts of months and years of our lives, credit card companies are building our psychographic profiles. Target stores can figure out their customers' health conditions before they do... and if you think other companies are better protecting sensitive information (remember the giant data breach?), think again.

Discovered this week, major security flow dubbed "Heartbleed" had existed for over two years. The defect in encryption technology used by many websites and networking equipment makers have put millions of passwords and other sensitive information at risk. Just another reminder of why you should scrutinize the security on the Internet and other web-connected gadgetry.

Vulnerabilities can be found everywhere. The network of a big oil company was hacked through the online menu of a Chinese restaurant popular with employees. Target was breached through its heating and cooling system. Printers, thermostats, videoconferencing equipment, household items, even vending machines and gas pumps can be used to gain access to your data. And so can employees of the companies collecting data. Last year there were multiple cases when stolen patient identification information was used to file unauthorized income tax returns.

Recently published SANS healthcare cyberthreat report reveals that health care networks (hospitals, insurance carriers, pharmaceutical companies, web sites, software and devices) - have been and continue to be compromised by successful cybercriminal attacks. Health networks seem to have the weakest Internet security among sites dealing with sensitive information, often not addressing very basic issues, vulnerable to off-line password guessing and user impersonation attack.

Trust is especially important in health care. As the days of blind trust that 'doctor knows best' are becoming a distant memory, new cases of security breaches can lower the trust further discouraging use of digital health services and disclosure of important medically relevant information.

At present, most digital health products and corporate wellness programs fail both companies and patients. There are many fundamental flaws responsible for that. And the lack of trust is not going to make it any more successful.

Paraphrasing Derek Thompson's passage about Facebook and Amazon, for the Internet of Things for Health and Wellness to succeed, we have to embrace a new version of intimacy that felt natural when the good old-fashioned country doctor made house calls. The machines have to know us. Will we let them?


Pogue D (2014). The curse of the cloud. Scientific American, 310 (2) PMID: 24640327

Wu F, & Xu L (2013). Security analysis and Improvement of a Privacy Authentication Scheme for Telecare Medical Information Systems. Journal of medical systems, 37 (4) PMID: 23818249

The SANS-Norse Healthcare Cyberthreat Report:

Sunday, December 15, 2013

From Cyber Zombiness to Ambient Awareness

Dr. Phlox, the Enterprise surgeon, responded to the comment about movies (aka stories unfold on the screen) by answering: "Well, we had something similar a few hundred years ago, but they lost their appeal when people discovered their real lives were more interesting."

At this stage of our evolution, virtual characters and screens are taking over our lives. We stare at our devices - smartphones, tablets, laptops, TVs - 12 or more hours a day.  We are less fully aware of our surroundings and reality. We already lost many of our ancient abilities - like the ability to recognize certain smells related to survival and identity (check the color wheel bellow). We lost the desire to move fast (see this study about today's kids taking a minute-and-a-half longer to run a mile than kids did in the 1980s or this piece about marathon runners), and are losing fine motor skills - as more children can not hold a pencil.

The average American socializes offline by about 5 minutes less than 10 years ago. And does a little less of everything that involves communicating with the real world. Perhaps this is the reason for headlines like "Completely oblivious cellphone users didn't see a gunman in their midst" or "Americans don't trust each other anymore."

Meanwhile more and more devices are watching everything we do - online and offline. This holiday season, at least a thousand of retailers - from large chain stores to small boutiques will track shoppers' movements in real time using iBeacons, Gimbal sensors and other high tech innovations. Mobile phones will soon be analyzing our emotions, isles in the grocery store and mannequins modeling clothes will be able to scan our faces and offer well-timed in-store commercials and coupons. So will smart TVs and cars. And technology worn by other people - such as google glass or Kapture wristband. And even more technology to watch when we are being watched, so we can smile when on camera. Everything will spy on us and we are gradually becoming used to it, outsourcing self-awareness and self-management to machines.

Will this ever lead to devices bringing real value to human life? And new ways to extend our senses and cognitive abilities, and enrich our daily lives without interfering with them?

Only time will tell.


Colin IM, & Paris I (2013). Glucose meters with built-in automated bolus calculator: gadget or real value for insulin-treated diabetic patients? Diabetes therapy : research, treatment and education of diabetes and related disorders, 4 (1), 1-11 PMID: 23250633

Scott Wallsten (2013). What Are We Not Doing When We're Online The National Bureau of Economic Research DOI: 10.2139/ssrn.1966654

Shoham, A. and Pesämaa, O. (2013), Gadget Loving: A Test of an Integrative Model. Psychol. Mark., 30: 247–262. doi: 10.1002/mar.20602

blockquote { margin:1em 20px; background: #dfdfdf; padding: 8px 8px 8px 8px; font-style: italic; }