Compromised numbers: Why the statistic you see may not be actual possession

10 Comments

One of the amazing statistics to come out of last Wednesday’s UEFA Champions League match was the possession number. Barcelona was reported by UEFA was having held the ball 72 percent of the time, an amazing figure against a club of Chelsea’s caliber. For those who have tried to find significance to correlations between possession and victories, the number must have been both remarkable and beguiling. After all, Barcelona lost, giving more credence to the hypothesis’ main qualm: What if one team doesn’t care about holding the ball?

The next day, the possession story got even more confusing. Supreme stat overlords Opta reported that Chelsea had only managed 20 percent of the ball. What? Even less time in possession? How freakish is this data point going to get?

That, however, is not the story. At least, it’s the story in light of what Graham MacAree notes at Chelsea fan site We Ain’t Got No History. As he’s found out, Opta seems to be miscalculating possession; or, better put, Opta is not reporting a number consistent with the normal expectation for a possession stat.

The normal expectation: When one team has the ball, they’re in possession. I think we can all agree on this, right? This still leaves a lot of gray area. For example, who gets credit for possession when midfield chaos leaves neither side in control? Does one team get possession on a goal kick, when most goal kicks lead to 50-50 midfield challenges? And more broadly, what happens when play is dead but the game clock is running?

I’ve always assumed this is like a chess clock. When one team controls the ball, you hit a button that sends their dials turning. When the other fully regains possession, you hit a button. One clock stops. The other starts running. Those in between moments? They’re governed by one rule: Until possession changes, don’t touch anything.

That, apparently has nothing to do with Opta’s calculations. In fact, Graham’s research suggests Opta doesn’t even run a clock, which may be why they never report possession in terms of time. Instead, the relation between reported possession and total passes suggests Opta just uses passes. As Graham found out, if you take a team’s pass attempts a divide it by the game’s total attempted passes, you have Opta’s possession stat.

What does this mean? Let’s take a totally fake scenario. Barcelona plays three quick passes before trying a through ball that rolls to Petr Cech. It all takes four seconds, while Petr Cech keeps the ball at his feet for eight seconds before picking it up, holding it for five seconds, then putting it out for a throw in, which takes eight more seconds to put back into play.

Despite Barcelona having possession for only four of those 25 fake seconds, they’d have 80 percent of Opta’s possession (three good passes plus one bad, while Chelsea had only Cech’s unsuccessful pass). A logical expectation of a zero-sum possession figure would have that as either 16 percent or (if you credit the time out of play as Barça’s, since they’d have the ensuing throw) 48 percent Barcelona’s. Or, if you do a three-stage model (that’s sometimes reported in Serie A matches), you’d have 16 percent Barcelona, 52 percent Chelsea, and 32 percent limbo/irrelevant.

Of the three methods of reporting possession, Opta’s bares the least resemblance to reality; or, it’s the one that deviates furthest from what we expect from a possession stat.

Ironies being a thing these days, there are two here. First, Opta is the unquestioned leader in soccer data management. How could this happen?

Second, Opta isn’t trying to hide their methods. In fact, they’ve published a post on their site detailing not only their practices but their motivations and research, an investigation that found their approach “came up with exactly the same figures (as time-based methods) on almost every occasion.”

You would think two curmudgeons like Graham and myself would have found this, right? Graham had a reader point it out to him, while a representative from Opta magnanimously pointed me to the piece without the seemingly necessarily indignation of explaining how a Google search works. After all Graham’s work and head scratching – after my lack of work and similar head-scratching – we could have just gone to Opta’s site.

“We try to be as transparent as possible with this stuff,” Opta said when I asked them about it. Certainly, they should be commended being so up front about their methods. After all, they’re a business that makes money off their work. They don’t need to give away their secrets.

But that’s a secondary issue. The main one: Why is a data house like Opta, reputed as the industry standard, taking this short cut? Or, why haven’t they renamed their measure? Granted, the perception that it is a shortcut may have more to do with our expectations than their intent, though based on their defense in the post, it’s clear they do see this as an accurate way of describing possession.

Still, the number they publish is completely redundant to the raw passing numbers also distributed. Why put the measure out at all if not to check a “possession stat” box on a list of deliverables?

Opta’s possession stat shouldn’t be cited in reporting, and if it is, the word “possession” shouldn’t be used to describe it. Reader expectations for anything labeled “possession” are drastically different than what Opta’s producing. The number is confusing to the point of being misleading. It’s becoming counter-information because of its poor packaging.

Even though Opta’s post on the topic is 14 months old, most will be surprised to hear this “news.” It’s disconcerting for anybody who is hoping a SABR-esque revolution’s on the horizon. Almost all of the huge volume of data to which we have access has been useful, but where people are expecting something akin to linear weights to be published tomorrow, we can’t even agree on the terms (let alone the significance of them).

Graham probably puts it better:

I’m completely fine with keeping track of passing volume – I’ve done it before myself. What’s frustrating, from an analyst’s point of view, is that we’re being sold a dud. A statistic that ostensibly measures possession measures something that is not possession, and gets repeated as authoritative anyway.

And people wonder why football statistics don’t get taken very seriously.

Sky Blue’s Sam Kerr named NWSL league MVP for 2017

Photo by Tony Feder/Getty Images
Leave a comment

CHICAGO (AP) Sky Blue FC forward Sam Kerr has been named the National Women’s Soccer League’s Most Valuable Player for this season.

Kerr, a standout on the Australian national team, had a league-record 17 goals and became the league’s fifth Golden Boot winner to also be named MVP.

[ MORE: MLS Decision Day preview ]

She became the first NWSL player to score four goals in a game on Aug. 19 against the Seattle Reign. Kerr rallied Sky Blue from a 3-goal deficit to beat the Reign 5-4.

Kerr’s award was announced Friday. The league had earlier announced the season’s other award winners: North Carolina’s Abby Dahlkemper was named Defender of the Year, teammate Ashley Hatch was named Rookie of the Year, Portland’s Adrianna Franch was named Goalkeeper of the Year and the Thorns’ Mark Parsons was named Coach of the Year.

The Mendy Effect: Pep praises injured back

Photo by Laurence Griffiths/Getty Images
Leave a comment

Full back Benjamin Mendy cost Pep Guardiola and Manchester City $68 million, but perhaps he’ll be just as valuable as a very expensive sports psychologist.

The “Shark Team” member was amusing on social media even before his ACL injury sent him to the sidelines until at least April, but he’s become a must-follow Twitter fixture with his in-game messages (See some of his work below).

[ MORE: MLS Decision Day preview ]

City faces Burnley at 10 a.m. ET Saturday (Watch live on CNBC and online via NBCSports.com).

It’s fairly clear his act has translated in private, too, as apparently Mendy is just as good in group messages (Let us in, Pep. We won’t tell anyone). From ManCity.com:

“Usually, players who are out for a long time with injury are sad. They sometimes train apart and feel isolated.

“Mendy decided to be present. He is communicating on social media, WhatsApp and he calls his teammates and messages me. He is going to be so important outside the pitch because people like him make the atmosphere much better.”

It’s not surprising for anyone who’s been following the former Le Havre, Marseille and Monaco man.

Keep in mind, these Tweets below are from the last few days alone!

Ozil to Manchester United?!? Wenger reacts to gossip

Photo by Shaun Botterill/Getty Images
Leave a comment

Sometimes, even the biggest Arsene Wenger detractors have to feel for the guy.

Coming off a thrilling late win in Serbia, one that saw Olivier Giroud cap off a team goal straight out of the creative Wenger playbook, the manager should’ve been discussing how to stretch those good vibes into this weekend’s visit to Goodison Park.

[ MORE: MLS Decision Day preview ]

Yet no. Instead of talking about how the Gunners would respect the struggling Toffees, Wenger had to address speculation from several outlets claiming Mesut Ozil would move to Manchester United, perhaps as soon as January.

Feeling it? No, no Wenger was not feeling it. From Arsenal.com:

“We have to deal with all kinds of speculation when the players are at the end of their contracts. On the other hand, to be professional is to give 100 per cent as long as you are somewhere. For the rest, we came out many times and said that’s the situation. It [the media] can come out tomorrow and say that he extends his contract here. It will be exactly the same, it will not change anything. When you play the next game, commit 100 per cent. … When a player plays for Arsenal Football Club, his commitment cannot be linked with the length of his contract, it has just to be linked with the responsibility and the ambition he has to win the football game.”
Of course most big clubs have to deal with such drama on a year-to-year basis and, yes, having Ozil and Alexis Sanchez still in town was completely avoidable. But the idea that Ozil could leave, for free, to Manchester United? We’re sure Gooners the world over will be thrilled with the gossip.

MLS Decision Day preview: Much at stake

Jonathan Hayward/The Canadian Press via AP
Leave a comment

Four teams can claim a Western Conference second round berth, while four more can earn a valuable first-round bye in the East.

Yep, there’s plenty to play for beyond the West’s final playoff spot Sunday during Major League Soccer’s Decision Day, when every team will take the pitch for 4 p.m. ET kickoffs.

[ MORE: Conte feels pressure ]

Here’s what we do know regarding the playoffs:

  1. Supporters’ Shield winning Toronto FC gets a first round bye, while No. 6 seed New York Red Bulls are headed to the road for a first round playoff
  2. New England, Montreal, Philadelphia, Orlando, DC, Minnesota, Colorado, and LA will not make the playoffs
  3. Full stop.

So, yes, this will be fun.

First, let’s look at the Eastern Conference Standings ahead of Sunday’s extravaganza:

Eastern Conference
Team GP W D L GF GA GD Home Away PTS
Toronto FC 33 20 8 5 72 35 37 13-3-1 7-5-4 68
x – New York City FC 33 16 8 9 54 41 13 10-4-2 6-4-7 56
x – Chicago 33 16 7 10 61 44 17 12-3-2 4-4-8 55
x – Atlanta 33 15 9 9 68 38 30 11-2-3 4-7-6 54
x – Columbus 33 16 5 12 51 47 4 12-2-3 4-3-9 53
x – New York 33 13 8 12 51 46 5 9-6-2 4-2-10 47

— New York City FC controls its bye destiny, though Columbus could join them on 56 points and would pass them on tiebreakers (wins).

— If that happens, Chicago could claim the second bye with a win or draw in Houston (The Fire owns the goal differential tiebreaker).

— If New York City and Chicago lose or draw, Atlanta could finish second with a home win over TFC.

— Columbus can finish second with a win and non-wins for Chicago and Atlanta.

Western Conference
Team GP W D L GF GA GD Home Away PTS
x – Vancouver 33 15 7 11 49 47 2 9-5-3 6-2-8 52
x – Portland 33 14 8 11 58 49 9 10-4-2 4-4-9 50
x – Seattle 33 13 11 9 49 39 10 10-5-1 3-6-8 50
x – Sporting KC 33 12 13 8 39 27 12 10-6-1 2-7-7 49
x – Houston 33 12 11 10 54 45 9 11-4-1 1-7-9 47
San Jose 33 12 7 14 36 58 -22 9-5-2 3-2-12 43
FC Dallas 33 10 13 10 43 47 -4 7-7-2 3-6-8 43
Real Salt Lake 33 12 6 15 47 54 -7 8-4-4 4-2-11 42

Byes

— Vancouver finishes first with a win or draw at Portland. The ‘Caps could finish as low as third with a loss to Portland and a Seattle win versus Colorado.

— Portland finishes first — and wins the Cascadia Cup — with a win over visiting Vancouver.

— Seattle can claim a first round bye with a win over visiting Colorado and a Vancouver win over Portland.

— Sporting KC can finish second with a win at Real Salt Lake and non-wins for Portland and Seattle.

Final playoff spot

— San Jose claims the sixth seed with a home win over Minnesota. They can also finish sixth with a draw joined by non-wins for FC Dallas at home to LA and Real Salt Lake at home versus SKC.

— FC Dallas claims the sixth seed with a win over LA and a San Jose draw or loss versus Minnesota. FCD gets sixth with a draw, and a San Jose loss coupled with a RSL loss or draw versus SKC.

—  Real Salt Lake gets sixth with a win over SKC, and non-wins from San Jose and Dallas. RSL could also get sixth with a draw and losses for San Jose and Dallas.

(Photo by Peter G. Aiken/Getty Images)

Schedule
FC Dallas vs. LA Galaxy
DC United vs. New York Red Bulls
San Jose vs. Minnesota United
Real Salt Lake vs. Sporting KC
Houston vs. Chicago
Seattle vs. Colorado
Philadelphia vs. Orlando City
Portland vs. Vancouver
Montreal vs. New England
New York City vs. Columbus
Atlanta vs. Toronto

Predictions

— NYCFC hangs on for a draw against Columbus, earning a bye, leaving Chicago to host New York Red Bulls and Atlanta off to Columbus for the first round of the playoffs (We have Toronto beating Atlanta on Sunday).

— Vancouver and Portland draw, while Seattle beats Colorado. The ‘Caps and Sounders get byes, while Seattle takes back the Cascadia Cup.

— San Jose beats Minnesota, gaining the West’s sixth seed. The Quakes head to Portland for the first round, while SKC hosts Houston.