Compromised numbers: Why the statistic you see may not be actual possession

10 Comments

One of the amazing statistics to come out of last Wednesday’s UEFA Champions League match was the possession number. Barcelona was reported by UEFA was having held the ball 72 percent of the time, an amazing figure against a club of Chelsea’s caliber. For those who have tried to find significance to correlations between possession and victories, the number must have been both remarkable and beguiling. After all, Barcelona lost, giving more credence to the hypothesis’ main qualm: What if one team doesn’t care about holding the ball?

The next day, the possession story got even more confusing. Supreme stat overlords Opta reported that Chelsea had only managed 20 percent of the ball. What? Even less time in possession? How freakish is this data point going to get?

That, however, is not the story. At least, it’s the story in light of what Graham MacAree notes at Chelsea fan site We Ain’t Got No History. As he’s found out, Opta seems to be miscalculating possession; or, better put, Opta is not reporting a number consistent with the normal expectation for a possession stat.

The normal expectation: When one team has the ball, they’re in possession. I think we can all agree on this, right? This still leaves a lot of gray area. For example, who gets credit for possession when midfield chaos leaves neither side in control? Does one team get possession on a goal kick, when most goal kicks lead to 50-50 midfield challenges? And more broadly, what happens when play is dead but the game clock is running?

I’ve always assumed this is like a chess clock. When one team controls the ball, you hit a button that sends their dials turning. When the other fully regains possession, you hit a button. One clock stops. The other starts running. Those in between moments? They’re governed by one rule: Until possession changes, don’t touch anything.

That, apparently has nothing to do with Opta’s calculations. In fact, Graham’s research suggests Opta doesn’t even run a clock, which may be why they never report possession in terms of time. Instead, the relation between reported possession and total passes suggests Opta just uses passes. As Graham found out, if you take a team’s pass attempts a divide it by the game’s total attempted passes, you have Opta’s possession stat.

What does this mean? Let’s take a totally fake scenario. Barcelona plays three quick passes before trying a through ball that rolls to Petr Cech. It all takes four seconds, while Petr Cech keeps the ball at his feet for eight seconds before picking it up, holding it for five seconds, then putting it out for a throw in, which takes eight more seconds to put back into play.

Despite Barcelona having possession for only four of those 25 fake seconds, they’d have 80 percent of Opta’s possession (three good passes plus one bad, while Chelsea had only Cech’s unsuccessful pass). A logical expectation of a zero-sum possession figure would have that as either 16 percent or (if you credit the time out of play as Barça’s, since they’d have the ensuing throw) 48 percent Barcelona’s. Or, if you do a three-stage model (that’s sometimes reported in Serie A matches), you’d have 16 percent Barcelona, 52 percent Chelsea, and 32 percent limbo/irrelevant.

Of the three methods of reporting possession, Opta’s bares the least resemblance to reality; or, it’s the one that deviates furthest from what we expect from a possession stat.

Ironies being a thing these days, there are two here. First, Opta is the unquestioned leader in soccer data management. How could this happen?

Second, Opta isn’t trying to hide their methods. In fact, they’ve published a post on their site detailing not only their practices but their motivations and research, an investigation that found their approach “came up with exactly the same figures (as time-based methods) on almost every occasion.”

You would think two curmudgeons like Graham and myself would have found this, right? Graham had a reader point it out to him, while a representative from Opta magnanimously pointed me to the piece without the seemingly necessarily indignation of explaining how a Google search works. After all Graham’s work and head scratching – after my lack of work and similar head-scratching – we could have just gone to Opta’s site.

“We try to be as transparent as possible with this stuff,” Opta said when I asked them about it. Certainly, they should be commended being so up front about their methods. After all, they’re a business that makes money off their work. They don’t need to give away their secrets.

But that’s a secondary issue. The main one: Why is a data house like Opta, reputed as the industry standard, taking this short cut? Or, why haven’t they renamed their measure? Granted, the perception that it is a shortcut may have more to do with our expectations than their intent, though based on their defense in the post, it’s clear they do see this as an accurate way of describing possession.

Still, the number they publish is completely redundant to the raw passing numbers also distributed. Why put the measure out at all if not to check a “possession stat” box on a list of deliverables?

Opta’s possession stat shouldn’t be cited in reporting, and if it is, the word “possession” shouldn’t be used to describe it. Reader expectations for anything labeled “possession” are drastically different than what Opta’s producing. The number is confusing to the point of being misleading. It’s becoming counter-information because of its poor packaging.

Even though Opta’s post on the topic is 14 months old, most will be surprised to hear this “news.” It’s disconcerting for anybody who is hoping a SABR-esque revolution’s on the horizon. Almost all of the huge volume of data to which we have access has been useful, but where people are expecting something akin to linear weights to be published tomorrow, we can’t even agree on the terms (let alone the significance of them).

Graham probably puts it better:

I’m completely fine with keeping track of passing volume – I’ve done it before myself. What’s frustrating, from an analyst’s point of view, is that we’re being sold a dud. A statistic that ostensibly measures possession measures something that is not possession, and gets repeated as authoritative anyway.

And people wonder why football statistics don’t get taken very seriously.

Bravo fit again, but will he start Chile’s Confed Cup group finale?

Photo credit should read YURI KADOBNOV/AFP/Getty Images
Leave a comment

MOSCOW (AP) Claudio Bravo is fit again and could start in goal against Australia at the Confederations Cup on Sunday, Chile coach Juan Antonio Pizzi said Saturday.

Bravo – who is Chile’s joint most-capped player with Alexis Sanchez – hasn’t played since April 27, when he injured his calf for Manchester City in a derby game with Manchester United.

“Claudio is fit, he’s managed to train the last couple of days just like his other teammates,” Pizzi said. “He’s ready and available to play.”

Pizzi brushed off concerns about a lack of match fitness, saying that “quite obviously we take into account that factor” but players like Bravo are “are of such good quality that it isn’t that important they haven’t played in the last couple of months.”

Stand-in Johnny Herrera played in Chile’s 2-0 group stage win over Cameroon and Thursday’s 1-1 draw with Germany.

Gary Medel was substituted with a minor injury while playing in defense for Chile against Germany. Teammate Francisco Silva said Saturday that Medel had complained of “a very small muscle contraction issue” but was now fit.

Pizzi said he will aim to tire out Australia with Chile’s trademark all-action style, even though his team struggled for energy in the latter stages against Germany.

“This energy drop we had in the second half didn’t damage us too much because the opposing team couldn’t maintain a high pace because of the demands we’d imposed on them,” Pizzi said of the Germany game.

“We’re going to try to get (the Australians) tired as well and use this to beat our opponent, and we hope this is going to translate into goals.”

USMNT’s Wood extends Hamburg contract through 2021

Photo credit: Hamburg / Twitter: @HSV
Leave a comment

HAMBURG, Germany (AP) American forward Bobby Wood extended his contract with Bundesliga side Hamburger SV on Saturday to 2021.

Wood, who joined Hamburg from second-tier Union Berlin, scored five goals and set up two more in 28 Bundesliga games. He had 17 goals in 31 second-division games for Union the season before.

“Not only his goals count for us, but his readiness to run and challenge,” Hamburg sporting director Jens Todt said. “Bobby is a key player for our offense and a real team player.”

Wood has eight goals in 32 appearances for the United States.

MLS Snapshot: NYCFC run rampant on Red Bulls, win 2-0

Photo credit: NYCFC
Leave a comment

The game in 100 words (or less): The only thing standing in the way of New York City FC avenging their infamous 7-0 loss to the New York Red Bulls with a lopsided demolition job of their own, on Saturday, was an otherworldly goalkeeping performance from Luis Robles. It was the Red Bulls shot-stopper, with his four saves on the afternoon (three of them coming in spectacular fashion), who kept Jesse Marsch’s side within touching distance for more than an hour. Jack Harrison was denied early on by Robles, but got the better of him not long later for the game’s opening goal. Heroics from Robles kept the score at 1-0 for another 32 minutes, before Ben Sweat’s (accidental?) header made it 2-0 in the 65th minute. The Red Bulls, on the other hand, managed their first shot on target in the 80th minute. That’s three wins in a row for NYCFC, who go seven points clear of their Hudson River rivals and keep Toronto FC in sight at the top of the league table, five points ahead.

[ MORE: Transfer rumor roundup ]

Three moments that mattered

18′ — Robles goes full-stretch to deny Harrison — David Villa’s vision and Rodney Wallace‘s hold-play created the chance for Harrison, but Luis Robles’ acrobatics denied the 20-year-old Englishman in spectacular fashion.

33′ — Harrison not to be denied this time — Sweat delivered the ball to Harrison near the top of the box, and the second-year man did everything right with what’s a really, really difficult chance to take — facing away from goal, first-time, ball traveling across the goalkeeper, upper-90 to the far post.

65′ — Sweat loops a header past Robles for 2-0 — Sweat probably didn’t mean it, but the ball hit the back of the net, and that’s all that matters. Not a bad time to score your first MLS goal, either.

[ FOLLOW: All of PST’s MLS coverageStandings | Stats | Schedule ]

Man of the match: Luis Robles

Goalscorers: Harrison (33′), Sweat (65′)

Watford signs Will Hughes from Derby County

Getty Images
Leave a comment

Watford has completed the capture of 22-year-old central midfielder Will Hughes, a fantastic transfer for one of England’s younger talents.

Hughes, despite his young age, racked up 189 appearances for Derby County (despite missing significant time in 2015 for an ACL tear) and now gets his first shot at the Premier League, and with it potentially a chance to push his way into the England fold. Hughes has been a staple for the England youth system, making 22 appearances for the country’s U-21 side but is yet to feature for the senior team.

The fee for the transfer was undisclosed but reports have tabbed the amount at around $10 million.

Hughes came close to making the Premier League with Derby County on multiple occasions, reaching the Championship playoffs in both 2014 and 2016. Now, he’ll battle the likes of Valon Behrami, Tom Cleverley, Etienne Capoue, Abdoulaye Doucoure for a spot in Watford’s midfield.

The club release confirmed that Hughes has not yet completed his medical, and will do so when he returns to the U.K. from competing in the U-21 European Championships in Poland.