
It’s perhaps a testament to Sloan’s waning influence on the sports analytics scene as a whole that there was a general scarcity in the number of articles spun out of the annual Boston conference this year, but Tom Worville’s roundup got me thinking.
It’s worth reading in full, but towards the end, Worville speculates on a missing link in data science in football at the moment:
Finally, I foresee there being a shortage of what I’d call ‘translators’ in the industry. There are probably only a handful of people right now who have a job like this, which involves designing and implementing a strategy around the use of data within a club and has a relevant tactical understanding of the game. This person is responsible for choosing what technology, data and products are used, the personnel who are hired and is adept in understanding the needs across the academy, recruitment and analysis departments. They’re not (always) doing the more technical work, but they have a complete understanding of it. As teams begin to hire more technical staff, having someone non-technical to pull it all together seems like a vital step to getting buy-in and actually feeding into decision making processes.
This generated some Twitter debate about what Worville is describing is more appropriately a skill than a position, and I’m inclined to agree that there is no reason for clubs to create an additional job, but perhaps consider hiring technical directors with a strong understanding of data science. It’s not like these people don’t exist; Grétar Steinsson, Everton’s chief European scout, comes to mind, as does someone like Tim Bezbatchenko, now with the Columbus Crew.
But I think there is also room for the kind of analyst described in this piece out of Sloan, a ‘people’ analyst, someone who works to build trust by going beyond building databases and meeting people throughout the organization.
The author, Cade Massey, uses Houston Astro analyst Sig Mejdal as an example.
How does Mejdal spend his time? In the summer of 2017 he was a coach in Troy, New York, deep in the Astros minor league system. This 51-year-old was wearing a uniform, coaching first base, warming up players, and eating with the team after games. The top analyst in the organization spent his summer evenings riding the team bus between small towns in upstate New York!
The Astros are considered a model for blending analytics with traditional expertise. They took this unusual approach with Mejdal because of their commitment to embedding analytics in the organizational DNA. They wanted to break down the barriers that typically exist between those who think in regressions and those who can hit 95-mile-per-hour fastballs. They wanted to create opportunities for players and coaches to ask “the analyst” questions and for the analyst to ask questions of them.
Again, I would argue these people already exist in the football sphere but fly mostly under the radar. Having met many analysts, I also think they would legitimately claim that the time-consuming work of compiling and cleaning data, along with some of the stringent cultural norms of the more conservative clubs, would make this kind of approach to the role difficult, if not impossible.
But I think it’s worth it. However, I also think trust comes from more than mere glad-handing or possessing an intimate knowledge of the game.
To offer an example, Statsbomb has produced a couple of interesting articles on Manchester United under Ole Gunnar Solskjaer, and how the club is likely over-performing at the moment based on expected goals. The first, from Mike Goodman, makes the basic argument that yes—Man United are both better under Solskjaer, but also luckier.
There’s nothing particularly ambiguous about what’s going on with United. They have both improved and gone on a hot streak at the same time. And here it’s important to talk about what exactly it means to get hot. From an xG perspective it simply means whatever is causing the team to score more and concede less is not accounted for in the model. Because the model works, and we know that teams over time largely converge to where the xG model predicts they will be, whatever is causing United’s hot run of form, is likely to be temporary, even as the underlying improvement proves more durable.
A few days later, following United’s 2-0 loss to Arsenal, James Yorke added this, based on the club’s performance this season related to xG:
Man Utd should give Solskjaer the job, but they should also give it him with a realistic expectation of what he can achieve in a given timeframe. They have a large upwards gap to traverse and it will take them time even if they get everything right. As one of the world’s wealthiest clubs, they should have a long term strategy to achieve those goals. Now, who is going to decide which players they buy this summer?
These are important questions, and this isn’t the first time that Man United have provided an interesting case study in the power of understanding underlying numbers—the club’s final season under Sir Alex Ferguson produced a lot of articles on the United’s failure to adequately fortify the squad ahead of his departure.
However, from United’s perspective, the knowledge that they are ‘over-performing’ based on expected goals isn’t inherently interesting. Every year, many clubs over or underperform their way to Champions League berths or relegation scares. This is part of the game. The lesson is rather what does United do after this season to maintain or improve their current form?
For that, the analysis needs to get far more granular. To what degree can we safely ascribe Solskjaer’s positive influence on United’s actual improvement? What specifically are the weaknesses in the squad that need to be addressed to help tick up xG for and lower xG against in future season? What are some specific, realistic transfer targets that will make the most impact on these areas? And what should United’s approach be in the transfer market should these first, second or third options fall through? At what point does adding a player no longer become worth it to help provide more stable underlying performance metrics?
I’m not at all meaning to insinuate Statsbomb’s writers and analysts aren’t aware of these problems and don’t have strong, evidence-based opinions about them with regard to Man United in particular—they are in business for this sort of thing and are smart to keep it to themselves.
I just think the perception that analysts are naive about the dirty business of running a football club, and about the brutal, messy, expensive and chaotic (and exploitative) business of securing football transfers, still lingers throughout the sport, in part because analysts tend to elide over this sort of thing when they present their models.
I think to earn that trust, a lot of analysts building models and making decent predictions about team form should consider speaking to these less-heralded, messier realities, the kind that you sometimes don’t know about until you’ve worked at a club.
The good news is many (most?) analysts writing publicly today now have worked at clubs in some capacity. I just think they should begin to think about really start to go beyond recruitment models that rank players, or assessments that give rough estimates about expected table finish, and offer something like, say, an algorithmic approach to recruitment that takes into account how football transfers ACTUALLY work, something roughly akin to the 37% rule, maybe.
And they should make this work public, where possible.