Smart, Stupid Stats
Issue   |   Tue, 04/23/2013 - 22:25

More than 98 percent of the statements in this column are true. That previous sentence was the exception. It was a lame example of a useless statistic. More than just conjectures, statistics feature a specificity that can have a powerful impact on our thinking about a particular topic. That first sentence is a glaring example of what can go wrong when they are not subjected to the proper scrutiny. Unlike Snapple Facts, however, most sports statistics don’t suffer from questionable veracity. More often than not, sports stats are well researched, and seldom can they simply be dismissed as false.

Nevertheless, their potential to mislead is pernicious, precisely because it’s so easy to get lost in a deluge or dearth of factual information.

A true statistic may be just as misleading as an overtly false one. The batting average of baseball players proves a useful example. Consider two hitters: one with a batting average of .250; the other with a .300 batting average. The former hits at a rate in line with the average for MLB players, while the latter has a significantly higher batting rate than the average major leaguer.

But as Bill James — the preeminent sabermetrician who’s made a career of reframing the importance assigned to statistics in baseball — has written, the difference between an average and above-average hitter isn’t readily discernable from their batting performance in a single game. Over 12 at-bats, the hitter with a .250 average should collect three hits; the .300 hitter four. Twelve at-bats amounts to three games or so. And thus more or less the difference between “average” and “above-average” hitters is an extra hit every three games. Moreover, should the player batting .250 chance into a couple of hits in a single game, one might leave the stadium thinking he was the best hitter on the team. In a small enough sample size, random chance looms large over outcomes — which is one of the reasons why one-game playoff features more excitement than a prolonged series.

The example above also belies the fundamental problem with batting average: that if the question is how good is a hitter, batting average doesn’t provide all that useful an answer. Batting average is nothing more or less than the probability that a hitter will record a hit. Hits are wonderful athletic feats. But a hit isn’t necessary to get on base, or to move base runners up, or drive home a run. It may be tempting to treat batting average as the quintessential measure of a hitter’s ability. The question is, the ability to do what?

The utility of any stat pivots on the question of precisely why we’re interested in it. In other words, the key to understanding what a stat tells us is to understand the question we’re looking to answer by means of the stat. At a baseball game, if the question is which batter gives a team the best chance to score a run, the answer is not simply the player with the highest batting average because it’s not just hits that amount to runs.

Sports leagues stockpile stats because they are useful tools in the quest for objective clarity about athletic performance. When faced with a stat, our first question inevitably must be why am I looking at this? Only then can we proceed to what am I looking for? And once that’s clear, we need to go further: is the emergent trend here a product of chance? If it isn’t, is it sustainable? What underlying changes — in the performance of the team, of the player, or of the competition — does this stat point to?

The same kind of misleading information we encountered earlier is prevalent in basketball statistics as well. Consider two players with the following stats: player A averages 10 points, two rebounds and four assists per game; player B averages 15 points, four rebounds and six assists per game. For simplicity, assume that the two players are physically identical and have the same playing style. Based on this information, can we decide who the better player is?

This might make complicate matters: player A comes off the bench and plays only 18 minutes per game; player B is a starter who is on the court for 36 minutes a game. Standardization of our two hypothetical players’ performances, then, might be one way to derive useful insights from the statistics we have. The most popular method for doing this is extrapolating a player’s output over a 36-minute period. And in our example, it’s clear that even if player A didn’t double his output once his playing time was doubled, he’s still doing more with less, when compared with player B. And so we might expect that player A was the better player.

Remember, however, that in a 48-minute game, there’s no guarantee that every minute of play will feature the same level of competition. And if we really want to find out which player is better, we have to base our evaluation on a fair comparison. In basketball, each player’s contributions are necessarily shaped by the presence — or absence — of the remaining talent on the floor (the strategy of the coach is in play too, but that’s more unwieldy a variable to account for).

And so it wouldn’t be fair to compare the performance of a starter playing against other starters to the performance of a player who could start but instead comes off the bench to face lesser competition. Even if player A is an excellent bench player, it might be true that given extended playing time against starters, his production would drop off. Conversely, if player B is only an average starter, his performance might improve in limited minutes against bench players.

Thus one way to even the stakes for our comparison might be to weight each player’s performance for not only the number of minutes he plays, but for the quality of the competition he plays against. In that light, it becomes far less ambiguous how much each player is making of the opportunities he’s given — and thus which of them is better.

Sports statistics are not in and of themselves misleading. But the inferences one may choose to draw from them, or be quietly led to draw from them, can be dangerously out of touch with reality. Factual as they may be, the only way to really digest a stat is with a grain of salt.

Velda (not verified) says:
Wed, 12/10/2014 - 12:32

Wrige more, thats all I have to say. Literally, it seems as tuough you relied on the video
too make your point. You obviously know what yiure talking
about, why throw away your intelligence on just posting videos to yyour blog whewn you could
be giving us something enlightening to read?

Here is my blog :: 223 ammo bulk sale

Filtered HTML

  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <blockquote> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.