Last week as I was on a road trip with my father, we got into a discussion about advanced metrics in baseball, and how they are often times more useful in evaluating players than some of the traditional metrics we all grew up with. We went back and forth for quite a while (due in no small part to the fact that I had to stop and explain concepts like Batting Average on Balls In Play, which helps to explain the influence of luck), and I finally was able to hit upon this as a central thesis: advanced metrics in baseball are used by smart front offices to determine how they can make their team incrementally better. I think it’s pretty self evident that a guy like Alex Rodriguez is good at baseball. Unfortunately, there are precious few players that are super elite performers like Rodriguez, and not every team is able to secure the services of such a player. Thus, teams like the Tampa Bay Rays need to find other ways to produce runs and wins. This is where the advanced metrics come in. Where can they find a way to score one extra run, prevent one extra run, and win one more game, and do so on a budget? The players that can provide that incremental improvement aren’t found in the premium shopping bin in the free agent market. They’re found in journeyman utility players, bullpen arms, and unheralded starting pitchers. I like to think that by the end of our conversation, my father had a little bit better of an understanding of what advanced metrics in baseball were all about.
Then yesterday, I stumbled across this piece. After the jump, I’ll take a look at some of the author’s logical missteps, and try to put advanced metrics in a little clearer context.
We start out with an analogy about buying cars, and how people become fixated on horsepower and gas mileage. Apparently, people have become fixated on these two measurements, and often to the detriment of other factors. (Hmm, I wonder why that is?) So, with that as the premise, let’s dig in.
Unfortunately, this obsession with horsepower and fuel economy turns out to be a big mistake. The explanation is simple: The variables don’t matter nearly as much as we think. Just look at horsepower: When a team of economists analyzed the features that are closely related to lifetime car satisfaction, the power of the engine was near the bottom of the list. (Fuel economy was only slightly higher.) That’s because the typical driver rarely requires 300 horses or a turbocharged V-8. Although we like to imagine ourselves as Steve McQueen, accelerating into the curves, we actually spend most of our driving time stuck in traffic, idling at an intersection on the way to the supermarket. This is why, according to surveys of car owners, the factors that are most important turn out to be things like the soundness of the car frame, the comfort of the front seats and the aesthetics of the dashboard. These variables are harder to quantify, of course. But that doesn’t mean they don’t matter.
Almost immediately, I have a problem with what’s being posited here. Unless you’re buying a car just for fun, you’re looking for specific things out of the vehicle you’re about to purchase. Is it going to be a work truck? Is it for hauling the family around town and on vacation road trips? Is it slated to be somewhat of a beater, a daily driver just to get you to and from work? Depending on the answer to those questions, gas mileage, horsepower, soundness of frame, and seat comfort take on varying degrees of importance. Additionally, you can in fact test drive these vehicles before you sign on the line which is dotted. That way, you can tell how comfortable the seats are, and how much you like the aesthetics of the dash array. If soundness of the frame is a primary concern, there are several places where you can research such a thing (Car and Driver and Motor Trend come to mind). Bottom line, if you’re not test driving a vehicle before you buy it to ensure an appropriate comfort level, and you’re not researching the aspects of the car’s construction as they pertain to durability, you’re not very good at car shopping. Also worth pointing out here is that you can in fact quantify seat comfort and dashboard aesthetics. Those aren’t “intangibles”. They’re very tangible things. If you’re deciding between three cars, test drive each, and then rank each on those aspects. Seems pretty simple to me. Moving on.
But this is not a column about cars. My worry is that sports teams are starting to suffer from a version of the horsepower mistake. Like a confused car shopper, they are seeking out the safety of math, trying to make extremely complicated personnel decisions by fixating on statistics. Instead of accepting the inherent mystery of athletic talent — or at least taking those intangibles into account — they are pretending that the numbers explain everything. And so we end up with teams that are like the worst kind of car. They look good on paper — so much horsepower! — but they fail to satisfy. The dashboard is ugly, the frame squeaks, and the front seats make our ass hurt.
Again, this makes one a poor car shopper or a poor GM. If you’re not evaluating everything, you’re not doing your job. When you’re assembling a football or basketball team, you also need to evaluate how new pieces will integrate into your existing roster, and how other players will be affected by such an addition. Hold onto that thought, we’ll come back to it.
This is largely the fault of sabermetrics. Although the tool was designed to deal with the independent interactions of pitchers and batters, it’s now being widely applied to team sports, such as football and basketball. The goal of these new equations is to parse the complexity of people playing together, finding ways to measure quarterbacks while disregarding the quality of their offensive line, or assessing a point guard while discounting the poor shooting of his teammates. The underlying assumption is that a team is just the sum of its players, and that the real world works a lot like a fantasy league.
I’m going to get nit-picky here for a minute. Sabermetrics is a term that should be exclusively used as it relates to baseball, if at all. The term sabermetrics was first used by Bill James, and comes from the acronym SABR, which stands for the Society for American Baseball Research. So if you’re using sabermetrics to evaluate football or basketball players, you’re wasting your time. Personally, I think we should call all of these new measures advanced metrics, since they differ from sport to sport, and there are some very useful metrics that have been developed independently of SABR, even if they are in the same spirit of the work done by that group. To get to the actual point of the above paragraph, yes, sometimes people use individual advanced metrics in team sports to compare players while ignoring certain team aspects like offensive line quality or teammate shooting percentage. Often, the goal of such a comparison isn’t a real world application, but rather a fantasy sports application. When GMs are player shopping, you can be fairly certain that they are taking those other things into account, however (assuming the GM in question is good at his job). That’s how a team can take a player that didn’t perform particularly well in one situation, and plug them into their organization and have a reasonable expectation of success at that player’s given position.
In many respects, sabermetrics has dramatically improved personnel decisions. By relying on unusual measurements of performance, such as base runs and plus-minus ratings, teams have been able to identify neglected talent, whom they can sign on the cheap. Sabermetrics has also helped sports executives double-check their instincts. Instead of blindly trusting some errant whim — and thus making a terrible trade or picking the wrong free agent — they can consult the math. If the Giants had trusted the numbers, for instance, they wouldn’t be saddled with Aaron Rowand’s five-year, $60 million contract. (He batted .230 last year.) They would have realized that his OBS and OPS is pretty mediocre, especially once his two outlier seasons are taken into account.
I’m a little confused here. Advanced metrics are good? That’s exactly what I’ve been trying to say. Just to nit-pick a bit more, plus-minus ratings are fairly useless in both basketball and hockey, and OPS isn’t really a great measure either. I have no idea what the strikethrough portion is supposed to be, unless it’s one of those “clever” made up acronyms used by people to make lame jokes about advanced metrics.
For a nerd like me, this quantification of sports has been tremendous fun. Thanks to obsessive websites, even the casual fan now has access to statistical tools that would have boggled the mind of a GM 10 years ago. Sabermetrics has also transformed the act of being a spectator, so that watching a game is no longer just about cheering for our hometown team. The numbers have given us a whole new way to think about sports, elevating the conversation beyond disappointed groans, ecstatic high-fives, and subjective opinions.
Yay for advanced metrics!
But sabermetrics comes with an important drawback. Because it translates sports into a list of statistics, the tool can also lead coaches and executives to neglect those variables that can’t be quantified. They become so obsessed with the power of base runs that they undervalue the importance of not being an asshole, or having playoff experience, or listening to the coach. Such variables are the sporting equivalent of a nice dashboard. They can’t be quantified, but they still count.
I take particular issue with this paragraphs as it relates to baseball. The importance of not being an asshole is largely irrelevant in baseball, and you need look no further than Reggie Jackson to suss that out. Playoff experience can be similarly overvalued. Andruw Jones hit two homeruns, drove in six RBIs, hit .400, had a .500 on base percentage, and a .750 slugging percentage in the 1996 World Series as a 19 year old rookie. Just because a player has been there before, that doesn’t elevate his talent level. As far as listening to the coach, I’m not aware of too many players that willfully disregard signs when at bat or on base, or ignore positioning instruction when in the field. If they did, they wouldn’t get much playing time. As far as other team sports go, several players that could be considered an asshole are some of the best in their league (Kobe?), and playoff experience is vastly overrated (Aaron Rodgers?). If those things count, they count far less than you think they do.
This is the moral of the Dallas Mavericks. By nearly every statistical measure, the Mavs were outmanned by most of their playoff opponents. (According to one statistical analysis, the Los Angeles Lakers had four of the top five players in the series. The Miami Heat had three of the top four.) And yet, the Mavs managed to do what the best teams always do: They became more than the sum of their parts. They beat the talent.
Here the author clearly disregards the fact that basketball, especially playoff basketball, is heavily dependent on match-ups. Dirk Nowitzki was a nightmare to guard for both the Lakers and the Heat. This applies to other areas of the most recent NBA postseason as well. The Chicago Bulls had the best regular season record in the league, but the match-ups favored the Heat. The Orlando Magic had a better regular season record than the Hawks, but the match-ups favored Atlanta. Those things aren’t intangible, either. Were we really to expect that the combination of Brandon Bass and Ryan Anderson were a good match-up for Josh Smith? Or that either of them could guard Al Horford? The next few paragraphs go on to talk glowingly about J.J. Barea, and that’s fine, I guess, but there’s nary a word about the Finals MVP. Nor is there any mention of how LeBron James and Dwyane Wade played an insane number of minutes throughout the season, and then their minutes went UP in the playoffs. I think those would be important things to discuss when talking about the Finals, but that’s just me.
Here’s my problem with sabermetrics — it’s a useful tool that feels like the answer. If we were smarter creatures, of course, we wouldn’t get seduced by the numbers. We’d remember that not everything that matters can be measured, and that success in sports (not to mention car shopping) is shaped by a long list of intangibles. In fact, we’d use the successes of sabermetrics to focus even more on what can’t be quantified, since our new statistical tools take care of the stats for us. We are finally free to think about how those front seats feel.
Paragraphs like this one are why David Eckstein has a World Series MVP trophy. His 2006 World Series performance was in every way inferior to the one I mentioned above when talking about Andruw Jones. Again we have the car seat comfort analogy, but as you know by now, that’s not an intangible, and it can be measured, if only by comparison.
But that’s not what happens. Instead, coaches and fans use the numbers as an excuse to ignore everything else, which is why our obsession with sabermetrics can lead to such shortsighted personnel decisions. After all, there is no way to quantify the fierce attitude of a team that feels slighted, or the way even the best players can be undone by the burden of expectations, or how Kendrick Perkins meant more to the Celtics than his rebounding stats might suggest.
I’d like an example or two of how advanced metrics have resulted in shortsighted personnel decisions, and I’m not sure Kendrick Perkins qualifies. The Celtics clearly felt they needed someone who possessed the qualities of a player like Jeff Green, and they managed to get that player, and added a 2012 draft pick. Did the trade of Kendrick Perkins drastically change the playoff fortunes of the 2010-11 Boston Celtics? Color me skeptical.
For reasons that remain mysterious, some teammates make each other much better and some backup point guards really piss off Ron Artest. These are the qualities that often determine wins and losses, and yet they can’t be found on the back of a trading card or translated into a short list of clever equations. This is the paradox of sports statistics: What the math ends up teaching us that is that sports are not a math problem.
I don’t think these reasons are that mysterious. Some teammates make each other much better because they’re better at their given sport. Some backup point guards really piss off Ron Artest because Ron Artest is a crazy person. See, I debunked both mysteries, and didn’t have to use any math. As for the qualities that determine wins and losses, that seems pretty simple. In baseball, does a hitter get on base a lot, or does he hit for power well? Does a pitcher strike out a lot of batters? Does a defender shrink the area into which a batter can drop a base hit? To answer each of those questions beyond a simple yes or no, you need some sort of metric that measures each of those things. The last line there reminds me of my favorite anti-stat tagline: “sports aren’t played on a spreadsheet!”. Well of course not. Let’s imagine for a moment that you’re a small business owner that sells widgets. In order for your business to run at optimum efficiency, you need to know what the cost of sale is for your widgets, what staff costs are going to be, what it costs to open the door and turn on the lights every day, and what price point to set for your widgets. Is your business being transacted on a spreadsheet? No, but if you want it to be successful, you’re measuring it on one.