“In God we trust, all others bring data.” – Edward Deming
You would think that the coming transition to data driven decision making would be welcome. Cheered, even. But you would be wrong. And this year’s National League Cy Young Award voting proves it.
You don’t have to know much – anything, really – about baseball to comprehend the simple fact that a won-loss record doesn’t accurately reflect, let alone measure, a pitcher’s ability. That might seem odd, even counterintuitive, but just consider the following:
- A pitcher can surrender ten runs and “win” the game
- A pitcher can surrender no runs and “lose” the game
Need we continue? Given those facts, it would seem self-evident – beyond obvious – that any argument for or against a particular pitcher that begins with his won-loss record in determining performance is just foolish.
And yet that’s exactly what columnists are doing today, and have been for decades.
The background, in brief, is this: there were four or five pitchers in the National League this year that had very good seasons, statistically. More, actually, but it’s tough to make a case for anyone outside, say, five.
In years past, when the traditional (read: newspaper) writers were solely responsible for handing out this award, it is probable that the St Louis Cardinals’ Chris Carpenter would have won the Cy Young award. The Cardinals ace received strong support from the traditionalist crowd, not least because of his 12-1 record down the stretch. In spite of this, he did not win. The Cy Young went instead to San Francisco Giant Tim Lincecum.
This year, for the first time – it wasn’t until 2008 that the Baseball Writers Association of America was willing to admit “internet” writers to its ranks – two of the non-traditionalist writers (both of whom are on Twitter, naturally) had votes. ESPN.com’s Keith Law and Baseball Prospectus’ Will Carroll picked their NL Cy Young candidates. And both left Carpenter off their ballots entirely, which led to Lincecum winning the award.
Why? How could these two leave the candidate the traditionalists favored to win off their ballots entirely? Because of the data.
First, both writers ignored the won/loss records for the reasons discussed above. Second, Carpenter threw fewer innings that the other contenders, meaning that he had less time to be valuable to his club. Third, Lincecum performed consistently better in statistics that attempt to normalize for variables like park effects and team defense that the pitcher has no control over.
Here’s Law on the subject:
Lincecum was a no-brainer, and it’s disappointing to see that a majority of voters on the award whiffed on the easiest part of the three-part question. Lincecum led the NL in FIP (Fielding Independent Pitching) and WAR (Wins Above Replacement), both of which normalize a pitcher’s stats to account for the help he received from his defense, and he led both categories by wide margins. He also led the NL in VORP, which adjusts for park but not for defense, by a narrow margin. I understand that many voters are uncomfortable with these advanced stats, but Lincecum also finished second in the NL in (unadjusted) ERA, but threw 36 more innings than the guy in front of him, Chris Carpenter.
Carpenter’s innings total was the main reason he ended up off my ballot. He pitched extremely well when on the mound, but not well enough to close the value gap between him and the three pitchers I listed, each of whom threw at least 27 innings more than Carpenter. Both Carpenter and Wainwright received significant help from their defense, while neither Lincecum nor Vazquez could say the same.
As for Vazquez, he ranked ahead of Wainwright in the advanced metrics anyway, but I also gave him extra credit for pitching in the most difficult division in the NL, one in which he had to face two great offenses and only one patsy.
As for the win total of each pitcher: I ignored that, because, as I’ve said for years, it tells us nothing useful about how well the pitcher performed.
The fallout? St Louis fans went berserk. And as a passionate fan of the game myself, that I could have understood if not condoned. But that wasn’t the end of it. Colleagues in the industry, such as Sports Illustrated’s Jon Heyman, hammered Carroll and Law for their omission. Closer to home, the Boston Globe’s Nick Cafardo – who, in the interests of full disclosure, I’ve had my issues with – questioned the omission by noting that “Carpenter was 12-1 with a 2.19 ERA from July 5 to the end of the season.”
The vote – and the trend towards data driven decision making behind it – was controversial enough, in fact, that a Mercury News columnist wrote an article just on that subject.
Some are seeing this glass as half full: a mere two informed and educated voters managed to swing the vote away from the traditionalists. Yay.
Call me a cynic, but the glass is half empty here. It would be one thing if those criticizing the approach had taken the time to understand the metrics involved. By and large, they have not. Many are like Hall of Famer Murray Chass, formerly a sportswriter from the New York Times who now writes a blog that he doesn’t call a blog because he doesn’t like blogs, arguing that ignorance is actually a virtue (here’s Nate Silver’s, of Fivethirtyeight.com fame, response).
To me, VORP epitomized the new-age nonsense. For the longest time, I had no idea what VORP meant and didn’t care enough to go to any great lengths to find out. I asked some colleagues whose work I respect, and they didn’t know what it meant either.
Finally, not long ago, I came across VORP spelled out. It stands for value over replacement player. How thrilling. How absurd. Value over replacement player. Don’t ask what it means. I don’t know.
If the objections were substantive, and based on metrics that actually related to pitcher performance, that would also be welcome.
Even if the dissent was irrational, but handled respectfully, I could live with it. But instead, as ever, rationalism is under siege. Today it’s baseball, which you might not care about, but tomorrow it’ll be healthcare, which you probably do. Or web design. And so on.
All I see when I look at what played out here is a template for all of the battles that are going to be fought, as those with data struggle against the entrenched opinions, feelings and bias of those who think that’s a reasonable basis for making decisions.
To those who would fight those battles, I salute you. Bring your data, as Deming recommends, but if what happened to Carroll and Law is any indication, you should bring a thick skin as well.