Whole New Ballgame: Should Math Trump Myth in Baseball?

When you can’t express it in numbers, your knowledge is of a meager and unsatisfactory kind.

— Lord Kelvin

The 2012 baseball season was, statistically, a monster season for Detroit’s first baseman, Miguel Cabrera. He did what no player had done since Carl Yazstremski in 1967. He won baseball’s “triple-crown,” leading the American League in its three most venerable offensive statistics: batting average, home runs, and RBIs. Cabrera finished the season with .330, 44, and 139, respectively. In any other era, Cabrera would have been a no-brainer to be named American League MVP, but this year the subject was a matter of heated debate.

Mike Trout, at 21, is already a superstar. He looks like Mickey Mantle and runs like Willie Mays, and after this, his first MLB season, he is already counted among history’s best outfielders. That he was even considered a challenge to Miguel Cabrera is evidence of a growing schism in the world of baseball.

Baseball’s upstart, mathematically-inclined new wave is a group of adherents to sabermetrics, which stands for the Society for American Baseball Research, centered in Phoenix. These rebels are distinct from baseball traditionalists in their quest to accurately quantify every aspect of the game. This trend has created a culture many classic fans claim undermines the very essence of baseball.

Part of baseball’s allure, even to its most traditional fans, is that statistics make it comprehensible. One of the earliest newspaper box scores appeared in the New York Herald for a game between the All-Brooklyn and All-New York teams. The only information detailed was the number of “hands-left” (outs) each player made and the number of runs each scored.

But it was baseball, and they were numbers. The combination of the two seemed fated. As sportswriter Alan Schwartz wrote: “After the last out had been made and the sun had set on the ball field, the numbers emerged to glow like gas lamps, lighting the way to a new appreciation for the game.”

Those gas-lamp numbers proliferated thanks to Brooklyn Englishman Henry Chadwick, who developed an array of new statistics he believed would “obtain an accurate estimate of a player’s skill.” Most of Chadwick’s numbers accounted for fielding and pitching, the essence of the game in his era. But baseball, it turned out, was quantifiable in ways other sports simply weren’t. The game’s symmetry—nine innings of three outs each—allowed for accurate comparisons between any number of games, any number of players, or any number of teams.

Dalliances in baseball statistics by men of great genius produced remarkable advances in understanding America’s pastime. In the 1950s George Lindsey, a military analyst at Air Defense Command in Quebec, felt that combat data was “murky and incomplete,” so he struggled to assess the values of individual decisions made in battle.[1] He grew to believe the same analysis could be applied to baseball. Lindsey meticulously recorded details of more than 400 games each season and eventually published “Statistical Data Useful for the Operation of a Baseball Team.No major league team consulted Lindsey’s work.

Earnshaw Cook, a Maryland metallurgist, was the first to bring serious baseball analysis to the masses with his 1964 book, Percentage Baseball. Many of Cook’s findings were opaque, even for his most educated readers. Cook frequently quoted Francis Bacon. Some passages were in Latin. His equations were gibberish to those who held sway in professional baseball.

It took nearly two decades for the power of statistics to finally reach MLB’s spheres of influence. But it finally did, thanks to a graveyard-shift security guard at a pork-and-beans plant in Kansas. Bill James self-published the Bill James Baseball Extract annually, beginning in 1980. The Extract used painstaking statistical analysis to explain why teams won and lost. At the time, James was treated as a pariah by major league front offices, though his work grew in popularity among serious baseball fans.[2] Bill James became a kind of hero after Michael Lewis’ Moneyball chronicled the Oakland Athletics’ improbable success in 2002.

Moneyball: The Art of Winning An Unfair Game focuses on the scouting and management side of the Oakland Athletics, the first team to fully integrate analytic analysis into its management. Billy Beane was a former first-round pick whose major league career never reached the potential scouts had predicted. Once his own career fizzled in the early 90s, he became an advanced scout for the Athletics, the last team he played for. Eventually he worked his way up to general manager. Beane’s front office challenge was that the Athletics were poor and had to compete complete against much more affluent franchises. Typically, the Athletics salary was close to $30 million for a season; the Yankees, on the other hand, spent around $200 million.

Beane turned to sabermetrics—a top-tier statistician, after all, cost less than an all-star shortstop. The 2002 season looked particularly bleak for Oakland; elite East Coast teams had picked up three of their best players from 2001. But Beane used sabermetrics to cobble together a team of statistically undervalued misfits, who, despite conventional wisdom, managed to win the American League Western Division.

Wins Above Replacement, or WAR, is the God-particle of today’s sabermetrics. It expresses a player’s value to his team in a single, discrete numeral. 

More recently, the Tampa Bay Rays, laughing stock since their 1998 inception, adopted a management strategy similar to moneyball that some call equityball. Andrew Friedman, Matt Silverman, and Stuart Sternberg—all Wall Street veterans—took over baseball operations in 2005 and based their decisions on the market principle of “positive arbitrage”—the exploitation of price differences between markets. As Friedman put it, “I love players I think that I can get for less than they are worth.” At the heart of the Ray’s efforts to rebuild was to press every small statistical advantage, “the 52-48 edge,” so the Rays could compete with juggernaut franchises. Once again, math trumped myth. Tampa Bay rose above its dismal history to secure the American League Wild Card slot in 2011, win their division in 2010, and play in the 2008 World Series.

Wins Above Replacement, or WAR, is the God-particle of today’s sabermetrics. It expresses a player’s value to his team in a single, discrete numeral. WAR essentially asks, “If Player X had to be replaced, how more often would his team lose?” A player’s WAR takes into account his wRAA, UBR and UZR.[3] These statistics represent offense, base running, and defense. Through complex mathematics, WAR yields the number of team wins attributable to the efforts of any single player.

WAR is why Mike Trout was—despite Cabrera’s historic triple-coronation—in serious contention for the MVP Award. Trout’s WAR at the end of the 2012 season was an astonishing ten. In other words, had Trout not played for the Angels, they would have only won only 79 games (instead of 89), placing them well out of playoff contention weeks before the season’s end. Cabrera’s WAR, on the other hand, was an excellent, but less-than-supreme, 7.1. Hence, Trout was the MVP candidate with the mostvalue, at least in the purest statistical sense of his WAR rating. As Tom Ley wrote,[4] “Cabrera’s season was a neat oddity; Trout’s a supernova.”

This year’s MVP race became a kind of battleground between the old-school baseball thinkers, who waxed lyrical about the game’s tradition and mystique, and the new-school statisticians, who extolled baseball’s essential quantifiability.

Mike Trout and the Tampa Bay notwithstanding, do sabermetrics make baseball better? What do we lose when Wall Street trading becomes the model for something as fabled and sepia-toned as baseball? What are we left with when myth is plumbed and plundered?

Or do sabermetrics serve that other great American myth: the Horatio Alger rags-to-riches tale? Do statistics allow the scrappy band of underdogs to take on, and vanquish, the giant?

While the debate has been settled for the year, one thing remains certain: math and myth are inextricably linked in the world of baseball, and a first-rate statistician can influence a team’s season as much as first-rate draft pick. As Tom Verducci wrote about this years MVP race, “The pejorative nonsense about ‘new school’ and ‘old school’ was sad. Everybody uses advanced statistics, though how they weigh them varies. In fact, if Albert Reach can get on a Hall of Fame ballot next month essentially for publishing a baseball magazine for seven years in the 19th century (it helped sell his baseballs), someday Sean Forman, the brains behind baseballreference.com, should be on one.”

[1] Lindsey’s wife June hated baseball and felt it was a waste of her husband’s intellect. She held a Ph.D. from Cambridge in X-ray crystallography and worked with Nobel Prize winning scientist Dorothy Hodgkin. She was also created the adenine and guanine strands for Crick and Watson’s DNA model.

[2] James did not work for a baseball franchise until Jim Henry of the Boston Red Sox hired him as a data analyst in 2003. The Red Sox won the World Series in 2004.

[3] Weighted Runs Above Average, Ultimate Base Running, and Ultimate Zone Rating, in case you’re keeping score at home.

[4] In his aptly titled article, “Let’s Admire Miguel Cabrera’s Triple Crown Before We Put the Triple Crown Into the Dustbin of History.”

 MATT RUCKER lives and writes in Phoenix, the hub of baseball's statistical universe.