Sports and politics are remarkably similar. Both have billions of dollars at stake, engender profound emotions and revolve around larger-than-life individuals. And they both share a culture of data devoted to making predicting future results more accurate than ever. Don't just ask Moneyball's Billy Beane - ask statistician Nate Silver, a man who has led the charge in changing both baseball and political forecasting over the last decade.
I have been following Silver’s work for years - he helped inspire me to write for a living. Silver is now best known for his work at FiveThirtyEight, a political blog owned by The New York Times that uses advanced statistical analysis to predict the outcome of elections. Silver has come under fire within the last week when his model predicted that President Barack Obama was a heavy statistical favorite to be elected in Tuesday's election. Silver bet MSNBC’s “Morning Joe” Scarborough $1,000 last week that Obama would be re-elected.
Many people lambasted Silver for the bet and his belief that Obama will win. But if you have read Silver over the years, you know that he is not necessarily betting on Obama. He is betting on the accuracy of his model.
Lies, Damn Lies, And Statistics
Silver may now be a national name, especially after his bet with Scarborough, but his roots and methods are not based in politics. Before dishing out political insights on “Morning Joe,” Silver was a baseball man.
Silver first foray into the controversial world of predictive analysis came at Baseball Prospectus (BP), a publication devoted to the study of advanced baseball statistics (known as sabermetrics). From 2003 to 2009, Silver was responsible for BP’s prediction engine, known as PECOTA. Named after MLB journeyman Bill Pecota, the full name of the prediction engine is Player Empirical Comparison and Optimization Test Algorithm. By 2005, PECOTA was recognized by top baseball minds as one of the most accurate prediction tools in the game.
From an outsider’s perspective, there is nothing simple about how PECOTA works. But you do not have to be a baseball historian to see how the model functions as a tool for statistical analysis and projection.
PECOTA projects an individual baseball player's career based on his similarities to other baseball players in the past. For instance, PECOTA will project a three-year probability of a player’s performance taking into account his age, handedness, past performance and periphery statistics (such as position and fielding metrics). So if you have a shortstop approaching his age 35-37 seasons with a .249 career batting average, the engine will look for similar players and see how they did at that age. Turns out that approach predicts the shortstop's performance with a fair degree of accuracy - better than other methods.
The Model Works - Mostly
The important thing here is the way Silver creates his models. Essentially, he is takes data sets and applies logical analytical methods that take his conclusions to deeper understanding. Like any good statistician, he is not looking for a specific outcome, but rather looking for the most finely tuned pattern recognition engine to predict future results.
To people that understand analysis of complex data sets, Silver’s conclusions make perfect sense given the information he has. The day before the election, Silver’s model gives Obama an 86.3% chance of defeating former Massachusetts governor Mitt Romney. According to Silver’s numbers, Obama will reap 307.2 electoral college votes to Romney’s 230.8. Silver’s prediction is the result of running his analytical model on poll results from individual states, weighting individual polls in each state based on its past history.
Essentially, Silver applied Moneyball analysis to politics. But just as traditional baseball types rejected sabermetrics when it told them things they didn't want to hear, much of the political establishment - especially those who disagree with FiveThirtyEight's conclusions - decry him as a fraud.
Silver’s PECOTA days demonstrate that he is not a fraud, but his method is far from foolproof. FiveThirtyEight’s reliance on polling numbers (which are often volatile or biased) creates a margin of error that could make Silver's analysis completely wrong. But Silver has a lot on the line: his reputation, and probably his career, depend on his model accurately predicting the outcome of this election.
There is a bigger story here. This is the second time Silver has used big data and smart algorithms to drive a movement that turned an entire industry on its head. In both cases, these industries had long and storied traditions and conventional wisdoms ripe for disruption by smarter, more efficient analysis of existing data. That doesn't mean Silver is always right. But it does mean that you can't analyze baseball - and possibly by Wednesday, politics - the same way again.
Will Silver’s election predictions prove accurate? Will Marco Scutaro ever be an All-Star again? Let us know what you think of Silver's methods in the comments.