Sunday, November 11, 2012

Nate Isn't That Great

The election is over, and for some strange reason, Nate Silver's prediction achievements are being lauded--almost as if we have some new Einstein in our midst.

After all, Nate correctly predicted the outcomes of all 50 states.  Do you know what the chances of that happening are?

It turns out to be much higher than what one might expect.  Nate was not the only one to predict all 50 states correctly, but he's the only one you'll hear about, because he's the one who works for a major newspaper.

You see, Nate used something called a Monte Carlo simulation.  These are neat models where you input some "known" variables and allow random fluctuations to change the outcomes.  You run thousands of election simulations through the random generator, and you study the outcomes.  If a majority of the simulations show Obama winning, then overall, you expect Obama to win.

If you go to Nate's site and scroll down to the graph that shows "Electoral Vote Distribution" you can see the outcomes of each of the individual simulations.  It appears that a couple had Obama winning only 200 electoral votes at one extreme and as high as 370 at the other extreme.  The higher the line on that chart, the more simulations there were that resulted in that electoral count outcome.

The highest line is at 332.  20% of the simulations showed this count.  This is EXACTLY the count that Obama got.  Wowzers!  The second highest line is at 303, which basically gives Florida to Romney.  That outcome occurred in 16% if the simulations.  If you take the average of all the outcomes, that gives the expected outcome of 313.

This is all neat and all, but one problem with Monte Carlo simulations is that the mode or mean (depending on how you set it up) is already set BEFORE you run the simulations.  That is, it will be determined by the non-random inputs you put into the model.

The biggest inputs into Nate's model were poll results.  As each newer poll gave more up-to-date information, Nate updated his model.  And guess what!  The mode (highest point) of his simulation outcomes MATCHED the polls.

In other words, it wasn't Nate Silver who predicted the outcome, but it was the POLLS.  Nate only compiled the information with neat bells and whistles, and passed that information to us, the consumers.

Now, as a mathematician, I totally ate up Nate's predictions.  I love the stats that he compiled, and how he showed which states had the biggest probability of switching to the other side, and how much sway each individual voter had on the election.  That's all cool stuff.  Yet, that's not what he's getting press about.  Rather, it's the fact that he correctly predicted all 50 states.

But he didn't do that.  I'll say it again ... it was the POLLS.

If you don't believe me, follow this simple exercise.

First go to Nate's site and scroll down to the map.  Look at his predictions.

Next go to CNN (or your favorite news source) to check out the actual results.  Compare the maps and say, "Ooh Ahh!" over Nate's predictions as you confirm that every single one of the states matches.

Next go and view CNN's own projections from a couple of weeks before the election.  These are predictions from using poll data plus whatever cool math tricks they use.  Just in case the link is broken, I'll point out that their projection excludes eight "battleground" states: NV, CO, IA, WI, OH, VA, FL, NH.  Now, if you compare CNN's projections in the remaining 42 states, you'll see that they accurately predict each one.

And finally, visit my newly found actuary friend's site.  His name is Patrick.  At the top, you'll see his projections.  All fifty states matched perfectly.  Not only that, scroll down to his histogram, and you'll see that it's very similar to Nate's.  The mode is 332.  Next comes 347 (most likely giving NC to Obama) and in third place comes 303 (giving FL to Romney).

I've been enjoying my friend's site and eating up his stats and comparing to Nate's stats.  But while Nate's in the spotlight, my friend (who doesn't work for a major newspaper) isn't.

Now that you've compared the results of three different projections, what do all three of these have in common?

Amazing accuracy to predict the actual outcome, and the reliance on POLLS.

While Nate and Patrick and others should continue to do what they do and produce cool stats, let's give credit where credit is due and let's acknowledge where the true predictive power comes from: the POLLS.

Update 11/12/2012:  After having seen videos of Nate Silver after the election and seeing that he is not gloating, I would like to add that Nate appears to be a wonderful person and a professional mathematician.  Please note that I do not believe that Nate made any errors in his projections, and neither did he use faulty methods.  

It's just that the one feat of predicting the point estimate of each of the 50 states was actually the easiest thing to predict, as the polls already had this covered.  Anyone could have taken the average of the latest major polls in each state, add up the electoral votes, and arrive at the same "correct" prediction.  That part isn't particularly impressive.

But if you haven't already, I invite you to dig into Nate's blog and look at all the other cool stats.  It's very difficult to gauge the accuracy of what "could have been" but you can be impressed with all the complexity and coolness.

And then visit my friend Patrick's blog, whose model isn't quite as complex, but is still nonetheless pretty cool.

No comments: