As part of my poll-obsessing, I finally checked out fivethirtyeight, recommended to me by Alex. Short version is that Nate Silver, the author of the site, is also a leader of Baseball Prospectus. He is credited with creating the very powerful PECOTA system, which rethinks baseball statistics — mostly through pure intelligence, but there is some math that exceeds the AD&D level — and in the process creates a much better explanatory and predictive tool. (It also played no small part in helping to create fantasy baseball’s popularity and even help baseball make a comeback when people thought the fast-paced, pre-felonious NBA was going to surpass America’s pastime.)
fivethirtyeight is, and I don’t think this is oversimplifying, doing for political polling what it did for baseball stats: finding truths by refining, critiquing, and improving simplistic polling data. Today’s post on the site was one of those aha moments:
I have gotten an increasing number of questions about the GWU/Battleground Poll, which presently gives John McCain a 2-point national lead, even as essentially every other current national poll shows Barack Obama with a lead of at least 5 points.
Just because a poll is an outlier doesn’t necessarily mean that it’s doing something wrong. Pollsters may have legitimate reasons for having a different perspective on the election, and they may also occasionally produce odd results due to chance alone.
In this case, however, the poll seems to be making a relatively fundamental mistake: it is not weighting by age.
For months, I’ve been wondering why the hell some polls have been reporting a neck and neck race, while others show Obama steadily gaining ground. (Even stranger, why on earth is the always admirable John McCain pulling such silly stunts, throwing hail Marys, if it’s a dead heat?) Finally, someone explains it, and oh how bizarrely simple it turns out to be.
For those who are curious, here’s the weighting of the battleground poll in question:
18-34 17%
35-44 12%
45-64 40%
65+ 31%
Compared to the US Census/2004 election data:
18-34 26%
35-44 17%
45-64 38%
65+ 19%
Pretty clear. This poll massively overrepresents older voters who, at a local polling level, have been averse to Obama for a variety of reasons, and massively underrepresent the younger voters who Obama has targeted in campaign activities and who are likely to respond to the post Baby-boomer voice he’s cultivated.
So simple, no math. Can’t tell if I’m impressed at the baseball-stats freaks or disgusted at the innumeracy of the media, or even literate newspaper reading people.