Friday, February 5, 2010

Understanding Polls

Maurice Ferre, former Mayor of Miami, and declared candidate for the Democratic nomination for the U.S. Senate seat from Florida occupied by George Lemieux, “is touting poll results that show he does almost as well as Meek when matched up against likely Republican candidates.”  The poll, conducted by Fabrizio, McLaughlin & Associates, shows that when the individuals polled in hypothetical matchups between Governor Crist and either Congressman Meek or Mayor Ferre, or Speaker Rubio and and either Congressman Meek or Mayor Ferre that Ferre faired about as well as Meek, that, in the words of Ferre – that Ferre and Meek are “virtually tied.”



“If the election for U.S. Senator were held today and the candidates were (ROTATE) Charlie Crist, the Republican and Kendrick Meek, the Democrat, for whom would you vote?”
All Voters
GOP
Democrat
Independent
Crist
47
69
26
55
Meek
29
10
53
17
Undecided
24
22
21
28
“If the election for U.S. Senator were held today and the candidates were (ROTATE) Charlie Crist, the Republican and Maurice Ferre, the Democrat, for whom would you vote?”
All Voters
GOP
Democrat
Independent
Crist
49
68
31
54
Ferre
27
9
49
19
Undecided
24
23
20
27
“If the election for U.S. Senator were held today and the candidates were (ROTATE) Marco Rubio, the Republican and Kendrick Meek, the Democrat, for whom would you vote?”
All Voters
GOP
Democrat
Independent
Rubio
42
75
12
45
Meek
30
5
60
16
Undecided
28
20
28
39
“If the election for U.S. Senator were held today and the candidates were (ROTATE) Marco Rubio, the Republican and Maurice Ferre, the Democrat, for whom would you vote?”
All Voters
GOP
Democrat
Independent
Rubio
43
75
16
42
Ferre
27
4
54
20
Undecided
29
21
31
38

Ferre is, more or less, right. The difference between Meek and Ferre is about two percentage points, and that is less than the margin of error (3.5%). This was a poll of “800 likely voters polled between Jan 27, 2010 and Jan 28, 2010.”


I like this poll quite a bit (from what I can tell here) – and not for the results. But some things to note for consumers of polls (this will be old hat for my faculty friends, connoisseurs of Nate Silver’s website, and (hopefully) my former students). Polling largely comes down to achieving a “good” sample of the population, asking fair questions, and making reasonable interpretations of the data. Is this a good sample? Well, the population under study is the people who will vote in the general election in November 2010. This poll uses “likely voters” – which is, by and large, a euphemism for “voted in the last election.” Since past behavior is a pretty good indication of past behavior, the likely voter is the most common route here. When elections suggest (say as they did in ’08 with Obama) that new voters could be important (voters who were under the age of 18 in 2004 could not be counted among “likely voters” in 2008), some adjustments need to be made. These adjustments occur with what is called “weighting.” Weighting the data effectively means different multipliers are used on sub-samples of the data in making projections about the whole. This often happens when the sub-sample numbers do not reflect the relative sub-sample group expected in the full population under study. So, for example, if we expect that 20% of the electorate in November will be Cuban, and our sub-sample Cuban population is only 10% of the sample – we effectively double-count those numbers to account for the small sub-sample population. Alternatively, if the sample wound up being 60% women, and we expected women to be 50% of the voting population, we would discount the sub-sample population by 1/6. What all of this should say is that it is really important to get the sample right. And by “right” – we mean free from systematic bias or error. If we polled people only at night, individuals who work at night would not be part of the sample – AND – if people who work at night vote differently than people who work during the day (or don’t work), we would have a problem. Thus we typically want what is called a “random” sample. To get a perfectly random sample, we would have a giant database of all possible voters and then use a random number generator to get the “right” names, and then contact those people. We can’t really do that – but we can do things like “rdd” or use random digit dialing to call people. We can then ask qualifying questions, such as “did you vote for any Presidential candidate in 2008?” and then only include the “yes” answers in the poll. I can’t tell here exactly how this survey crew randomized their sample, but there are some other indicia of quality here. Notice in the question wording the insertion of (ROTATE)? What that means is that the person who asked the question of the person being polled would change around the order of the names in the question. This is one way to guard against things like a “response set bias” – when individuals tend to answer in a particular way, largely affected by the framing of the question. This is an issue with referenda on ballots; voters tend to vote “yes” more often than “no” – regardless of the substance. So that the polling group rotated the order here is a pretty good sign that the firm is thinking about the right issues. I also like the large number of individuals polled (800) and the low margin of error 3.5%. The sample size and the margin of error work together (along with confidence level which is not reported here, but the social science standard is 95%). In effect, if the keep one of the three constant (sample size, margin of error, confidence level), the other two move together. That is to say, if we keep the confidence level the same, by increasing the sample size (to a point), the margin of error shrinks. What all of this means is that if we conducted the poll twenty times, using the same randomization techniques and the same questions, we would expect that 19 out of 20 times (95%), the estimates of candidate support would only vary by 3.5% from the estimates listed above. So we are reasonably sure that there is very little difference between Meek and Ferre in how they poll against either Crist or Rubio. That might not be enough confidence, or too much margin of error, if one were experimenting with a new drug for organ rejection in transplants, but if one is simple trying to figure out how to run a political campaign (and appeal to donors), these results are pretty good for Ferre.