Even some election observation experts don't believe in taking samples of voting stations for quick count or PVT. They think it is better to take all the voting stations and do away with the risk of just taking a sample. This idea seems like common sense, but it is flawed.
According to Vote Count Verification: A User’s Guide For Funders, Implementers, And Stakeholders, Democracy International, 2011:
Although international and domestic groups have con-ducted sample-based PVTs in dozens of countries since 1988, PVTs have sometimes drawn controversy in some quarters of the international community. National election authorities, foreign aid officials, and technical advisers have sometimes questioned the feasibility and accuracy of a vote count verification exercise based on statistical sampling, even though the use of statistical sampling in polling and research is widely accepted among social scientists, media organizations, public opinion researchers and politicians around the world. They also worry that a separate, unofficial vote projection that diverges from the official count might foment postelection unrest.
Misgivings among election authorities and national political elites about the purposes and methodology of PVTs are not surprising. Election authorities rarely like the idea of independent organizations, domestic or foreign, threatening to second guess the official results or offering their own reports of the election outcome. Foreign involvement in such exercises can also be seen as a threat to local sovereignty or hurt national pride because it seems to imply that national authorities require international oversight.
The reason that collecting data from all the units (a census) might not give as reliable results as collecting data from some of the units (a sample) because of the vastly larger scale of operation for the former. This is the well known fact in the census/survey community. Even the seemly simple and routine tasks of collecting vote count results from the voting stations, transmitting them to headquarters, and tabulating the results are no exception to this rule.
The critically important transitional elections in Indonesia in June 1999 produced considerable controversy among both domestic and international actors.
In response to substantial public mistrust of the official election authorities a coalition of Indonesian universities called the Rectors‟ Forum, with advice from NDI, proposed a sample-based PVT.
... Apparently, for the first time, however, development agency officials and technical advisers questioned the intellectual basis of a sample-based PVT. In particular, some PVT critics questioned the PVT‟s reliance on statistics. They claimed, incorrectly, that random statistical sampling would not work in the absence of extensive baseline demo-graphic data or could not be used for proportional representation elections. This was a fundamental misunderstanding of the principles of statistics.
Yet because of these unfounded concerns about a sample-based PVT, many Indonesian election and government officials, a number of foreign technical advisers, and some development agency officials initially opposed the PVT. Some urged instead that an independent vote tabulation should consist of a comprehensive PVT, which would at-tempt to collect all the results from several hundred thou-sand polling stations in the country, much as NAMFREL had attempted to do in the Philippines in 1986.
Subsequently, key international actors organized an unofficial comprehensive count in Indonesia, called the Joint Operations Media Center (JOMC). It was organized on the behalf of the Indonesian election commission with funding and technical assistance from American, Australian, and Japanese organizations and the United Nations Development Program (UNDP). Before the election, one of the international organizers promised a “facility . . . capable of reporting reliable results of the elections at the earliest practical moment.”
The JOMC‟s spokesperson told the media he hoped that 50 percent of the results would be known by the day after polling.
... The JOMC was ultimately unable to collect meaningful results. By the morning after election day, it was reporting less than 1/4 of 1 percent of the vote, a meaningless number. Even by three days after the elections, the JOMC could report only 7.8 percent of the vote count, still too small to support any conclusions about the outcome of the elections. ... Rather than reassuring Indonesians and the international community about the integrity of the vote count, the JOMC parallel count actually undermined confidence by raising expectations that it could not meet. Both the sample-based PVT and the comprehensive JOMC ultimately failed to build confidence in the integrity of the reported election results.
Leaving aside the complex issues of PVT vs. exit polls, sample PVT vs. comprehensive PVT, or vote count verification in general, you may like to relax for a moment and have some fun playing around with sample size for PVT using real life voting data. You could do that with what is known as computer simulation. You could learn about the rationale and philosophy and all the nice and impressive things about simulation later, if you like (pardon me, I didn't).
To start with, you will need to have a bit of knowledge about using computers. I would assume that you have installed R on your computer and know how to run a script file with it. If you haven't installed the simFrame package, then install it.
As for the data, download the precinct level 2012 US elections data for Texas from Harvard Dataverse, the Harvard Elections Data Archive. You could download the data file in tab delimited text format, R data format, or the original stata file format. Unfortunately the R data file doesn't work. The stata data file is fine. I don't know for sure if the precinct level elections data means voting station level elections data. I assumed it is so, but it would be no harm for the purpose of our exercise if it is not exactly equal.
The handbook for quick count/PVT by NDI mentioned in my previous post gives detailed description on how to determine the sample size. The report by Committee for Free and Fair Elections in Cambodia (COMFREL), Parallel Vote Tabulation Through Quick Count for 2008 National Assembly Elections, October 2008, showed it followed the NDI approach. Among other resources, ACE encyclopedia (version 1.1) noted "On the whole and probably in a rather random way, one might say that there is an inclination towards doing quick counts on 10% of the population in the case of transition elections (e.g., Chile in 1988, Panama in 1989 and Bulgaria in 1990)." Handbook for Domestic Election Observers by OSCE/ODHR, 2003, observed similarly: "Experience shows that where there is little demographic data and the population is quite diverse, the tendency is to use a relatively large sample, such as 10 per cent of polling stations. Where the opposite is true, a smaller sample can be used and provide sufficiently credible and accurate results for national elections."
In its methodology note on PVT, Pakistan General Elections 2008: Election Results Analysis by Free and Fair Election Network explains:
Experience with past PVTs has shown that drawing a sample of 25-30 polling stations provides sufficient data, within a relatively small margin of sampling error, to assess the reasonableness of official election results. Adding additional polling stations to the sample, even when the number of total polling stations is large, does not improve the margins of sampling error dramatically.
The reason for this statistical principle is that a PVT works with “cluster samples” – each polling station “cluster” averages 1,000 registered voters, and 25 polling stations in a constituency produces a sample of 25,000 voters (25 polling stations x 1,000 voters each) which is much more than statistically sufficient to permit comparisons with official results.
... As part of the world’s largest PVT, almost 16,000 Polling Station Observers (PSOs) from the Free and Fair Election Network (FAFEN) witnessed and recorded the actual vote count in a statistically valid sample of 7,778 randomly- selected polling stations during the 2008 Pakistan National and Provincial Assembly Elections. The national sample of 7,778 polling stations represented almost eight million registered voters.
Common people and even some experts find it hard to believe that taking 25 or 30 voting stations out of a large number of them in a constituency would give good enough estimate for true voting results. For our exercise we have downloaded the Texas data for 2012 elections. It included data for 8952 precincts, of which 278 has 0 votes. It covers election results for U.S. President, for U.S. and State House of Representatives and Senate. For this exercise you will take the votes for the President.
Here's how you could play around with the sample size for PVT. You take simple random sample of 25 precincts out of 8674 with any votes. Then you total up the votes for "g2012_USP_dv" (Democratic votes), "g2012_USP_rv" (Republican votes), and "g2012_USP_tv" (Total votes) for this sample. Then you estimate their totals for Texas.
Theoretically you want to do this for infinite numbers of samples. Obviously you can't. As someone said, running 10,000 samples won't hang your computer and it is close enough to infinity as you could comfortably get. So you would run the simulation with 10,000 samples. Finally you would estimate the total votes for Texas by taking the mean of all the estimates from each of the 10,000 samples. Then you could compare them with the known results for Texas to see how accurate they are.
Here's how I did that with the simFrame package:
You should get these results with the above code:
(i) For total votes
Vote_For SimulatedTotVotes TrueTotVotes AccuracyPercent
1 Democrats 3302674 3307609 99.85
2 Republican 4562952 4568788 99.87
3 Total 7986507 7997303 99.86
(ii) For percentage of total votes
Vote_For SimulatedPCVotes TruePCVotes AccuracyPercent
1 Democrats 41.35 41.36 99.98
2 Republican 57.13 57.13 100.00
In sampling terms, a PVT consisted of a sample of clusters (the voting stations). When they differ greatly in "size", the precision of the estimates will suffer. Stratifying the voting stations by "size" and taking samples independently in each of the groups (strata) could improve the precision of the estimates of vote counts.
I guess one way to look into this in our Texas data would be to draw a scatter-plot with the ratio Republican-votes/Democrat-votes on the y-axis and total-votes on the x-axis. We could then see if this ratio changes with the "size" (number of voters) of the precincts.
Here's the scatter-plot:
The same scatter-plot done with the package "hexbin" is here:
Note that they both have regression line drawn in on the graph. From these two graphs, I guess, I could make out that stratification will not be very effective in this situation. I also have a hunch that plain systematic sampling would be good here.
Although this simulation exercise is directed at PVT, it could be useful in help convincing the skeptics that sampling really works. In a sense, I was hoping to give a peek of simulation, PVT, and sampling to young people and ordinary folks. Once they are interested, I'm sure they would like to try out the beautiful hexbin plots too.
Look for the resources on simulation, PVT, and hexbin on the Web, learn more, experiment and enjoy (more like advising myself)! Besides, improve on my ideas and codes, would you?