Also the US census is inaccurate because not everyone fills those papers out. There is no way that the Census bureau can keep track of 100% of the people in the US at one time. That's impossible. Flawed results based of incomplete data does not equal statistical fact.
In response to your specific concerns, read Chapter 8, Page 19 of
Summary File 3: 2000 Census of Population and Housing Technical Documentation (Page 951 in a PDF viewer). This section specifically addresses the issues of nonresponse (failure to receive a questionnaire back, an example would be transient workers, seasonal workers, homeless people, etc), respondent and enumerator error (a respondent failing to understand a question on the questionnaire, someone going door to door misreading the question or misunderstanding the response, people lying on the questionnaires) and processing error (someone types things in wrong).
They reduced the error in the following ways:
- Questionnaires were placed in public places for people
- Introductory letters and post cards were sent.
- The introductory letter included information for requesting your questionnaire in Spanish and many other languages.
- A toll free phone number was set up to clarify questions and help people find census forms
- Local officials were allowed to raise concerns about the census before it was administered and they would be addressed.
- Page 21 of chapter 8 explains how multiple answers from the same location are resolved, for example, someone who mailed in the questionnaire mailed out and also picked up another one at the post office would be filtered correctly. This is filtered by a computer. It increases the accuracy on one level but may also decrease it on another. The error introduced by this filtering technique
is calculable however.
A statistical survey does not need to ask everyone in the whole entire world, country, state, etc, to be accurate. This is what random sampling and large sample sizes do. Because of the sampling techniques used to sample residents, there is no sampling error. There is only the non-sampling error I mentioned above. This is more pronounced over small populations (Alaskan villages) and basically a non-issue over large areas, for example, the nation. So when the Census says on average there are 2.59 people living in every residential unit in the US, it is very accurate to say that there 2.59 people in every house. The statistic that 6.9% of all Americans who owned their own house in the year 2000 were over the age of 65 is very accurate. Both of these statistics came from the 2000 Census. The next couple are made up. When we say the average number of people in every house is 2.60 in Georgia, 2.58 in Atlanta, and 2.44 in West Atlanta, we get increasingly less accurate because of the smaller number of people polled and the filtering techniques mentioned above. The cutoff for accurate data seems to be about 400 people if I'm reading this correctly. So you'll still get decent data for Atlanta, it just won't be as accurate as the national data.
Statistics derived from this data is less accurate. The way you can tell the difference is data between the data is data from the Census Bureau will be cited as coming from the Census Bureau. Here is an example of a derived statistic. The Census Bureau measured 308,851,000 people in the US in 2010 (not by the census). Using this number and the 6.9 measured in 2000, if we make the assumption that the populations did not change that much over 10 years, we can say 2,007,531.5 old people own homes. Besides the fact that mixing new data with old isn't a good idea, we just introduced error into the data.
I'm sure you know the census is used to determine the number of representatives that each state gets, how much financial aid for school districts and roads each state gets, etc. This means that politicians have really good reason to make the statistics as accurate as possible. They want their states to have more say on issues so that legislation their state wants to get passed has a higher chance of being passed. They want their states to get as much financial aid for schools and roads as possible. Every question has to be reviewed and approved by congress. The error control is very tight. The error control is transparent. The Census Bureau published the document above so that you can calculate the non-sampling error. There is no set non-sampling error for every statistic which is why they are not published. It depends many things, like the percent error you consider to be acceptable, whether you want that data to apply to 99% of Alaskan villagers or 99.999% of Alaskan villagers will make a big difference.
The statement "There's only three kinds of lies: lies, damn lies and statistics," is intended to be read as a tongue-in-cheek comment about misrepresenting and misinterpreting statistics and a warning to statisticians not to try to make conclusions that they can't. Statistics are a tool. Cars don't kill people, drivers do. Statistics don't lie, people do. You can use statistics to tell a lie but you can also use them to make very accurate predictions and assessments of populations.
I've really had it with all the statistics trolls on this forums. First Lord Tony, now you. Would you believe me that the gravitational constant is 6.67300 × 10
-11 m
3 kg
-1 s
-2? Would you believe me that we determined this by measuring it? Would you believe me that that constant was arrived at because the experiment was repeated multiple times and then a statistical method was applied to determine if these results were statistically significant or just random error? Did you know that the reason the supermarket you shop at
does not instantly explode into a huge inferno and kill everyone inside just because someone crashes their car in the front of it and it catches on fire is because Underwriters Laboratories tested all of the wall systems multiple times,
statistically brown townyzed the results and then determined that the wall system could resist a certain amount of heat for a certain amount of time, and that this 10 foot stretch of wall would also scale for a 1 foot, 100 foot, or even 1000 foot stretch of wall,
without burning up every single wall on the face of the Earth? Did you know that the model for the needed wingspan and airspeed needed to lift a given weight were derived from statistical brown townysis of data gathered from experimentation? They didn't test every single possible weight and wingspan combination? Did you know this is why your plane doesn't crash into the fence at the end of the runway, because of some flimsy statistical process that someone used to brown townyze the change in data for different shapes and lengths to determine a formula to figure out how long to make the wings? Did you know that voting for the President of the United States is a statistical operation, that an assumption of who should be President is made from a set of statistics gathered by polling nearly everyone, even though not everyone voted? Every citizen in this nation is affected by the President's decisions and should have a say in who should be president. Maybe everyone who didn't vote would have voted for McCain and he would have won! Fortunately we do not have to poll everyone in the United States because we know that if we give everyone an equal opportunity to vote we will get enough responses that we can make a very
statistic about who should be president.
Now that you know this, are you afraid of standing inside of buildings or that airplanes will magically drop out of the air, or maybe gravity will suddenly stop working? Are you going to challenge the legitimacy of voting on the principle that it is a statistical process and therefore not 100% of people were measured and is therefore obviously not an accurate assessment of the true opinion of the United States?
tl;drIf statistics were not accurate we would not use them. There are a variety of error control methods that can be used. Everything is based on statistics. Any count of people, wages, GDP, opinion poll, etc, is a statistic.The Census Bureau statistics are very accurate, accept them already and move on. You are welcome to continue to live in ignorance and not accept the validity of statistics, but please understand that an employer determines your wages on a set of statistics, exchange rates and the prices of goods are based on statistics, votes are based on statistics and everyone else seems to believe in those. Some of them even pull information from the census (for example, wages)!