People who don’t trust statistics are stupid and wrong

I’m sick of the meme in our politics and culture that statistics “can’t be trusted”. You hear it from both the right and the left. It’s not only factually incorrect, it contributes to a culture of stupidity.

“Poll results don’t tell you that much, because you can manipulate a poll to give you pretty much any results you want,” says some guest on Hardball last night. In some ways, it’s not his fault that he’s parroting this talking-point. It’s all over our culture, and it’s not even partisan: I hear people from all over the political spectrum complaining (when it suits them) that poll data can be deliberately twisted and biased and manipulated. People can manipulate the wording of a question to influence how people will answer, or can manipulate the way the results are reported to produce whatever impression they like.

These problems do exist. A higher percentage of people in this country believe that “abortion should be allowed in some circumstances” than describe themselves as “pro-choice”, even though some people would call these the same thing. A higher percentage of people in this country support each of the components that make up “Obamacare” than will say they support “Obamacare” itself. Wording matters, and subtle changes in wording can produce different results.

Similar problems exist in scientific measurements, as well. One group will say “The earth is cooling!” because they are measuring 10-year moving average annual mean temperatures in key locations on major landmasses, while another group will say “The earth is warming!” because they are measuring polar ocean temperatures. This kind of thing causes confusion and is a real problem that needs to be addressed.

So why am I complaining?

Because statistics never lie; people do.

Statistics are not the problem! The problem is the people who will construct a survey that asks one thing, and then report the result as if it means something else. The problem is people who will have complex and nuanced questions like “Do you think Obama could have done a better job on some issues?” and then will report the results with fireworks and bullhorns as “People disapprove of Obama!” The problem is when overly-eager news anchors say that Obama is ahead of Romney in the polls 48%-47% when the error margin on the polls is plus or minus 2%.

The numbers aren’t wrong. The people are reporting them wrong.

So when somebody declares haughtily on the television set or in a news opinion piece that “statistics can be manipulated to give you any result you want”, he is wrong. He is downright factually wrong.

The numbers aren’t being manipulated; the story about the numbers is being manipulated.

In other words: The statistics aren’t the problem; the guy describing what the statistics mean is the problem.

You might say, “Well, that’s a little nit-picking, isn’t it? What’s the big difference?”

When you get masses of people believing that “statistics cannot be trusted”, you are contributing to the decay of science in our culture. You are telling people: Do not trust scientific method. Do not trust numbers. Do not trust it when you see complex graphs that are the results of year-long studies published in scientific journals.  Why? Because those are statistics… and everybody knows “statistics can’t be trusted”.

This is undermining science. And what happens when you undermine science? You get crap like this:

In these economic times the “failure” or “success” of any “financial crisis” comes not from numbers but humans. Ask yourself if you are making/taking home more money or even if you have a job?… Are your family, friends, or neighbors having a “financial crisis” or are they unemployed? Again numbers and stats can reveal anything – talk to others and you’ll find the real answer.

This is a comment that was left on an article on another website. It is a clear product of the “I don’t trust statistics!” mindset, and it is so stupid it hurts.

Really? You really think you can get a better idea about what’s going on in the country by asking a few of your neighbors?

If I judged all of Texas (much less the United States) by the neighborhood I live in, I’d think Texas was a deep blue state and mainly populated by gay men. You can’t come to conclusions about how the world works by asking a handful of people who live just outside your doorstep. That’s a deeply, deeply stupid mindset to have.

That is why we need to fight this meme, this trope, this “common wisdom” that is infecting our culture. It’s undermining rational analysis. It’s what leads to a culture of climate change deniers and creationists, and even you liberals are contributing to that when you spout the tired old line that you “don’t trust statistics”.

Every time you hear someone say “statistics can lie!” or “I don’t trust statistics!” you have to stand up. You have to stand up, and say: You are wrong.  Statistics never lie. Statistics always correctly report the information that they are actually about.

Statistics always represent the truth about the thing they are measuring.

Your duty, as an observer, is to make sure you completely understand what they are really measuring.

Because if you don’t, then someone can lie to you about that.

Numbers don't lie



16 views shared on this article. Join in...

  1. Marc Burnett says:

    Science said at one time based on the numbers, whites were intellectually superior to minorities. Science said that whites were physically a different species than minorities. Number showed higher crime rates were caused by minorities and public aid is given and abused by minorities as a whole.

    ALL OF THESE FACTS HAVE BEEN PROVEN WRONG! It is not just the misrepresentation of numbers, but where you look for or how you ignore information and the desired agenda.

    • Greg Stevens says:

      “Science said at one time based on the numbers, whites were intellectually superior to minorities.”

      This is actually an excellent example of the distinction that I was trying to make in the article between an error in the NUMBERS vs. an error in the INTERPRETATION of the numbers.

      What you are talking about isn’t an actual error in numbers per se (at least not in most cases, although there are some situations where calculations themselves have been wrong).

      Rather, what has happened is that a statistical result (e.g. “White children have higher IQ’s than black children when controlling for family socio-economic status”) has been interpreted to mean a particular thing (e.g “White people are genetically smarter than black people”), which then has been shown to be an incorrect interpretation of the statistics because there are alternative explanations and confounding factors (e.g. other cultural, developmental and opportunity-related factors) that could lead to the same result without implying a genetic or “inborn” basis.

      But the key point of my article, when I say “people who don’t trust statistics are stupid and wrong”, is that the number ITSELF WAS NOT WRONG. There is a statistical difference between white and black childrens’ IQ’s. The STATISTIC ITSELF isn’t at issue…. the issue is with the interpretation of the statistic, and how one concludes one should act as a result.

      • Jordan Sharps says:

        I think people don’t trust the people gathering the data to be accurate. It isn’t just that people misinterpret statistics; the methods used during these studies sometimes are flawed or outright corrupted by objective bias (finding the answer you want). I think people should do more meta-analysis on studies. Before you trust any statistics, you should take the time to analyze the studies methods. Do the researchers provide any analysis? Do you know the identity of the organization or individuals performing the study? It is actually rude for you to make the case that people are “stupid” because they say you shouldn’t trust “statistics.” Numbers can be inaccurate; I can only agree with you that they don’t literally “lie.” People reporting or presenting the data can lie. However, statistical analysis requires intellectual honesty (in other words, do some meta-analysis).

        • Greg Stevens says:

          Thanks for your comment, Jordan! I’m sorry that you feel my title, calling people “stupid”, is rude. In some sense you’re right: it’s a bit of an exaggeration in order to convey my strong emotions about this topic.

          But if you’ll humor me for a moment, I’d like to go into why I actually feel so strongly about this. Sure, you’re right, there are specific individual things that one can and should be skeptical about whenever talking about statistics, especially statistics related to a controversial subject. We need to be aware that not all methods of collecting data are sound, we need to make sure that what is really being measured is the same as what we are being TOLD is measured, and so on. In that sense, being a vigilant consumer of statistics is very important.

          But I also think that saying “we should be a vigilant consumers of statistics” is different from saying “I don’t trust statistics”. The latter is a blanket statement, and too often is used by people as an a priori reason to simply dismiss any statistics the speaker does not like. As a general statement (“don’t trust statistics!”) it has been leveraged by anti-science people as a meme for promulgating distrust of science. It has been co-opted into larger anti-science arguments, such as:

          “The government gives you all of these statistics about the economy getting better, but you shouldn’t pay attention to that. Talk to your neighbors! Talk to your friends at the bar! Are they all happy? Do they feel things are great! No! Therefore, you should ignore the statistics!”

          This is horrible, terrible thinking. It’s anti-factual and anti-science, and it is just one more facet of a larger movement to distrust science and data in society.

          And it is within THAT framework that I have such strong emotions about the phrase “I don’t trust statistics!!” I’m against the mindset that so often goes along with it.

          Your specific points are well-stated and correct. Use multiple sources, check methodologies when possible, look at meta-analyses. But I’d still maintain — even given all of that — that those things shouldn’t be bagged up into the broader slogan “don’t trust statistics!”. It is that rhetoric that is damaging. More than anything, I’m making a call for people to be more precise in their critique.

          I hope that makes sense.

  2. Marcus says:

    As someone who has studied statistics, I testify how easy it is to manipulate data. There are so many different tactics to use to give the desired outcome. Even if the raw data is negative towards the party I represent, I can still make them look good. Only trust direct parameters or raw data.

    • Greg Stevens says:

      Marcus: Thanks for your comment.

      I did try to make it VERY clear in my article that I know people are able to use statistics in a deceptive way. But the point I was trying to make–and it is a fine point to be sure–is that the way they deceive you is by deceiving you about the interpretation of the numbers. Unless they are actually using incorrect formulas (which sometimes happens), the numbers themselves aren’t “wrong” …. it’s the interpretation of them that’s wrong.

      I’ve even written about this before. Mark Levin will say “The wealthiest people pay the largest share of federal tax revenue!” That statistic is completely true. But he’s deceptive in the conclusions that he draws from it: he uses it to imply that they pay more than their “fair share” of the revenue, when in fact the reason the “share” of their input to the revenue is so high is because wealth distribution is so skewed in our country, so that even if their tax RATE is low their total tax AMOUNT will still be high.

      So it’s stuff like that that makes me want to distinguish between “not trusting statistics” and “not trusting how people interpret that statistics”. The statistics themselves aren’t the problem. Is that people try to get you to infer from them. 😉

      One last point: I’d challenge you a little when you say you trust “raw data”. In most realistic data scenarios, raw data is a list of hundreds if not thousands or millions of individual data points. It’s literally incomprehensible as “raw data”. The key is to figure out the best way to summarize it. But I’d have to ask you to give a meaningful example of what it even means to trust “raw data” that is not summarized in SOME way or other.

  3. Dan D. says:

    Theoretical statistic: In 2012 Faketopia had 1 person mauled to death by cats. In, 2013 Faketopia had 3 people mauled to death by cats. OR. Between 2012 and 2013 deaths by cat maulings rose by 300%. Seeing that both are correct, which one would you use to perpetuate your want to pass a “Cat ban”? That’s what people mean when they say “you can’t trust all statistics”.

    • Greg Stevens says:

      I understand that that is what people mean. However, as I tried to point out in the article, what you really shouldn’t trust is the way the statistics are being presented and interpreted. The statistic ITSELF is not “wrong”. It’s factually correct, and if you understand what it means (and what it doesn’t mean) you will be safe from making bad decisions based on it.

      In some sense, I feel that this is the responsibility of both people reporting statistics (to do it in a way that isn’t deliberately confusing or biased) AND people reading statistics (to know what questions to ask, and to know what statistics do or do not imply when they are cited).

      For example, in this case, I have a little mental “warning flag” that goes off any time I see a “percent increase” type of statistics, alerting me to look for what the actual raw numbers are. If the raw numbers are not reported, then immediately my reaction is to not trust whoever is presenting the statistic, and to ask for more information.

      But again I’ll stress: this isn’t the fault of the statistics themselves. It’s the fault of the person who reports a % increase without saying what the numbers are actually increasing from and to. 🙂

  4. J. Anderson says:

    Greg, are you saying that you have never heard of or experienced when statistical data that was once held as highly credible and believed in, was proven completely wrong?

    Whats more stupid, strongly believing in data as fact, when in fact it is false but at the time seemed true, and your belief actually hurt other people in the process because you took a belief and acted on it, OR, realizing that a statistical data is something that you as an individual person can never know if it’s true or not and therefore you don’t believe it, you consider it could be true, but you consider it could also be false.

    Seems to me that the latter is more sound to reality, and the former is no different than believing the words of the bible are fact.

    • Greg Stevens says:

      You ask me this: “Greg, are you saying that you have never heard of or experienced when statistical data that was once held as highly credible and believed in, was proven completely wrong?”

      This is actually one of the most important points that I want to make: What you have described ALMOST NEVER HAPPENS.

      I need to call attention to the way you worded it: unless someone has performed a mathematical error, it is EXTREMELY rare to find that the statistical data are wrong.

      What you are thinking of — and all of those example that you think you remember of “statistics being wrong” — are cases where people have interpreted the conclusions of the statistics incorrectly, or they have drawn an incorrect inference from a number. In these cases, the number that was calculated is not wrong. What is wrong is the meaning that people attached to the number.

      This is a very important distinction, because it’s a problem when people take a wishy-washy “oh numbers could be right or could be wrong I’ll just believe whatever I want” attitude toward science and data.

      The thing that you should be skeptical about — the thing that DESERVES criticism and skepticism — is always the theory, meaning, and interpretation that people attach to numbers.

      But (unless there was an error in the actual math, which almost never happens), it is simply not factually true to say “the statistics were wrong”.

      The distinction matters.

      • J. Anderson says:

        I think we are running into a sematics issue on what we mean by “Statistical Data”. Here is what the definition from the Dictionary says it means,

        “Statistical data is described as a mathematical body of science that pertains to the collection, analysis, interpretation or explanation, and presentation of data,[2] or as a branch of mathematics[3] concerned with collecting and interpreting data.”

        You said that unless someone performed a mathematical error, it is extrememly rare to find that the statistical data is wrong. But based on the definition above, statistical data doesn’t just involve the calculation of the math, it also involves interpretation of the collection of data. And as soon as someone interprets the data, that is the beginning of information pollution because now you are no longer relying on the data alone, you are relying on how it is interpreted. And usually it isn’t just one person in this process interpreting the data, and not always do the scientists fully agree on the interpretations. This all happens before the media further pollutes the information.

        So from what I understand from your response, you are saying that Statistical Data is just the calculated math, and the interpretation that happens afterwards is not apart of the process when based on the definition above, it is.

        So when people say they don’t trust the data, they aren’t talking about the actual math which is extremely rare for the public to even see in its raw form directly from the scientists before interpretation. They are talking about the end result which is after it is interpreted and explained by those involved, which is the complete form of statistical data. They don’t trust the explanation and interpretation of the data. And rightly so because there has been many examples of Statistical Data that was wrong, again not talking about the math, I’m talking about the full presentation to us on what they say is true.

        It would be awesome if we could easily have access to see the math directly right from the actual people who created the data in person, but then we would still be directly interpreting the data we are seeing. This suggests that Statistical Data by definition is usually so loose, regardless on the fact we are talking about numbers, that we have to have an interpretation process of those numbers and what they really stand for.

        So, based on this, statistical data isn’t fact. It’s theory. “Fact” is what science can prove. There is no need for interpretation when the science speaks for itself.

        • Greg Stevens says:

          It could very well be a semantic issue. As someone who comes from the perspective of academia and who has done research for years, I’m used to thinking of “data” as numerical. Even in the way that research is often presented in journals that deal with quantitative numerical data, there is most often a “results” section that has numbers and graphs and statistical tests, but that is deliberately thin on interpretation, and then a “conclusion” section that interprets the meaning of the results. A perfectly strict delineation isn’t always possible, of course, but usually an attempt is made.

          So that’s the context I’m coming from. Perhaps that drives, in part, my preference for distinguishing between saying “the statistics are wrong” and saying “the interpretation of these numbers is wrong”, as to me the latter seems more PRECISE. It focuses the critique, and is more informative about how to fix the problem: do not simply dismiss the numbers, find out what they REALLY mean.

          And ultimately, part of the driving motivation I have for pushing this agenda comes down to this practical matter: apart from bickering about semantics, there are real implications for the way people respond to science. People need to learn that when they doubt a result, the best scientific response is not to simply say “oh these numbers are wrong”, but to say “what do these numbers really mean and how can we try to measure the thing we really WANT to measure?”

          So there it is: on the surface you’re probably right that we are having a semantic disagreement, but I hope I’ve explained why I feel it is an issue of more than merely semantic importance.

  5. Saywhat says:

    Statistics don’t lie but bad modeling can give us some awful statistical data that many people trust. More peer reviewed studies that make it through publishing have these errors than not.

    • Greg Stevens says:

      “More peer reviewed studies that make it through publishing have these errors than not.”

      Do you have any evidence at all for this claim?

      Or is it just something you take on faith because you like to think that stuff is wrong?

  6. Ronald Bush says:

    I detest all the credibility given to statistics. Statistics are not reliable in making conclusions because many times, I don’t want to say all the time, they do not divulge every circumstance of the study or lead to a concrete, or same for everyone who sees it, conclusion. It is borderline impossible to completely illustrate every variable condition in a statistic but it is necessary to do so to avoid the statistic being too open for interpretation. I’m not going to use a random statistic to support my point, rather, I’d like to analyze the credibility of all statistics. Scientific peer-reviewed data is an example of one of the most reliable places to get data from, because the data attained from them is not open for interpretation, among other reasons such as testable, observable, and replicable. I do not hold statistics to a regard as high as scientific peer-reviewed data, nor do I believe you do either. The thing that annoys those who are “stupid and wrong”, as you put it, is when people do hold statistics to that high regard and policy debates become objective science vs interpretation, like in climate change debates. I think it is necessary, for the sake of showing the whole argument of statistics, for you to mention this in your article in a rewrite or new article.

    • Greg Stevens says:

      Hey Ronald! Thanks for your comment.

      Well, first I just want to clear something up as a matter of terminology…. the data that you find in scientific experiments and peer-reviewed journals are ALSO statistics, they are just statistics that are measuring the results of experiments, rather than statistics measuring things selected from “out there in the world”. So when you talk about the difference between scientific experimental data and political data, both are “statistics”. A statistic is just a mathematical thing: a measurement that talks about the relationships between numbers. So you need to be careful when you say things like “I detest all the credibility given to statistics” … when in fact what you seem to mean is “I detest the credibility given to statistics about non-experimental measurements.”

      And in that, I think you have a point. It’s much easier to come to conclusions about statistics that are based on the numbers that come from the results of an experiment, than statistics that are just correlations or analyses of measurements of stuff “out there in the world.” Political data, survey data, and even a lot of medical data (since in a lot of cases people aren’t allowed to perform medical experiments on people…. darn), are very messy because they are not controlled.

      However, here is MY feeling about that stuff: the problem isn’t the statistics themselves. The problem is with people reading too much into their meaning, OR mis-interpreting their meaning altogether. So once again, I’ll stand by the main thesis of my post: people are wrong to “not trust statistics”, because the real problem isn’t the NUMBERS themselves… the problem is in people being dishonest (or just wrong?) about what those numbers means.

      Just for clarity, I’ll give you a basic example of stuff that happens all the time. The idea of “correlation” is one of the most mangled statistical measures out there. People find a “correlation” between two things, and they assume that one of them causes the other… even though you can find a correlation between all kinds of things of other reasons (e.g. there is a correlation between the population of the earth and the year… it doesn’t mean that the calendar CAUSES people to be born or that having more people CAUSES the years to advanc).

      On the flip side, people find “no correlation” and interpret it as “no relationship” between two things. This is also wrong. Correlation only measures LINEAR relationship: you can have two variables that are completely dependent on one another in a cyclical way, and they will have a correlation of zero, even though they are strongly related.

      So the problem isn’t the that numbers are wrong or misleading,… it’s that people don’t understand what the numbers mean, and they attach all kinds of meanings that are incorrect. If I say there is zero correlation between daylight and time, that is technically true. What is incorrect is when someone who doesn’t understand math concludes that there is “no relationship” between daylight and time.

      Anyway, sorry for being so long-winded, but I thought this point was worth making. you are right to be skeptical of natural data that is not experimentally measured. But the REASON for you to be skeptical isn’t the “fault” (per se) of statistics. Statistics aren’t the problem. Interpretation of statistics is the problem.

Leave a Reply

Your email address will not be published. Required fields are marked *

Comment

You may use these tags : <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Trending Articles