I read this article all worried, only to be relieved when the statistical error was revealed. At work, I'm always asking "Is this just noise?" which also comes in handy when I'm listening to political debates.
Actually statistics is an error in the first place.
That actual result for 95% confidence is that if you do the same procedure on some assumed random system without the comdition 100 times, it will only come out positive only 5 times.
But you don't have an actual assumed random system. So the 5 may not apply.
Irradiated the article at the link and hope to fond the time to read the original source, but the scope is surprising. That is, it is surprising if psychologists are trying to make inferences about the difference in differences without the appropriate statistical estimates. It's not so surprising is they are simply reporting within group effects and are ignoring the between group differences. Also, the article at the link seems to imply that the standard errors for the within group and difference in differences tests are the same. They are not.
Social "science" curricula need to add statistical analysis techniques as a prerequisite. Otherwise a scholar may mistake association for correlation.
The clearest example of intentional misrepresentation I recall is the testimony of the Yale expert in the Bush/Gore butterfly ballot lawsuit. I saw this live, on television right here in St. Louis.
He stated the undervote was statistically associated with the butterfly ballot, as if association was proof of causation. My wife came downstairs to ask what I was shouting about.
In English, that's translated as, "the undervote was coincidental and not scientifically proven to be caused by the butterfly ballot."
Although the Bush team left a long pause before their next question, no one on either legal team openly challenged his statement!!
Is 15% significant? There seems to be a subjective issue in all of this. The overall tone seems to be "people highlight differences more than they should."
Which is a true point, but I think this is also factored in by academics when they read other works.
They've identified one direct, stark statistical error so widespread it appears in about half of all the published papers surveyed from the academic psychology research literature.
No, it does not.
Also from the article...
How often? Nieuwenhuis looked at 513 papers published in five prestigious neuroscience journals over two years. In half the 157 studies where this error could have been made, it was.
So it happened in 78.5 out of 513 papers, or ~15.3%.
If you're going to write an article about people's statistical mistakes, get your numbers right.
Note: Getting it wrong in 50% of the cases where you could get it wrong is very bad. You don't need to make it sound even worse.
There is pressure to have results, and positive ones to get research funds, or get the drugs to market, but as one comment noted: Mr. Goldacre is falling into the same trap as these academics that have made flawed claims in their papers). Now it is true that most often, in the within-group analysis it is easier to achieve significance (making Mr. Goldacre's comment likely); however, other factors impact this likelihood such as size of the groups and how tight the data cluster around the mean in each of the two types of analyses. Hence, Mr. Goldacre's statement (above as I quoted) is a bit misleading. See how easily one can misinterpret statistics?
The article in the Guardian describes one major problem with statistical design in academic literature. A deeper problem is the almost universal assumption that normal distribution, which is the basis for most statistics, is--well--the norm.
If a phenomenon varies more than the normal curve suggests (e.g., a Cauchy distribution), then all the usual measures of statistical significance would be overstated. I've heard several times of "500-year" floods on the Mississippi, but I wouldn't be surprised if the erroneous use of statistical tests based on the normal curve was a reason.
All you have to do is read Steve McIntyre's work to realize that there is so much statistical incompetence and statistical shenanigans (sometimes it's hard to tell the difference) coming from the pro-AGW crowd that everything they do is suspect. That's why lots of us are skeptical. It seems that they don't use professional statisticians on most of their papers. They wing it themselves and get things wrong (or misunderstand what they are doing) routinely. You'd think with trillions of dollars at stake the very least that would happen would be that every finding would be rigorously examined for its statistical soundness by those specializing in that field. But that doesn't appear to happen. The AGW crowd is a closed circle and real statisticians aren't invited in.
I suspect there are a lot of fields where statistical techniques are necessary to analyze data, but don't directly help with the design or implementation of experiments. That would explain why there are so many fields that use statistics extensively, but have practitioners who are not statistically adept.
At the Colorado School of Mines, they used to have a cheer: "Rippidy Rip, Rippidy Rap, Referee's got the (clap)(clap)(clap)." We didn't say he had the clap, we clapped our hands. Perhaps it had something to do with his vision, or lack thereof.
I spent semester recently in an College of Education on a Big Ten campus as a doctoral student. The statistics favored there is the analysis of variance. One of the tenets is that they Can Not prove cause and effect, they can only prove not a cause and effect. Sounds like a bunch of guessing in the dark.
We were warned, by the way, about continuous pagination. Some journals have continuous pagination where the first journal of a volume year will start at page 1, and end at page, 98, say. Then the next journal will start at page 99, and so on through the year. If we cited a journal that used continuous pagination, and we used the journal number in our citation, our professors might mark us down from an A. While Althouse, A., 2011. "On the Terrace", Wisdom of Meade Journal, Vol. 58, No. 2, 213-215. and Althouse, A., 2011. "On the Terrace", Wisdom of Meade Journal, Vol. 58, 213-215. might not really look different, they are different enough for the student to be taken down a letter grade.
Those pompous preening peacock professors are more interested in getting published, and cited, than they are in really doing anything useful and constructive. That just re-enforced my desire to return to the classroom, and actually teach students. No wonder there are no advances in education in this country, other than bloating the budgets for administrators and hiking tuition.
Though I assume the findings of the linked original research are sound, I'm not convinced Goldacre's account of them are accurate. In his hypothetical example, of the reaction of the healthy cell reaction to chemical exposure he suggests "...there is a bit of a drop, but not as much – let's say 15%, which doesn't reach statistical significance." If that "which" were changed to an "and," I could see his point. But it's not the 15% drop which is not significant, but the 15% result.
He probably would have been better off choosing different example numbers; perhaps mutant drop of 30%, healthy drop of 20%, with a difference of 10%. The two 15s make it more confusing than is necessary.
This ties into my pet peeve of non-stats savvy people discussing science; the meaning of the word "significant" is very different in science than the everyday world.
Academic psychologists actually get a lot more statistical training than, say, medical researchers.
But there's tremendous lack of clarity about what inferential statistics (significance testing, confidence intervals) actually do or dont' do for the researcher.
For instance, every inferential procedure that I've ever countered presupposes random sampling from the population to which the researcher wishes to generalize.
Psychologists basically never do random sampling. Other social scientists hardly ever do.
"...presupposes random sampling from the population..."
In Poli Sci, when doing state level studies, researchers will often forego confidence intervals entirely, noting that since all 50 states, or the universal set, are included, they do not apply. Of course, this is not entirely true, but it's good enough for government work.
The one that always got me was when working with a given data set, additional statistical caveats should be provided each time the numbers are accessed. I always used it as an excuse to not think about work while at home lest I have to adjust my confidence downward (that's a joke by the way). Of course no one ever followed the practice and no one really cared.
By the way sorepaw, I see you don't know what "don't" means since you mis-spelled it (or would a mis-placed apostrophe be considered a punctuation error?). It's such a basic concept, I don't know how to begin to discount your contribution to the discussion.
JTriangle Man you said: "The clearest example of intentional misrepresentation I recall is the testimony of the Yale expert in the Bush/Gore butterfly ballot lawsuit."
I also admire you for posting all this and I can say that I really have a good time reading all this posts here. Keep on posting
Despite the fact that I found this particular blog post quite helpful, I could not help but to question whether or not the stats you used are accurate.
30 comments:
I read this article all worried, only to be relieved when the statistical error was revealed. At work, I'm always asking "Is this just noise?" which also comes in handy when I'm listening to political debates.
Actually statistics is an error in the first place.
That actual result for 95% confidence is that if you do the same procedure on some assumed random system without the comdition 100 times, it will only come out positive only 5 times.
But you don't have an actual assumed random system. So the 5 may not apply.
That's when everything is done right.
Irradiated the article at the link and hope to fond the time to read the original source, but the scope is surprising. That is, it is surprising if psychologists are trying to make inferences about the difference in differences without the appropriate statistical estimates. It's not so surprising is they are simply reporting within group effects and are ignoring the between group differences. Also, the article at the link seems to imply that the standard errors for the within group and difference in differences tests are the same. They are not.
Social "science" curricula need to add statistical analysis techniques as a prerequisite. Otherwise a scholar may mistake association for correlation.
The clearest example of intentional misrepresentation I recall is the testimony of the Yale expert in the Bush/Gore butterfly ballot lawsuit. I saw this live, on television right here in St. Louis.
He stated the undervote was statistically associated with the butterfly ballot, as if association was proof of causation. My wife came downstairs to ask what I was shouting about.
In English, that's translated as, "the undervote was coincidental and not scientifically proven to be caused by the butterfly ballot."
Although the Bush team left a long pause before their next question, no one on either legal team openly challenged his statement!!
The science is settled and you, sir, are anti-science!
Psychology or neuroscience?
Statistics is at the foundation of research science in studies of data.
If the credentialed ones are now faking the statistics when they are not faking the data, then we will never learn anything new that is not myth.
Science is dead.
A lack of integrity kills another blessing in life.
But at least we got rid of Judeo-Christian morality that once produced the integrity. That means less restraints on everything, you know.
"...a little tricksy..."?
Give me back the statistics, my precious.
Is 15% significant? There seems to be a subjective issue in all of this. The overall tone seems to be "people highlight differences more than they should."
Which is a true point, but I think this is also factored in by academics when they read other works.
From the article...
They've identified one direct, stark statistical error so widespread it appears in about half of all the published papers surveyed from the academic psychology research literature.
No, it does not.
Also from the article...
How often? Nieuwenhuis looked at 513 papers published in five prestigious neuroscience journals over two years. In half the 157 studies where this error could have been made, it was.
So it happened in 78.5 out of 513 papers, or ~15.3%.
If you're going to write an article about people's statistical mistakes, get your numbers right.
Note: Getting it wrong in 50% of the cases where you could get it wrong is very bad. You don't need to make it sound even worse.
There is pressure to have results, and positive ones to get research funds, or get the drugs to market, but as one comment noted: Mr. Goldacre is falling into the same trap as these academics that have made flawed claims in their papers). Now it is true that most often, in the within-group analysis it is easier to achieve significance (making Mr. Goldacre's comment likely); however, other factors impact this likelihood such as size of the groups and how tight the data cluster around the mean in each of the two types of analyses. Hence, Mr. Goldacre's statement (above as I quoted) is a bit misleading. See how easily one can misinterpret statistics?
"You don't need to make it sound even worse."
They were illustrating the practice of making results sound more impressive. Very tricksy of them!
The article in the Guardian describes one major problem with statistical design in academic literature. A deeper problem is the almost universal assumption that normal distribution, which is the basis for most statistics, is--well--the norm.
If a phenomenon varies more than the normal curve suggests (e.g., a Cauchy distribution), then all the usual measures of statistical significance would be overstated. I've heard several times of "500-year" floods on the Mississippi, but I wouldn't be surprised if the erroneous use of statistical tests based on the normal curve was a reason.
Well, that's embarrassing.
The Blonde says statistics is "not normal thinking".
Guess she's right.
In half the 157 studies where this error could have been made, it was.
Does this mean that in 356 papers this error could not have been made? Did 356 papers not deal with experiments involving differences in differences?
I am confused.
A lot of paychecks depend on getting published, even if it's pointless research that no one will ever read or use.
All you have to do is read Steve McIntyre's work to realize that there is so much statistical incompetence and statistical shenanigans (sometimes it's hard to tell the difference) coming from the pro-AGW crowd that everything they do is suspect. That's why lots of us are skeptical. It seems that they don't use professional statisticians on most of their papers. They wing it themselves and get things wrong (or misunderstand what they are doing) routinely. You'd think with trillions of dollars at stake the very least that would happen would be that every finding would be rigorously examined for its statistical soundness by those specializing in that field. But that doesn't appear to happen. The AGW crowd is a closed circle and real statisticians aren't invited in.
The danger of spreadsheets.
I suspect there are a lot of fields where statistical techniques are necessary to analyze data, but don't directly help with the design or implementation of experiments. That would explain why there are so many fields that use statistics extensively, but have practitioners who are not statistically adept.
Getting the clap at conferences? Why would they want to do that?
At the Colorado School of Mines, they used to have a cheer: "Rippidy Rip, Rippidy Rap, Referee's got the (clap)(clap)(clap)." We didn't say he had the clap, we clapped our hands. Perhaps it had something to do with his vision, or lack thereof.
I spent semester recently in an College of Education on a Big Ten campus as a doctoral student. The statistics favored there is the analysis of variance. One of the tenets is that they Can Not prove cause and effect, they can only prove not a cause and effect. Sounds like a bunch of guessing in the dark.
We were warned, by the way, about continuous pagination. Some journals have continuous pagination where the first journal of a volume year will start at page 1, and end at page, 98, say. Then the next journal will start at page 99, and so on through the year. If we cited a journal that used continuous pagination, and we used the journal number in our citation, our professors might mark us down from an A. While
Althouse, A., 2011. "On the Terrace", Wisdom of Meade Journal, Vol. 58, No. 2, 213-215.
and
Althouse, A., 2011. "On the Terrace", Wisdom of Meade Journal, Vol. 58, 213-215.
might not really look different, they are different enough for the student to be taken down a letter grade.
Those pompous preening peacock professors are more interested in getting published, and cited, than they are in really doing anything useful and constructive. That just re-enforced my desire to return to the classroom, and actually teach students. No wonder there are no advances in education in this country, other than bloating the budgets for administrators and hiking tuition.
Though I assume the findings of the linked original research are sound, I'm not convinced Goldacre's account of them are accurate. In his hypothetical example, of the reaction of the healthy cell reaction to chemical exposure he suggests "...there is a bit of a drop, but not as much – let's say 15%, which doesn't reach statistical significance." If that "which" were changed to an "and," I could see his point. But it's not the 15% drop which is not significant, but the 15% result.
He probably would have been better off choosing different example numbers; perhaps mutant drop of 30%, healthy drop of 20%, with a difference of 10%. The two 15s make it more confusing than is necessary.
This ties into my pet peeve of non-stats savvy people discussing science; the meaning of the word "significant" is very different in science than the everyday world.
Academic psychologists actually get a lot more statistical training than, say, medical researchers.
But there's tremendous lack of clarity about what inferential statistics (significance testing, confidence intervals) actually do or dont' do for the researcher.
For instance, every inferential procedure that I've ever countered presupposes random sampling from the population to which the researcher wishes to generalize.
Psychologists basically never do random sampling. Other social scientists hardly ever do.
Oops!
"...presupposes random sampling from the population..."
In Poli Sci, when doing state level studies, researchers will often forego confidence intervals entirely, noting that since all 50 states, or the universal set, are included, they do not apply. Of course, this is not entirely true, but it's good enough for government work.
The one that always got me was when working with a given data set, additional statistical caveats should be provided each time the numbers are accessed. I always used it as an excuse to not think about work while at home lest I have to adjust my confidence downward (that's a joke by the way). Of course no one ever followed the practice and no one really cared.
By the way sorepaw, I see you don't know what "don't" means since you mis-spelled it (or would a mis-placed apostrophe be considered a punctuation error?). It's such a basic concept, I don't know how to begin to discount your contribution to the discussion.
There are other examples of this...odds ratios from logistic regression are routinely misinterpreted, for example.
JTriangle Man you said:
"The clearest example of intentional misrepresentation I recall is the testimony of the Yale expert in the Bush/Gore butterfly ballot lawsuit."
I also admire you for posting all this and I can say that I really have a good time reading all this posts here. Keep on posting
bipolartest.com
Despite the fact that I found this particular blog post quite helpful, I could not help but to question whether or not the stats you used are accurate.
Bipolar Test
I have learned many important things from your post and i want to thank you for sharing this post.
laboratory billing claims
laboratory medical billing claims
online moderation
helps you moderate your blog in the top search engines like google.
Post a Comment