Aug 6, 2022·edited Aug 6, 2022Liked by Alexandros Marinos
In addition:
I'm surprised how much Scott Alexander distorts Merino's study when he describes it. As if he hadn't read it.
SA's claim 1: "if you called a hotline and said you had COVID, they sent you an emergency kit".
That is not what the study says. Patients who received a positive test for a test were *given* a kit (under certain conditions.) SA seems to be making a confusion with the phone *monitoring* a subset of the patients were offered: they were called a few days after they had received a positive test, to check how they felt and to advise them to go to hospital according to the description they gave of their own health state.
SA's claim 2: "18,074 people got the kit"
Inaccurate : 83,000 patients got hte kit, out of whom 77,381 were included in the study. Among whom 18,074 received a monitoring phone call.
SA's claim 3: "Their control group is people from before they started giving out the kits, people from after they stopped giving out the kits, and people who didn’t want the kits."
Where does that come from? Mexico didn't stop giving out the kits (not right after the study, at least). Here's what the study mentions: "The control group are positive symptomatic patients, from 23 November to 28 December, and the treated group are positive symptomatic patients from 28 December to 28 January."
(The City of Mexico started delivering the kits to any person who tested positive — under certain conditions — as from Dec 28th 2020)
SA's claim 4: "There are differences in who got COVID early in the epidemic vs. later, and in people who did opt for medical kits vs. didn’t. To correct these, the researchers tried to adjust for confounders"
The reasons the authors adjusted for confounders are not exactly the ones SA mentions. Unlike SA is suggesting, there is no "early in the pandemic" involved in the study : the time difference between control group and treated group is exactly 1 month: Nov 23-Dec 28 2020 for the control group, and Dec 28 2020-Jan 28 2021 for the treated group.
Enzo, thank you. No matter how far I dig, there's always more, it seems. I will be incorporating your findings in the article as soon as I get a chance.
Not sure if you saw the appendix at the bottom, I have a timeline of the absurd merino withdrawal.
Are you a long time SA reader? Any thoughts as to why he would make so many errors such as the ones you picked up? I'm starting to wonder if he used a researcher or something to extract the facts from the papers.
For the record, the same kind of debate is taking place in France. Many bloggers or youtubers I used to admire because they seemed to "follow the science" have totally disappointed me during the Coivd crisis, spitting on a huge part of science and searchers because they were not doing RCTs and "that's the only way science works, anything else is BS"...
Aug 6, 2022·edited Aug 6, 2022Liked by Alexandros Marinos
To be honest, I had never heard of SA before you started debating his positions on ivermectin.
(I'm probably more familiar to the debates in France.)
Another hypothesis, since I sometimes act in such a bad way: when I think I know in advance why I'll disagree with a paper and I lack time or energy, I — sometimes, I insist :-D — jump to the key points in the paper to find a confirmation of what I expected, and do not thoroughly read the whole paper. Which may lead to miss important points in the paper. Maybe SA can have done the same with Merino's paper...
There's always counter-examples. If you look at the reference I used, it also has examples where the RCTs performed worse. The real question is if one or the other can be reliably said to be superior, and there doesn't seem to be any support for that position. I find that very surprising, but that's the nature of learning as I go.
My understanding is this. It is much much easier to cheat and get the conclusion you want with observational studies (and still easier with mathematical models...).
With obs studies, sometimes you do not even have to cheat at all, juste using biases in the data (healthy user bias, immortal bias...) to get the results you want. For instance, there are hundreds of (flawed) pharma sponsored obs studies claiming flu vaccines work.
But, as is now clear with Pfizer (and Together!) trial, some people have become experts at cheating RCT - in the protocol and after. So what is important is the independance (good faith) and competence of the researchers.
> The real question is if one or the other can be reliably said to be superior, and there doesn't seem to be any support for that position.
I don't think that's what Cochrane conclude (from my second link); emphasis on the last two sentences:
"When using the new Risk Of Bias In Non-randomized Studies of Interventions (ROBINS-I) tool (Sterne et al 2016), an assessment tool that covers the risk of bias due to lack of randomization, all studies may start as high certainty of the evidence (Schünemann et al 2018). The approach of starting all study designs (including NRSI) as high certainty does not conflict with the initial GRADE approach of starting the rating of NRSI as low certainty evidence. This is because a body of evidence from NRSI should generally be downgraded by two levels due to the inherent risk of bias associated with the lack of randomization, namely confounding and selection bias. Not downgrading NRSI from high to low certainty needs transparent and detailed justification for what mitigates concerns about confounding and selection bias (Schünemann et al 2018). Very few examples of where not rating down by two levels is appropriate currently exist."
I don't treat Cochrane as some kind of oracle. They tend to be an extremely conservative middle ground, so that's useful some times to establish a baseline, but that's not to say they're not drowning in their own biases.
Thus, I don't think their guidelines serve as evidence in and of themselves.
It seems to me that the thrust of your point is that studies should be evaluated by their details rather than their design, and each may have some value as part of a broad picture. I think this aligns with Cochrane's guidelines (Schünemann et al., 2018).
But it seems to me the difference is that Cochrane's guidelines have some more nuance than "[whether] [RCTs] or [non-RCTs] can be reliably said to be superior"; they provide an actual methodology to evaluating biases (of which confounding is just one type). I think this is a much more useful framework than a vague equivalence.
The alternative is the anti-hierarchy of evidence school. One place to start is Blunt (2015): Hierarchies of evidence in evidence-based medicine.
It is actionable, which is what Cochrane generally favors. But by being overly mechanistic it has the effect of eliminating nuance while simultaneously creating the impression of rigor. For me to take such procedures seriously, I'd need to see some data, as well as a good explanation of how goodharts law concerns are handled in this framework.
But isn't there a value to being mechanistic like a checklist before plane takeoff? Even the study you cited ends with, "Factors other than study design per se need to be considered when exploring reasons for a lack of agreement between results of RCTs and observational studies" and Schünemann et al. (2018) provides some of such factors.
My point is that things like residual confounds are potentially serious, so by concluding that there's a vague, a priori equivalence between RCTs and non-RCTs, we might forget to analyze all the potential biases. In other words, there's something in the intuition that there is a hierarchy of evidence with RCTs being at the top, although that's a partially flawed intuition as Anglemyer et al. point out, and we need more nuance around RCTs. But I worry that your conclusion might create blindspots around non-RCTs.
Aug 6, 2022·edited Aug 6, 2022Liked by Alexandros Marinos
In 2018, Adrian Currie published the book, "Rock, Bone, and Ruin," as part of the MIT Press series, "Life and Mind: Philosophical Issues in Biology and Psychology." Currie treats three of the historical sciences: geology, paleontology, and archaeology. He could have included cosmology, evolutionary ecology, and biogeography as well. And Peter Turchin's cliodynamics finally brought historical science to history itself.
The point of Currie's book is to explore what kinds of evidence allow scientific conclusions to be drawn when manipulative experiments are impossible because of the expanses of time and space that would be involved. Reviewing the many tactics used by scientists who are thus constrained, Currie concludes that these investigators are "methodological omnivores" who must be far more creative than laboratory scientists.
In the politically manipulated SARS-2 environment, RCT's are most easily mounted by well-funded agencies, and such agencies are precisely those which are most likely to have been bribed in some fashion to disfavor early treatment of Covid. So, in terms of feasibility of doing valid manipulative experiments like RTC's, politization of early treatment is analogous to the space-time constraints of the historical sciences. If the experimental manipulations are manipulated dishonestly, and we know that they have been, then that is a constraint on acquiring knowledge.
Being methodologically omnivorous and accepting the epistemic utility of observational studies is a rational response to corrupted RCT's.
Just found out that Scott Alexander has always been that way (or at least since 2016) :
Kary Mullis is kind of cheating since he was not technically a psychedelicist. He was a biochemist in the completely unrelated field of bacterial iron transport molecules. But he did try LSD in 1966 back when it was still a legal research chemical. In fact he tried 1000 micrograms of it, one of the biggest doses I’ve ever heard of someone taking. Like the others, Mullis was a brilliant scientist – he won the Nobel Prize in Chemistry for inventing the polymerase chain reaction. Like the others, Mullis got really weird fast. He is a global warming denialist, HIV/AIDS denialist, and ozone hole denialist; on the other hand, he does believe in the efficacy of astrology. He also believes he has contacted extraterrestrials in the form of a fluorescent green raccoon, and “founded a business with the intent to sell pieces of jewelry containing the amplified DNA of deceased famous people like Elvis Presley”.
I’m simply a housewife with a passion for the welfare of mankind. The fact that ivermectin has very little side effects and is a proven and essential medicine, makes this entire fiasco so ridiculous! Obviously so much deep and malicious activity going on behind the scenes.
In addition:
I'm surprised how much Scott Alexander distorts Merino's study when he describes it. As if he hadn't read it.
SA's claim 1: "if you called a hotline and said you had COVID, they sent you an emergency kit".
That is not what the study says. Patients who received a positive test for a test were *given* a kit (under certain conditions.) SA seems to be making a confusion with the phone *monitoring* a subset of the patients were offered: they were called a few days after they had received a positive test, to check how they felt and to advise them to go to hospital according to the description they gave of their own health state.
SA's claim 2: "18,074 people got the kit"
Inaccurate : 83,000 patients got hte kit, out of whom 77,381 were included in the study. Among whom 18,074 received a monitoring phone call.
SA's claim 3: "Their control group is people from before they started giving out the kits, people from after they stopped giving out the kits, and people who didn’t want the kits."
Where does that come from? Mexico didn't stop giving out the kits (not right after the study, at least). Here's what the study mentions: "The control group are positive symptomatic patients, from 23 November to 28 December, and the treated group are positive symptomatic patients from 28 December to 28 January."
(The City of Mexico started delivering the kits to any person who tested positive — under certain conditions — as from Dec 28th 2020)
SA's claim 4: "There are differences in who got COVID early in the epidemic vs. later, and in people who did opt for medical kits vs. didn’t. To correct these, the researchers tried to adjust for confounders"
The reasons the authors adjusted for confounders are not exactly the ones SA mentions. Unlike SA is suggesting, there is no "early in the pandemic" involved in the study : the time difference between control group and treated group is exactly 1 month: Nov 23-Dec 28 2020 for the control group, and Dec 28 2020-Jan 28 2021 for the treated group.
Besides — and this has nothing to do with Scott Alexander — thanks to your paper, I've just found out that Merino's paper has been withdrawn by the preprint server where it had been published. And the reasons are incredible: https://socopen.org/2022/02/04/on-withdrawing-ivermectin-and-the-odds-of-hospitalization-due-to-covid-19-by-merino-et-al/
I've now updated the piece with your observations. In fact, I was motivated to check deeper, and found a couple more. :)
Enzo, thank you. No matter how far I dig, there's always more, it seems. I will be incorporating your findings in the article as soon as I get a chance.
Not sure if you saw the appendix at the bottom, I have a timeline of the absurd merino withdrawal.
You're right: I had missed the appendix :-)
Are you a long time SA reader? Any thoughts as to why he would make so many errors such as the ones you picked up? I'm starting to wonder if he used a researcher or something to extract the facts from the papers.
For the record, the same kind of debate is taking place in France. Many bloggers or youtubers I used to admire because they seemed to "follow the science" have totally disappointed me during the Coivd crisis, spitting on a huge part of science and searchers because they were not doing RCTs and "that's the only way science works, anything else is BS"...
To be honest, I had never heard of SA before you started debating his positions on ivermectin.
(I'm probably more familiar to the debates in France.)
Another hypothesis, since I sometimes act in such a bad way: when I think I know in advance why I'll disagree with a paper and I lack time or energy, I — sometimes, I insist :-D — jump to the key points in the paper to find a confirmation of what I expected, and do not thoroughly read the whole paper. Which may lead to miss important points in the paper. Maybe SA can have done the same with Merino's paper...
That's my dominant hypothesis. Overconfidence in one's ability to figure out what's going on in a topic this complex.
> But, we now know that when tested in the real world, well, it’s not true. RCTs don't seem to yield more information than observational studies.
There are famous counter-examples like hormone replacement therapy: https://archive.ph/lCCwf
Cochrane does try to integrate non-RCTs but with care: https://training.cochrane.org/handbook/current/chapter-14#section-14-2
There's always counter-examples. If you look at the reference I used, it also has examples where the RCTs performed worse. The real question is if one or the other can be reliably said to be superior, and there doesn't seem to be any support for that position. I find that very surprising, but that's the nature of learning as I go.
My understanding is this. It is much much easier to cheat and get the conclusion you want with observational studies (and still easier with mathematical models...).
With obs studies, sometimes you do not even have to cheat at all, juste using biases in the data (healthy user bias, immortal bias...) to get the results you want. For instance, there are hundreds of (flawed) pharma sponsored obs studies claiming flu vaccines work.
But, as is now clear with Pfizer (and Together!) trial, some people have become experts at cheating RCT - in the protocol and after. So what is important is the independance (good faith) and competence of the researchers.
I like RTE's drawing on RCT : https://roundingtheearth.substack.com/p/the-chloroquine-wars-part-iv
> The real question is if one or the other can be reliably said to be superior, and there doesn't seem to be any support for that position.
I don't think that's what Cochrane conclude (from my second link); emphasis on the last two sentences:
"When using the new Risk Of Bias In Non-randomized Studies of Interventions (ROBINS-I) tool (Sterne et al 2016), an assessment tool that covers the risk of bias due to lack of randomization, all studies may start as high certainty of the evidence (Schünemann et al 2018). The approach of starting all study designs (including NRSI) as high certainty does not conflict with the initial GRADE approach of starting the rating of NRSI as low certainty evidence. This is because a body of evidence from NRSI should generally be downgraded by two levels due to the inherent risk of bias associated with the lack of randomization, namely confounding and selection bias. Not downgrading NRSI from high to low certainty needs transparent and detailed justification for what mitigates concerns about confounding and selection bias (Schünemann et al 2018). Very few examples of where not rating down by two levels is appropriate currently exist."
I don't treat Cochrane as some kind of oracle. They tend to be an extremely conservative middle ground, so that's useful some times to establish a baseline, but that's not to say they're not drowning in their own biases.
Thus, I don't think their guidelines serve as evidence in and of themselves.
It seems to me that the thrust of your point is that studies should be evaluated by their details rather than their design, and each may have some value as part of a broad picture. I think this aligns with Cochrane's guidelines (Schünemann et al., 2018).
But it seems to me the difference is that Cochrane's guidelines have some more nuance than "[whether] [RCTs] or [non-RCTs] can be reliably said to be superior"; they provide an actual methodology to evaluating biases (of which confounding is just one type). I think this is a much more useful framework than a vague equivalence.
The alternative is the anti-hierarchy of evidence school. One place to start is Blunt (2015): Hierarchies of evidence in evidence-based medicine.
It is actionable, which is what Cochrane generally favors. But by being overly mechanistic it has the effect of eliminating nuance while simultaneously creating the impression of rigor. For me to take such procedures seriously, I'd need to see some data, as well as a good explanation of how goodharts law concerns are handled in this framework.
But isn't there a value to being mechanistic like a checklist before plane takeoff? Even the study you cited ends with, "Factors other than study design per se need to be considered when exploring reasons for a lack of agreement between results of RCTs and observational studies" and Schünemann et al. (2018) provides some of such factors.
My point is that things like residual confounds are potentially serious, so by concluding that there's a vague, a priori equivalence between RCTs and non-RCTs, we might forget to analyze all the potential biases. In other words, there's something in the intuition that there is a hierarchy of evidence with RCTs being at the top, although that's a partially flawed intuition as Anglemyer et al. point out, and we need more nuance around RCTs. But I worry that your conclusion might create blindspots around non-RCTs.
In 2018, Adrian Currie published the book, "Rock, Bone, and Ruin," as part of the MIT Press series, "Life and Mind: Philosophical Issues in Biology and Psychology." Currie treats three of the historical sciences: geology, paleontology, and archaeology. He could have included cosmology, evolutionary ecology, and biogeography as well. And Peter Turchin's cliodynamics finally brought historical science to history itself.
The point of Currie's book is to explore what kinds of evidence allow scientific conclusions to be drawn when manipulative experiments are impossible because of the expanses of time and space that would be involved. Reviewing the many tactics used by scientists who are thus constrained, Currie concludes that these investigators are "methodological omnivores" who must be far more creative than laboratory scientists.
In the politically manipulated SARS-2 environment, RCT's are most easily mounted by well-funded agencies, and such agencies are precisely those which are most likely to have been bribed in some fashion to disfavor early treatment of Covid. So, in terms of feasibility of doing valid manipulative experiments like RTC's, politization of early treatment is analogous to the space-time constraints of the historical sciences. If the experimental manipulations are manipulated dishonestly, and we know that they have been, then that is a constraint on acquiring knowledge.
Being methodologically omnivorous and accepting the epistemic utility of observational studies is a rational response to corrupted RCT's.
"methodological omnivores". Love it.
Just found out that Scott Alexander has always been that way (or at least since 2016) :
Kary Mullis is kind of cheating since he was not technically a psychedelicist. He was a biochemist in the completely unrelated field of bacterial iron transport molecules. But he did try LSD in 1966 back when it was still a legal research chemical. In fact he tried 1000 micrograms of it, one of the biggest doses I’ve ever heard of someone taking. Like the others, Mullis was a brilliant scientist – he won the Nobel Prize in Chemistry for inventing the polymerase chain reaction. Like the others, Mullis got really weird fast. He is a global warming denialist, HIV/AIDS denialist, and ozone hole denialist; on the other hand, he does believe in the efficacy of astrology. He also believes he has contacted extraterrestrials in the form of a fluorescent green raccoon, and “founded a business with the intent to sell pieces of jewelry containing the amplified DNA of deceased famous people like Elvis Presley”.
https://slatestarcodex.com/2016/04/28/why-were-early-psychedelicists-so-weird/?
And then read this : https://joomi.substack.com/p/remembering-kary-mullis?
Deeply unsettling.
I’m simply a housewife with a passion for the welfare of mankind. The fact that ivermectin has very little side effects and is a proven and essential medicine, makes this entire fiasco so ridiculous! Obviously so much deep and malicious activity going on behind the scenes.