You may have heard that a new NIH-funded Randomized Controlled Trial was published recently. This one most definitely proves ivermectin doesn’t work for COVID-19—at least if the headlines are to be believed:
Naturally, since this Substack isn’t called “Let The Press Do Your Research,” we’re going to be diving deeper than the headlines. A lot deeper.
Quick Observations
Before we discuss the trial’s conclusions, the following context will help you:
The dosing is described as being approximately 0.4mg/kg, when in fact almost every patient was underdosed. This diagram is a visual representation of what is described in the protocol:
Very few patients were correctly dosed, with the vast majority being underdosed to some extent. If you weighed 68 or 116kg (149 or 256 lb), you’d be getting 76% of the nominal dose. If you weighed 51 or 131kg (112 or 288 lb), you’d be getting 67% of the nominal dose. And the further above 131kg your weight goes, the lower the effective dose you’re getting (even as your risk from COVID-19 climbs).
A few more details that I will expand on in a future installment:
The drugs were sent to the patients by mail, so that some patients took them 13 or 14 days after symptom onset.
The authors never saw the vast majority of the patients, and every step of the process was done remotely. The call this a “distributed” trial.
There is no reporting of adherence in the trial, so we don’t even know how many of the patients took how many of the doses they were given.
There is no per-protocol analysis, so we don’t know how well the drug did in the patients who actually took all the doses.
And still, the results are actually strongly positive for ivermectin, if you strip back the bias.
So, What Did They Find?
Let’s start with the trial’s findings:
The authors write that there was “a posterior probability of benefit of .91.” This is another way to write that they found 91% probability that ivermectin is superior to taking placebo, in shortening time to recovery. Now, remember, this is in the USA. Worms can’t be the reason the authors got the result they did.
They also say that, “This did not meet the prespecified threshold of posterior probability greater than .95.” When we look at clinicaltrials.gov to see what they prespecified, however, we see a primary endpoint of the trial that is completely different. The only change that happened—after the end of the ivermectin arm—was the timeframe went from 14 to 28 days.
And the authors didn’t just change the primary endpoint, the ones registered on clinicaltrials.gov (Hospitalizations, Deaths, Symptoms by day 14) aren’t even reported in the paper.
So what happened? When we watch this presentation, the authors tell us that they simply didn’t have enough events in their primary endpoints to produce a meaningful result. Their mild and moderate COVID-19 patients were doing a lot better than expected, so they had to choose a different endpoint. In other words, much like the Vallejos and Lopez-Medina trials, the original design of the trial was hopelessly underpowered, which means that given the sample size, patient population, and time period during which the trial was run, COVID-19 simply didn’t affect the patients enough to determine if ivermectin worked.
Normally—in scientific terms—this means that the experiment failed to properly test the hypothesis, and we need to go back to the drawing board. Not for these scientists, though: they decided to rescue the study by changing their primary endpoint to measure the speed of recovery.
What we are shown—alongside the published paper—is the 4th version of the trial protocol, dated 20 December, 2021. That’s six months after the first patient was included and 46 days before the last. What’s more, it misreports the original endpoint, showing it as being measured 28 days after enrollment, whereas the 1st version of the protocol—as well as clinicaltrials.gov—report it as being measured 14 days after enrollment. This is not explained, nor do we have access to version history to understand the discrepancy.
What’s more, the statistical analysis plan that is shared with the trial, including many key details, is dated after the end of the trial, opening up the possibility that post-hoc decisions were made.
There’s a more fundamental problem, however: the statistical language of posterior probability is Bayesian. There is no Bayesian concept of meeting some prespecified threshold as a way to “prove” “statistical significance.”
In other words, the conclusion of the paper is they found a 91% probability that ivermectin helps with improving time to recovery.
The benefit of Bayesian statistics is that they mean what they sound like. There are no hidden traps like there are with p-values or “crossing the 1-line.” They work far better with human intuitive reasoning about probability than the alternative.
If the authors wanted to set some pre-specified target so that they can declare the study reached some sort of high-confidence result, that’s up to them. However, they went on to write that, “These findings do not support the use of ivermectin in outpatients with mild to moderate COVID-19.“
The preprint also included this sentence in its results section, which didn’t make it to the final paper:
Ivermectin at 400 µg/kg was safe and without serious adverse events as compared with placebo (ivermectin [n=10]; placebo [n=9]).
Can you imagine a doctor telling you “This drug has a 91% chance of helping, and is very safe, so I can’t really recommend it”? That is essentially what this paper is saying—and that’s if we accept the entire design and analysis approach as flawless.
Bonus points to the trial for using Bayesian statistics, minus points for misrepresenting them. By the way, all this is with what they call a “skeptical prior,” which is almost like stuffing the ballot box with a few negative votes to start with. Yes, it’s a good tool if you want to use it for a sensitivity analysis, or even for steelmanning a strong result. In this case, it helped take the final result from 93% down to 91%:
In their defense, the authors use the skeptical prior as primary for all the repurposed generic drugs they test in the ACTIV-6 framework. No, wait, that’s not in their defense, never mind…
Then we get to the presentation by the principal investigator, where she waves her mouse over values for day 7 and day 14 on the WHO clinical progression scale endpoint.
Here’s the slide itself:
What this shows is that—based on the WHO’s clinical progression scale—on day 14 (the pre-specified time period), the authors found 27% improvement in favor of ivermectin, with 98% probability of efficacy. This exceeds their 95% threshold, even while using the skeptical prior.
I would expect a scientifically neutral trial—since they were not able to use their primary endpoint—to promote this secondary endpoint and declare that it is strong evidence in favor of ivermectin.
Instead, listen to what the investigator said, because you won’t believe me if I quote her:
Basically, she is pointing to a Posterior P value of 0.97 and 0.98 and calling them “not significant”, which has no meaning in that context. The authors do something similar in the infographic attached to the paper:
See in the bottom right how they coin the term “posterior P value”, and then describe it as “not significant”. The academic reader will interpret this as “the p-value of this seemingly positive result is .91, which is not significant” as it is far away from the traditional threshold tends to be 0.05.
The better a hypothesis does in a bayesian analysis, the worse it looks in this mangled upside-down world of posterior p-value non-significance.
What we are seeing here is that when the Bayesian values can be twisted to look like they don’t cross some threshold, then they are presented as “not supporting” the use of ivermectin, even though that’s not how Bayesian probability works. When the Bayesian results do cross the threshold, we fall back to some sort of frequentist shell game that apparently concludes it isn’t “statistically significant.”
If you’re going to do that anyway, what is the point of using Bayesian statistics for the publication? What value do the pre-determined thresholds have if the commitment to them is not consistent? Why even put out a paper?
And somehow, it gets worse:
Apparently, the authors included 107 patients—split evenly between ivermectin and placebo—that had no symptoms on day 1 of their participation in the study. I interpret this to mean that while they did have a positive result and symptoms, by the time they were included, they had no signs of illness left. This group substantially skews the result against ivermectin, as we can see from their results.
A Fork in the Road
To summarize what we’ve learned so far, the authors of this study found part-way through the trial that their original endpoint did not have sufficient statistical power to produce a useful result. At that point, they were faced with a choice of which of the secondary endpoints to promote to primary. Given the circumstance, this choice needs to be the least questionable choice available.
The first choice was an endpoint that was based on the standardized WHO clinical progression scale, and on day 14—the time period matching the one prespecified for the original endpoint—this endpoint showed 27% improvement, with 98% probability of efficacy, which is above the pre-specified 95% threshold.
These results are in spite of over 50 patients in each group having no symptoms at the time of inclusion—and the use of the skeptical prior—which make this result even more remarkable.
Instead, they chose to take a strange turn, inventing a new, self-reported primary outcome, that has to do with when the patients felt they no longer had any symptoms. It is clear that patients self-reported as having symptoms for a very long time, since 28 days after inclusion (and therefore 34 days after first symptoms, on average), 20% of patients still reported not having recovered:
Even so, the authors still found 91% probability of efficacy, with the no-symptom patients included, and the skeptical prior in place.
And yet, somehow, the conclusion of the paper is that….
These findings do not support the use of ivermectin in patients with mild to moderate COVID-19.
Somehow, ACTIV-6 makes me appreciate the attention to detail and loving honesty of the TOGETHER trial team. But even going past that, I think ACTIV-6 may be the strongest evidence of ivermectin’s efficacy, because the people that ran the trial clearly were not intending to arrive at that result. And yet, when we peel back the layers of obfuscation, we see it.
More on this in future installments.
I was on oxygen for a year after covid (no preexisting conditions) went to specialist pulmonary in 3 cites even to University utah specialust all told me nithing they could do i have to wait they do not have any knowledge about covid!! When does a doctors tell you they cant do anything before covid if you went to the doctor with a cold they wound give antibiotics like candy!!
Well finally i found a doctor that followed the hippocratic oath “do no harm” who care about their patients !who knew what they were doing and were not led by politicians! That doctor prescribed Ivermectin and within a week i was off oxygen after a year of my life wasted for no
reason!!
Another brilliant analysis. You are one of my favorite writers on substack. I would love to have a friend like you to analyze medical journals with on the regular. You are so sharp. Nothing gets past you, you are a true detective. Thank you for showing us what charlatans these NIH sponsored authors are. Keep up the good work. We need minds like yours to shine a bright light on the fraudsters in the NIH, FDA and tte CDC who are making a mockery of true science. They give “science” a bad name.