The Potemkin Argument, Part 10: The Ballad of Lopez-Medina

Aug 06, 2022

This is a public peer review of Scott Alexander’s essay on ivermectin, of which this is the tenth part. You can find an index containing all the articles in this series here.

Among the critiques of various studies, Scott highlights two studies as “bigger and more professional” that “don’t show good results for ivermectin.” The first one of these is Lopez-Medina et al., so let’s see what we’ve got with that one.

Lopez-Medina et al.

Lopez-Medina et al: Colombian RCT. 200 patients took ivermectin, another 200 took placebo.

We’d be remiss not to mention that the trial took place in the city of Cali, Columbia, between July and December 2020, and had 476 participants.

You might note that 200 + 200 does not add up to 476—and you’d be correct. Hold on to that observation, it will come handy soon enough.

They originally worried the placebo might taste different than real ivermectin, then solved this by replacing it with a different placebo, which is a pretty high level of conscientiousness.

So, we have a major change in placebo composition mid-trial. Apparently, for Scott, this indicates “a pretty high level of conscientiousness.” I’m noting a sudden change in tone—from discarding studies for any reason—to an instinctive defense of this one. And yet, I’ve not seen such a blatant issue in any other trial.

Scott does not mention another—far more serious—event during the execution of this trial. From the paper:

On October 20, 2020, the lead pharmacist observed that a labeling error had occurred between September 29 and October 15, 2020, resulting in all patients receiving ivermectin and none receiving placebo during this time frame.

They apparently messed up and gave 38 placebo patients the treatment, so the investigators had to throw two away weeks of data—covering about 76 patients. “A pretty high level of conscientiousness,” indeed.

To my mind, once you get something that wrong, you simply abort the trial and move on. Perhaps report the results for the avoidance of publication bias, but nothing more. Instead, Lopez-Medina and team kept going as normal.

I suppose we’ll have to trust them that nothing else went wrong.

Primary outcome was originally percent of patients whose symptoms worsened by two points, as rated on a complicated symptom scale when a researcher asked them over the phone. Halfway through the study, they realized nobody was worsening that much, so they changed the primary outcome to time until symptoms got better, as measured by the scale.

I’m noting complete lack of interest in why their expectation of how much worse the patients would get was so far off. Apparently, they expected 18% of placebo to deteriorate, and instead they saw only 5% do so. How come so many fewer patients deteriorated, you ask? It’s hard to know. Perhaps they recruited exceptionally healthy patients. Perhaps it was the near-universal BCG vaccination.

Or perhaps it was something else. You see, a city-wide program of ivermectin early treatment for COVID-19 was initiated just as the Lopez-Medina trial was starting.

Interest in ivermectin (which is available over the counter without prescription in Columbia) was going through the roof—at more than 10x normal rates of consumption—as well as resulting in the kinds of search patterns we’ve seen elsewhere:

To everyone's surprise, when the results came out, the adverse events between placebo and treatment were near-identical, and very very suggestive of ivermectin use. Did I say everyone's? I meant no one's.

Flavio Abdenur @AbdenurFlavio

@joex92_ Joe, check this out: the incidences of side effects in the ivermectin and placebo groups in the Lopez-Medina JAMA RCT are basically identical... and typical of high-dose ivermectin (Table 7 in supplemental material)

This list of symptoms, by the way, is described in the study protocol as "symptoms that have historically been reported in subjects receiving ivermectin.” Given the relatively high dose in this trial, it's kinda strange that they appear pretty much evenly across groups.

In fact, ivermectin use was so rampant, the consent forms had to tell patients they were going to be taking some strange-sounding drug—D11AX22—in order to prevent them from dropping out of the trial. As the lead investigator put it in his response:

The need to use D11AX22 rather than ivermectin in the ICF arose from the extensive use of ivermectin in the city of Cali during the study period, extensive recommendations from some political and medical leaders to use it against COVID-19, and the fact that the initial placebo had a different taste from ivermectin.

Note that this had not been mentioned at all in the paper; it was only raised because someone noticed it in an old version of the protocol.

You might think that this covers all the problems with the change of primary endpoint, but, unfortunately, there’s more. As we’ve discussed in the past, the size of a trial relates to the number of patients required to have a high likelihood of achieving statistical significance on the primary endpoint—if the hypothesis is correct. You’d expect that a change in primary endpoint would lead to a change in the optimal study size, especially given the fact that the investigators were aware of the substantial lack of deterioration in the placebo group.

Nope. Apparently—with the new endpoint—the number of patients required was identical to the one they required initially, no change needed. What a coincidence! Even though the study population was so low-risk that only 5% of placebo patients ever deteriorated, and even though ivermectin treatment did reduce median duration of symptoms by two whole days, there just weren’t enough patients to make a “statistically significant” difference.

When the principal investigator was asked to justify the decision, he referred to the same literature that had produced their initial, falsified hypothesis on deterioration being 18%, making no mention of the fact that somehow the placebo group seemed to be doing exceedingly well in this context.

Anyway, back to Scott:

In the ivermectin group, symptoms improved that much after 10 days; in the placebo group, after 12, p = 0.53. By the end of the study, symptoms had improved in 82% of ivermectin users and 79% of controls, also insignificant. 4 patients in the ivermectin group needed to be hospitalized compared to 6 in the placebo group, again insignificant.
This study is bigger than most of the other RCTs, and more polished in terms of how many spelling errors, photographs of computer screens, etc, it contains. It was published in JAMA, one of the most prestigious US medical journals, as opposed to the crappy nth-tier journals most of the others have been in.

Look. They may have messed up their statistical power calculations, continually changed their protocol, and mixed up the placebo with the treatment for a couple of weeks, but at least the English in the paper is excellent, the charts are super crisp, and this prestigious journal (that has been shown to have direct financial benefit from its relationship to pharmaceutical companies) is telling me it’s good, so what can possibly go wrong?

Forgive the sarcasm, but sometimes I’m confused about which way the causality is supposed to flow. If we assume that studies are high-quality because they appear in the likes of the Journal of the American Medical Association, and we assume that the journals are high-quality because of the studies that appear in them, we have made a near-impenetrable epistemic bubble.

At least we can expect such a high-impact journal to foster debate, right? And yet, JAMA refused to publish a letter signed by over 160 doctors voicing concerns.

Let’s keep going:

When people say things like “sure, a lot of small studies show good results for ivermectin, but the bigger and more professional trials don’t”, this is one of the two big professional trials they’re talking about.
Ivermectin proponents make some good arguments against it. In order to get as big as it did, Lopez-Medina had to compromise on rigor. Its outcome is how people self-score their symptoms on a hokey scale in a phone interview, instead of viral load or PCR results or anything like that. Still, this is basically what we want, right? In the end, we want people to feel better and less sick, not to get good scores on PCR tests.

Actually, the endpoint was clearance of all symptoms. Meaning that if you felt better and less sick—but had a lingering cough, as is usual with COVID-19—as far as this trial was concerned, there was no difference in outcome.

Also, it changed its primary outcome halfway through; isn’t that bad? I think maybe not; the reason we want a preregistered primary outcome is so that you don’t change halfway through to whatever outcome shows the results you want. The researchers in this study did a good job explaining why they changed their outcome, the change makes sense, and their original outcome would also have shown ivermectin not working (albeit less accurately and effectively).

Ah, the return of “shown ivermectin not working” to mean “not reached statistical significance.” Neither outcome reached statistical significance, but the original outcome hinted at a much higher point estimate of efficacy. Not looking good for an investigator that wants to appear unbiased.

By the way, if “deterioration by 2 degrees on an 8-point scale,” is not producing enough statistical power, the obvious change is to “deterioration by 1 degree on an 8-point scale.” Not to an endpoint that gives the appearance of wanting to produce a specific result.

I don’t know of any evidence that they knew (or suspected) final results when switching to this new outcome, and it seems like the most reasonable new outcome to switch to.

The evidence that they knew (or suspected) what the final results would be is that they changed their endpoint because they were not observing sufficient deterioration. This indicates some level of data visibility by the investigators. Otherwise, how would they know to change the endpoint?

Finally, their original placebo tasted different from ivermectin (though they switched halfway through). This is one of the few studies where I actually care about placebo, because people are self-rating their symptoms. But realistically most of these people don’t know what ivermectin is supposed to taste like.

As covered above, not only did people know what ivermectin tasted like, they actually had quite likely taken it. Mind you, this is ivermectin in liquid form, not quite a taste you can avoid by swallowing a pill quickly.

What’s more, one of my findings during research for this article, is that while the paper states that the patients needed to not have taken ivermectin for five days in order to be eligible for randomization, in the clinicaltrials.gov pre-registration of the trial, the exclusion criteria didn't say five days... it said 48 hours without ivermectin was enough, in contradiction to what was published.

To make matters worse, the registration was updated after trial enrollment had ended to agree with the paper, raising serious concerns. People can only see this if they look into the protocol history. Despite intense research into this trial by many others, I’m the first to bring this to light, a year and a half after the trial ended.

A request: if any Colombians or other Spanish speakers can help find the original trial protocol—hosted somewhere that indicates it was uploaded before or during the trial (like one of the local ethics boards mentioned in the paper)—that would be incredibly helpful in figuring out this 48 hours vs. five days question.

Let me break down the significance of this: if the exclusion criteria was as seen in the registration, this would mean that a patient could get COVID-19 symptoms, use ivermectin for two days, skip one day of dosing, and be randomized in this trial to take “D11AX22” or placebo the next day. This would not constitute a protocol violation.

Also, they did a re-analysis and found there was no difference between the people who got the old placebo and the new one.

By “no difference,” I assume Scott means “no statistically significant difference in results.” But I can’t find any such analysis in the paper or appendix. Perhaps I missed it. But also, it would be a very weak kind of evidence, given that the whole trial did not reach significance. Obviously, there would have to be extreme differences for any particular subset to reach significance with far, far fewer patients. Now that would be strong evidence, and—shockingly—it might have sort of happened.

A third-party group—given access to the data—figured out that more than one person per household was allowed to take part in the study after the placebo changed.

This means that two family members could have gotten sick, gotten in the trial, but somehow one ended up in treatment and one in placebo. Did they pool their drugs? Were they able to figure out that they were taking different things? The investigators won't tell us what was in the new placebo, only that it “tasted similar.”

"What was in the placebo," is not really the kind of question you want to be asking in these trials. There should be a straightforward answer.

Regardless, these investigators indeed calculated the Risk Ratio (RR) for the different periods:

They found that the RR of the saline/dextrose placebo period was 0.72, but not significant. The RR of the updated placebo period was 0.92, but not significant. I don’t know that I’d call a 20% shift “no difference.” But it gets better: they calculated what would happen if they compared the discarded patients from the two-week period where the placebo was actually treatment, versus the placebo of the rest of trial.

The result? A statistically significant improvement of 56% (RR 0.44, p=0.033). Make of it what you will.

Concerns of Bias

The reason the change in endpoint—as well as the seeming lack of statistical power—is rubbing people the wrong way, is that it interacts badly with the various other events that went on in and around the trial. For example, a Colombian left-populist, ex-rebel, opposition leader (now president-elect)—was promoting ivermectin as a Covid-19 cure in 2020. I'm going to guess the establishment stayed neutral and focused on doing science. Or maybe they did what they did everywhere else?

El Atenuador. @EduardoLuger

@alexandrosM @jaldanabula @Fruizgomez @SocPediatria De hecho el actual presidente electo @petrogustavo lo propuso como alternativa de tratamiento temprano en 2020, pero lo destruyeron y llamaron de irresponsable... Igual, en 2021 dijo que las 💉 no eran tan eficaces. Lo volvieron a bautizar de irresponsable.

The other factors we have to delve into are the conflicts of interest present here. As Scott wrote elsewhere in this essay, "...ivermectin boosterism and vaccine denialism are closely linked." While that may be true, the converse also seems to hold: "...vaccine boosterism and ivermectin denialism are closely linked."

At the same time that the organization of Dr. Lopez-Medina was running this trial, it was also running the local Janssen vaccine trial for J&J.

In fact, Dr. Eduardo Lopez-Medina himself is a childhood infectious disease doctor with specialization in vaccines. Here's a Dengue vaccine candidate he worked on:

So you won't be shocked to see Sanofi, Janssen, GSK, Gilead, and others in the COI (conflicts of interest) statement, will you?

Not surprisingly, Dr. Lopez-Medina was involved in two more vaccine trials in 2021. One for Clover Health SCB-2019 protein subunit vaccine, and also the Solidarity Vaccine trial for the WHO. Both registered here.

With that backdrop, the paper needlessly makes this claim:

preliminary reports of other RCTs of ivermectin as treatment for COVID-19 with positive results have not yet been published in peer-reviewed journals

Why would an independent investigator downplay the state of the evidence, omitting any mention eight peer-reviewed RCTs with positive effects published prior to the paper?

That said, conflict of interest arguments are still the least persuasive to me, as they are uncomfortably close to mind-reading: I wouldn’t make up my mind about a trial solely based on their presence. For all we know, all this could mean nothing.

That said, the change of endpoint, as well as the lack of concern to maintain statistical power—both before and after the change—are consistent with an investigator that either doesn’t care if the trial is unable to reach significance even in principle, or one that is actively pursuing a result that can be framed negatively. Given that the trial was self-funded by the investigator’s institution, the “don’t care” option becomes less persuasive.

Scott Comes at Ivmmeta

At this point in the essay, Scott clearly feels he has built up enough ammunition to reveal his thoughts about ivmmeta, which he has been treating respectfully up to this point. Let’s see what he has to say:

I’m making a big deal of this because ivmmeta.com - the really impressive meta-analysis site I’ve been going off of - puts a special warning letter underneath their discussion of this study, urging us not to trust it.

I hope readers now have some understanding why ivmmeta was concerned about this study.

They don’t do this for any of the other ones we’ve addressed so far - not the one by the guy whose other studies were all frauds […]

I don’t believe we’ve seen a “guy whose other studies were all frauds” thus far. Charitably, Scott might be talking about Hector Carvallo, who has been implicated in bad research practices in a different study. We’ll be discussing that study in the next part of this series, which should make clear why I don’t think throwing around words like “fraud” is warranted. Regardless, the suggestion that every study by the same author should carry a warning is quite expansive, and it is a position that not even the “fraud hunters” group has articulated.

What else?

[…] not the one where 50% of 21 people had headaches, […]

This must be about the Babalola et al. study. As discussed, Scott is propagating a misunderstanding by Gideon Meyerowitz-Katz.

[…] not the unrandomized one where the groups were completely different before the experiment started, […]

I suspect Scott is talking about Elalfy et al., yet another of the studies he completely misconstrued, accusing it for being surreptitiously non-randomized, which the trial clearly stated in five different places.

[…] not even the one by the guy accused of crimes against humanity.

This must be referring to Cadegiani, whose section Scott has had to severely correct as its contents were false to the point of libel.

Only this one. This makes me a lot less charitable to ivmmeta than I would otherwise be; I think it’s hard to choose this particular warning letter strategy out of well-intentioned commitment to truth.

As you can see, it’s hard to defend the position that the four other studies Scott considers deserve a big fat warning—without exception—don’t really rise to the level of concern that Lopez-Medina et al. merits.

This is a good demonstration of why I called this series “the Potemkin argument.” On the basis of previous false portrayals, Scott builds a false conclusion: that ivmmeta cannot be trusted.

And yet, all these studies are excluded by ivmmeta in its exclusion analysis, as ivmmeta does not consider their results to be of high quality. Do I think the editorial angle of ivmmeta is fully neutral? No. But the fact that they don’t plaster their pages with critiques by Gideon Meyerowitz-Katz and Nick Mark—that don’t stand up to scrutiny—is not the reason.

One more thing to keep in mind is that ivmmeta is very open to feedback. Often, I’ll send them anonymous feedback through the form on their page. Sometimes, this is about a pro-ivermectin claim not being correct. They remove it within hours. Other times, I let them know of some new observation I came up with, and on occasion they do nothing, even if it is pro-ivermectin. As a rule, when that happens, it’s because they have found a flaw in my argument before I did. I’ve also heard from definitely anti-ivermectin folks that ivmmeta has corrected things they pointed out, too. The model of ivmmeta as a propaganda outlet is hard to make fit with these observations.

On the other hand, Scott refuses to read this sequence of essays and correct the dozens of errors pointed out here. I do hope he changes his mind, and if you know him personally, I hope you will encourage him to at least give them a read. But knowing this, as well as the comparative quality of the ivmmeta content when compared to Scott’s essay, calling ivmmeta out for lack of even-handedness is over the top.

I recommend that everyone who has objections about specific things on the ivmmeta website try this: send them a message with some specific issue you found and see what happens. In my experience, the more factual and the more specific one is, the more likely they are to incorporate the feedback. If you don’t see it incorporated in a day or two, let me know. I’d be interested in hearing your experiences either way.

In this light, I have to consider the resulting loss of trust in ivmmeta that Scott declares to be entirely a construct of his own misunderstanding, seemingly cultivated by only approaching the situation through people like Gideon Meyerowitz-Katz. I really do wonder how he squares not trusting ivmmeta because he suspects motivated reasoning, but trusting Gideon Meyerowitz-Katz who has literally declared his motivated reasoning since 2020—before the vast majority of the evidence was in:

They just really don’t like this big study that shows ivermectin doesn’t work.

I guess this sentence should stand as a reminder to all: avoid ascribing intentions to others. You might be wrong. Just as Scott was here.

But let’s let him continue:

Also, the warning itself irritates me, and includes paragraphs like:
RCTs have a fundamental bias against finding an effect for interventions that are widely available — patients that believe they need treatment are more likely to decline participation and take the intervention [Yeh], i.e., RCTs are more likely to enroll low-risk participants that do not need treatment to recover (this does not apply to the typical pharmaceutical trial of a new drug that is otherwise unavailable). This trial was run in a community where ivermectin was available OTC and very widely known and used.
Nobody else worries about this, and there are a million biases that non-randomized studies have that would be super-relevant when discussing those, but somehow when they’re pro-ivermectin the site forgets to be this thorough.

Scott fails to understand the problem being raised in the paragraph he’s quoting. Ivmmeta makes a specific point as to the structural issues that exist specifically when testing off-patent repurposed drugs, especially ones that are available over the counter in the area where the study is taking place. When he says “nobody else worries about this,” he must be thinking of investigators of proprietary drugs, which are generally not available for patients outside the trial setting. Hard to know why they would worry about this. If you’re testing Molnupiravir, you’re not exactly worried that people might not enroll, or that they may drop out, because they can just get it at a pharmacy without prescription instead.

I think a better pro-ivermectin response to this study is to point out that all the trends support ivermectin. Symptoms took 10 days to resolve in the ivermectin group vs. 12 in placebo; 4 ivermectin patients were hospitalized vs. 6 placebo patients, etc. Just say that this was an unusually noisy trial because of the self-report methodology, and you’re confident that these small differences will add up to significance when you put them into a meta-analysis.

Here’s the thing. If it’s a bad study, it’s a bad study. We’re not supposed to be playing “arguments as soldiers” here, and we’re definitely not supposed to be playing "studies as soldiers,” either. When you see a community put up a principled stand against a badly executed trial, the response “hey, it kinda helps you out, you should take it,” doesn’t exactly inspire confidence that the scientific process is what’s being followed here.

Conclusion

The noted switch from hair-trigger trial-evaluator to this ultra-charitable mode is quite jarring. Scott omits several critical elements that give people pause about the Lopez-Medina trial, and downplays others.

What’s more, on the basis of his work up to this point, he uses the Lopez-Medina review to attack ivmmeta for bias. I am pretty sure that—like everyone in this discussion—they have some biases. But the accusation that this is warping their content or results must be made with enough support to stick or not be levied at all. As it stands, that accusation says more about the process Scott used to come to his conclusion than it says about ivmmeta.

While I don’t appreciate the narrative that the “fraud hunters” have been building on the basis of their findings, I nevertheless accept those findings and believe they should be thanked for their work on those. If we are to make sense of this landscape, I sincerely believe we’re better off accepting people’s biases and making the most of what evidence and analysis they provide, rather than trying to discredit and get them out of the conversation for infractions real or imagined.

Is ivmmeta perfect? No. Is it the worst actor in this whole debate? While many have tried to make that case, I’ve seen nobody make it convincingly. If anyone has principled critiques of ivmmeta, please let me know.

This is a public peer review of Scott Alexander’s essay on ivermectin, of which this is the tenth part. You can find an index containing all the articles in this series here.

name12345

Aug 6, 2022

> If we are to make sense of this landscape, I sincerely believe we’re better off accepting people’s biases and making the most of what evidence and analysis they provide, rather than trying to discredit and get them out of the conversation for infractions real or imagined.

Another way of putting this is "[...] we’re better off disregarding people’s biases and focusing on the evidence and claims [...]".

In my opinion, Scott's major failure in this episode is his overconfidence from which he extrapolates ad hominems.

Expand full comment

1 reply by Alexandros Marinos

Anon

Aug 7, 2022

What’s irked me most about Scott Alexander and GM-K is their mocking sense of superiority and belief that they are truly objective, free of any agenda, when it seems very obvious that they’re hell bent on proving IVM doesn’t “work” no matter what.

But I guess another perspective on this is what was the impact of Scott’s original analysis? And what would be the impact if Scott acknowledged your analysis and adjusted his own opinion accordingly?

Remember of course that the Cochrane review on IVM has been recently updated and reaches the same conclusion as first time, ie IVM is very unlikely to be of benefit in treating Covid.

I think this Cochrane review would have greater impact on the various decision making bodies than Scott’s analysis.

Also, you referenced the GM-K article

https://www.bmj.com/content/377/bmj.o917

In the “responses” section to this article Marc Rendell calls upon GM-K to collaborate and study the Mexican data together objectively.

I’m guessing he got no reply from Gideon.

If so, then GM-K’s credibility could not be shot to damnation any further, the failure to collaborate with Rendell is surely the final nail in the coffin for this guy.

Given he seems to have plenty of time to write “Ivermectin didn’t save...” type of articles, surely he could work together with a few others (including yourself) to get mutually agreed conclusions.

But keep up the good work, I’ve learned something new from every article you’ve written

3 more comments...

Do Your Own Research

Discussion about this post