The Problem with the TOGETHER Trial
The most sophisticated trial of early treatments for COVID-19 doesn't quite add up.
This article is part of a series on the TOGETHER trial. More articles from this series here.
The TOGETHER trial has been hailed as a methodological marvel, and proof that ivermectin can’t possibly help against COVID-19.
Despite the trial’s sophisticated design, however, I’ve come to believe that its execution was flawed and its conclusions illusory, and this article will demonstrate why, using materials released by the trial investigators themselves.
While that doesn’t imply that ivermectin does work, I believe any fair-minded reader—regardless of their views on any particular molecule—will agree that the results cannot be taken at face value, simply by reviewing the evidence as released by the trial investigators and analyzed here.
The TOGETHER trial is an adaptive platform trial that has produced numerous publications over the last year. Its recent paper has sparked an online effort across wide ranging fields of expertise has emerged, aiming to reconstruct the study’s context from the various documents and findings, and answer the many questions that the trial’s outputs have created. Since there is no single authoritative source of truth on the sequence of events that took place, we are forced to reassemble Humpty Dumpty from fragments, including the following documents:
Hydroxychloroquine and Lopinavir/Ritonavir publication (JAMA)
As tends to happen when an online community descends upon a particular artifact, many concerns have been raised, some more valid than others. When it comes to the TOGETHER trial however, there has been a distinct signal of concern. According to one site cataloguing the online effort to understand the trial, there are currently 43 distinct concerns that have been raised about the trial, most of them with real validity:
It’s easy to dismiss this litany of issues as a Gish Gallop. And indeed, it’s implausible that 43 or more distinct issues would exist in a single manuscript. So perhaps some will dismiss them as noise. This article will demonstrate that most of the issues listed are not actually distinct, but deeply related to a central failure of the protocol of the TOGETHER trial.
An Algorithm’s Tale
To get there, however, we have to zoom out, starting from a bird’s eye view. The TOGETHER trial included studies of a number of medicines, with different arms starting and ending at different times. For the period of particular interest for this analysis, between January and August 2021, the main arms of the study (metformin, low-dose ivermectin, high-dose ivermectin, fluvoxamine, and placebo) come together to assemble a rough timeline of events which can be summarized as follows:
The best way to explain what went wrong is to walk through the main events of the trial from the vantage point of the algorithm tasked with patient randomization and allocation. And while some have said that I am speaking out of turn when it comes to biology, algorithms are in fact well within my field of academic expertise, so I may have been in the right place at the right time to notice something others missed.
We’ll use the enrollment graph, shared by the principal investigator of the study on August 6, 2021, to help us navigate:
The data looks quite hard to extract, but with the application of a grid, we can infer numbers for enrollment in each arm by week, enrollment numbers which are corroborated by the published papers of the TOGETHER trial, and tell quite the tale.
The allocation for the 2021 experiments begins on January 15, 2021. The patients start trickling in, and the first patient is allocated to metformin. Over the next few weeks, the pace of enrollment picks up. Four arms are open to patients: fluvoxamine, ivermectin (low dose), metformin, and placebo. The algorithm’s job is to place the patients on different arms, while maintaining the relative balance between the arms in terms of overall patient count. The algorithm also tries to make sure the many sites involved in the study get a roughly equal number of patients in every arm. In technical parlance, the approach is called block randomization, stratified by site, with variable patient set sizes. What this means is that the algorithm will create blocks of patients from each site, making sure to have enough patients in each block such that it can evenly—but randomly—assign them to all the arms that are open to enrollment at any given time. For example, a block could contain eight patients, with two patients being assigned to each of the four arms. The size of the block itself is variable, within some multiple of the minimum possible block size, which helps defend against certain kinds of bias.
By February 15th, barely four weeks into the trial, a new revision of the trial protocol was created. By then, 19 patients had been assigned to the ivermectin arm. The new protocol describes some big changes in how the trial works:
The low-dose ivermectin arm is to be stopped, replaced with a higher dose ivermectin arm, administering more than three times the dose.
The dosing will operate in a radically different way: instead of having all patients in every arm—placebo or not—take a treatment twice a day for 10 days, each treatment will follow its own schedule (3-day, 10-day, etc). Meanwhile, the placebo arm patients will be classified internally into smaller groups, each following a placebo schedule that mimics an active arm in the study (3-day placebo, 10-day placebo, etc.)
Vaccination against SARS-CoV-2 changes from being one of the inclusion criteria to being an exclusion criterion.
Two new sites are added to the trial, increasing the total number from three to five.
The primary endpoint is changed from >12 hours of ER observation to >6 hours of ER observation.
The first change is by far the most impactful, and by way of explanation for this change, the NEJM ivermectin paper writes:
On the basis of feedback from advocacy groups, we modified the protocol to specify three days of administration of ivermectin.
It should be noted that the authors of the study include Dr. Craig Rayner, who has made a forceful argument for the inability of ivermectin—at the typical doses prescribed or higher—to reach the plasma concentration levels that would have an effect on COVID, as per the in vitro results. Regardless of the fact that this argument seems exaggerated or invalid, if the understanding of blood plasma levels was available to the TOGETHER team, it’s unclear why they would start the trial with a dose that was sigificantly lowere than the consensus recommendation at the time and for which absolutely no support of effectiveness existed.
It should be noted that the original dosing for the TOGETHER ivermectin trial, while being reported as 1 dose of 400mcg/kg, actually stopped scaling the dose at 60kg. For a trial with a median BMI of 30, and given the average height of Brazilian males, we would expect the average male weight to be far closer to 90kg, which would make the effective dose for men closer to 267mcg/kg. And the higher the weight (and therefore risk from COVID-19) the less the per-kg dose the patient revived. I’ve not been able to find any justification in research literature for limiting the total dose for heavier patients. It appears to be an innovation of the TOGETHER trial.
So it definitely was correct to increase the dosing, and this corrected protocol was submitted for approval on February 19, 2021, to CONEP, the Brazilian ethics board.
In adaptive trials, a board named Data and Safety Monitoring Committee (DSMC) or similar is tasked with making decisions on such big changes. We would expect this committee to have reviewed the data and other feedback, halted the low-dose ivermectin arm, and decided on the new protocol to submit to local authorities. The actual sequence of events does not seem to match this expectation. While CONEP was reviewing the new protocol, patients continued to be allocated to the low-dose ivermectin arm, still receiving a one-day treatment protocol. This raises questions on two fronts: first, if the trial investigators decided to alter the dose because they agreed the lower dose would be ineffective, assigning further patients seems like a breach of ethical standards. Secondly, the data gathered would not be of much use, given that as per the application to CONEP, the arm was to be halted. This leaves us with the uncomfortable dilemma of having to wonder whether the DSMC took the decision to breach the trust of the trial participants and continue assigning them to a condemned arm of the study, or whether the investigators took the decision revise the protocol themselves, bypassing the DSMC altogether.
It’s not clear why, but around March 4th—three weeks and 59 patients later—the low-dose ivermectin arm was halted, with the CONEP decision still pending. At that point, the algorithm would have to alter its parameters of operation. Instead of creating blocks that can cover three treatments and a placebo arm, it would have to do the same for the two remaining treatments (fluvoxamine and metformin) and a placebo arm. For the next two weeks and change, the allocation to those three arms increased, since no patients were allocated to the now-halted low-dose ivermectin arm.
According to CONEP, the authorization to put the new protocol in place was granted on March 15th. The authors of the ivermectin paper—mistakenly, by the looks of it—report that this approval was given on March 21st. Regardless, the new protocol is registered on clinicaltrials.gov on March 21st, and the high-dose ivermectin arm starts getting patients assigned to it, starting March 23rd.
The next few weeks will be the worst of the pandemic in the country—especially in Belo Horizonte—where all the active sites are located.
As soon as the high-dose ivermectin arm starts, the algorithm assigns patients aggressively to that arm, placing many fewer than expected to the placebo arm, as well as to the fluvoxamine and metformin arms. In the week of March 22nd, 57% (83/146) of patients are assigned to ivermectin. The following week, it’s 41% (37/90) of patients.
The 10 most critical weeks of the trial can be seen below in percentage form, sourced from the same data the investigators shared. We’d expect the arms to be roughly balanced from week to week, but instead we see wild fluctuations.
This diagram can be read as 5 distinct week-pairs, where allocations keep changing:
Row 1-2: The two last weeks of the low-dose ivermectin arm
Row 3-4: The two week gap between ivermectin arms
Row 5-6: The two first weeks of the high-dose ivermectin arm
Row 7-8: The two weeks after the termination of the metformin arm
Row 9-10: The first two weeks where balance is restored
Under the 1:1:1:1 randomization scheme, we’d expect allocation to hover around 25% (or 33% when there’s only 3 arms to work with), so why did the algorithm malfunction so spectacularly?
The discontinuity, and the eventual catch-up, becomes clear when we look at 2021 recruitment to different arms cumulatively.
My hypothesis is that the allocation algorithm was configured to “reactivate” the original ivermectin arm, instead of having the original arm terminated and a new high-dose one created. Seeing the ivermectin arm become active again, the algorithm operated as if the blocks that had been created in previous weeks—to cater for a 3-arm trial—were essentially incomplete. So, as suitable patients came in, it disproportionately allocated them to the ivermectin arm, backfilling the 3-arm blocks and turning them into 4-arm blocks. The two new sites that were added with the new protocol were unaffected, having new blocks created from scratch. The new sites being added upon the implementation of the protocol helps explain why the allocation to the ivermectin treatment arm that week was not actually 100%, but merely 57%.
From the point of view of the algorithm, it was fixing an imbalance with the state of the trial’s data model. However, in the real world, this created a huge issue in terms of blinding, randomization, and control. The entire point of a DB RCT trial—touted as the gold standard in medical research—is to be doubly-blind, randomized, and placebo-controlled. The algorithmic misfiring of matching placebo patients recruited under one protocol with treatment patients recruited later, under a different protocol, compromised each of these key properties. Let’s walk through the issues in detail:
Blinding: Knowing that the patients in a specific week are disproportionately in a treatment arm of a specific medication gives the opportunity for the many biases a DB RCT is supposed to control for. This blinding failure compounds the issue introduced by the new protocol when it decided to split the control group into different dosing subgroups. Someone seeing a patient go on a 3-dose regimen (the patients were grouped by letter according to their regimen) would know that the patient has a higher-than-random chance of being on ivermectin. The combination of these two issues means that the organizers of the trial would be able to—especially in these first few weeks—tell with high confidence (>75%) that a 3-dose regimen patient is an ivermectin treatment patient. Further, since 3-dose patients were assigned different letter groups depending on whether they were taking placebo or the treatment, observing the relative frequencies of the letter groups, which those first few weeks were even more lopsided than usual, would allow the nominally blinded staff to know exactly who was on ivermectin treatment, who was on ivermectin-like 3-day placebo, and who was assigned to different arms.
Randomization: It goes without saying, but a 1:1:1:1 randomization scheme—even when using blocks and stratification—should not result in the allocation anomaly visible to the naked eye in the Recruitment over Time diagram below. The features of the randomization algorithm should result in some fluctuations, which is probably why it was coded in such a way as to recover, but to put it plainly, this isn’t supposed to happen:
Looking at the same data without the stacking makes this even more plain. The weeks around March and April 2021 stand out like a sore thumb:
Control: What emerges from the papers published by the group—and is consistent with the block backfilling hypothesis—is that the new high-dose ivermectin patients were matched with placebo patients who had been treated in the weeks before March 23rd, and selected under a different protocol. The authors explicitly deny this in their paper, but there is no other way to reconcile the data they have released so far. Besides the obvious matter of violating the core principles of RCTs, there are two massive confounders we can ascertain:
First, there is a deeply relevant protocol change: the original protocol treated vaccination for SARS-CoV-2 as an inclusion criterion, classifying those patients as high-risk. The protocol enacted on March 21st treated vaccination for SARS-CoV-2 as an exclusion criterion, meaning that being vaccinated would rule a patient out of joining the trial at all, regardless of other factors. Comparing possibly-vaccinated placebo patients with definitely-unvaccinated treatment patients is nobody’s idea of a placebo-controlled trial.
In addition, the Gamma variant wave was on the rise during the first period, and at its absolute peak during the second period, giving us another reason why the two patient groups are vastly different from each other:
Using local data from the city of Belo Horizonte—the vicinity of which contains all of the study sites active at the time of the allocation discontinuity—we can see that the period from which the placebo group was preferentially allocated from could well have an expected CFR that is half the expected CFR for patients recruited in the immediately following period—in which patients were over-allocated to the high-dose ivermectin arm. And since the patients in the trial were explicitly selected to be high-risk, we can expect the CFR among patients to be substantially higher, and the difference even more impactful. In fact, by combining the results for the placebo arm as reported in the metformin, fluvoxamine, and ivermectin papers, we can infer the mortality in the period between the start of the high-dose ivermectin placebo group and the end of the metformin placebo group. Indeed, it seems that mortality rose dramatically—from 2.6% to 5.6%—before settling back down to 3.1%. I have intentionally left the date of the start of the high-dose ivermectin placebo group undefined in the diagram below, as that is essentially the bone of contention between me and the authors of the TOGETHER study. However, we don’t need to define the start date to see the drastic change in baselines around the period of interest.
Note that if—as the authors claim—the interim period begins on March 23rd, they would have to have recruited 126 patients to placebo in two weeks, when in the preceding 2.5 months, they would have recruited only 77. One more reason why the claim that all enrollment for the high-dose ivermectin trial started on March 23 stretches credulity. Back to mortality, in the published results we are told that there were 21 deaths among patients treated with high-dose ivermectin, versus 25 deaths among patients treated with placebo. If the difference in CFR and vaccination status is obscuring an additional 2-4 deaths, or an additional 50-100% improvement in favor of ivermectin treatment over placebo, the final results would be vastly different. And while this is in terms of mortality, we can expect similar or even more pronounced differences to be reflected in the hospitalization and extended ER observation events—the study’s primary endpoints. You can see more detail on the placebo offsetting in part 1 of my previous essay on this topic.
Even if we assume the artifacts we see are a result of algorithmic misconfiguration, and that the researchers were operating in good faith, it was incumbent upon them to be open about this issue as soon as it was noticed. Instead, their ensuing actions and communications have been focused around obscuring and denying any issue at all. Consequently, irrespective of the original intentions (or carelessness), the lack of disclosure, and the following publicity campaign touting the results of the study as definitive and final, render a complete assumption of good faith nonviable.
The Mysterious S.A.P.
To make matters even more confusing, the protocol enacted on March 21, 2021, refers to a Statistical Analysis Plan (SAP) but the first such document appears to be dated on March 27th. Perhaps six days is not such a big delay, but the final two signatures on this document appear to have been placed by April 8th. Did the document come into force on March 27th? Or perhaps on March 29th, when principal investigators Dr. Mills and Dr. Reis signed? Or was it on April 8th, when Dr. Harari and Dr. Ruton signed on behalf of Cytel—who was responsible for executing the plan?
One may ask why the exact date matters. Well, it turns out that this SAP defines a different randomization algorithm than the one defined in the protocol that came into force on March 21st. The key part is this:
Eligible participants will be randomized at an equal allocation ratio to study experimental intervention(s) or placebo. Individual randomization will be stratified by clinical site, by age (<50 years vs. >=50 years), and time from onset of symptoms (<120 hours vs. >=120 hours).
This adds two more stratification factors (age and time from onset of symptoms), and removes any mention of variable-size patient sets. Is this an important change? It’s unclear, but when the issue seems to revolve around exactly this algorithm, layering a serious change in its parameters could shift results further, depending on the specifics of its implementation, the underlying data, and the approach used to transition. What is clear, is that the algorithm would be changed yet again on June 22nd, reverting back to the protocol version 3.0, which simplifies it to the following:
The randomization will be stratified by clinical site, by age (<50 years vs. >=50 years).
It is this last randomization algorithm that is described in the fluvoxamine and ivermectin papers, even though it was only active for a small part of the trials at the end. Even more strangely, it is the algorithm that is described in the metformin paper, a paper describing results in a period that did not overlap with this protocol at all.
The Show Must Go On
Back to our algorithm, by April 3rd, the metformin arm is stopped for futility by decision of the DSMC. Despite this, on the week starting on April 5th, the disproportionate allocation to ivermectin treatment continues: 44% of patients (43/96) are assigned to the ivermectin treatment arm, instead of the 33% we’d expect. The disparity, however, is decreasing, and over the next few weeks will even out. By April 12th, equilibrium has returned. The patients in the fluvoxamine arm are 242, and the placebo patients are 238. Interestingly, if we add the patients in the discontinued low-dose ivermectin arm, and the number of patients in the new high-dose ivermectin arm, the total comes up to 241. The hypothesis that the algorithm is treating the two disjointed ivermectin arms as a single continuous entity would predict exactly this state, and it’s unclear what alternative hypothesis could fit these observations better.
By July 5th, the fluvoxamine arm has reached 683 patients. While there are conflicting reports from the principal investigator as to the planned size of the trials, the TOGETHER team claims that the aim was always to reach 681 patients. For unclear reasons, the fluvoxamine arm is not terminated until the DSMC meets on August 5th. In that meeting, they determine that the ivermectin high-dose trial has reached its planned size. They decide to end both the ivermectin and fluvoxamine trials, the latter of which at that point had reached 742 patients, an overshoot of 9%. It’s important for trials to stop at their pre-planned sizes, because if one is given the ability to choose an arbitrary stopping point, there is a very high probability of generating a false positive result, intentionally or not.
By August 6th, a slide deck is released that creates ripples around the world: “Fluvoxamine works, ivermectin does not.” The headlines at numerous major newspapers were definitive, and the quotes scathing. Dr. Mills was quoted in the LA Times as saying:
Among the 1,500 patients in the study, he said, ivermectin showed “no effect whatsoever” on the trial’s outcome goals — whether patients required extended observation in the emergency room or hospitalization.
“In our specific trial,” he said, “We do not see the treatment benefit that a lot of the advocates believe should have been” seen.
From there, the fluvoxamine and ivermectin stories diverge. On August 23rd, a preprint of the fluvoxamine paper is released, and the final form of the paper appears in the Lancet on October 27, 2021. The results are a bombshell: especially on mortality, fluvoxamine appears to have reduced the risk of death by 32%, and in the per-protocol analysis—focusing only on the patients who followed at least 80% of their treatment—it appeared to protect from death by an astounding 91%. Fluvoxamine has had much momentum since. Strangely though, policy at the WHO or NIH has not significantly shifted, despite other fluvoxamine studies having positive results.
Who Monitors the Monitors?
In the meantime, over at the Gates Open Research portal, the TOGETHER protocol was undergoing an open peer review. Two invited reviewers from Denmark had several issues with the protocol, but the one that stands out is the following:
Data Monitoring and Safety Committee:
Why is this committee not independent? Professor Thorlund is Vice President of the contract research organization (CRO, i.e CYTEL), employee of the CRO, professor at the sponsoring university, author of the TOGETHER protocol, and member of the DMSC for the trial. Dr Haggstrom also seems connected to the CRO? And why is this committee not blinded?
Dr. Mills responded to this issue thusly:
We have corrected the membership of our DSMC and clarified that Professor Thorlund serves as the non-voting chair of the DSMC.
The lack of removal of both conflicted members of the DSMC did not sit well with the reviewers. They withheld their full approval of the protocol with the following comment:
Kristian Thorlund, still having ‘several hats on,’ seems only to lose his voting rights in the Data and Safety Monitoring Committee and is still appointed chair of the committee. We do not find this move sufficient to remove our worries as well as the worries any other outsiders may have.
It turns out that the concerns identified were but the tip of the iceberg. Mills and Thorlund go way back, having over 100 publications together. They are co-founders of MTEK Sciences—the startup that in the earliest versions of the TOGETHER landing page took credit for the trial itself—and named Mills and Thorlund as co-leads. MTEK was acquired by Cytel in 2019, which is the company conducting the analysis of this trial. Dr Haggstrom, another former MTEK and then Cytel employee, is also a member of the DSMC. Two more members of the DSMC have extensive shared publishing history with Dr. Mills, principal investigator of the trial. For more about the issues with the DSMC and its lack of independence—including specific evidence—see “part 3” of my prior essay.
Given the pivotal role the DSMC played in the conduct of the trial, including the decisions to stop certain arms at certain times (or not), its lack of independence—despite claims to the contrary—puts a real question mark over the results of the trial altogether.
The ivermectin study does not appear until later. Much later. On March 18th, 2022—more than six months after the preliminary results were released—an “exclusive” dropped in the Wall Street Journal:
Dr. Mills provided the paper with the following quote:
“There was no indication that ivermectin is clinically useful.”
We still didn’t have a paper or preprint, but this still unsubstantiated preview delivered by the WSJ triggered a second round of breathless coverage. As soon as the ruckus died down, the paper finally appears in NEJM on March 30th, 2022, triggering a third round of headlines. Here’s what the NY Times had to say:
“There’s really no sign of any benefit,” said Dr. David Boulware, one of the authors of the paper. With such definitive statements, I reached for the published paper hoping to get some answers to my questions.
First and foremost, the paper clearly states:
The evaluation that is reported here involved patients who had been randomly assigned to receive either ivermectin or placebo between March 23, 2021, and August 6, 2021.
This is a key claim—contradicting what we concluded above—about placebo patients being recruited earlier than March 23, 2021. The data published in the ivermectin, fluvoxamine (preprint), fluvoxamine (final), and metformin papers, as well as the enrollment chart presented on August 6th, all make sense only if this quote is false.
Conversely, if this quote is true, several things the authors published in those other sources must be false. I’m not aware of a way to reconcile the two.
As a sanity check, let’s try another way to demonstrate the key issue: according to the ivermectin trial paper, the final number of patients in the high-dose ivermectin arm was 679, and the number of patients in the aborted low-dose arm was 77. These two numbers add up to 756. The fluvoxamine trial covers the period between January 20th and August 5th, 2021, starting when the low-dose ivermectin arm started, and ending one day before the high-dose ivermectin arm ended. It reports 756 placebo patients. This means that despite the two-week gap—clearly visible in the enrollment chart—all placebo patients that were used for the fluvoxamine trial were also used for the ivermectin trials, even though 152 placebo patients had been recruited before March 23rd, when the high-dose ivermectin arm started. Assuming 77 of those placebo patients were allocated to match the 77 low-dose ivermectin patients, that still leaves 75 placebo patients enrolled before March 23rd unspoken for. If the placebo patients assigned to fluvoxamine in the 2-week break between ivermectin arms were not matched to the high-dose ivermectin patients, we would have seen 75 additional placebo patients in the fluvoxamine arm, and they’re nowhere to be seen.
The offset placebo patients are not the only issue with the trial. In the paper’s discussion section, the authors state:
We ensured that trial participants did not have a history of ivermectin use for the treatment of Covid-19 by means of extensive screening of potential participants about this issue.
This is quite the curious sentence. First, an exclusion for ivermectin use is not present in any of the protocols, in the pre-registrations, or in any discussions prior to this paper. Secondly, the qualifier “for the treatment of Covid-19,” implies that use of ivermectin for other reasons was not cause for exclusion. Given that the questionnaire filled in for each patient did not have a dedicated question on ivermectin use, we can only assume that they got their information from the concomitant medications section. In this section, there is a field to fill in for the “indication” for which the medication was used. So if a questionnaire included ivermectin as a concomitant medication, with Covid-19 as the indication, perhaps the patient was excluded, either at the time or after the fact. However, given that this trial has had trouble collecting data on things as basic as age and time since symptom onset, it’s hard to believe that indeed the concomitant medications and the specific indication was filled in correctly for all patients, especially when this was not called out as an explicit exclusion criterion.
There is additional cause for concern, given Dr. Mills’ response to a question about this issue:
At the time we did our study, IVM was not particularly popular for use in Minas Gerais. Even if some patients did access IVM, the fact that it is blinded should still maintain balance.
First, this argument proves too much. If blinding makes this not an issue, why have exclusion criteria at all? Second, he states IVM was not particularly popular, but reports at the time from the local press say the exact opposite:
Given this contradiction, we can assume that whatever filter the investigators applied to the study participants did not indicate the same volume of use as this publication reports, which requires some explanation.
When Is a High Dose Not a High Dose?
One more thing worth noting, is that despite trial authors’ claims to the contrary, the updated dosing was still not up to the level that ivermectin advocates were recommending at the time, especially for those most at risk. Dr. David Boulware, one of the authors on the ivermectin paper, stated that the TOGETHER dosing was compatible with the FLCCC protocol at the time the trial was designed:
The FLCCC I-MASK+ outpatient protocol in February 2021 was version 9. The cumulative dose of the two protocols is indeed quite close at lower weights:
However, due to an arbitrary 90kg limit on dose scaling, the TOGETHER ivermectin protocol actually falls severely short for patients with a high BMI. In fact, given that about half the patients in the trial had a BMI over 30, and given the average height of Brazilian men, it’s quite likely that almost half of them were under-dosed, as well as up to a third of women. Also, given that high BMI is a major comorbidity for COVID-19, it is quite a big oversight to under-dose those who are most at risk. This could have contributed to further obscuring a benefit from ivermectin.
All this analysis doesn’t even take into account the fact that the FLCCC instructed patients to take their doses with or right after a meal, whereas TOGETHER instructed dosing on an empty stomach. While information differs, the TOGETHER team reports in the supplemental appendix of the ivermectin publication (Table S4) that taking ivermectin with a meal can result in anywhere from 18% to 157% increased blood plasma concentrations.
Nor does it take into account that from a patient population with a median time since onset of symptoms of five days—and an additional day of delay to receive the first dose of their treatment—most would not even have qualified to participate in the trials of other antivirals such as Molnupiravir and Paxlovid, since it is well understood that antivirals work best when given early. A paper discussing the early use of remdesivir contained this extremely useful diagram, explaining intuitively the importance of early administration:
Despite all this, looking into the appendix, we see the kind of figure the other TOGETHER papers feature in the main body: a visualization of the Bayesian analysis the paper lends itself so well to.
The authors clearly tell us that in the intention-to-treat and modified intention-to-treat populations, the event rate was lower in the ivermectin group compared to placebo, with probabilities of 79.4% and 81.4% respectively. It’s worth remembering that the interpretation of Bayesian probabilities and intervals match people’s intuitive interpretation much more closely, and is free of considerations of “statistical significance,” which the frequentist equivalents get bogged down in.
It’s worth noting that the per-protocol analysis shows a lower probability of superiority for ivermectin at 63.4%. This is not because the ivermectin population does worse, but because somehow the 3-day per-protocol placebo group appears to outperform all the other placebo groups significantly, raising questions about potential confounders.
Despite all this, the paper’s conclusion, and in particular its categorical nature, was in conflict with the Bayesian statistics it used to present its findings:
Treatment with ivermectin did not result in a lower incidence of medical admission to a hospital due to progression of Covid-19 or of prolonged emergency department observation among outpatients with an early diagnosis of Covid-19.
And while the breathless headlines continued, with scathing quotes from the various authors, the principal investigator has indicated that he is on the side of the Bayesian analysis—and not on the side of his own paper’s conclusions—or his statements to the press. For instance, in an email chain including entrepreneur Steve Kirsch, Dr. Mills wrote:
In particular, there was a 17% reduction in hospitalizations that would be significant if more patients were added. I really don’t view our study as negative and, also in that talk, you will hear me retract previous statements where I had been previously negative. I think if we had continued randomizing a few hundred more patients, it would have likely been significant.
This comment is particularly shocking in light of the fact that the fluvoxamine trial was allowed to overrun by 9%, whereas the ivermectin trial was stopped two patients before the predetermined limit. What can we make of all this? It seems that the paper presents us with parallel realities. Everyone has enough clues to believe what they want. The answer, of course, would be to have access to the underlying data so as to confirm or refute hypotheses like the one presented in this essay. It may be that a number of serious errors have creeped into the various manuscripts giving the impression the placebo arm was offset, but a look at the data will reveal that impression to be false. There’s only one way to find out.
The original protocol, and every subsequent version claims that individual patient data would be available upon request to interested researchers as soon as the protocol was terminated. The fluvoxamine publication in October contained the following statement:
Data from the TOGETHER trial will be made available following publication of this manuscript to interested investigators through the International COVID-19 Data Alliance after accreditation and approval by the TOGETHER trial principal investigators (EJM and GR).
As of a few days ago, data has still not been made available. PI Mills suggests that the statisticians of the group are occupied, but perhaps soon the data will be open to applicants to “propose an analysis.”
7) We are providing all data sets to ICODA. Applicants can then propose an analysis to ICODA and, if they approve it, can get access to the data. It’s not a minor issue to prepare CDISC data sets and we have been swamped with getting the FDA EUA submission on lambda such that our statisticians are occupied with that. The data sets will be available soon.
This sounds quite far from the original “upon request,” and quite late for October’s “following the publication of this manuscript.” In addition, ICODA shares funding sources with the TOGETHER trial, so it can’t truly be seen as independent. In light of the irregularities—and assuming this is some sort of misunderstanding—the investigators can help clear things up by allowing third parties to audit the results of the trial, as they had promised to do. As of the publication of this article, my hypotheses about what we will see are abundantly clear, so even if they are concerned about a potential fishing expedition, disproving a detailed set of concerns should insulate them from later critiques, to some extent.
Long Story Short
All in all, it really seems that something went terribly wrong in this trial. A significant part of the placebo group used for the high-dose ivermectin trial appears to be populated with patients from the prior period—who were selected under a different set of inclusion/exclusion criteria (including being included for vaccination instead of excluded). This, in turn, created a perturbation in the data such that the fluvoxamine and metformin arms were not assigned many patients during the most lethal phase of the pandemic in Brazil, bringing their reported results into question as well. Especially in the case of fluvoxamine, it also remains unclear why the trial was extended for a month after it reached its pre-planned number of patients.
There is an unfortunate set of unexplained events around this trial, and the authors and investigators do need to engage the issues raised by many since the original infographic was released in August 2021. In brief, here’s a distillation of a subset of what has people concerned:
The seeming reallocation of placebo patients recruited earlier and under the original protocol to be compared with patients recruited later and under a revised protocol creates issues for the claim that the trial was randomized, blind, and placebo-controlled. Compromising one is concerning, compromising all three should be disqualifying.
The lack of independence of the DSMC, being staffed with people who had a stake in the success of the trial and deep prior history with Dr. Mills.
The original decision to revise the ivermectin dose has not been adequately explained or documented, including the decision to continue recruiting patients despite the pending application to the Brazilian ethics board to alter the trial. The lack of release of the findings from that study, even if brief, also complicates matters further.
The discontinuity between the reported approval date by the Brazilian ethics board of March 15th, and the one claimed by the authors as March 21st.
The decision to change SARS-CoV-2 vaccination from an inclusion criterion to an exclusion criterion in the March 21st protocol.
The two alterations in the randomization algorithm, and the lack of clarity about when exactly the changes took effect.
The extension of the fluvoxamine arm beyond its planned size.
The lack of acknowledgement that the absence of ivermectin on the exclusion criteria is problematic, and the sharing of details about what exactly was done to remedy this issue.
The lack of acknowledgement of the effect of the Gamma variant on the trial in general and on the over-allocation to the ivermectin treatment arm in particular.
Each of these issues raises legitimate concerns about the quality of the results of the trial, but when put together, they paint a picture of a truly catastrophic scheduling anomaly around March 2021, and a number of factors that amplified its effect on the results of the trial far beyond any acceptable range. In fact, given the already considerable length of this essay, I have not gone into several other key issues with the trial, most of which happen to affect the final finding of effectiveness of ivermectin negatively.
As we try to make sense of the anomalies, we’re forced to confront the dilemma of whether this is the result of lack of competence or its presence. Given that many of the authors of the TOGETHER papers are deeply involved in the evolution of the concept of an adaptive platform trial, and have much experience designing clinical trials for large pharmaceutical companies, neither conclusion is without serious implications.
The medical use of pharmaceutical products, as the root word pharmakon denotes, implies the knowledge of how to cure, but also the knowledge how to poison. After all, it is the dose that makes the difference. Likewise, with the experts involved in the TOGETHER trial, we must know if the result we saw is a case of application of expertise towards non-obvious ends, or perhaps in the midst of a raging pandemic, honest mistakes were made.
The razor that will distinguish the intertwined hypotheses is the response of the TOGETHER team to the publication of this analysis and others like it. I sincerely hope they respond with more transparency, openness to third party analysis of the data, and a willingness to acknowledge the things that did not go as expected. In contrast, doubling down and making it even harder to get answers will tip the balance in favor of those who believe that whatever mistakes were made, were not, in fact, honest.
This article is part of a series on the TOGETHER trial. More articles from this series here.
For private feedback, press inquiries, and other requests relevant to this article, my Twitter DMs are open, you can reach me at @alexandrosm.
I would like to thank Michelle Paquette, Mary Re, Isyah Deranukul, Eva Tallaksen, Phil Harper, Bonnie Hawthorne, Bret Swanson, Travis Smith, and Tom Beakbane for their feedback and contributions. This essay is the result of a collaboration that includes more people than I could possibly name here, or even know the names of, but without whose work would not have been possible.
Great work Alex. I've just noticed that Edward Mills is the supervising author for this study but works for Cytel. He declares this on his form, and also his funding from the BMGF. But he doesn't declare his collaboration with Kristian Thorlund and Kyle Sheldrick in the pseudo-attack on molnupiravir (https://pubmed.ncbi.nlm.nih.gov/35294804/). At the same time the conflict form includes two dates of submission - 21st Nov 2021 and 24th Feb 2022. This happens if there have been late additions to the authorship, after it was first submitted. The additional authorship relates to Craig Rayner who appears to be the late arrival and has multiple conflicts including Pfizer and the Australian government. The conflicts document is here (https://www.nejm.org/doi/suppl/10.1056/NEJMoa2115869/suppl_file/nejmoa2115869_disclosures.pdf). He is the founder of Certara, an organisation that is putatively involved in finding repurposed drugs. I would hazard a guess that the only repurposed drugs they have found for anything would be very expensive patented drugs, but that would be cynical. The whole thing stinks