11 Comments
Apr 4·edited Apr 4Liked by Michael Weissman

This post is the most compelling assessment I've seen of the situation to date. Thank you for writing this up and sharing your analysis.

I've seen the full arguments from both sides of the Rootclaim debate, and regardless of people's opinions on how the debate played out with information available at the time, it seems undeniable that studies and key information have emerged since the debate which seem to overwhelmingly conclude that the wet market has extremely low odds of being the origin of the virus, as pointed out in your analysis, which would dramatically downward shift the probability assigned to the most important point in support of Zoonosis. Many commenters don't seem to be aware of how recent some of the key points of evidence are (some as recent as March 2024) which either were not available or may not have been fully understood at the time of the debate.

For those interested, here's a brief list of some recent information that updated me towards lab leak. This is for the sake of explaining my thoughts to others, but is in no way all-encompassing. Michael does a far superior job of explaining these in great depth.

- Study published March 5th, 2024 finding intermediate sequences between Lineage A and B (https://doi.org/10.1093/ve/veae020). This research shows that Lineage B very likely came from Lineage A. All cases in the market were Lineage B, but none were Lineage A. In short, the research shows that a single spillover is much more likely than a double-spillover Zoonotic event. The double-spillover theory is a foundational argument of the ZW theory that Peter Miller and others use. This is a massive blow to the probability that the wet market was the origin of the virus, to the point where it now seems extremely *unlikely* that the wet market was the origin.

- Wildlife trade in Wuhan is significantly less than Wuhan's percentage of the population, which significantly changes the probabilities downwards of a ZW origin in the bayesian calculations that Peter Miller and others use.

- Although the DEFUSE proposal leaked in 2021, more recent drafts were discovered in 2024 which contain what appears to be damning evidence. New information included their approach using restriction enzymes (BsaI/BsmBI) that ultimately matched precisely with what Bruttel et al. (2022) found as the assembly process that would create exactly this virus, years before this DEFUSE draft leak was even public. Michael describes the degree of how unlikely this would be if the origin was Zoonotic. The DEFUSE budget leak confirms that they were purchasing these enzymes. Additionally, the new documents contained draft comments that were not available in the original leaked proposal. Among many other things, the comments show that the research work was actually planned to be done at the WIV at BSL-2 levels for cost reduction, but they edited the final document to "BSL-3" because they thought "US researchers will likely freak out" if they knew this research was being done in lower safety BSL-2 labs. The researchers seemed to think the distinction didn't matter for their research and that it was bureaucratic tape slowing them down, so they fudged the proposal to hide this. Considering BSL-2 labs are not sufficiently designed to contain airborne disease (whereas BSL-3 labs are), this does not seem to be an insignificant point in this whole debate.

Peter Miller claimed that because the DEFUSE proposal was rejected meant that the proposed work never happened (or at least had an exceedingly low likelihood of occurring), which led to the proposal essentially being dismissed as evidence, at least when it came to evaluating the arguments and probabilities. However, it is unequivocally the case that many research scientists conduct research long before they apply for the grant for that research. It is also often the case that they apply for the grant while in the middle of conducting research. This is confirmed by countless research scientists online and it's just how research often is done due to the difficulty and delays of receiving funding. Peter's conclusion that the DEFUSE research did not happen because it was rejected is so at odds with how scientists conduct research that at best I would consider Peter ignorant of this point, and at worst it would seem Peter intentionally manipulated this point and tried to get the DEFUSE proposal dismissed as evidence because it would dramatically undermine his argument and potentially change the entire outcome of the debate.

There's more, but I'll wrap this up for now. My key takeaway is that Michael's approach and evidence seem more updated, complete, and compelling than anything else available online on the topic to date. It seems especially relevant and a better analysis than either sides of the Rootclaim debate, and I believe more people should be aware of this post and read it in full before making conclusions.

Expand full comment
author

Yes, nice summary of the newer evidence.

I think that the HSM market story has taken serious hits, negating the importance of the badly flawed Worobey/Pekar papers. If you'd asked 5 years ago, people warning of zoonosis wouldn't have put an especially high fraction of their priors on market spillovers, since there are also farms, transport, etc. So I think the big factors assigned by some to HSM are nonsense but removing them doesn't do much to reduce the overall ZW priors.

The biggest change has been from finding the DEFUSE restriction enzyme plans. Before, most people (including me) had not considered that pattern as usable evidence because it was unclear how much ex post facto choice of what to look for was involved, i.e. the old multiple comparisons problem. After Emily Kopp published the DEFUSE plans that were stunningly close to the pattern Bruttel et al. had noticed, that factor became worth taking very seriously.

Expand full comment
Mar 24Liked by Michael Weissman

Thank you so much for this work.

I have one major suggestion, which is more discussion of how to interpret the numbers coming out. In particular, I was bothered by this analysis for a long time because the conclusion seemed too certain given the number of missing pieces. Intuitively, it seems like there should be some factor in there which has no information about the conclusion but vastly increases the uncertainty. Some or possibly all of this comes from me wanting to interpret the final numbers as representing degrees of certainty instead of subjective probabilities, but I doubt I'm the only one with that confusion.

I think this could be discussed in a few places - in the introduction, at the end when combining priors with evidence, and once at the point where you discuss the vast gap between RaTG13 and SC2. Any origin theory requires some sequences in this gap, and finding them could decide the case either way depending on whether they show up in wildlife or in a lab notebook. Maybe there's a way to evaluate counterfactual evidence formally but just discussing the significance of the missing evidence would help. For everybody's sanity there should be a clear distinction between "100:1 odds it was a lab leak" and "available evidence favors lab leak by 100:1".

There is a second small issue relating to the Bruttel et al paper and the seeming confirmation from the DEFUSE draft. I think it was Alina Chan who pointed out that the BsaI/BsmBI construction showed up in an earlier Baric paper, and that Bruttel et al probably read that paper. I didn't see any followup on that. Your wording seems to suggest that Bruttel et al predicted the choice of enzymes from looking at the genome, but that may be too strong. You didn't use it as evidence but it might be worth tweaking the text.

And a really small point: I would take the part about "quantifying friggin likely' out of the title. Most people won't get the joke, and you are so consistent about taking the high road in the rest of the document.

I can't thank you enough for this work: I really hope that you keep on this. Having it in a decent journal would be a huge step forward.

Expand full comment
author

Good points. Taken in reverse:

I'm sluggishly starting (with some possible coauthors) to convert this to peer-review-ready format. There will be no "friggin'" in the title. Meanwhile, I already gave the P.O. authors all the good lines because they spoke in such clear ordinary English.

I guess I should mention the preceding Baric paper. It is relevant to P(Bruttell guessing the right enzymes|ZW), but that doesn't enter directly in my calculations. To the extent that it affects P(pattern|LL), which does enter, it increases it and raises the odds. I'd thought about using it but was lazy and at some point the odds get extreme.

That brings me to the big philosophical point. I do discuss it in a narrow sense at the specific point about the sequence gap. "Although fully knowing what sequences were in Wuhan labs would be almost equivalent to answering the origins question, our current estimate of what’s there would mostly just be based on the other evidence leaning toward LL, ZL, or ZW, augmented a bit by a highly subjective sense of how forthright people are likely to be. We don’t want to either double-count our other evidence or introduce especially subjective terms."

I think this "what about missing evidence?" problem comes up in every Bayesian estimate. I've worried about it a lot, as have others. It sort of nags at the back of the mind. One can usually imagine various types of evidence that would, if available, outweigh everything that we have our hands on. If you tried to change odds based on pure ignorance of the non-evidence, all odds would approach 1/1. That would not only be wrong, it would be logically inconsistent. E.g. it would give odds (A or B) vs. C = 1 if A and B are lumped, but 2/1 if viewed as separate outcomes.

So how to deal with the sense that somehow the odds based on what we know can't be as extreme as what they seem to be? For people calling election results or horse races, they have enough cases to allow calibration. Do the ones they call 2/1 give about 2/1 in results? Etc. For pandemics, we fortunately don't have enough to calibrate.

If I were reading this blog and not writing it, I'd just take it in attenuated form. Once people start seeing things one way, whether by priors or by the first evidence they see, it starts to get hard to see other sides. So I think a reasonable reader who doesn't want to spend a huge amount of time tracking down individual factors would tend to discount the odds beyond the hierarchical and robust discounts I've used. But I can't think of a logically coherent way of saying that the odds aren't what they are because someday we may know them better.

Expand full comment
Mar 24Liked by Michael Weissman

Thanks for the explanation - discounting the odds was my first approach, and when I re-read the document looking specifically for this question I did see that you gestured towards it in the section about the sequence gap. Maybe your pedagogy is working as intended.

I guess what I'm thinking of is to estimate odds for a counterfactual like "nearby ancestral genome including FCS shows up in a wildlife sample around Wuhan" which most people would consider dispositive, and see if the method agrees. This seems like it would be a useful sanity check if it doesn't taint the analysis.

Happily, for a peer-reviewed version you can assume more expert readers.

Expand full comment
author

I did sort of quickly run through a negative control when I saw someone claim that a paper using very different (non-Bayesian) methods might have indicated a possible lab origin of MERS. I think the method here comes down heavily for MERS being natural.

If a close ancestor with an FCS etc had showed up anywhere, even Yunnan, then I wouldn't even bother looking at LL. Given the possible routes to Wuhan, ZL would still be very much in the running.

Expand full comment

"I’m definitely not endorsing the cruel opposition to strenuous public health measures that seems to have become associated with skepticism about the zoonotic account."

Well, when we talk of scientific proof, it would be wise to look at what the evidence was and is for the efficacy of all these stringent public health measures, and look at the possibility that the cure was far worse than the disease. There are some highly competent people like Jay Bhattacharya and Martin Kulldorff who warned that this would be the case. They are very experienced Public Health experts. And I think that Sweden has proved that it was so.

Apparently you're not allowed to speak out about these matters, without being called 'cruel'. Disappointing. Very disappointing. Doesn't make your story more credible, to say the least.

Expand full comment
author

have you read Bhattacharya's early work in this area? E.g. the Santa Clara infection fatality rate paper? Failure to use standard confidence intervals for false positives gave an error of up to a factor of infinity in the estimate. Improper recruitment methods (not acknowledged in the paper) heavily slanted the results. Etc. So they came up with "we estimate a local infection fatality rate of 0.17%.". The *population* fatality rate in the US is more than twice that.

So I absolutely stand by the position that this is a serious disease. That's now personally reinforced by some fairly bad sequelae to a mild case. Air filtering and more use of masks are fully justified.

So Bhattacharya and I disagree intensely on a crucial issue (pubic health measures) but agree on another (origins). I don't think that judging arguments by whether someone is on your team's side or not is a good way of getting at the truth.

Expand full comment
May 7·edited May 7

There is a steep gradient in the IFR in terms of age, noting that the IFR for children and adolescents is lower than for influenza. I have never heard mathematicians and physicists complain about that. Old people die, I am sorry for you, and they have been doing so for a very long time. I can know, because I have accompanied many neurological patients who were old and tired of their days to a comfortable death. And really, the IFR for SARS-CoV-2 is up to twice as high as for influenza. And yes, I have seen the necessary people die from influenza, including young people, as well as children.

Also, in terms of facemasks, you are completely uninformed. Every clinician has long known that the protection of a surgical mask is virtually zero with aerogenically transmitted viruses. We do it in the hospital as an "incantation ritual," but almost every physician knows it is utter nonsense. That was the consensus before 2020, and has been demonstrated in many studies.

Two of the best people in evidence-based medicine, Carl Heneghan and Tom Jeffersen, conducted a systematic review of the effect of facemasks. I know there has been a lot of criticism of that, but that what mainly for political purposes, because they did nothing different than they did before in writing their reviews. But somewhere politicians and policy makers had decided that facemasks must be effective, and so this did not come in handy.

https://www.cebm.net/covid-19/covid-19-masks-on-or-off/

I find it very striking that the hard betas in particular were so uncritical of the measures taken. I'm not even talking about the environmental impact of a measure that was based on nothing more than dressing down the public, and making critics immediately visible and scapegoating them. This had nothing, but nothing to do with public health. This had everything to do with indoctrination and enforcing certain behavior.

Then again. What does this whole discussion have to do with the origin of SARS-CoV-2? What is the point of a derogatory comment about people who are very justified, and very well reasoned criticisms of the measures taken? Which - when it comes to preventing mortality - have also had a very marginal effect.

Such a comment has no place in this piece at all. With a swing of the pen, very experienced and well-informed doctors and scientists are dismissed here as cruel. Really, I find it outrageous.

Expand full comment