The canonical account of the origins of SARS-CoV-2 (SC2) is that it spilled over to humans from wildlife in the Huanan Seafood Market. The two papers on which this view is founded were published together in Science: Worobey et al. 2022 and Pekar et al. 2022. With the discovery that Pekar incuded massive mathematical errors whose correction reverses its main 2-spillover conclusion, the market spillover case rests almost exclusively on the Worobey results, especially ones based on case locations around town. Although Stoyan and Chiu have shown that the spatial location arguments were unconventional and unpersuasive and I have shown that the Worobey statistics indicated that the cases used were unrepresentative, the qualitative impression has remained that the clustering of cases near HSM implied that HSM was highly likely to be the spillover site. Several well-known attempts to express that likelihood quantitatively obtained Bayes factors of roughly 1000 favoring an HSM spill from the Worobey location data. Now noted economist Andrew Levin* has analyzed the case location data using standard statistical methods for describing infectious disease spread and, most importantly, including the case timing as well as the position. Levin finds overwhelming odds that the simple Worobey model cannot be right. He finds that even an enhanced, more realistic version of the Worobey model is inferior to a lab-leak model in fitting the spatiotemporal case pattern. This result reverses that last significant argument for the HSM-spill model.
Here I’ll try to give a brief summary of Levin’s results. I’ll especially try to present the overall context of how they fit in with other analyses and how the HSM account fits into the broader possibility of some sort of zoonotic origin. To clarify notation, Levin calls the hypothesis of a market-related spillover hypothesis “Z” (corresponding to what I called “ZWM”) and the hypothesis that it came from research “A”, corresponding to a broader class of research leaks than my DEFUSE-related “LL”.
Levin includes a modest Bayes factor (2.3) favoring A just from the outbreak occurring in China, a feature whose probability is less than one both for Z and A. This factor is not particularly large or controversial, though omitted from most Bayes analyses.
Levin’s Bayes factor from the outbreak occurring in Wuhan rather than elsewhere in China is ~20 favoring A, which superficially appears to be a lot less than the factor of ~100 that I (and also some who think zoonosis is more likely) use for the zoonotic hypothesis not limited to market versions. The reason is that Levin’s lab leak hypothesis is not specialized to DEFUSE-related work but includes possible leaks of any pathogenic coronavirus from any lab, so he gets P(Wuhan|A) = ~0.2. That broader lab leak hypothesis would of course have larger prior probability by the inverse of that 0.2 factor so that when combined with some systematically estimated priors the posterior odds are essentially unaffected by that choice of a broader hypothesis. Levin’s Bayes factor of 20 is explicitly a lower limit since it uses an unrealistically high estimate of Wuhan’s share of China’s relevant wildlife trade, about a factor of 10 or more high according to the data I discuss.
Levin’s most interesting new analysis concerns the case location data. There is general agreement that the numerous early cases linked to HSM show that it had been a major spreading site. The tested linked cases, however, do not include any of the more ancestral sequences, which were found only in unlinked cases. Worobey argued that the clustering of many unlinked cases close to HSM indicated that HSM was the original spillover site from which all the cases descended. The key argument was that an unrealistic toy model of cases uniformly distributed throughout the population could be confidently rejected, leaving an HSM origin as favored. Since the toy model tested had never been anyone’s alternative to the HSM account, the unusual reasoning was invalid, as Stoyan and Chiu pointed out.
An actual previously proposed alternative theory, based on accounts of how cases were detected, was that the clustering of the unlinked cases near HSM arose from enhanced detection probability for patients living near HSM. I showed that the latter theory at least gave the correct sign of the difference between the median distances from HSM to unlinked vs. to linked cases, while the Worobey model gave the wrong sign. Nevertheless, that contrast could have also arisen from special features of the HSM neighborhood. Such features were omitted from the original Worobey model and could provide an alternative explanation of the clustering, regardless of whether HSM was the starting point or not.
What Levin has done instead is to set aside the possibility that the reported unlinked cases are highly non-representative and instead to take seriously the possibility raised by Débarre and Worobey of special properties of the HSM neighborhood. He assumes, like Worobey, that the reported cases adequately represent the larger set of all cases, i.e. like them he leaves out any dependence on proximity to HSM of the probability of reporting of unlinked cases. He incorporates the possibility of different spread properties in different neighborhoods into the disease spread model in a standard way, by including a dependence of spreading probability on population density. Crucially, Levin includes the timing of the cases, not just the locations. The combined time-location data is obviously far more informative than the data would be when the timing is ignored.
Worobey had claimed “COVID-19 cases in December 2019 were associated with the Huanan market in a manner unrelated to Wuhan population density or demographic patterns.” By comparing the location-time data for such a clear picture of Z with a similar model for HSM-originating spread that includes dependence on population density, Levin finds “The spatiotemporal analysis indicates that such a conclusion is completely untenable, with odds of about 560 million to 1 against that particular version of hypothesis Z.”
The problem then becomes one of comparing the more realistic version of the HSM Z model with a model in which many unlinked cases spread from sources across the Yangtze river from HSM, i.e. locations of the main Wuhan virology labs. Comparing models with the same number of adjustable parameters, Levin finds “the conditional odds in favor of hypothesis A relative to hypothesis Z are about 27:1.” Thus, contrary to the central finding on which the widely accepted HSM account is based, the case time-place data disfavor the HSM spillover account.
There may be one important caveat for that case location factor. If I understand correctly, Levin’s model for A includes no factor for how likely it would be for an HSM-centered cluster to start up around the same time that the earliest non-market cases also show up. Some crude theoretical models have claimed that would be highly unlikely. Empirically, since outbreaks in several East Asian cities, including Beijing, first showed up as wet-market clusters, the probability cannot be very low. It is possible that inclusion of a factor along these lines for A could approximately cancel the factor Levin obtains from how poorly the unlinked spread fits Z compared to A.
To summarize this factor, if the clustering of unlinked cases is due to proximity ascertainment bias, it contains no useful information about the source. If, as Worobey and Levin assume, the reported case data are fully representative of the full set of cases, then Levin has shown that population structure is essential to understanding the space-time pattern, contrary to the Worobey claim. The data say that the reported unlinked cases include a cluster very tightly centered on HSM as well as a set of cases across the river that show no relation to HSM. To what extent the tight cluster comes from ascertainment bias and to what extent it comes from the high population density near the Hankou train station/HSM neighborhood is uncertain.
Levin then looks at the locations within HSM of the swabs that tested positive or negative for SC2 RNA and mtDNA of suspected potential hosts, in particular raccoon dogs. Based on the swab analysis done by Bloom, there was already good qualitative reason to believe that these swab data disfavored a wildlife origin of the SC2 RNA. Unlike several actual animal coronaviruses, SC2 showed no positive correlation with mtDNA of any plausible non-human host. Levin quantifies that to obtain a Bayes factor of 3.2 disfavoring raccoon dog sources of the RNA. Levin argues that other potential hosts are even less likely.
Levin then considers the internal workplace locations of the cases among HSM vendors. Here, as in previous reports, he observes that there is no tendency for the vendors in or close to wildlife stalls to be at enhanced risk. Again using conventional spatiotemporal modeling, he finds a Bayes factor of 12 disfavoring the hypothesis that SC2 spread from raccoon dog stalls in favor of the hypothesis that it just spread between vendors who were near to each other. I suspect this factor may be overestimated because his version of the Z hypothesis omits vendor-to-vendor spread that might occur after initial spread from wildlife.
Combining all Levin’s spatiotemporal Bayes factors gives a net factor of 15,000 favoring A over Z. To obtain net odds this needs to be combined with what Levin calls prior odds but would actually include both genuine priors and any Bayes factors based on observations outside the spatiotemporal ones included in Levin’s analysis.
The genuine priors for Levin’s broadly defined A should be roughly five times larger than those used elsewhere for comparison of single-lab probability vs. all-China zoonosis probability. Omitted observations that would be outside Levin’s analysis include sequence features that are rare under Z but likely under A. The most striking of these is the restriction enzyme site pattern noticed by Bruttel et al. that is rare in natural coronaviruses but characteristic of laboratory chimeras and then closely linked to DEFUSE plans by the draft proposals that were FOIA’d by Kopp. The famous apparently recent furin cleavage site insert with unusual coding (under Z) would also provide another factor favoring A since it fits DEFUSE plans and is unfamiliar in related viruses regardless of their hosts. Another factor specifically affecting the market version is that all the sequenced market-linked cases were from a lineage downstream from the most ancestral versions, expected if HSM is downstream but not if it’s the spillover site. Thus when Levin suggests “priors” of very roughly 1/1, within about a factor of 100, he is not being eccentric but just acknowledging that other Bayes factors favoring A have been found, shifting the net odds typically found before using his factors. Considering those other factors and Levin’s use of a broadly defined A, his suggested “priors” are distinctly conservative.
Any set of specific models need to be taken with a grain of salt. The standard methods that Levin uses are not written on stone tablets as being correct. They could overestimate or, equally likely, underestimate the Bayes factors compared to what the best possible models would give. Levin has tried variants of the models without finding any important changes so the results should be fairly robust. Still, we’ve seen one, based on the Wuhan origin, where a more carefully estimated likelihood would substantially increase the odds favoring A. For another, the distribution of times and stall positions of HSM vendor cases, I suspect that using a more flexible Z model would reduce Levin’s A-favoring factor. Overall, taking into account more generic uncertainties tends to pull the net odds towards one simply because there is more room for the smaller probability to go up than for the larger one to go up.
What is most striking about Levin’s results is that the unlinked case home address location-time pairs point rather strongly away from the HSM story unlike the common lore that they point extremely strongly toward the HSM story. This basic result seems robust since it flows mainly just from including times together with locations. Thus Levin’s analysis leaves essentially nothing standing of the core case for the HSM picture. Combined with other likelihood factors and reasonable priors, these results come close to just ruling out the standard HSM picture.
Nevertheless, some caution is needed in extending that conclusion to the broader set of zoonotic pictures. Even if the HSM raccoon dog account were absolutely ruled out, a variety of unlikely-sounding but not impossible other zoonotic pictures would remain. The smallest modification might include a trickle of some other animals that were not reported in the usual monitoring and perhaps sold elsewhere. Another possibility might be that an FCS-free version of SC2 directly transmitted from bats was somehow able to propagate in people for a while near the relevant bat sources (around Laos) and then picked up an FCS via template switching with the host or even a bacterium before arriving in Wuhan. Alternatively, there could have been an intermediate southern host for a similar account with the host not having anything to do with the wildlife trade in Wuhan. Although the leak of a product of DEFUSE-style research is now by far the most probable account, it might make sense for researchers who wish to explore zoonotic accounts to produce roughly quantifiable alternatives to the now strongly disfavored HSM account.
Explicit problems with the coding and mathematical logic of the companion paper to Worobey, Pekar et al. 2022, have been described on pubpeer and now more completely in an arXiv paper, for which I’ve written some less-technical explanation. Correcting those problems also reverses the main conclusion of that paper. Regardless of possible ways of rescuing some zoonotic account, extreme problems with the current leading publications on this topic in leading journals are now clearly established.
*Levin is not new to this type of work- his analysis of the age-specific infection fatality rate for COVID-19 has been cited by 1000+ scholarly works, and his previous work on the statistical analysis of panel data has ~20,000 google scholar citations.
Levin does make some disturbing mistakes in his introduction. I understand that he is an economist, but this is sloppy because this is easy to check. WIV1 and WIV16 were not found in the old Mojiang copper mine, but a long way away.
"The most closely related bat viral strains (WIV1 and WIV16) exhibit similarity of about 96% to the SARS-CoV genome; those strains were cultured from samples collected at an abandoned mine in Yunnan province, about 1500km from Guangzhou."