29 Comments

fantastic work!

I agree with almost everything, just a few comments:

- I'd put some weight on the earliest genome in presence of Vero DNA sequenced at Sangon. Both the company and Vero cell lines are mentioned in DEFUSE. If it were from a human, it would have likely been a less mixed sample, and was certainly never published, which in itself is suspicious.

IMO WIV panick-sequenced all their RaTG13-like / FCS insertion project samples after they got the first SARS2 sequence to check if it was them. This would also explain the CHO DNA (often used for spike characterisation).

- There are 2 more important pieces of evidence you do not discuss

1: human optimized SARS2 spike expression vectors found in 2019 patient samples

2: our endonuclease preprint, specifically the high concentration of syn. Mutations in restriction sites used by WIV researchers in 2017.

this talk may help:

https://youtu.be/EuuY94tsbls?si=IVu6DXPxMDxhNT98

let me know if you like to discuss this.

Expand full comment
author
Sep 5, 2023·edited Sep 5, 2023Author

I don't want to censor anybody, especially one of my rare cohort of fellow Harvard people who did time in a penitentiary. Nonetheless, I'm deleting a series of very long Comments from reader "Harvard2TheBigHouse" because they wandered off topic into extremely naive remarks about quantum mechanics, etc. (Next week I'll post something about quantum mechanics!) I don't want this substack to be a woo forum.

His key relevant point was that he believes that SC2 came from live-attenuated-vaccine research, which I would consider to be a subset of LL. He gave a link to an early paper on that: https://onlinelibrary.wiley.com/doi/10.1002/bies.202100017.

He also believes that HIV came from similar research, a topic about which I know nothing.

Readers who wish to follow up on his thoughts may go to the substack under that name.

Expand full comment
author
Sep 3, 2023·edited Sep 3, 2023Author

Tentative not-ready-for-prime-time updates in response to Valentin. Here's a glance at sausage-making.

Valentin's collaborator Alex Washburne says that of the 10 relevant synthetic viruses they found, 8 used methods that left the restriction sites in. (https://alexwasburne.substack.com/p/the-synthetic-origin-theory-of-sars). I found one or two out of 24 natural sequence clusters that they present land in the pattern region of the synthetic sequences.

But there's another issue. There exists a pair of suitable restriction enzymes that give just the right synthetic segment pattern for SC2. But aren't there several other possible combinations of suitable restriction enzymes? We need to compare the ~80% chance that the right pattern could be found under LL with the probability that the right pattern could be found under ZW including all plausible sets of restriction enzymes that might be used for synthesis. Are there 3 such combinations? 6? Somebody in the business should know the answer.

I hate to recommend machine learning for anything, but it might make sense to use an ML method to distinguish the patterns of the synthetic sequences from the many others, then use the results to give odds.

***

Sangon has the most intriguing data. I should have mentioned that DEFUSE specified that lab as where some of their sequencing would be done, and that the cells included not just VERO but hamster cells, also standard for lab culture. There were some weird features indicating that these cells had been pretty messed up by some virus. Here's a tentative update. I hope more knowledgable people can review it.

3 of 13 mutations looked ancestral. It doesn't much matter whether there could have been misreads, because the probability of a random misread looking ancestral isn't much different from the probability I mentioned of early SC2 mutations looking ancestral. I get

P(3of13|ZW)=~1/70.

What about P(3of13|LL)? Here I don't really have a clue. You expect more ancestral nt's in a recent ancestral line, but how many? The simplest way I can express my cluelessness would be to assign equal probabilities to any number of ancestral nt's from 0 to 13. Then P(3of13|LL)=1/14. This would give a likelihood ratio of 70/14=5, or logit = 1.6. Maybe ±1.

***

i still am too ignorant of the roles of plasmids to have even a first look at that possible update.

Expand full comment

Since you won't answer on X, let's try again here. You quote Bob Garry "Do the alignment of the spikes at the amino acid level -- it's stunning."

He was aligning SARS2 with RaTG13, which the WIV uploaded on January 24th. So why did Shi and WIV publish the 96% match exposing the furin cleavage site, which kickstarted engineering rumors?

https://virological.org/t/tackling-rumors-of-a-suspicious-origin-of-ncov2019/384

Expand full comment