โœจ Visual Editor

close

Thread Truncated

Only the first 20 tweets are shown to ensure high-quality rendering and prevent image size issues.

palette Canvas & Background

Gradient:arrow_forward
Text Color:
135ยฐ

style Card Style

40px
16px

text_fields Typography

16px
Peter Miller
@tgof137
A few people asked me about Nod's preprint about 2 spillovers.

He makes two arguments criticizing Pekar et al 2022.

A proper analysis of his 1st argument actually points in the opposite direction and strengthens Pekar's conclusions.

His 2nd argument is not well defined.
๐Ÿงต
Peter Miller
@tgof137
Nod's preprint is here:
arxiv.org/pdf/2502.20076

Let me walk you through his mistakes.
Peter Miller
@tgof137
The outbreak in Wuhan is unusual.

Normally, when a single case of Covid starts an outbreak, it starts a single polytomy. We've observed this happening again and again, around the world.

In Wuhan, there are 2 polytomies. Pekar theorized that was from 2 spillovers.
Thread image
Peter Miller
@tgof137
Because there are only 2 mutations separating the 2 lineages, this could also happen by chance, from 1 spillover.

Pekar's model said that was a rare thing to occur, with 3% odds from a single introduction.
Thread image
Peter Miller
@tgof137
(the odds are actually even lower than that, because the genetic clock is reversed. A spilled over after B. So it's probably closer to 0.3%, but let's ignore that for now)
Peter Miller
@tgof137
How often do you get 2 polytomies from 2 spillovers?

The simplest calculation would be you just simulate 2 epidemics, and count how often both form basal polytomies

That gets you to a bayes factor of somewhere around 4X, in favor of 2 spillovers, not 1:

Peter Miller
@tgof137
Nod points out an additional constraint that Pekar missed.

Pekar required that the 2 polytomies both need to be equal and balanced, with each making 30-70% of the total genomes.

Pekar required this for the 1 introduction case but did not require it for the 2 introduction case.
Peter Miller
@tgof137
If you look at the start time of all possible simulated epidemics, some grow really slowly and some grow quickly.

If you randomly pick 2 points from that curve, the two aren't always going to be close to each other.

Peter Miller
@tgof137
At this point, you should probably stop and think about what we're actually modeling here.

We're saying that it is possible for an epidemic that starts some random place in Wuhan to grow very slowly before it takes off.
Peter Miller
@tgof137
Maybe it starts in late September or early October, goes from person to person for a few weeks, and then takes off when it hits Huanan market.

That might represent something like, "one infected person from Yunnan visits Wuhan and starts the pandemic":

Peter Miller
@tgof137
What we're actually trying to model here is "2 spillovers at Huanan market". Everyone agrees the market is a reasonably good place for Covid to spread.

Some lab leakers think the market is the perfect place for Covid to spread, better than any other.
Peter Miller
@tgof137
That part is debatable, I don't know how much better or worse it is than any other crowded building.

But we all agree it's a crowded building, so an introduction there quickly starts growing.
Peter Miller
@tgof137
No one thinks that a spillover at the market is just going to start in September, bounce around from one person to one person for 2+ months, and then suddenly start growing quickly.
Peter Miller
@tgof137
Pekar's code simulates transmission across a social network. Some nodes in that network are well connected, others are poorly connected. The slowly growing epidemics start at a poorly connected node and jump along such nodes for a while before hitting a well connected node.
Peter Miller
@tgof137
If you simulate 2 spillovers at the Huanan market as two introductions into random places in that social network, then it's not a model that represents reality.
Peter Miller
@tgof137
I ran some numbers here, with my own code. Suppose I simulate 2 spillovers, with lineage A and lineage B, at the same time.

Suppose those are completely independent of each other, i.e. they happen in different locations.
Peter Miller
@tgof137
I counted only the cases where both spillovers did not go extinct.

I found it's about 9% odds that the two will become balanced at 30/70 or closer.

If I just divide 9% by 3%, I get bayes factor 3.

So, by that logic, Nod did find something, he reduced ~4 down to ~3.
Peter Miller
@tgof137
But, if we suppose a market outbreak is faster than a random introduction somewhere in Wuhan, then we actually want to put some constraint on that.

How should we quantify that?
Peter Miller
@tgof137
One approach would be to say a market spillover has no mutations before the basal polytomy forms.

Those epidemics happen a little faster than average, but they can still take a few generations to get going. Here's an example, with TMRCA marked by the orange line.
Thread image
Peter Miller
@tgof137
Another approach would be to count epidemics where the TMRCA happens right after the first case.

These aren't so uncommon in general (I think it's ~33% of all simulations), and might be a good representation of what a introduction looks like in a high spread environment.
Thread image
Generated by Thread Navigator
100%
view_carousel Carousel Studio NEW
Press โŒ˜ + S to quick-export