Thread Truncated (Cap Enforced)
Only the first 20 tweets are unrolled into slides to ensure reliable PDF exporting and high server performance.
Canvas & Ratio
Choose your destination platform format
Layout Template
Choose a content structure for your slides
Preset Themes
Typography & Sizing
Brand Kit Customization
AGENCYConfigure brand assets for headers & footers
Outro Slide CTA
Customize your closing call-to-action slide
Background Pattern
Build Your Carousel
Drag and drop any post card below onto a slide, or use the quick buttons to insert content/images instantly!

A few people asked me about Nod's preprint about 2 spillovers. He makes two arguments criticizing Pekar et al 2022. A proper analysis of his 1st argument actually points in the opposite direction and strengthens Pekar's conclusions. His 2nd argument is not well defined. 🧵

Nod's preprint is here: <a target="_blank" href="https://arxiv.org/pdf/2502.20076" color="blue">arxiv.org/pdf/2502.20076</a> Let me walk you through his mistakes.

The outbreak in Wuhan is unusual. Normally, when a single case of Covid starts an outbreak, it starts a single polytomy. We've observed this happening again and again, around the world. In Wuhan, there are 2 polytomies. Pekar theorized that was from 2 spillovers.


Because there are only 2 mutations separating the 2 lineages, this could also happen by chance, from 1 spillover. Pekar's model said that was a rare thing to occur, with 3% odds from a single introduction.


(the odds are actually even lower than that, because the genetic clock is reversed. A spilled over after B. So it's probably closer to 0.3%, but let's ignore that for now)

How often do you get 2 polytomies from 2 spillovers? The simplest calculation would be you just simulate 2 epidemics, and count how often both form basal polytomies That gets you to a bayes factor of somewhere around 4X, in favor of 2 spillovers, not 1: <a target="_blank" href="https://x.com/tgof137/status/1772420536909394410" color="blue">x.com/tgof137/status…</a>

Nod points out an additional constraint that Pekar missed. Pekar required that the 2 polytomies both need to be equal and balanced, with each making 30-70% of the total genomes. Pekar required this for the 1 introduction case but did not require it for the 2 introduction case.

If you look at the start time of all possible simulated epidemics, some grow really slowly and some grow quickly. If you randomly pick 2 points from that curve, the two aren't always going to be close to each other. <a target="_blank" href="https://x.com/tgof137/status/1772417301561708753" color="blue">x.com/tgof137/status…</a>

At this point, you should probably stop and think about what we're actually modeling here. We're saying that it is possible for an epidemic that starts some random place in Wuhan to grow very slowly before it takes off.

Maybe it starts in late September or early October, goes from person to person for a few weeks, and then takes off when it hits Huanan market. That might represent something like, "one infected person from Yunnan visits Wuhan and starts the pandemic": <a target="_blank" href="https://x.com/tgof137/status/1772418932885852210" color="blue">x.com/tgof137/status…</a>

What we're actually trying to model here is "2 spillovers at Huanan market". Everyone agrees the market is a reasonably good place for Covid to spread. Some lab leakers think the market is the perfect place for Covid to spread, better than any other.

That part is debatable, I don't know how much better or worse it is than any other crowded building. But we all agree it's a crowded building, so an introduction there quickly starts growing.

No one thinks that a spillover at the market is just going to start in September, bounce around from one person to one person for 2+ months, and then suddenly start growing quickly.

Pekar's code simulates transmission across a social network. Some nodes in that network are well connected, others are poorly connected. The slowly growing epidemics start at a poorly connected node and jump along such nodes for a while before hitting a well connected node.

If you simulate 2 spillovers at the Huanan market as two introductions into random places in that social network, then it's not a model that represents reality.

I ran some numbers here, with my own code. Suppose I simulate 2 spillovers, with lineage A and lineage B, at the same time. Suppose those are completely independent of each other, i.e. they happen in different locations.

I counted only the cases where both spillovers did not go extinct. I found it's about 9% odds that the two will become balanced at 30/70 or closer. If I just divide 9% by 3%, I get bayes factor 3. So, by that logic, Nod did find something, he reduced ~4 down to ~3.

But, if we suppose a market outbreak is faster than a random introduction somewhere in Wuhan, then we actually want to put some constraint on that. How should we quantify that?

One approach would be to say a market spillover has no mutations before the basal polytomy forms. Those epidemics happen a little faster than average, but they can still take a few generations to get going. Here's an example, with TMRCA marked by the orange line.


Another approach would be to count epidemics where the TMRCA happens right after the first case. These aren't so uncommon in general (I think it's ~33% of all simulations), and might be a good representation of what a introduction looks like in a high spread environment.
