Thread: Multi-omics sounds cool—until you actually try it. Here's are the nuances.
1/
You’ve got RNA-seq.
Methylation.
Proteomics.
Time to “integrate” the data.
But how? And why?
Let’s break it down.

2/
Multi-omic integration sounds powerful.
But it’s not magic.
If you don’t ask the right question first, the answer won’t matter.
Multi-omic integration sounds powerful.
But it’s not magic.
If you don’t ask the right question first, the answer won’t matter.
3/
Start here:
Do you want shared programs across omics?
Or unique signals from each modality?
That choice decides your method.
Start here:
Do you want shared programs across omics?
Or unique signals from each modality?
That choice decides your method.
4/
Unsupervised goal? Try MOFA2.
Want to predict disease or treatment? DIABLO is your friend.
Graph models? Great—if it performs better
Unsupervised goal? Try MOFA2.
Want to predict disease or treatment? DIABLO is your friend.
Graph models? Great—if it performs better
5/
Real-life example:
Chronic kidney disease study used both MOFA2 + DIABLO.
Why? Different tools, complementary insights.
Paper: ncbi.nlm.nih.gov/pmc/articles/P…
Another New preprint for a different disease: medrxiv.org/content/10.110…
Real-life example:
Chronic kidney disease study used both MOFA2 + DIABLO.
Why? Different tools, complementary insights.
Paper: ncbi.nlm.nih.gov/pmc/articles/P…
Another New preprint for a different disease: medrxiv.org/content/10.110…
6/
Here’s what makes multi-omics hard:
Your matrix is incomplete.
RNA-seq for 200 samples.
Proteomics for 150.
Methylation for 180.
Here’s what makes multi-omics hard:
Your matrix is incomplete.
RNA-seq for 200 samples.
Proteomics for 150.
Methylation for 180.
7/
You can’t just “merge” them.
Naive concatenation drowns real signal.
Or worse—creates phantom clusters driven by batch noise.
You can’t just “merge” them.
Naive concatenation drowns real signal.
Or worse—creates phantom clusters driven by batch noise.
8/
Each modality is different:
scATAC-seq is sparse
Proteomics is noisy
RNA-seq has 20K+ features
Methylation may only cover 50K regions and over 9 million CpG sites
Each modality is different:
scATAC-seq is sparse
Proteomics is noisy
RNA-seq has 20K+ features
Methylation may only cover 50K regions and over 9 million CpG sites
9/
Good methods normalize each modality, learn weights, or regularize smartly.
MOFA2, DIABLO, and weighted PCA all do this.
Good methods normalize each modality, learn weights, or regularize smartly.
MOFA2, DIABLO, and weighted PCA all do this.
10/
Want to see how it fails?
Check this post:
divingintogeneticsandgenomics.com/post/python-vi…
Spatial + gene expression integration went sideways without normalization.
Want to see how it fails?
Check this post:
divingintogeneticsandgenomics.com/post/python-vi…
Spatial + gene expression integration went sideways without normalization.
11/
Math is nice.
But biology matters more.
If you can’t map back your result to a gene, CpG, or protein—what’s the point?
Math is nice.
But biology matters more.
If you can’t map back your result to a gene, CpG, or protein—what’s the point?
12/
These methods uncover correlations, not causes.
Interpret carefully.
Validate everything.
These methods uncover correlations, not causes.
Interpret carefully.
Validate everything.
13/
Use known pathways.
Run orthogonal experiments.
Generalize across cohorts.
Don’t trust the output blindly.
Use known pathways.
Run orthogonal experiments.
Generalize across cohorts.
Don’t trust the output blindly.
14/
Resources:
Tools list: github.com/mikelove/aweso…
Tool review: ncbi.nlm.nih.gov/pmc/articles/P…
Overview: frontlinegenomics.com/a-guide-to-mul…
Resources:
Tools list: github.com/mikelove/aweso…
Tool review: ncbi.nlm.nih.gov/pmc/articles/P…
Overview: frontlinegenomics.com/a-guide-to-mul…
15/
Key takeaways:
Start with the question
Pick tools based on your goal
Normalize per modality
Validate everything
Biology > black boxes
Multi-omics is messy. But it’s worth it—if you know what you’re doing.
Key takeaways:
Start with the question
Pick tools based on your goal
Normalize per modality
Validate everything
Biology > black boxes
Multi-omics is messy. But it’s worth it—if you know what you’re doing.
I hope you've found this post helpful.
Follow me for more.
Subscribe to my FREE newsletter chatomics to learn bioinformatics divingintogeneticsandgenomics.ck.page/profile
Follow me for more.
Subscribe to my FREE newsletter chatomics to learn bioinformatics divingintogeneticsandgenomics.ck.page/profile
View Tweet
Generated by Thread Navigator
Press ⌘ + S to quick-export
