Sai Zhang Profile Banner
Sai Zhang Profile
Sai Zhang

@saizhang0

1,333
Followers
1,581
Following
15
Media
933
Statuses

computational biology and machine learning. assistant professor at @UF . tsinghua (phd) → stanford (postdoc with @SnyderShot ) → ufl (pi). views are my own.

Gainesville, FL
Joined January 2019
Don't wanna be here? Send us removal request.
Pinned Tweet
@saizhang0
Sai Zhang
5 months
We have single-cell version of eQTL and fine-mapping, but how about PRS? I am delighted to share our latest work on scPRS, a geometric deep learning model constructing single-cell-resolved PRS leveraging scATAC data to enhance disease prediction and biological discovery. 1/n
@biorxivpreprint
bioRxiv
5 months
Deconvolution of polygenic risk score in single cells unravels cellular and molecular heterogeneity of complex human diseases #bioRxiv
0
4
12
2
8
51
@saizhang0
Sai Zhang
10 months
Someone is running jobs on 128 GPU nodes (~1000 A100) during holidays. Is this a normal PhD life?
Tweet media one
98
59
1K
@saizhang0
Sai Zhang
10 months
@70_dbz We have user limit, but I assume this is an exception (a Santa gift?)😂
1
0
54
@saizhang0
Sai Zhang
2 years
It's my first day as a #newPI @UF @UFPHHP @UFMedicine . Very excited to start this new position! I'm really grateful to all my mentors, colleagues and friends who helped my along this journey. (1/n)
18
2
48
@saizhang0
Sai Zhang
10 months
@vitaliikl Yes thank @NVIDIAAI for the investment.
1
0
47
@saizhang0
Sai Zhang
10 months
As this tweet is unexpectedly getting popular, I want to disclose that this @UF HPC is HiPerGator with tremendous investments (1,120 A100s) from @nvidia @NVIDIAAI . Literally, NVIDIA is making UF researchers much busier than ever.
@saizhang0
Sai Zhang
10 months
Someone is running jobs on 128 GPU nodes (~1000 A100) during holidays. Is this a normal PhD life?
Tweet media one
98
59
1K
1
4
44
@saizhang0
Sai Zhang
2 years
Last meeting with Mike @SnyderShot before I leave - besides project wrap-up, also got a lot of useful advices for being a good PI.
Tweet media one
2
0
39
@saizhang0
Sai Zhang
1 year
We have two postdoc openings (until filled): (1) general computational genomics & machine learning and (2) ALS genomics funded by @mndassoc . Please DM me if you are interested in either of them, and help spread it if you know someone would be interested. Application links:👇
2
19
28
@saizhang0
Sai Zhang
8 months
Can't agree more. Bio ML is not Kaggle but science. ML is just another tool to decode bio system, so the best model is the one working best, not the largest or fanciest. In many cases a simpler model works better - trust me the real informative bio data is less than you thought.
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
8 months
To all budding compbio & ML folks interested in bio: Don't just only run behind the latest ML model hype train. The greatest long run impact will come by really assimilating prior bio/compbio literature with the goal of really understanding strategies for how to model biology. 1/
8
129
781
1
0
28
@saizhang0
Sai Zhang
3 years
Finally out. We developed machine learning method RefMap to discover the genomic basis of ALS. Our gene findings put distal axon dysfunction upstream of motor neuron degeneration. (1/n)
7
8
28
@saizhang0
Sai Zhang
10 months
Super excited that our work on a GNN-based polygenic risk score (PRS-Net) was accepted by #RECOMB2024 . This is my first last-author paper! Preprint coming soon. Looking forward to visiting Boston and connecting with old and new friends! @RECOMBconf
@jmuiuc
Jian Ma
10 months
The list of #RECOMB2024 Accepted Papers is now live: . A huge congrats to all the authors! @RECOMBconf
Tweet media one
1
19
81
1
1
25
@saizhang0
Sai Zhang
11 months
I usually don't comment on politics, but this is what's happening at UF, and I see signs it will be expanded to other states. This is an extremely risky action that will definitely harm US sci&tech in the long run.
@NewsfromScience
News from Science
11 months
A new state law is thwarting faculty at Florida’s public universities who want to hire Chinese graduate students and postdocs to work in their labs.
5
12
11
1
9
27
@saizhang0
Sai Zhang
2 years
Officially online @CellSystemsCP . Our first example of combining genetics and single cell multiomics to dissect cell heterogeneity and better map genomic causes of complex diseases, here severe COVID-19. Work from @SnyderShot and Phil Tsao
3
6
19
@saizhang0
Sai Zhang
2 years
Thanks @CellSystemsCP @GaalBernadett for featuring our paper ( @SnyderShot @JohnathanCK1 @TheSnyderLab ) on the cover! Really enjoyed the review process!
@CellSystemsCP
Cell Systems
2 years
Our August issue is online! On the cover: Natural killer (NK) cells (pale blue) attacking SARS-CoV-2 viruses (red) with machine learning methodology signified by patterned numerals.
Tweet media one
1
0
2
0
2
18
@saizhang0
Sai Zhang
2 years
A warm welcome from @SnyderShot and I to @JohnathanCK1 visiting Stanford! A lot of fun will be happening.
Tweet media one
1
0
20
@saizhang0
Sai Zhang
3 years
Check out our new preprint. Our machine learning-powered analyses reveal a cell-type-specific genetic landscape of severe COVID-19, and place NK cells upstream in the pathogenesis. Great collaboration with @JohnathanCK1 , MVP and GEN-COVID. Work from Philip Tsao and @SnyderShot
@medrxivpreprint
medRxiv
3 years
Common and rare variant analyses combined with single-cell multiomics reveal cell-type-specific molecular mechanisms of COVID-19 severity #medRxiv
0
2
1
0
9
12
@saizhang0
Sai Zhang
8 months
Here is the preprint . We are still working on a journal version based on RECOMB review comments. Stay tuned!
@saizhang0
Sai Zhang
10 months
Super excited that our work on a GNN-based polygenic risk score (PRS-Net) was accepted by #RECOMB2024 . This is my first last-author paper! Preprint coming soon. Looking forward to visiting Boston and connecting with old and new friends! @RECOMBconf
1
1
25
0
1
12
@saizhang0
Sai Zhang
10 months
An update: this job won’t terminate until Jan 2nd, so let’s calculate 💰💰💰 now
@saizhang0
Sai Zhang
10 months
Someone is running jobs on 128 GPU nodes (~1000 A100) during holidays. Is this a normal PhD life?
Tweet media one
98
59
1K
4
0
11
@saizhang0
Sai Zhang
2 years
Officially accepted by @CellSystemsCP
@saizhang0
Sai Zhang
3 years
Check out our new preprint. Our machine learning-powered analyses reveal a cell-type-specific genetic landscape of severe COVID-19, and place NK cells upstream in the pathogenesis. Great collaboration with @JohnathanCK1 , MVP and GEN-COVID. Work from Philip Tsao and @SnyderShot
0
9
12
0
3
11
@saizhang0
Sai Zhang
10 months
@BoWang87 Camel, mamba🧐 I think you can rename your Publication page as zoo of papers.
0
0
6
@saizhang0
Sai Zhang
2 years
A super insightful review!
@drjosephpowell
Joseph Powell
2 years
Check out our perspectives article on the intersection of single-cell genomics and human (population) genetics. Great working on this with the team: @AnnaSECuomo , @aparnanathan , @soumya_boston , @dgmacarthur
1
10
47
0
0
6
@saizhang0
Sai Zhang
1 year
Really a cutting-edge material!
@BoWang87
Bo Wang
1 year
Interested in LLMs for genomic research but don't know where to start? looking for a review/survey to get started in this field? 👇👇😀 I am very excited to share that our review paper titled "To Transformers and Beyond: Large Language Models for the Genome" is now available as
Tweet media one
Tweet media two
Tweet media three
15
288
1K
0
0
7
@saizhang0
Sai Zhang
3 years
Excited to give the talk at CCBR. Looking forward to it!
0
0
7
@saizhang0
Sai Zhang
3 years
Please check out our @NeuroCellPress cover story here! Thanks for the awesome design from @not_akevin . Thanks a lot for the preview from @haiyuanyu
@NeuroCellPress
Neuron
3 years
Check out our latest issue: Cover highlights a novel mapping approach to discover risk-altering variants of disease such as ALS. Find the paper from Snyder's team here: with a preview of this work:
Tweet media one
0
2
17
0
3
7
@saizhang0
Sai Zhang
3 years
Will give a Highlight talk about our Neuron paper. Have been 5ys since I attended RECOMB last time (how time flies!). Really looking forward to it and meeting old and new friends in SD!
@recomb2022
RECOMB 2022
3 years
Program for #RECOMB2022 , featuring 7 distinguished keynote speakers (Regina Barzilay, Howard Chang, John Chodera, Lenore Cowen, John Marioni, Bing Ren and Wenyi Wang), and covering a broad range of topics in computational biology, is now available!
0
18
37
0
0
7
@saizhang0
Sai Zhang
8 months
I anticipate versatile applications of PRS-Net by the community with enhanced prediction and biological discovery 🚀
@biorxiv_genetic
bioRxiv Genetics
8 months
PRS-Net: Interpretable polygenic risk scores via geometric learning #biorxiv_genetic
0
0
4
1
0
6
@saizhang0
Sai Zhang
10 months
I am imaging one more weakness comment in the future - it is questionable if the applicant was able to recruit diverse trainees in the lab because of new FL laws.
@LabMoehle
Mark Moehle Lab
10 months
Grant unfortunately not discussed. This was at least comic relief when reading the summary statement. I guess connecting flights aren't a thing for the best scientists. Should have pushed for a GNV to SFO flight in my offer letter.
Tweet media one
91
33
604
0
1
6
@saizhang0
Sai Zhang
10 months
@jmuiuc @RECOMBconf Also a remarkable PC Chair for handling all submissions efficiently!
0
0
4
@saizhang0
Sai Zhang
3 years
The best way to celebrate an accept is to accelerate the next project. Well this is research
0
0
6
@saizhang0
Sai Zhang
11 months
@jmuiuc Qual vs number is like a strategic difference, but aiming high is riskier especially for junior people. However I do believe it brings benefit in the long run. The problem is if the field appreciates more qual over number or vice versa?
0
0
4
@saizhang0
Sai Zhang
2 years
This approach was previously used to study the genetics of ALS. Please see our Neuron paper
@SnyderShot
Michael Snyder, PhD
2 years
Our new RefMap ML method combined with single cell data identifies over 1000 genes responsible for COVID severity and defines the cells types underlying its heritability (NK and T cells). Just got the cover:
2
18
122
0
2
5
@saizhang0
Sai Zhang
11 months
My last tweet is about quality vs quantity. Well here are almost 40 papers and over 30 talks per year. Where did the time go for research? Thanks to ChatGPT?
0
0
5
@saizhang0
Sai Zhang
2 years
Very excited to be involved in this incredible work! Using network analysis, we identified genes and pathways underpinning CHD and we have experimental validation as well. Congrats to @jingjingSF @SnyderShot @Joseph_C_Wu @TheSnyderLab @StanfordCVI
2
0
5
@saizhang0
Sai Zhang
2 years
@UF @UFPHHP @UFMedicine At last, I really want to thank @SnyderShot for his enormous supports on my research - so proud of being a member of @TheSnyderLab ! (3/n)
0
0
4
@saizhang0
Sai Zhang
4 years
Excited to share our ( @SnyderShot ) latest work on ALS genetics. An awesome collaboration with Dr. Johnathan Cooper-Knock and Dr. Pamela J. Shaw from the University of Sheffield and many other great scientists in this field. 1/5
2
0
4
@saizhang0
Sai Zhang
2 years
@UF @UFPHHP @UFMedicine Our group is actively recruiting postdocs and graduate students. So if you are interested in ML for biomedicine, please DM me. The job ad is coming soon :)
0
2
4
@saizhang0
Sai Zhang
3 years
@jmuiuc A few opinions: Biology is becoming combio by itself in these days. CS/ML should be a skill to a biologist, like math to physicists. It is not the background or training but the scientific question we are asking that matters.
0
0
3
@saizhang0
Sai Zhang
11 months
@anshulkundaje We need a whole review or perspective on it!
0
0
2
@saizhang0
Sai Zhang
9 months
@WenhuChen High risk high reward topic
0
0
3
@saizhang0
Sai Zhang
2 years
Excuse me?
@Columbia
Columbia University
2 years
“Asian American students who have earned admission to Harvard are smart, promising, and have no doubt worked very hard. But in ways . . . they may have also benefited from their racial status long before they applied,” writes sociology Professor @JLeeSoc .
2K
166
405
0
0
3
@saizhang0
Sai Zhang
2 years
@morris_lab @kenjikamimoto68 Congrats! We have used CellOracle in several projects and it is really an elegant and powerful method.
0
0
3
@saizhang0
Sai Zhang
3 years
Looks like a really smart idea
@bloodgenes
Vijay Sankaran
3 years
Are you interested in how we can learn more human biology by integrating #singlecell genomics and #GWAS ? Please check out our preprint: #V2F mapping at single-cell resolution through network propagation Led by @fulong_yu ! A short 🧵 (1/n)
Tweet media one
5
79
298
0
0
3
@saizhang0
Sai Zhang
4 years
A very interesting result and very happy journey with @JohnathanCK1 . More about ALS rare variant is coming :D
@CellReports
Cell Reports
4 years
Rare variant burden analysis within enhancers identifies CAV1 as an ALS risk gene
0
3
10
0
1
3
@saizhang0
Sai Zhang
1 year
@anshulkundaje @biorxivpreprint same problem - have to read biorxiv using my phone?
0
0
2
@saizhang0
Sai Zhang
3 years
@anshulkundaje @martinjzhang This is great work. Also you may want to check this out for scATAC:
0
0
3
@saizhang0
Sai Zhang
3 years
Will a 40-page response make the reviewers happy? I really hope so...
0
0
3
@saizhang0
Sai Zhang
2 years
@EpiEllie @BUSPH Well I got Mike with no surprise @SnyderShot
Tweet media one
0
0
3
@saizhang0
Sai Zhang
3 years
This reminds me of All models are wrong (has own math assumptions & we can always get a "better" one by changing priors) so we inevitably need follow-up experimental validations. I am tending to believe that math/stats models are tools giving us candidates rather than truth.
@bpasaniuc
Bogdan Pasaniuc
3 years
@doctorveera lots of strong statements without much support in this preprint. The basic summary is: let's change the null of TWAS and then be surprised that under the new null it has inflated type 1 error rate...
1
0
16
0
0
3
@saizhang0
Sai Zhang
9 months
@jengreitz @Nature @kanghelenyihua @Dr_RajatGupta Very impressive work! Congratulations!
0
0
3
@saizhang0
Sai Zhang
10 months
@m_gitz @UF @nvidia @NVIDIAAI I expect this one will be great research given the great resources, but also expect it to be fast leaving space for others like me🤣
0
0
0
@saizhang0
Sai Zhang
11 months
Very insightful papers! I have been curious why all of these sequence models are always applied variant by variant ignoring the true variant background in individuals - sometimes could be very complex in long range. Now here is why it is suboptimal.
@sara_mostafavi
Sara Mostafavi
11 months
Our paper (with @chikinlab and @LXandR_ ) benchmarking sequence-based DL models for personal gene expression prediction is out: A co-submission from Nilah Ioannidis' group, showing these results are consistent across data and models
4
44
162
0
0
2
@saizhang0
Sai Zhang
2 years
@anshulkundaje @UF @UFPHHP @UFMedicine Thanks Anshul. Hope we will have the chance to collaborate in the future!
0
0
2
@saizhang0
Sai Zhang
10 months
@SashaGusevPosts Also, I am always curious if these models were "overfitted" given the relatively homogeneous training input (ref DNA) compared to other tasks. If the overfitting is an issue, then the model may be easily incorrectly interpreted, leading to the poor perf in personal settings?
1
0
2
@saizhang0
Sai Zhang
10 months
0
0
1
@saizhang0
Sai Zhang
8 months
1
0
2
@saizhang0
Sai Zhang
2 years
Bad effect of n=1?
@leonidkruglyak
Leonid Kruglyak
2 years
Shen et al. measured fitness by comparing growth of mutant libraries with a single WT strain. Crucially, the libraries did not contain WT sequences created in parallel with the mutations, and as a result did not control for gene-specific background effects on fitness.
1
4
42
0
0
1
@saizhang0
Sai Zhang
5 years
@jure Video available?
0
0
2
@saizhang0
Sai Zhang
1 year
@jmuiuc @UCLA_CGSI Very insightful! We urgently need well defined benchmarking to demo the “power” of LLM rather than a fancy concept but marginally outperforming nonLLMs or just compared with trivial methods.
0
0
2
@saizhang0
Sai Zhang
2 years
@doctorveera Totally agreed and really nice summary of “counter-examples”!
0
0
1
@saizhang0
Sai Zhang
2 years
Also to all other authors! @jingjingSF and I came up with the idea before the pandemic which made my first work on pediatric disease - how time flies
0
0
2
@saizhang0
Sai Zhang
10 months
@SashaGusevPosts So it looks feeding more variations to the model is the key, especially in an end-to-end setting.
0
0
2
@saizhang0
Sai Zhang
9 months
0
0
2
@saizhang0
Sai Zhang
11 months
@jmuiuc A very interesting/important question. I think in most cases the number is negatively related to quality. But I have no doubt there are super talented people publishing many high-quality papers at a fast pace.
0
0
1
@saizhang0
Sai Zhang
3 years
Awesome opinions. I have been considering combining both jobs: predicting molecular profiles from sequences plus cell-type-specific factors (minimally/partially measured) for cross-cell-type prediction.
@jmschreiber91
Jacob Schreiber
3 years
What's the point of comp bio models that can only make predictions for experiments that have already been performed (e.g. DeepSEA, Basset, Enformer, BPNet, etc)? In Rit's/my latest short review on ML in comp bio, we discuss! 1/8
3
29
102
1
0
2
@saizhang0
Sai Zhang
1 year
@Xiaojie_Qiu Congrats Xiaojie!
1
0
1
@saizhang0
Sai Zhang
1 year
@jmschreiber91 @nomad421 @BioMickWatson @pashadag It is not "free" if someone uses it for green card application.
1
0
1
@saizhang0
Sai Zhang
2 years
@UF @UFPHHP @UFMedicine I want to thank @jingjingSF @JohnathanCK1 etc for years of collaborations which got me a good taste of science. I want to thank Dr. Jianyang Zeng who brought me into the field of compbio from pure CS. (2/n)
0
0
2
@saizhang0
Sai Zhang
2 years
0
0
2
@saizhang0
Sai Zhang
9 months
@penggaos Congrats!!! You deserve it boy!!!
1
0
2
@saizhang0
Sai Zhang
8 months
@BoWang87 @NatureBiotech Congrats Bo! This is the issue we recently got in our own data. Definitely will try your method.
0
0
2
@saizhang0
Sai Zhang
4 years
Hope I could get one like this next time🥺
@segal_eran
Eran Segal
4 years
Definitely didn't see this one coming
Tweet media one
6
2
181
0
0
2
@saizhang0
Sai Zhang
3 years
@SchoeneggerPhil It is the time for editor to take actions.
1
0
1
@saizhang0
Sai Zhang
1 year
@BoWang87 Congratulations!
0
0
1
@saizhang0
Sai Zhang
2 years
0
0
1
@saizhang0
Sai Zhang
3 years
@XiuweiZhang Not a convincing reason to reject a paper. DL may be sensitive to hyperparameters such as the structure of the network since it is DL. But every reviewer of a DL paper is looking for a section describing how those parameters were tuned using like CV. This looks to be inevitable.
0
0
1
@saizhang0
Sai Zhang
1 year
@zhou_jingtian @arcinstitute Congratulations Jingtian!!!
0
0
1
@saizhang0
Sai Zhang
10 months
@anshulkundaje @SashaGusevPosts Yes I ignored the difficulty in the second part. Then we need to sequence more individual ATAC+RNA to catch the variations. More work to do.
0
0
1
@saizhang0
Sai Zhang
8 months
@BoWang87 @patricksmalone Similar issues to DNA/RNA LLMs: I am always curious if they really learned something interesting using solely seq info. More advanced structure engineering is needed to integrate domain knowledge for a better solution.
0
0
1
@saizhang0
Sai Zhang
3 years
So the problem is not the data, it is how we analyze the data and validate our findings.
0
0
1
@saizhang0
Sai Zhang
1 year
0
0
1
@saizhang0
Sai Zhang
2 years
@grahamserwin @SnyderShot Thanks Graham! Congrats to your Nature paper!
0
0
1
@saizhang0
Sai Zhang
1 year
0
0
1
@saizhang0
Sai Zhang
2 years
@UF @UFPHHP @UFMedicine We will continue the research on compbio and precision medicine - developing novel ML models to integrate genetics with single-cell genomics and clinical data to decode complex diseases. A lot fun will be there! (4/n)
0
0
1
@saizhang0
Sai Zhang
3 years
I am excited to see that more and more efforts are now being made in combining functional genomics (e.g. single cell) with GWAS to boost the causal discovery and interpretation. Really like this work
0
1
1
@saizhang0
Sai Zhang
10 months
@SashaGusevPosts In Enformer MPRA experiments, a SNP-by-SNP strategy (similar to QTL) was taken. I wonder if together inputting multiple personal SNPs would have disrupted the model prediction because all of these models were trained on reference. May need to test it by masking certain SNPs.
2
0
0
@saizhang0
Sai Zhang
2 years
@jmuiuc @aaas Big Congrats!
0
0
1
@saizhang0
Sai Zhang
3 years
@jmschreiber91 Please check out the father of this series of studies - polyphen2
1
0
1
@saizhang0
Sai Zhang
3 years
@KVBortle @CDB_Illinois @MCB_Illinois @CancerCenterIL Big Congrats Kevin! Although we didn’t get the chance to work together hopefully we will in the future. Looking forward to your future work!
1
0
1
@saizhang0
Sai Zhang
2 years
This looks awesome! Eager to get a draft!
@jkpritch
Jonathan Pritchard
2 years
Here's the current table of contents (Part 4 not started yet). I hope to present a unified treatment of popgen data and theory, human history, and human trait genetics. I'll be open to sharing drafts with people soon (ping me if you have an interest).
Tweet media one
9
18
186
0
0
1
@saizhang0
Sai Zhang
2 years
0
0
1
@saizhang0
Sai Zhang
4 years
We profiled the transcriptome and epigenome (ATAC, histone ChIP and Hi-C) of iPSC-derived motor neurons (MNs), and developed a Bayesian network (RefMap) to integrate this functional genomics data with ALS genetics for risk gene discovery. 2/5
1
1
1
@saizhang0
Sai Zhang
3 years
@anshulkundaje @RoxanaDaneshjou There is a team at Stanford Med to help build software applications but forgot where I saw this info.
0
0
1