We develop and apply machine learning and statistical methods to study the genomics of complex diseases, with a particular interest in psychiatric disorders. We are especially interested in developing models for combining association evidence across multiple types of genomic data, such as gene expression and genotype data, and modeling prior biological pathways and networks for disentangling spurious from meaningful correlations.
Computational Biology, Regulatory Networks, Genetics of Complex Traits, Psychiatric Genetics, Machine Learning in Computational Biology,
The production of diverse types of high-dimensional biological data has increased tremendously in the last decade, presenting novel opportunities to develop and apply computational and machine learning approaches to understand the genetics of human diseases. However, the high dimensionality of this data, whereby up to millions of diverse and heterogeneous “features” are measured in a single experiment, coupled with the prevalence of systematic confounding factors present significant challenges in disentangling bona fide associations that are informative of causal molecular events in disease. My research interest lies in designing tailored computational models for integrating multiple types of high-dimensional “omics” data, with the ultimate goal of disentangling meaningful molecular correlations for common diseases such as psychiatric disorders.
Mostafavi S, Battle A, Zhu X, Potash JB, Weissman MM, Shi J, Beckman K, Haudenschild C, McCormick C, Mei R, Gameroff MJ, Gindes H, Adams P, Goes FS, Mondimore FM, Mackinnon DF, Notes L, Schweizer B, Furman D, Montgomery SB, Urban AE, Koller D, Levinson DF. Type I interferon signaling genes in recurrent major depression: increased expression detected by whole-blood RNA sequencing. Molecular Psychiatry. 2014.
Battle A, Mostafavi S, Zhu X, Potash JB, Weissman MM, McCormick C, Haudenschild CD, Beckman KB, Shi J, Mei R, Urban AE, Montgomery SB, Levinson DF, Koller D. Characterizing the genetic basis of transcriptome diversity through RNA-sequencing of 922 individuals. Genome Research. 2014
Raj T, Rothamel K, Mostafavi S, Ye C, Lee MN, Replogle JM, Feng T, Lee M, Asinovski N, Frohlich I, Imboywa S, Von Korff A, Okada Y, Patsopoulos NA, Davis S, McCabe C, Paik HI, Srivastava GP, Raychaudhuri S, Hafler DA, Koller D, Regev A, Hacohen N, Mathis D, Benoist C, Stranger BE, De Jager PL. Polarization of the effects of autoimmune and neurodegenerative risk alleles in leukocytes. Science. 2014
Mostafavi S, Battle A, Zhu X, Urban AE, Levinson D, Montgomery SB, Koller D. Normalizing RNA-sequencing data by modeling hidden covariates with prior knowledge. PLoS One. 2013
Mostafavi S, Goldenberg A, Morris Q. Labeling nodes using three degrees of propagation. PLoS One. 2012
Mostafavi S, Morris Q. Combining many interaction networks to predict gene function and analyze gene lists. Proteomics. 2012 (Review article).
Mostafavi S, Morris Q. Fast integration of heterogeneous data sources for predicting gene function with limited annotation. Bioinformatics. 2010