Research
The Storey Lab’s current research efforts are in the following areas:
- Extending fundamental population genetics models and quantities — such as allele frequencies, Hardy-Weinberg equilibrium, FST, admixture, ancestry, and kinship — to genome-wide genotype data making minimal assumptions about population structure
- Improving the accuracy and applicability of the polygenic trait model in population-based studies for determining the genetic basis of complex traits, including association studies, genome-wide heritability, and polygenic risk scores
- Developing causal inference methods and study designs for population-based genetics studies of complex traits
- Latent variable decompositions of high-dimensional data, with an emphasis on admixture, population structure, and exponential family distributions
- Latent variable modeling in the context of high-dimensional significance testing — such as surrogate variable analysis and cross-dimensional inference
- Signficance testing on high-dimensional regression models
- Models and estimators for false discovery rates and q-values
Publications
- Chen, D. and J. D. Storey (2024). “Coancestry superposed on admixed populations yields measures of relatedness at individual-level resolution”. bioRxiv doi:10.1101/2024.12.29.630632. https://doi.org/10.1101/2024.12.29.630632
- Tang, Y., I. Cabreros, and J. D. Storey (2024). “Identifying causal genotype-phenotype relationships for population-sampled parent-child trios”. bioRxiv doi:10.1101/2024.12.10.627752. https://doi.org/10.1101/2024.12.10.627752.
- Ochoa, A. and J. D. Storey (2021). “Estimating FST and kinship for arbitrary population structures”. PLoS Genetics 17.1. First published online 2020-11-02, p. e1009241. https://doi.org/10.1371/journal.pgen.1009241.
- Graim, K. and et al. (2021). “Modeling molecular development of breast cancer in canine mammary tumors”. Genome Research 31. First published online 2020-12-23, pp. 337-347. https://doi.org/10.1101/gr.256388.119.
- Bass, A. J. and J. D. Storey (2021). “The optimal discovery procedure for significance analysis of general gene expression studies”. Bioinformatics 37.3. First published online 2020-08-20, pp. 367-374. https://doi.org/10.1093/bioinformatics/btaa707.
- Chen, X., D. G. Robinson, and J. D. Storey (2021). “The functional false discovery rate with applications to genomics”. Biostatistics 22.1. First published online 2019-05-28, pp. 68-81. https://dx.doi.org/10.1093/biostatistics/kxz010.
- Bass, A. J., D. G. Robinson, and J. D. Storey (2019). “Determining sufficient sequencing depth in RNA-Seq differential expression studies”. bioRxiv doi:10.1101/635623. https://dx.doi.org/10.1101/635623.
- Bass, A. J. and J. D. Storey (2019). “The optimal discovery procedure for significance analysis of general gene expression studies”. bioRxiv doi:10.1101/571992. https://dx.doi.org/10.1101/571992.
- Ochoa, A. and J. D. Storey (2019). “New kinship and FST estimates reveal higher levels of differentiation in the global human population”. bioRxiv doi:10.1101/653279. https://dx.doi.org/10.1101/653279.
- Ochoa, A. and J. D. Storey (2019). “ FST and kinship for arbitrary population structures II: Method-of-moments estimators”. bioRxiv doi:10.1101/083923. https://dx.doi.org/10.1101/083923.
- Ochoa, A. and J. D. Storey (2019). “ FST and kinship for arbitrary population structures I: Generalized definitions”. bioRxiv doi:10.1101/083915. https://dx.doi.org/10.1101/083915.
- Cabreros, I. and J. D. Storey (2019). “A likelihood-free estimator of population structure bridging admixture models and principal components analysis”. Genetics 212.4, pp. 1009-1029. https://dx.doi.org/10.1534/genetics.119.302159.
- Hao, W. and J. D. Storey (2019). “Extending tests of Hardy-Weinberg equilibrium to structured populations”. Genetics 213.3, pp. 759-770. https://dx.doi.org/10.1534/genetics.119.302370.
- Cabreros, I. and J. D. Storey (2019). “Causal models on probability spaces”. arXiv 1907.01672. https://arxiv.org/abs/1907.01672.
- Cabreros, I. and J. D. Storey (2019). “A likelihood-free estimator of population structure bridging admixture models and principal components analysis”. bioRxiv doi:10.1101/240812. https://dx.doi.org/10.1101/240812.
- Painter, H. J., N. C. Chung, A. Sebastian, I. Albert, J. D. Storey, and M. Llinás (2018). “Genome-wide real-time in vivo transcriptional dynamics during Plasmodium falciparum blood-stage development”. Nature Communications 9.1, p. 2656. http://dx.doi.org/10.1038/s41467-018-04966-3.
- Painter, H. J., N. C. Chung, A. Sebastian, I. Albert, J. D. Storey, and M. Llinás (2018). “Genome-wide real-time in vivo transcriptional dynamics during Plasmodium falciparum blood-stage development”. bioRiv doi:10.1101/265975. http://dx.doi.org/10.1101/265975.
- Hao, W. and J. D. Storey (2017). “Extending tests of Hardy-Weinberg equilibrium to structured populations”. bioRxiv doi:10.1101/240804. https://dx.doi.org/10.1101/240804.
- Cabreros, I. and J. D. Storey (2017). “A nonparametric estimator of population structure unifying admixture models and principal components analysis”. bioRxiv doi:10.1101/240812. https://dx.doi.org/10.1101/240812.
- Chen, X., D. G. Robinson, and J. D. Storey (2017). “The functional false discovery rate with applications to genomics”. bioRxiv doi:10.1101/241133. https://dx.doi.org/10.1101/241133.
- Hackett, S. R. and J. D. Storey (2017). “Mixed membership martial arts: Data-driven analysis of winning martial arts styles”. MIT Sloan Sports Analytics Conference. http://www.sloansportsconference.com/wp-content/uploads/2017/02/1575.pdf.
- Ochoa, A. and J. D. Storey (2016). “ FST and kinship for arbitrary population structures II: Method of moments estimators”. bioRxiv doi:10.1101/083923. https://dx.doi.org/10.1101/083923.
- Ochoa, A. and J. D. Storey (2016). “ FST and kinship for arbitrary population structures I: Generalized definitions”. bioRxiv doi:10.1101/083915. https://dx.doi.org/10.1101/083915.
- Gopalan, P., W. Hao, D. M. Blei, and J. D. Storey (2016). “Scaling probabilistic models of genetic variation to millions of humans”. Nature Genetics 48.12, pp. 1587-1590. https://dx.doi.org/10.1038/ng.3710.
- Hackett, S. R., V. R. T. Zanotelli, W. Xu, J. Goya, J. O. Park, D. H. Perlman, et al. (2016). “Systems-level analysis of mechanisms regulating yeast metabolic flux”. Science 354.6311, pp. aaf2786-aaf2786. https://dx.doi.org/10.1126/science.aaf2786.
- Hao, W., M. Song, and J. D. Storey (2015). “Probabilistic models of genetic variation in structured populations applied to global human studies”. Bioinformatics 32.5, pp. 713-721. https://dx.doi.org/10.1093/bioinformatics/btv641.
- Song, M., W. Hao, and J. D. Storey (2015). “Testing for genetic associations in arbitrarily structured populations”. Nature Genetics 47.5, pp. 550-554. https://dx.doi.org/10.1038/ng.3244.
- Robinson, D. G., J. Y. Wang, and J. D. Storey (2015). “A nested parallel experiment demonstrates differences in intensity-dependence between RNA-seq and microarrays”. Nucleic Acids Research, p. gkv636. https://dx.doi.org/10.1093/nar/gkv636.
- Ochoa, A., J. D. Storey, M. Llinás, and M. Singh (2015). “Beyond the E-Value: Stratified statistics for protein domain prediction”. PLOS Computational Biology 11.11, p. e1004509. https://dx.doi.org/10.1371/journal.pcbi.1004509.
- Gopalan, P., W. Hao, D. M. Blei, and J. D. Storey (2015). “Scaling probabilistic models of genetic variation to millions of humans”. bioRxiv doi:10.1101/013227. https://dx.doi.org/10.1101/013227.
- Chen, X. and J. D. Storey (2015). “Consistent estimation of low-dimensional latent structure in high-dimensional data”. arXiv 1510.03497. https://arxiv.org/abs/1510.03497.
- Song, M., W. Hao, and J. D. Storey (2015). “Testing for genetic associations in arbitrarily structured populations”. bioRxiv doi:10.1101/012682. https://dx.doi.org/10.1101/012682.
- Robinson, D. G. and J. D. Storey (2014). “subSeq: Determining appropriate sequencing depth through efficient read subsampling”. Bioinformatics 30.23, pp. 3424-3426. https://dx.doi.org/10.1093/bioinformatics/btu552.
- Chung, N. C. and J. D. Storey (2014). “Statistical significance of variables driving systematic variation in high-dimensional data”. Bioinformatics 31.4, pp. 545-554. https://dx.doi.org/10.1093/bioinformatics/btu674.
- Marstrand, T. T. and J. D. Storey (2014). “Identifying and mapping cell-type-specific chromatin programming of gene expression”. Proceedings of the National Academy of Sciences 111.6, pp. E645-E654. https://dx.doi.org/10.1073/pnas.1312523111.
- Kim, J., N. Ghasemzadeh, D. J. Eapen, N. Chung, J. D. Storey, A. A. Quyyumi, et al. (2014). “Gene expression profiles associated with acute myocardial infarction and risk of cardiovascular death”. Genome Medicine 6.5, p. 40. https://dx.doi.org/10.1186/gm560.
- Ochoa, A., J. D. Storey, M. Llinás, and M. Singh (2014). “Beyond the E-value: Stratified statistics for protein domain prediction”. arXiv 1409.6384. https://arxiv.org/abs/1409.6384.
- Robinson, D. G., J. Wang, and J. D. Storey (2014). “A nested parallel experiment demonstrates differences in intensity-dependence between RNA-seq and microarrays”. bioRxiv doi:10.1101/013342. https://dx.doi.org/10.1101/013342.
- Robinson, D. G., W. Chen, J. D. Storey, and D. Gresham (2013). “Design and analysis of bar-seq experiments”. G3: Genes | Genomes | Genetics 4.1, pp. 11-18. https://dx.doi.org/10.1534/g3.113.008565.
- Jaffe, A. E., J. D. Storey, H. Ji, and J. T. Leek (2013). “Gene set bagging for estimating the probability a statistically significant result will replicate”. BMC Bioinformatics 14.1, p. 360. https://dx.doi.org/10.1186/1471-2105-14-360.
- Jaffe, A. E., J. D. Storey, H. Ji, and J. T. Leek (2013). “Gene set bagging for estimating replicability of gene set analyses”. arXiv 1301.3933. https://arxiv.org/abs/1301.3933.
- Hao, W., M. Song, and J. D. Storey (2013). “Probabilistic models of genetic variation in structured populations applied to global human studies”. arXiv 1312.2041. https://arxiv.org/abs/1312.2041.
- Chung, N. C. and J. D. Storey (2013). “Statistical significance of variables driving systematic variation”. arXiv 1308.6013. https://arxiv.org/abs/1308.6013.
- Leek, J. T., W. E. Johnson, H. S. Parker, A. E. Jaffe, and J. D. Storey (2012). “The sva package for removing batch effects and other unwanted variation in high-throughput experiments”. Bioinformatics 28.6, pp. 882-883. https://dx.doi.org/10.1093/bioinformatics/bts034.
- Desai, K. H. and J. D. Storey (2012). “Cross-dimensional inference of dependent high-dimensional data”. Journal of the American Statistical Association 107.497, pp. 135-151. https://dx.doi.org/10.1080/01621459.2011.645777.
- Marstrand, T. T. and J. D. Storey (2011). “Identifying and mapping cell-type specific chromatin programming of gene expression”. arXiv 1210.3313. https://arxiv.org/abs/1210.3313.
- Xu, W., J. Seok, M. N. Mindrinos, A. C. Schweitzer, H. Jiang, J. Wilhelmy, et al. (2011). “Human transcriptome array for high-throughput clinical studies”. Proceedings of the National Academy of Sciences 108.9, pp. 3707-3712. https://dx.doi.org/10.1073/pnas.1019753108.
- Xiao, W., M. N. Mindrinos, J. Seok, J. Cuschieri, A. G. Cuenca, H. Gao, et al. (2011). “A genomic storm in critically injured humans”. The Journal of Experimental Medicine 208.13, pp. 2581-2590. https://dx.doi.org/10.1084/jem.20111354.
- Storey, J. D. (2011). “False discovery rate”. In: International Encyclopedia of Statistical Science. Ed. by M. Lovric. Springer Nature, pp. 504-508. https://dx.doi.org/10.1007/978-3-642-04898-2_248.
- Leek, J. T. and J. D. Storey (2011). “The joint null criterion for multiple hypothesis tests”. Statistical Applications in Genetics and Molecular Biology 10.1, pp. 2361-2373. https://dx.doi.org/10.2202/1544-6115.1673.
- Kanodia, J. S., Y. Kim, R. Tomer, Z. Khan, K. Chung, J. D. Storey, et al. (2011). “A computational statistics approach for estimating the spatial range of morphogen gradients”. Development 138.22, pp. 4867-4874. https://dx.doi.org/10.1242/dev.071571.
- Desai, K. H., C. S. Tan, J. T. Leek, R. V. Maier, R. G. Tompkins, and J. D. Storey (2011). “Dissecting inflammatory complications in critically injured patients by within-patient gene expression changes: A longitudinal clinical genomics study”. PLoS Medicine 8.9, p. e1001093. https://dx.doi.org/10.1371/journal.pmed.1001093.
- Woo, S., J. T. Leek, and J. D. Storey (2010). “A computationally efficient modular optimal discovery procedure”. Bioinformatics 27.4, pp. 509-515. https://dx.doi.org/10.1093/bioinformatics/btq701.
- Gresham, D., V. M. Boer, A. Caudy, N. Ziv, N. J. Brandt, J. D. Storey, et al. (2010). “System-level analysis of genes and functions affecting survival during nutrient starvation in Saccharomyces cerevisiae”. Genetics 187.1, pp. 299-317. https://dx.doi.org/10.1534/genetics.110.120766.
- Mecham, B. H., P. S. Nelson, and J. D. Storey (2010). “Supervised normalization of microarrays”. Bioinformatics 26.10, pp. 1308-1315. https://dx.doi.org/10.1093/bioinformatics/btq118.
- Kruglyak, L. and J. D. Storey (2009). “Cause and express”. Nature Biotechnology 27.6, pp. 544-545. https://dx.doi.org/10.1038/nbt0609-544.
- Kall, L., J. D. Storey, and W. S. Noble (2009). “QVALITY: Non-parametric estimation of q-values and posterior error probabilities”. Bioinformatics 25.7, pp. 964-966. https://dx.doi.org/10.1093/bioinformatics/btp021.
- Schadt, E. E., C. Molony, E. Chudin, K. Hao, X. Yang, P. Y. Lum, et al. (2008). “Mapping the genetic architecture of gene expression in human liver”. PLoS Biology 6.5. Ed. by G. Abecassis, p. e107. https://dx.doi.org/10.1371/journal.pbio.0060107.
- Leek, J. T. and J. D. Storey (2008). “A general framework for multiple testing dependence”. Proceedings of the National Academy of Sciences 105.48, pp. 18718-18723. https://dx.doi.org/10.1073/pnas.0808709105.
- Kall, L., J. D. Storey, and W. S. Noble (2008). “Non-parametric estimation of posterior error probabilities associated with peptides identified by tandem mass spectrometry”. Bioinformatics 24.16, pp. i42-i48. https://dx.doi.org/10.1093/bioinformatics/btn294.
- Käll, L., J. D. Storey, M. J. MacCoss, and W. S. Noble (2008). “Posterior error probabilities and false discovery rates: two sides of the same coin”. Journal of Proteome Research 7.1, pp. 40-44. https://dx.doi.org/10.1021/pr700739d.
- Käll, L., J. D. Storey, M. J. MacCoss, and W. S. Noble (2008). “Assigning significance to peptides identified by tandem mass spectrometry using decoy databases”. Journal of Proteome Research 7.1, pp. 29-34. https://dx.doi.org/10.1021/pr700600n.
- Idaghdour, Y., J. D. Storey, S. J. Jadallah, and G. Gibson (2008). “A genome-wide gene expression signature of environmental geography in leukocytes of Moroccan Amazighs”. PLoS Genetics 4.4. Ed. by G. Barsh, p. e1000052. https://dx.doi.org/10.1371/journal.pgen.1000052.
- Hao, K., E. E. Schadt, and J. D. Storey (2008). “Calibrating the performance of SNP arrays for whole-genome association studies”. PLoS Genetics 4.6. Ed. by G. R. Abecasis, p. e1000109. https://dx.doi.org/10.1371/journal.pgen.1000109.
- Chen, L. S. and J. D. Storey (2008). “Eigen-$R^2$ for dissecting variation in high-dimensional studies”. Bioinformatics 24.19, pp. 2260-2262. https://dx.doi.org/10.1093/bioinformatics/btn411.
- Biswas, S., J. D. Storey, and J. M. Akey (2008). “Mapping gene expression quantitative trait loci by singular value decomposition and independent component analysis”. BMC Bioinformatics 9.1, p. 244. https://dx.doi.org/10.1186/1471-2105-9-244.
- Storey, J. D., J. Madeoy, J. L. Strout, M. Wurfel, J. Ronald, and J. M. Akey (2007). “Gene-Expression Variation Within and Among Human Populations”. The American Journal of Human Genetics 80.3, pp. 502-509. https://dx.doi.org/10.1086/512017.
- Storey, J. D. (2007). “The optimal discovery procedure: a new approach to simultaneous significance testing”. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 69.3, pp. 347-368. https://dx.doi.org/10.1111/j.1467-9868.2007.005592.x.
- Leek, J. T. and J. D. Storey (2007). “Capturing heterogeneity in gene expression studies by surrogate variable analysis”. PLoS Genetics 3.9, p. e161. https://dx.doi.org/10.1371/journal.pgen.0030161.
- Dabney, A. R. and J. D. Storey (2007). “Optimality driven nearest centroid classification from genomic data”. PLoS ONE 2.10. Ed. by J. Zhu, p. e1002. https://dx.doi.org/10.1371/journal.pone.0001002.
- Dabney, A. R. and J. D. Storey (2007). “Normalization of two-channel microarrays accounting for experimental design and intensity-dependent relationships”. Genome Biology 8.3, p. R44. https://dx.doi.org/10.1186/gb-2007-8-3-r44.
- Chen, L. S., F. Emmert-Streib, and J. D. Storey (2007). “Harnessing naturally randomized transcription to infer regulatory relationships among genes”. Genome Biology 8.10, p. R219. https://dx.doi.org/10.1186/gb-2007-8-10-r219.
- Akey, J. M., S. Biswas, J. T. Leek, and J. D. Storey (2007). “On the design and analysis of gene expression studies in human populations”. Nature Genetics 39.7, pp. 807-808. https://dx.doi.org/10.1038/ng0707-807.
- Storey, J. D., J. Y. Dai, and J. T. Leek (2006). “The optimal discovery procedure for large-scale significance testing, with applications to comparative microarray experiments”. Biostatistics 8.2, pp. 414-432. https://dx.doi.org/10.1093/biostatistics/kxl019.
- Dabney, A. R. and J. D. Storey (2006). “A new approach to intensity-dependent normalization of two-channel microarrays”. Biostatistics 8.1, pp. 128-139. https://dx.doi.org/10.1093/biostatistics/kxj038.
- Dabney, A. R. and J. D. Storey (2006). “A reanalysis of a published Affymetrix GeneChip control dataset”. Genome Biology 7.3, p. 401. https://dx.doi.org/10.1186/gb-2006-7-3-401.
- Chen, L. (2006). “Relaxed significance criteria for linkage analysis”. Genetics 173.4, pp. 2371-2381. https://dx.doi.org/10.1534/genetics.105.052506.
- Leek, J. T., E. Monsen, A. R. Dabney, and J. D. Storey (2005). “Edge: extraction and analysis of differential gene expression”. Bioinformatics 22.4, pp. 507-508. https://dx.doi.org/10.1093/bioinformatics/btk005.
- Storey, J. D., W. Xiao, J. T. Leek, R. G. Tompkins, and R. W. Davis (2005). “Significance analysis of time course microarray experiments”. Proceedings of the National Academy of Sciences 102.36, pp. 12837-12842. https://dx.doi.org/10.1073/pnas.0504609102.
- Storey, J. D., J. M. Akey, and L. Kruglyak (2005). “Multiple locus linkage analysis of genomewide expression in yeast”. PLoS Biology 3.8, p. e267. https://dx.doi.org/10.1371/journal.pbio.0030267.
- Brem, R. B., J. D. Storey, J. Whittle, and L. Kruglyak (2005). “Genetic interactions between polymorphisms that affect gene expression in yeast”. Nature 436.7051, pp. 701-703. https://dx.doi.org/10.1038/nature03865.
- Storey, J. D., J. E. Taylor, and D. Siegmund (2004). “Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: A unified approach”. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 66.1, pp. 187-205. https://dx.doi.org/10.1111/j.1467-9868.2004.00439.x.
- Vaszar, L. T., T. Nishimura, J. D. Storey, G. Zhao, D. Qiu, J. L. Faul, et al. (2004). “Longitudinal transcriptional analysis of developing neointimal vascular occlusion and pulmonary hypertension in rats”. Physiological Genomics 17.2, pp. 150-156. https://dx.doi.org/10.1152/physiolgenomics.00198.2003.
- Storey, J. D. and R. Tibshirani (2003). “Statistical significance for genomewide studies”. Proceedings of the National Academy of Sciences 100.16, pp. 9440-9445. https://dx.doi.org/10.1073/pnas.1530509100.
- Storey, J. D. and R. Tibshirani (2003). “Statistical methods for identifying differentially expressed genes in DNA microarrays”. In: Functional Genomics: Methods and Protocols. Ed. by M. Kaufmann and C. Klinger. Springer Nature, pp. 149-158. https://dx.doi.org/10.1385/1-59259-364-X:149.
- Storey, J. D. and R. Tibshirani (2003). “SAM thresholding and false discovery rates for detecting differential gene expression in DNA microarrays”. In: The Analysis of Gene Expression Data: Methods and Software. Ed. by G. Parmigiani, E. S. Garett, R. A. Irizarry and S. L. Zeger. Springer New York, pp. 272-290. https://dx.doi.org/10.1007/b97411.
- Storey, J. D. (2003). “The positive false discovery rate: A Bayesian interpretation and the q-value”. Annals of Statistics 31.6, pp. 2013-2035. https://dx.doi.org/10.1214/aos/1074290335.
- Arava, Y., Y. Wang, J. D. Storey, C. L. Liu, P. O. Brown, and D. Herschlag (2003). “Genome-wide analysis of mRNA translation profiles in Saccharomyces cerevisiae”. Proceedings of the National Academy of Sciences 100.7, pp. 3889-3894. https://dx.doi.org/10.1073/pnas.0635171100.
- Wang, Y., C. L. Liu, J. D. Storey, R. J. Tibshirani, D. Herschlag, and P. O. Brown (2002). “Precision and functional specificity in mRNA decay”. Proceedings of the National Academy of Sciences 99.9, pp. 5860-5865. https://dx.doi.org/10.1073/pnas.092538799.
- Storey, J. D. (2002). “False discovery rates: Theory and applications to DNA microarrays”. PhD Dissertation. Stanford University. https://searchworks.stanford.edu/view/5417184.
- Clement, K. (2002). “In vivo regulation of human skeletal muscle gene expression by thyroid hormone”. Genome Research 12.2, pp. 281-291. https://dx.doi.org/10.1101/gr.207702.
- Storey, J. D. (2002). “A direct approach to false discovery rates”. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 64.3, pp. 479-498. https://dx.doi.org/10.1111/1467-9868.00346.
- Efron, B., R. Tibshirani, J. D. Storey, and V. Tusher (2001). “Empirical Bayes analysis of a microarray experiment”. Journal of the American Statistical Association 96.456, pp. 1151-1160. https://dx.doi.org/10.1198/016214501753382129.
- Storey, J. D. and R. Tibshirani (2001). “Estimating the positive false discovery rate under dependence, with applications to DNA microarrays”. Technical Report 2001-28. Department of Statistics, Stanford University. http://storeylab.org/doc/2001-28.pdf.
- Efron, B., J. D. Storey, and R. Tibshirani (2001). “Microarrays, empirical Bayes methods, and false discovery rates”. Technical Report Bio-217. Department of Statistics, Stanford University. http://storeylab.org/doc/BIO217.pdf.
- Storey, J. D. (2001). “A new approach to false discovery rates and multiple hypothesis testing”. Technical Report 2001-18. Department of Statistics, Stanford University. http://storeylab.org/doc/2001-18.pdf.
- Efron, B., R. Tibshirani, J. D. Storey, and V. Tusher (2001). “Empirical Bayes analysis of a microarray experiment”. Technical Report Bio-216. Department of Statistics, Stanford University. http://storeylab.org/doc/BIO216.pdf.
- Storey, J. D. (2001). “The false discovery rate: A Bayesian interpretation and the q-value”. Technical Report 2001-12. Department of Statistics, Stanford University. http://storeylab.org/doc/2001-12.pdf.
- Gilbert, C. L., J. D. Kolesar, C. A. Reiter, and J. D. Storey (2001). “Function digraphs of quadratic maps modulo p”. Fibonacci Quarterly 39.1, pp. 32-49.
- Storey, J. D. and D. Siegmund (2001). “Approximate p-values for local sequence alignments: Numerical studies”. Journal of Computational Biology 8.5, pp. 549-556. https://dx.doi.org/10.1089/106652701753216530.
Lab Members
John D. Storey
John is the William R. Harman ‘63 and Mary-Love Harman Professor in Genomics, primarly appointed in the Lewis-Sigler Institute for Integrative Genomics at Princeton University. He has associated faculty appointments in Applied and Computational Mathematics, the Center for Statistics and Machine Learning, Computer Science, Molecular Biology, Operations Research and Financial Engineering, and the Princeton Institute for Computational Science and Engineering (PICSciE). He is the director (with Josh Akey) of the NHGRI Quantitative and Computational Biology Graduate Training Program at Princeton University. For more information, check out a brief biography, his personal website, and his GitHub, Google Scholar, and LinkedIn profiles.
Danfeng Chen
Danfeng is a PhD student in Quantitative and Computational Biology at Princeton University. She received her BS in Biological Sciences from Fudan University and her MS in Computational Biology and Quantitative Genetics from Harvard University.
Danfeng’s research is focused on new theory and methods for heritability in genome-wide association studies, new estimation methods for coancestry in structured populations, and developing surrogate variable analysis for sequencing-based molecular profiling, such as RNA-seq. For more information, check out Danfeng’s GitHub profile.
Olivia Harringmeyer
Olivia is a Lewis-Sigler Scholar at Princeton University working with the Akey and Storey labs. Olivia received her undergraduate degree in mathematics from Williams College. She then received a PhD in evolutionary biology from Harvard University. She studies structural genomic variation, focusing on chromosomal inversions and their role in evolution. For more information, check out her Google Scholar profile.
Yushi Tang
Yushi is a PhD student in Quantitative and Computational Biology at Princeton University. He received his BS from Peking University, double-majoring in Environmental Statistics and Economics, and MS in Computational Biology and Quantitative Genetics from Harvard University.
Yushi is working primarily in two areas. The first is in understanding the role of ancestral population as a conditional variable when performing genome-wide association studies and heritability estimation. The second is developing theory and methods for causal inference in population-based genome-wide genetics studies. For more information, check out his personal website and his GitHub, Google Scholar, and LinkedIn profiles.
Alvin Zhang
Alvin is a PhD student in Quantitative and Computational Biology at Princeton University. He received his BS from University of Chicago, double-majoring in Biology and Statistics.
Alvin is interested in multivariate models of genetic variation in order to imporove our ability to genetically dissect complex traits and form statistically interpretable polygenic risk scores for traits.
Lab Alumni
Name | Recent Position |
---|---|
Andrew Bass | Postdoctoral Associate, Emory University |
Carles Boix | Postdoctoral Associate, MIT |
Irineo Cabreros | Staff Editor of Statistical Modeling at The New York Times |
Lin Chen | Associate Professor, University of Chicago |
Xiongzhi Chen | Assistant Professor, Washington State University |
Neo Chung | Assistant Professor, University of Warsaw, Poland |
Alan Dabney | Associate Professor, Associate Department Head for Teaching Excellence, Texas A&M University |
Keyur Desai | Associate Director, Translational Medicine Statistics, Bristol-Myers Squibb |
Frank Emmert-Streib | Professor, Tampere University of Technology, Finland |
Sean Hackett | Director of Data Science, Calico |
Wei Hao | Machine Learning Engineer, Sisu |
Yu-Han Hsu | Computational Scientist, Broad Institute |
Jeffrey Leek | Chief Data Officer, Vice President, and J. Orin Edson Foundation Chair of Biostatistics, Fred Hutchinson Cancer Center |
Troels Marstrand | Founder and CTO, Revea |
Trevor Martin | Founder and CEO, Mammoth Biosciences |
Brigham Mecham | Founder and CEO, Trialomics |
Emily Nelson | Senior Consultant, Headstorm |
Alejandro Ochoa | Assistant Professor, Duke University |
Narayanan Raghupathy | Principal Scientist, Bristol-Myers Squibb |
David Robinson | Director of Data Science, Heap |
Dipen Sangurdekar | Vice President, Head of Data Science, KSQ Therapeutics |
Lincoln Smith | Founder, Analytic Managed Services |
Minsun Song | Associate Professor, Sookmyung Women’s University, South Korea |
Chuen-Seng Tan | Associate Professor, National University of Singapore |
Hannah Steele | Founding Partner and Portfolio Manager, Lorica Asset Management |
Mayisha Sultana | Commercial Strategist, Wellinks |
Sarah Urbut | Cardiology Fellow, MGH and Broad Institute |
Sangsoon Woo | Senior Biostatistician, NanoString |
Software
biobroom. This R package converts standard objects constructed by bioinformatics packages, especially those in Bioconductor, to the tidy data format. Bioconductor / GitHub
bnpsd. This R package simulates admixed populations via simulated allele frequencies and genotypes from the BN-PSD (“Balding-Nichols Pritchard-Stephens-Donnelly”) admixture model. This model enables the simulation of complex population structures, ideal for illustrating challenges in kinship coefficient and FST estimation. CRAN / GitHub
dnamix. This Fortran program calcluates likelihood ratios as they pertain to mixed DNA samples encountered in forensic science. GitHub
edge. This R package implements methods for carrying out differential expression analyses of genome-wide gene expression studies. Bioconductor / GitHub
eigenR2. This R package calculates a high-dimensional version of the classic R2 statistic. GitHub
gcatest. This R package implements the genotype-conditional association test (GCATest), which is an association test for genome wide association studies that controls for population structure under a general class of trait models. Bioconductor / GitHub
jackstraw. This R package calculates a ignificance test for association between variables and their estimated latent variables. Latent variables may be estimated by principal component analysis (PCA), logistic factor analysis (LFA), and other techniques. CRAN / GitHub
lfa. This R package implements logistic factor analysis, which is a PCA analogue on Binomial data via estimation of latent structure in the natural parameter space. Bioconductor / GitHub
popkin. The popkin (“population kinship”) R package estimates the kinship matrix of individuals and FST from their biallelic genotypes. Our estimation framework is the first to be practically unbiased under arbitrary population structures. CRAN / GitHub
qvalue. This R package takes a list of p-values resulting from the simultaneous testing of many hypotheses and estimates their q-values and local FDR values. Bioconductor / GitHub
snm. This R package performs a supervised normalization of microarray data that takes into account the study design. Bioconductor
subSeq. This R package performs subsampling of high throughput sequencing count data for use in experimental design and analysis. Bioconductor / GitHub
sva. This R package performs surrogate variable analysis to account for sources of systematic variation not included in the study design model. Bioconductor / GitHub
terastructure. This C++ program fits the admixture model on tera-sample-sized data sets (~ 1012 observed genotypes). This package provides a scalable, multi-threaded implementation that can be run on a single computer. GitHub
trigger. This R package provides tools for the statistical analysis of genetics of gene expression data. Bioconductor / GitHub
Contact & Links
Contact
Carl Icahn Labs
Princeton University
Princeton NJ 08544, USA
JDS Email
Faculty Assistant: Lee Morgan
Phone: +1.609.258.0859
Email: lmorgan@princeton.edu
Office: 257 Carl Icahn Labs
Links
Storey Lab GitHub
JDS on Google Scholar
JDS on ORCID may be out of date