Research

Images of published research figures

The Storey Lab’s current research efforts are in the following areas:

Extending fundamental population genetics models and quantities — such as allele frequencies, Hardy-Weinberg equilibrium, F_ST, admixture, ancestry, and kinship — to genome-wide genotype data making minimal assumptions about population structure
Improving the accuracy and applicability of the polygenic trait model in population-based studies for determining the genetic basis of complex traits, including association studies, genome-wide heritability, and polygenic risk scores
Developing causal inference methods and study designs for population-based genetics studies of complex traits
Latent variable decompositions of high-dimensional data, with an emphasis on admixture, population structure, and exponential family distributions
Latent variable modeling in the context of high-dimensional significance testing — such as surrogate variable analysis and cross-dimensional inference
Signficance testing on high-dimensional regression models
Models and estimators for false discovery rates and q-values

Publications

Tang, Y., I. Cabreros, and J. D. Storey (2026). “Identifying causal genotype-phenotype relationships for population-sampled parent-child trios”. Genetic Epidemiology 50: e70027. https://doi.org/10.1002/gepi.70027
Chen, D. and J. D. Storey (2025). “Coancestry superposed on admixed populations yields measures of relatedness at individual-level resolution”. PLoS Computational Biology 21(12): e1013848. https://doi.org/10.1371/journal.pcbi.1013848
Tang, Y., and J. D. Storey (2025). “A generalized test of genotype-phenotype causality in population-sampled nuclear families”. bioRxiv doi:10.64898/2025.12.29.696865. https://doi.org/10.64898/2025.12.29.696865
Storey, J. D. (2025). “False discovery rate”. In: International Encyclopedia of Statistical Science, 2nd ed. Ed. by M. Lovric. Springer Berlin, Heidelberg. https://link.springer.com/rwe/10.1007/978-3-662-69359-9_229
Chen, D. and J. D. Storey (2024). “Coancestry superposed on admixed populations yields measures of relatedness at individual-level resolution”. bioRxiv doi:10.1101/2024.12.29.630632. https://doi.org/10.1101/2024.12.29.630632
Tang, Y., I. Cabreros, and J. D. Storey (2024). “Identifying causal genotype-phenotype relationships for population-sampled parent-child trios”. bioRxiv doi:10.1101/2024.12.10.627752. https://doi.org/10.1101/2024.12.10.627752.
Ochoa, A. and J. D. Storey (2021). “Estimating F_ST and kinship for arbitrary population structures”. PLoS Genetics 17.1: e1009241. https://doi.org/10.1371/journal.pgen.1009241.
Graim, K. and et al. (2021). “Modeling molecular development of breast cancer in canine mammary tumors”. Genome Research 31: 337-347. https://doi.org/10.1101/gr.256388.119.
Bass, A. J. and J. D. Storey (2021). “The optimal discovery procedure for significance analysis of general gene expression studies”. Bioinformatics 37.3, pp. 367-374. https://doi.org/10.1093/bioinformatics/btaa707.
Chen, X., D. G. Robinson, and J. D. Storey (2021). “The functional false discovery rate with applications to genomics”. Biostatistics 22.1, pp. 68-81. https://dx.doi.org/10.1093/biostatistics/kxz010.
Bass, A. J., D. G. Robinson, and J. D. Storey (2019). “Determining sufficient sequencing depth in RNA-Seq differential expression studies”. bioRxiv doi:10.1101/635623. https://dx.doi.org/10.1101/635623.
Bass, A. J. and J. D. Storey (2019). “The optimal discovery procedure for significance analysis of general gene expression studies”. bioRxiv doi:10.1101/571992. https://dx.doi.org/10.1101/571992.
Ochoa, A. and J. D. Storey (2019). “New kinship and F_ST estimates reveal higher levels of differentiation in the global human population”. bioRxiv doi:10.1101/653279. https://dx.doi.org/10.1101/653279.
Ochoa, A. and J. D. Storey (2019). “ F_ST and kinship for arbitrary population structures II: Method-of-moments estimators”. bioRxiv doi:10.1101/083923. https://dx.doi.org/10.1101/083923.
Ochoa, A. and J. D. Storey (2019). “ F_ST and kinship for arbitrary population structures I: Generalized definitions”. bioRxiv doi:10.1101/083915. https://dx.doi.org/10.1101/083915.
Cabreros, I. and J. D. Storey (2019). “A likelihood-free estimator of population structure bridging admixture models and principal components analysis”. Genetics 212.4, pp. 1009-1029. https://dx.doi.org/10.1534/genetics.119.302159.
Hao, W. and J. D. Storey (2019). “Extending tests of Hardy-Weinberg equilibrium to structured populations”. Genetics 213.3, pp. 759-770. https://dx.doi.org/10.1534/genetics.119.302370.
Cabreros, I. and J. D. Storey (2019). “Causal models on probability spaces”. arXiv 1907.01672. https://arxiv.org/abs/1907.01672.
Cabreros, I. and J. D. Storey (2019). “A likelihood-free estimator of population structure bridging admixture models and principal components analysis”. bioRxiv doi:10.1101/240812. https://dx.doi.org/10.1101/240812.
Painter, H. J., N. C. Chung, A. Sebastian, I. Albert, J. D. Storey, and M. Llinás (2018). “Genome-wide real-time in vivo transcriptional dynamics during Plasmodium falciparum blood-stage development”. Nature Communications 9.1, p. 2656. http://dx.doi.org/10.1038/s41467-018-04966-3.
Painter, H. J., N. C. Chung, A. Sebastian, I. Albert, J. D. Storey, and M. Llinás (2018). “Genome-wide real-time in vivo transcriptional dynamics during Plasmodium falciparum blood-stage development”. bioRiv doi:10.1101/265975. http://dx.doi.org/10.1101/265975.
Hao, W. and J. D. Storey (2017). “Extending tests of Hardy-Weinberg equilibrium to structured populations”. bioRxiv doi:10.1101/240804. https://dx.doi.org/10.1101/240804.
Cabreros, I. and J. D. Storey (2017). “A nonparametric estimator of population structure unifying admixture models and principal components analysis”. bioRxiv doi:10.1101/240812. https://dx.doi.org/10.1101/240812.
Chen, X., D. G. Robinson, and J. D. Storey (2017). “The functional false discovery rate with applications to genomics”. bioRxiv doi:10.1101/241133. https://dx.doi.org/10.1101/241133.
Hackett, S. R. and J. D. Storey (2017). “Mixed membership martial arts: Data-driven analysis of winning martial arts styles”. MIT Sloan Sports Analytics Conference. http://www.sloansportsconference.com/wp-content/uploads/2017/02/1575.pdf.
Ochoa, A. and J. D. Storey (2016). “ F_ST and kinship for arbitrary population structures II: Method of moments estimators”. bioRxiv doi:10.1101/083923. https://dx.doi.org/10.1101/083923.
Ochoa, A. and J. D. Storey (2016). “ F_ST and kinship for arbitrary population structures I: Generalized definitions”. bioRxiv doi:10.1101/083915. https://dx.doi.org/10.1101/083915.
Gopalan, P., W. Hao, D. M. Blei, and J. D. Storey (2016). “Scaling probabilistic models of genetic variation to millions of humans”. Nature Genetics 48.12, pp. 1587-1590. https://dx.doi.org/10.1038/ng.3710.
Hackett, S. R., V. R. T. Zanotelli, W. Xu, J. Goya, J. O. Park, D. H. Perlman, et al. (2016). “Systems-level analysis of mechanisms regulating yeast metabolic flux”. Science 354.6311, pp. aaf2786-aaf2786. https://dx.doi.org/10.1126/science.aaf2786.
Hao, W., M. Song, and J. D. Storey (2015). “Probabilistic models of genetic variation in structured populations applied to global human studies”. Bioinformatics 32.5, pp. 713-721. https://dx.doi.org/10.1093/bioinformatics/btv641.
Song, M., W. Hao, and J. D. Storey (2015). “Testing for genetic associations in arbitrarily structured populations”. Nature Genetics 47.5, pp. 550-554. https://dx.doi.org/10.1038/ng.3244.
Robinson, D. G., J. Y. Wang, and J. D. Storey (2015). “A nested parallel experiment demonstrates differences in intensity-dependence between RNA-seq and microarrays”. Nucleic Acids Research, p. gkv636. https://dx.doi.org/10.1093/nar/gkv636.
Ochoa, A., J. D. Storey, M. Llinás, and M. Singh (2015). “Beyond the E-Value: Stratified statistics for protein domain prediction”. PLOS Computational Biology 11.11, p. e1004509. https://dx.doi.org/10.1371/journal.pcbi.1004509.
Gopalan, P., W. Hao, D. M. Blei, and J. D. Storey (2015). “Scaling probabilistic models of genetic variation to millions of humans”. bioRxiv doi:10.1101/013227. https://dx.doi.org/10.1101/013227.
Chen, X. and J. D. Storey (2015). “Consistent estimation of low-dimensional latent structure in high-dimensional data”. arXiv 1510.03497. https://arxiv.org/abs/1510.03497.
Song, M., W. Hao, and J. D. Storey (2015). “Testing for genetic associations in arbitrarily structured populations”. bioRxiv doi:10.1101/012682. https://dx.doi.org/10.1101/012682.
Robinson, D. G. and J. D. Storey (2014). “subSeq: Determining appropriate sequencing depth through efficient read subsampling”. Bioinformatics 30.23, pp. 3424-3426. https://dx.doi.org/10.1093/bioinformatics/btu552.
Chung, N. C. and J. D. Storey (2014). “Statistical significance of variables driving systematic variation in high-dimensional data”. Bioinformatics 31.4, pp. 545-554. https://dx.doi.org/10.1093/bioinformatics/btu674.
Marstrand, T. T. and J. D. Storey (2014). “Identifying and mapping cell-type-specific chromatin programming of gene expression”. Proceedings of the National Academy of Sciences 111.6, pp. E645-E654. https://dx.doi.org/10.1073/pnas.1312523111.
Kim, J., N. Ghasemzadeh, D. J. Eapen, N. Chung, J. D. Storey, A. A. Quyyumi, et al. (2014). “Gene expression profiles associated with acute myocardial infarction and risk of cardiovascular death”. Genome Medicine 6.5, p. 40. https://dx.doi.org/10.1186/gm560.
Ochoa, A., J. D. Storey, M. Llinás, and M. Singh (2014). “Beyond the E-value: Stratified statistics for protein domain prediction”. arXiv 1409.6384. https://arxiv.org/abs/1409.6384.
Robinson, D. G., J. Wang, and J. D. Storey (2014). “A nested parallel experiment demonstrates differences in intensity-dependence between RNA-seq and microarrays”. bioRxiv doi:10.1101/013342. https://dx.doi.org/10.1101/013342.
Robinson, D. G., W. Chen, J. D. Storey, and D. Gresham (2013). “Design and analysis of bar-seq experiments”. G3: Genes | Genomes | Genetics 4.1, pp. 11-18. https://dx.doi.org/10.1534/g3.113.008565.
Jaffe, A. E., J. D. Storey, H. Ji, and J. T. Leek (2013). “Gene set bagging for estimating the probability a statistically significant result will replicate”. BMC Bioinformatics 14.1, p. 360. https://dx.doi.org/10.1186/1471-2105-14-360.
Jaffe, A. E., J. D. Storey, H. Ji, and J. T. Leek (2013). “Gene set bagging for estimating replicability of gene set analyses”. arXiv 1301.3933. https://arxiv.org/abs/1301.3933.
Hao, W., M. Song, and J. D. Storey (2013). “Probabilistic models of genetic variation in structured populations applied to global human studies”. arXiv 1312.2041. https://arxiv.org/abs/1312.2041.
Chung, N. C. and J. D. Storey (2013). “Statistical significance of variables driving systematic variation”. arXiv 1308.6013. https://arxiv.org/abs/1308.6013.
Leek, J. T., W. E. Johnson, H. S. Parker, A. E. Jaffe, and J. D. Storey (2012). “The sva package for removing batch effects and other unwanted variation in high-throughput experiments”. Bioinformatics 28.6, pp. 882-883. https://dx.doi.org/10.1093/bioinformatics/bts034.
Desai, K. H. and J. D. Storey (2012). “Cross-dimensional inference of dependent high-dimensional data”. Journal of the American Statistical Association 107.497, pp. 135-151. https://dx.doi.org/10.1080/01621459.2011.645777.
Marstrand, T. T. and J. D. Storey (2011). “Identifying and mapping cell-type specific chromatin programming of gene expression”. arXiv 1210.3313. https://arxiv.org/abs/1210.3313.
Xu, W., J. Seok, M. N. Mindrinos, A. C. Schweitzer, H. Jiang, J. Wilhelmy, et al. (2011). “Human transcriptome array for high-throughput clinical studies”. Proceedings of the National Academy of Sciences 108.9, pp. 3707-3712. https://dx.doi.org/10.1073/pnas.1019753108.
Xiao, W., M. N. Mindrinos, J. Seok, J. Cuschieri, A. G. Cuenca, H. Gao, et al. (2011). “A genomic storm in critically injured humans”. The Journal of Experimental Medicine 208.13, pp. 2581-2590. https://dx.doi.org/10.1084/jem.20111354.
Storey, J. D. (2011). “False discovery rate”. In: International Encyclopedia of Statistical Science. Ed. by M. Lovric. Springer Nature, pp. 504-508. https://dx.doi.org/10.1007/978-3-642-04898-2_248.
Leek, J. T. and J. D. Storey (2011). “The joint null criterion for multiple hypothesis tests”. Statistical Applications in Genetics and Molecular Biology 10.1, pp. 2361-2373. https://dx.doi.org/10.2202/1544-6115.1673.
Kanodia, J. S., Y. Kim, R. Tomer, Z. Khan, K. Chung, J. D. Storey, et al. (2011). “A computational statistics approach for estimating the spatial range of morphogen gradients”. Development 138.22, pp. 4867-4874. https://dx.doi.org/10.1242/dev.071571.
Desai, K. H., C. S. Tan, J. T. Leek, R. V. Maier, R. G. Tompkins, and J. D. Storey (2011). “Dissecting inflammatory complications in critically injured patients by within-patient gene expression changes: A longitudinal clinical genomics study”. PLoS Medicine 8.9, p. e1001093. https://dx.doi.org/10.1371/journal.pmed.1001093.
Woo, S., J. T. Leek, and J. D. Storey (2010). “A computationally efficient modular optimal discovery procedure”. Bioinformatics 27.4, pp. 509-515. https://dx.doi.org/10.1093/bioinformatics/btq701.
Gresham, D., V. M. Boer, A. Caudy, N. Ziv, N. J. Brandt, J. D. Storey, et al. (2010). “System-level analysis of genes and functions affecting survival during nutrient starvation in Saccharomyces cerevisiae”. Genetics 187.1, pp. 299-317. https://dx.doi.org/10.1534/genetics.110.120766.
Mecham, B. H., P. S. Nelson, and J. D. Storey (2010). “Supervised normalization of microarrays”. Bioinformatics 26.10, pp. 1308-1315. https://dx.doi.org/10.1093/bioinformatics/btq118.
Kruglyak, L. and J. D. Storey (2009). “Cause and express”. Nature Biotechnology 27.6, pp. 544-545. https://dx.doi.org/10.1038/nbt0609-544.
Kall, L., J. D. Storey, and W. S. Noble (2009). “QVALITY: Non-parametric estimation of q-values and posterior error probabilities”. Bioinformatics 25.7, pp. 964-966. https://dx.doi.org/10.1093/bioinformatics/btp021.
Schadt, E. E., C. Molony, E. Chudin, K. Hao, X. Yang, P. Y. Lum, et al. (2008). “Mapping the genetic architecture of gene expression in human liver”. PLoS Biology 6.5. Ed. by G. Abecassis, p. e107. https://dx.doi.org/10.1371/journal.pbio.0060107.
Leek, J. T. and J. D. Storey (2008). “A general framework for multiple testing dependence”. Proceedings of the National Academy of Sciences 105.48, pp. 18718-18723. https://dx.doi.org/10.1073/pnas.0808709105.
Kall, L., J. D. Storey, and W. S. Noble (2008). “Non-parametric estimation of posterior error probabilities associated with peptides identified by tandem mass spectrometry”. Bioinformatics 24.16, pp. i42-i48. https://dx.doi.org/10.1093/bioinformatics/btn294.
Käll, L., J. D. Storey, M. J. MacCoss, and W. S. Noble (2008). “Posterior error probabilities and false discovery rates: two sides of the same coin”. Journal of Proteome Research 7.1, pp. 40-44. https://dx.doi.org/10.1021/pr700739d.
Käll, L., J. D. Storey, M. J. MacCoss, and W. S. Noble (2008). “Assigning significance to peptides identified by tandem mass spectrometry using decoy databases”. Journal of Proteome Research 7.1, pp. 29-34. https://dx.doi.org/10.1021/pr700600n.
Idaghdour, Y., J. D. Storey, S. J. Jadallah, and G. Gibson (2008). “A genome-wide gene expression signature of environmental geography in leukocytes of Moroccan Amazighs”. PLoS Genetics 4.4. Ed. by G. Barsh, p. e1000052. https://dx.doi.org/10.1371/journal.pgen.1000052.
Hao, K., E. E. Schadt, and J. D. Storey (2008). “Calibrating the performance of SNP arrays for whole-genome association studies”. PLoS Genetics 4.6. Ed. by G. R. Abecasis, p. e1000109. https://dx.doi.org/10.1371/journal.pgen.1000109.
Chen, L. S. and J. D. Storey (2008). “Eigen-$R^2$ for dissecting variation in high-dimensional studies”. Bioinformatics 24.19, pp. 2260-2262. https://dx.doi.org/10.1093/bioinformatics/btn411.
Biswas, S., J. D. Storey, and J. M. Akey (2008). “Mapping gene expression quantitative trait loci by singular value decomposition and independent component analysis”. BMC Bioinformatics 9.1, p. 244. https://dx.doi.org/10.1186/1471-2105-9-244.
Storey, J. D., J. Madeoy, J. L. Strout, M. Wurfel, J. Ronald, and J. M. Akey (2007). “Gene-Expression Variation Within and Among Human Populations”. The American Journal of Human Genetics 80.3, pp. 502-509. https://dx.doi.org/10.1086/512017.
Storey, J. D. (2007). “The optimal discovery procedure: a new approach to simultaneous significance testing”. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 69.3, pp. 347-368. https://dx.doi.org/10.1111/j.1467-9868.2007.005592.x.
Leek, J. T. and J. D. Storey (2007). “Capturing heterogeneity in gene expression studies by surrogate variable analysis”. PLoS Genetics 3.9, p. e161. https://dx.doi.org/10.1371/journal.pgen.0030161.
Dabney, A. R. and J. D. Storey (2007). “Optimality driven nearest centroid classification from genomic data”. PLoS ONE 2.10. Ed. by J. Zhu, p. e1002. https://dx.doi.org/10.1371/journal.pone.0001002.
Dabney, A. R. and J. D. Storey (2007). “Normalization of two-channel microarrays accounting for experimental design and intensity-dependent relationships”. Genome Biology 8.3, p. R44. https://dx.doi.org/10.1186/gb-2007-8-3-r44.
Chen, L. S., F. Emmert-Streib, and J. D. Storey (2007). “Harnessing naturally randomized transcription to infer regulatory relationships among genes”. Genome Biology 8.10, p. R219. https://dx.doi.org/10.1186/gb-2007-8-10-r219.
Akey, J. M., S. Biswas, J. T. Leek, and J. D. Storey (2007). “On the design and analysis of gene expression studies in human populations”. Nature Genetics 39.7, pp. 807-808. https://dx.doi.org/10.1038/ng0707-807.
Storey, J. D., J. Y. Dai, and J. T. Leek (2006). “The optimal discovery procedure for large-scale significance testing, with applications to comparative microarray experiments”. Biostatistics 8.2, pp. 414-432. https://dx.doi.org/10.1093/biostatistics/kxl019.
Dabney, A. R. and J. D. Storey (2006). “A new approach to intensity-dependent normalization of two-channel microarrays”. Biostatistics 8.1, pp. 128-139. https://dx.doi.org/10.1093/biostatistics/kxj038.
Dabney, A. R. and J. D. Storey (2006). “A reanalysis of a published Affymetrix GeneChip control dataset”. Genome Biology 7.3, p. 401. https://dx.doi.org/10.1186/gb-2006-7-3-401.
Chen, L. (2006). “Relaxed significance criteria for linkage analysis”. Genetics 173.4, pp. 2371-2381. https://dx.doi.org/10.1534/genetics.105.052506.
Leek, J. T., E. Monsen, A. R. Dabney, and J. D. Storey (2005). “Edge: extraction and analysis of differential gene expression”. Bioinformatics 22.4, pp. 507-508. https://dx.doi.org/10.1093/bioinformatics/btk005.
Storey, J. D., W. Xiao, J. T. Leek, R. G. Tompkins, and R. W. Davis (2005). “Significance analysis of time course microarray experiments”. Proceedings of the National Academy of Sciences 102.36, pp. 12837-12842. https://dx.doi.org/10.1073/pnas.0504609102.
Storey, J. D., J. M. Akey, and L. Kruglyak (2005). “Multiple locus linkage analysis of genomewide expression in yeast”. PLoS Biology 3.8, p. e267. https://dx.doi.org/10.1371/journal.pbio.0030267.
Brem, R. B., J. D. Storey, J. Whittle, and L. Kruglyak (2005). “Genetic interactions between polymorphisms that affect gene expression in yeast”. Nature 436.7051, pp. 701-703. https://dx.doi.org/10.1038/nature03865.
Storey, J. D., J. E. Taylor, and D. Siegmund (2004). “Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: A unified approach”. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 66.1, pp. 187-205. https://dx.doi.org/10.1111/j.1467-9868.2004.00439.x.
Vaszar, L. T., T. Nishimura, J. D. Storey, G. Zhao, D. Qiu, J. L. Faul, et al. (2004). “Longitudinal transcriptional analysis of developing neointimal vascular occlusion and pulmonary hypertension in rats”. Physiological Genomics 17.2, pp. 150-156. https://dx.doi.org/10.1152/physiolgenomics.00198.2003.
Storey, J. D. and R. Tibshirani (2003). “Statistical significance for genomewide studies”. Proceedings of the National Academy of Sciences 100.16, pp. 9440-9445. https://dx.doi.org/10.1073/pnas.1530509100.
Storey, J. D. and R. Tibshirani (2003). “Statistical methods for identifying differentially expressed genes in DNA microarrays”. In: Functional Genomics: Methods and Protocols. Ed. by M. Kaufmann and C. Klinger. Springer Nature, pp. 149-158. https://dx.doi.org/10.1385/1-59259-364-X:149.
Storey, J. D. and R. Tibshirani (2003). “SAM thresholding and false discovery rates for detecting differential gene expression in DNA microarrays”. In: The Analysis of Gene Expression Data: Methods and Software. Ed. by G. Parmigiani, E. S. Garett, R. A. Irizarry and S. L. Zeger. Springer New York, pp. 272-290. https://dx.doi.org/10.1007/b97411.
Storey, J. D. (2003). “The positive false discovery rate: A Bayesian interpretation and the q-value”. Annals of Statistics 31.6, pp. 2013-2035. https://dx.doi.org/10.1214/aos/1074290335.
Arava, Y., Y. Wang, J. D. Storey, C. L. Liu, P. O. Brown, and D. Herschlag (2003). “Genome-wide analysis of mRNA translation profiles in Saccharomyces cerevisiae”. Proceedings of the National Academy of Sciences 100.7, pp. 3889-3894. https://dx.doi.org/10.1073/pnas.0635171100.
Wang, Y., C. L. Liu, J. D. Storey, R. J. Tibshirani, D. Herschlag, and P. O. Brown (2002). “Precision and functional specificity in mRNA decay”. Proceedings of the National Academy of Sciences 99.9, pp. 5860-5865. https://dx.doi.org/10.1073/pnas.092538799.
Storey, J. D. (2002). “False discovery rates: Theory and applications to DNA microarrays”. PhD Dissertation. Stanford University. https://searchworks.stanford.edu/view/5417184.
Clement, K. (2002). “In vivo regulation of human skeletal muscle gene expression by thyroid hormone”. Genome Research 12.2, pp. 281-291. https://dx.doi.org/10.1101/gr.207702.
Storey, J. D. (2002). “A direct approach to false discovery rates”. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 64.3, pp. 479-498. https://dx.doi.org/10.1111/1467-9868.00346.
Efron, B., R. Tibshirani, J. D. Storey, and V. Tusher (2001). “Empirical Bayes analysis of a microarray experiment”. Journal of the American Statistical Association 96.456, pp. 1151-1160. https://dx.doi.org/10.1198/016214501753382129.
Storey, J. D. and R. Tibshirani (2001). “Estimating the positive false discovery rate under dependence, with applications to DNA microarrays”. Technical Report 2001-28. Department of Statistics, Stanford University. http://storeylab.org/doc/2001-28.pdf.
Efron, B., J. D. Storey, and R. Tibshirani (2001). “Microarrays, empirical Bayes methods, and false discovery rates”. Technical Report Bio-217. Department of Statistics, Stanford University. http://storeylab.org/doc/BIO217.pdf.
Storey, J. D. (2001). “A new approach to false discovery rates and multiple hypothesis testing”. Technical Report 2001-18. Department of Statistics, Stanford University. http://storeylab.org/doc/2001-18.pdf.
Efron, B., R. Tibshirani, J. D. Storey, and V. Tusher (2001). “Empirical Bayes analysis of a microarray experiment”. Technical Report Bio-216. Department of Statistics, Stanford University. http://storeylab.org/doc/BIO216.pdf.
Storey, J. D. (2001). “The false discovery rate: A Bayesian interpretation and the q-value”. Technical Report 2001-12. Department of Statistics, Stanford University. http://storeylab.org/doc/2001-12.pdf.
Gilbert, C. L., J. D. Kolesar, C. A. Reiter, and J. D. Storey (2001). “Function digraphs of quadratic maps modulo p”. Fibonacci Quarterly 39.1, pp. 32-49.
Storey, J. D. and D. Siegmund (2001). “Approximate p-values for local sequence alignments: Numerical studies”. Journal of Computational Biology 8.5, pp. 549-556. https://dx.doi.org/10.1089/106652701753216530.

Lab Members

John D. Storey

Photo of John D. Storey

John is the William R. Harman ‘63 and Mary-Love Harman Professor in Genomics, primarly appointed in the Lewis-Sigler Institute for Integrative Genomics at Princeton University. He has associated faculty appointments in Applied and Computational Mathematics, the Center for Statistics and Machine Learning, Computer Science, Molecular Biology, Operations Research and Financial Engineering, and the Princeton Institute for Computational Science and Engineering (PICSciE). He is the director (with Josh Akey) of the NHGRI Quantitative and Computational Biology Graduate Training Program at Princeton University. For more information, check out a brief biography, his personal website, and his GitHub, Google Scholar, and LinkedIn profiles.

Danfeng Chen

Photo of Danfeng Chen

Danfeng is a PhD student in Quantitative and Computational Biology at Princeton University. She received her BS in Biological Sciences from Fudan University and her MS in Computational Biology and Quantitative Genetics from Harvard University.

Danfeng’s research is focused on new theory and methods for heritability in genome-wide association studies, new estimation methods for coancestry in structured populations, and developing surrogate variable analysis for sequencing-based molecular profiling, such as RNA-seq. For more information, check out Danfeng’s GitHub profile.

Olivia Harringmeyer

Photo of Olivia Harringmeyer

Olivia is a Lewis-Sigler Scholar at Princeton University working with the Akey and Storey labs. Olivia received her undergraduate degree in mathematics from Williams College. She then received a PhD in evolutionary biology from Harvard University. She studies structural genomic variation, focusing on chromosomal inversions and their role in evolution. For more information, check out her Google Scholar profile.

Yushi Tang

Photo of Yushi Tang

Yushi is a postdoctoral research associate at Princeton University. He received his BS from Peking University, double-majoring in Environmental Statistics and Economics, MS in Computational Biology and Quantitative Genetics from Harvard University, and PhD in Quantitative and Computational Biology from Princeton University.

Yushi is working primarily in two areas. The first is in understanding the role of ancestral population as a conditional variable when performing genome-wide association studies and heritability estimation. The second is developing theory and methods for causal inference in population-based genome-wide genetics studies. For more information, check out his personal website and his GitHub, Google Scholar, and LinkedIn profiles.

Alvin Zhang

Photo of Yushi Tang

Alvin is a PhD student in Quantitative and Computational Biology at Princeton University. He received his BS from University of Chicago, double-majoring in Biology and Statistics.

Alvin is interested in multivariate models of genetic variation in order to imporove our ability to genetically dissect complex traits and form statistically interpretable polygenic risk scores for traits.

Lab Alumni

Name	Recent Position
Andrew Bass	Postdoctoral Associate, Emory University
Carles Boix	Postdoctoral Associate, MIT
Irineo Cabreros	Staff Editor of Statistical Modeling at The New York Times
Lin Chen	Associate Professor, University of Chicago
Xiongzhi Chen	Assistant Professor, Washington State University
Neo Chung	Assistant Professor, University of Warsaw, Poland
Alan Dabney	Associate Professor, Associate Department Head for Teaching Excellence, Texas A&M University
Keyur Desai	Associate Director, Translational Medicine Statistics, Bristol-Myers Squibb
Frank Emmert-Streib	Professor, Tampere University of Technology, Finland
Sean Hackett	Director of Data Science, Calico
Wei Hao	Machine Learning Engineer, Sisu
Yu-Han Hsu	Computational Scientist, Broad Institute
Jeffrey Leek	Chief Data Officer, Vice President, and J. Orin Edson Foundation Chair of Biostatistics, Fred Hutchinson Cancer Center
Troels Marstrand	Founder and CTO, Revea
Trevor Martin	Founder and CEO, Mammoth Biosciences
Brigham Mecham	Founder and CEO, Trialomics
Emily Nelson	Senior Consultant, Headstorm
Alejandro Ochoa	Assistant Professor, Duke University
Narayanan Raghupathy	Principal Scientist, Bristol-Myers Squibb
David Robinson	Director of Data Science, Heap
Dipen Sangurdekar	Vice President, Head of Data Science, KSQ Therapeutics
Lincoln Smith	Founder, Analytic Managed Services
Minsun Song	Associate Professor, Sookmyung Women’s University, South Korea
Chuen-Seng Tan	Associate Professor, National University of Singapore
Hannah Steele	Founding Partner and Portfolio Manager, Lorica Asset Management
Mayisha Sultana	Commercial Strategist, Wellinks
Sarah Urbut	Cardiology Fellow, MGH and Broad Institute
Sangsoon Woo	Senior Biostatistician, NanoString

Software

biobroom. This R package converts standard objects constructed by bioinformatics packages, especially those in Bioconductor, to the tidy data format. Bioconductor / GitHub

bnpsd. This R package simulates admixed populations via simulated allele frequencies and genotypes from the BN-PSD (“Balding-Nichols Pritchard-Stephens-Donnelly”) admixture model. This model enables the simulation of complex population structures, ideal for illustrating challenges in kinship coefficient and F_ST estimation. CRAN / GitHub

dnamix. This Fortran program calcluates likelihood ratios as they pertain to mixed DNA samples encountered in forensic science. GitHub

edge. This R package implements methods for carrying out differential expression analyses of genome-wide gene expression studies. Bioconductor / GitHub

eigenR2. This R package calculates a high-dimensional version of the classic R² statistic. GitHub

gcatest. This R package implements the genotype-conditional association test (GCATest), which is an association test for genome wide association studies that controls for population structure under a general class of trait models. Bioconductor / GitHub

geneticTMT. This R package implements the Transmission Mean Test (TMT) and the Transmission Disequilibrium Test (TDT) to infer causal genotype-phenotype relationships for population-sampled nuclear families. GitHub

jackstraw. This R package calculates a ignificance test for association between variables and their estimated latent variables. Latent variables may be estimated by principal component analysis (PCA), logistic factor analysis (LFA), and other techniques. CRAN / GitHub

lfa. This R package implements logistic factor analysis, which is a PCA analogue on Binomial data via estimation of latent structure in the natural parameter space. Bioconductor / GitHub

popkin. The popkin (“population kinship”) R package estimates the kinship matrix of individuals and F_ST from their biallelic genotypes. Our estimation framework is the first to be practically unbiased under arbitrary population structures. CRAN / GitHub

qvalue. This R package takes a list of p-values resulting from the simultaneous testing of many hypotheses and estimates their q-values and local FDR values. Bioconductor / GitHub

snm. This R package performs a supervised normalization of microarray data that takes into account the study design. Bioconductor

subSeq. This R package performs subsampling of high throughput sequencing count data for use in experimental design and analysis. Bioconductor / GitHub

superadmixture. This R package implements functions for inferring the population coancestry and simulating genotypes according to the super admixture framework. GitHub

sva. This R package performs surrogate variable analysis to account for sources of systematic variation not included in the study design model. Bioconductor / GitHub

terastructure. This C++ program fits the admixture model on tera-sample-sized data sets (~ 10¹² observed genotypes). This package provides a scalable, multi-threaded implementation that can be run on a single computer. GitHub

trigger. This R package provides tools for the statistical analysis of genetics of gene expression data. Bioconductor / GitHub

Contact & Links

Contact

Carl Icahn Labs
Princeton University
Princeton NJ 08544, USA

JDS Email

Faculty Assistant: Laura Hoffman
Phone: +1.609.258.8607
Email: lh7396@princeton.edu
Office: 147 Carl Icahn Labs

Links

Storey Lab GitHub
JDS on Google Scholar
JDS on ORCID may be out of date