Lab Files
Helpful Research Tips
Please see below for some helpful tips on conducting research that have been shared by our lab members:
Getting started on Seadragon
Seadragon tips for PC
Genetics jargon dictionary
Reading List
Please see our lab meeting reading list below. Some notes are available on papers that have been read.
An Owner’s Guide to the Human Genome: an introduction to human population genetics, variation and disease. Jonathan Pritchard. Book Chapters [Siyi Ch1] [Siyi Ch2] [Siyi Ch3]
- Higher criticism for detecting sparse heterogeneous mixtures. Donoho & Jin. Annals of Statistics. 2004.
- Principal components analysis corrects for stratification in genome-wide association studies. Price et al. Nature Genetics. 2006.
- Microarrays, Empirical Bayes and the Two-Groups Model. Efron. Statistical Science. 2008.
- Rare-Variant Association Testing for Sequencing Data with the Sequence Kernel Association Test. Wu et al. American Journal of Human Genetics. 2011.
- Optimal tests for rare variant effects in sequencing association studies. Lee et al. Biostatistics. 2012.
- A general framework for estimating the relative pathogenicity of human genetic variants. Kircher et al. Nature Genetics. 2014.
- Replicability analysis for genome-wide association studies. Heller & Yekutieli. Annals of Applied Statistics. 2014.
- Efficient Bayesian mixed-model analysis increases association power in large cohorts. Loh et al. Nature Genetics. 2015.
- LD Score regression distinguishes confounding from polygenicity in genomewide association studies. Bulik-Sullivan et al. Nature Genetics. 2015.
- A gene-based association method for mapping traits using reference transcriptome data. Gamazon et al. Nature Genetics. 2015.
- Integrative approaches for large-scale transcriptome-wide association studies. Gusev et al. Nature Genetics. 2016.
- Large-scale association analysis identifies new lung cancer susceptibility loci and heterogeneity in genetic susceptibility across histological subtypes. McKay et al. Nature Genetics. 2017.
- Testing for Gene–Environment Interaction Under Exposure Misspecification. Sun et al. Biometrics. 2017.
- 10 Years of GWAS Discovery: Biology, Function, and Translation. Visscher et al. American Journal of Human Genetics. 2018.
- Efficiently controlling for case-control imbalance and sample relatedness in large-scale genetic association studies. Zhou et al. Nature Genetics. 2018.
- Multi-trait analysis of genome-wide association summary statistics using MTAG. Turley et al. Nature Genetics. 2018.
- ACAT: A Fast and Powerful p Value Combination Method for Rare-Variant Analysis in Sequencing Studies. Liu et al. American Journal of Human Genetics. 2019.
- Operating characteristics of the rank-based inverse normal transformation for quantitative trait analysis in genome-wide association studies. McCaw et al. Biometrics. 2019.
- Powerful gene set analysis in GWAS with the Generalized Berk-Jones statistic. Sun et al. PLoS Genetics. 2019.
- A simple new approach to variable selection in regression, with application to genetic fine mapping. Wang et al. Journal of the Royal Statistical Society Series B. 2020.
- Dynamic incorporation of multiple in silico functional annotations empowers rare variant association analysis of large whole-genome sequencing studies at scale. Li et al. Nature Genetics. 2020.
- A Fast and Accurate Method for Genome-Wide Time-to-Event Data Analysis and Its Application to UK Biobank. Bi et al. American Journal of Human Genetics. 2020.
- Genetic Variant Set-Based Tests Using the Generalized Berk–Jones Statistic With Application to a Genome-Wide Association Study of Breast Cancer. Sun et al. JASA. 2020.
- Integration of multiomic annotation data to prioritize and characterize inflammation and immune‐related risk variants in squamous cell lung cancer. Sun et al. Genetic Epidemiology. 2020.
- Computationally efficient whole-genome regression for quantitative and binary traits. Mbatchou et al. Nature Genetics. 2021.
- Inference for set-based effects in genetic association studies with interval-censored outcomes. Sun et al. Biometrics. 2022.
- FAVOR: functional annotation of variants online resource and annotator for variation across the human genome. Zhou et al. Nucleic Acids Research. 2022.
- Efficient and accurate frailty model approach for genome-wide survival association analysis in large-scale biobanks. Dey et al. Nature Communications. 2022.
- Powerful, scalable and resource-efficient meta-analysis of rare variant associations in large whole genome sequencing studies. Li et al. Nature Genetics. 2022.
- A Minimax Optimal Ridge-Type Set Test for Global Hypothesis With Applications in Whole Genome Sequencing Association Studies. Liu et al. JASA. 2022.
- Large-Scale Hypothesis Testing for Causal Mediation Effects with Applications in Genome-wide Epigenetic Studies. Liu et al. JASA. 2022.
- A multiple-testing procedure for high-dimensional mediation hypotheses. Dai et al. JASA. 2022.
- 15 years of GWAS discovery: Realizing the promise. Abdellaoui et al. American Journal of Human Genetics. 2023.
- Accurate and efficient estimation of local heritability using summary statistics and the linkage disequilibrium matrix. Li et al. Nature Communications. 2023.
- A new method for multiancestry polygenic prediction improves performance across diverse populations. Zhang et al. Nature Genetics. 2023.
- Testing a Large Number of Composite Null Hypotheses Using Conditionally Symmetric Multidimensional Gaussian Mixtures in Genome- Wide Studies. Sun et al. JASA. 2024. [Jae-Woo]
- Fast and scalable ensemble learning method for versatile polygenic risk prediction. Chen et al. PNAS. 2024.
- Synthetic surrogates improve power for genome-wide association studies of partially missing phenotypes in population biobanks. McCaw et al. Nature Genetics. 2024.
- Semi-supervised machine learning method for predicting homogeneous ancestry groups to assess Hardy-Weinberg equilibrium in diverse whole-genome sequencing studies. Shyr et al. American Journal of Human Genetics. 2024.
- Ensemble methods for testing a global null. Liu et al. Journal of the Royal Statistical Society Series B. 2024.