Research
Summary
Below are a brief summary of first-author, last-author, and other statistical methodology publications.
For a full list of publications, including all collaborative works, please see the CV.
*Indicates equal contribution as first authors.
#Indicates work done by trainee.
Identification of rare and weak omic risk factors in disease
This work offers researchers more powerful methods to identify omic features that perturb disease risk across a variety of phenotypes and a variety of omic platforms. Such risk factors are often known to possess rare and weak effects that are difficult to detect.
Sun R, Carroll RJ, Christiani DC, Lin X. Testing for gene-environment interaction under exposure misspecification. Biometrics. 2018 Jun 01; 74(2):653-662. PMID: 29120492.
Sun R, Hui S, Bader GD, Lin X, Kraft P. Powerful gene set analysis in GWAS with the Generalized Berk-Jones statistic. PLoS Genet. 2019 Mar 01; 15(3):e1007530. PMID: 30875371.
Gaynor SM*, Sun R*, Lin X, Quackenbush J. Identification of differentially expressed gene sets using the Generalized Berk-Jones statistic. Bioinformatics. 2019 Nov 01; 35(22):4568-4576. PMID: 31062858.
Sun R, Lin X. Genetic Variant Set-Based Tests Using the Generalized Berk-Jones Statistic with Application to a Genome-Wide Association Study of Breast Cancer. J Am Stat Assoc. 2020 Jan 01; 115(531):1079-1091. PMID: 33041403.
Sun R, Wang Z, Claus Henn B, Su L, Lu Q, Lin X, Wright RO, Bellinger DC, Kile M, Mazumdar M, Tellez-Rojo MM, Schnaas L, Christiani DC. Identification of novel loci associated with infant cognitive ability. Mol Psychiatry. 2020 Nov 01; 25(11):3010-3019. PMID: 30120420.
Sun R, Shi A, Lin X. Differences in set-based tests for sparse alternatives when testing sets of outcomes compared to sets of explanatory factors in genetic association studies. Biostatistics. 2024 Jan 01; 25(1):171-187. PMID: 36000269.
Li, Z, Li, X, Zhou, H, Gaynor, SM, Selvaraj, MS, Arapoglou, T, Quick, C, Liu, Y, Chen, H, Sun, R, Dey, R, Arnett, DK, Auer, PL, Bielak, LF, Bis, JC, Blackwell, T, Blangero, J, Boerwinkle, E, Bowden, DW, Brody, JA, Cade, BE, Conomos, MP, Correa, A, Cupples, LA, Curran, JE, de Vries, PS, Duggirala, R, Franceschini, N, Freedman, BI, G ̈oring, HH, Guo, X, Kalyani, RR, Kooperberg, C, Kral, BG, Lange, LA, Lin, BM, Manichaikul, AW, Manning, AK, Martin, LW, Mathias, R, Meigs, JB, Mitchell, BD, Montasser, ME, Morrison, AC, Naseri, T, O’connell, Jr, Palmer, ND, Peyser, PA, Psaty, BM, Raffield, LM. A framework for detecting noncoding rare-variant associations of large-scale whole-genome sequencing studies. Nat Methods. 2022 Dec 01; PMID: 36303018.
Li, X, Quick, C, Zhou, H, Gaynor, SM, Liu, Y, Chen, H, Selvaraj, MS, Sun, R, Dey, R, Arnett, DK, Bielak, LF, Bis, JC, Blangero, J, Boerwinkle, E, Bowden, DW, Brody, JA, Cade, BE, Correa, A, Cupples, LA, Curran, JE, de Vries, PS, Duggirala, R, Freedman, BI, G ̈oring, HH, Guo, X, Haessler, J, Kalyani, RR, Kooperberg, C, Kral, BG, Lange, LA, Manichaikul, AW, Martin, LW, McGarvey, ST, Mitchell, BD, Montasser, ME, Morrison, AC, Naseri, T, O’Connell, JR, Palmer, ND, Peyser, PA, Psaty, BM, Raffield, LM, Redline, S, Reiner, AP, Reupena, MS, Rice, K, Rich, SS, Sitlani, CM, Smith, JA, Menon, VK. Powerful, scalable and resource-efficient meta-analysis of rare variant associations in large whole genome sequencing studies. Nat Genet. 2023 Jan; 55(1):154-164. PMID: 36564505.
McCaw ZR, Gaynor SM, Sun R, Lin X. Leveraging a surrogate outcome to improve inference on a partially missing target outcome. Biometrics. 2023 Jun 01; 79(2):1472-1484. PMID: 35218565.
Identification of disease risk factors for time-to-event outcomes
These tools offer the ability to associate high-dimensional omic data with the interval-censored time-to-event outcomes that are often found in massive modern genetic compendiums such as the UK Biobank. The interval-censored outcomes are common and arise from the dependence of these compendiums on periodic health questionnaires, which are a cost-effective approach for collecting large amounts of health data on hundreds of thousands of subjects.
Zhu, L, Tong, X, Cai, D, Li, Y, Sun, R, Srivastava, DK, Hudson, MM. Maximum likelihood estimation for the proportional odds model with mixed interval-censored failure time data. J Appl Stat. 2020 Jan 01; :1-17. PMID: 34349336.
Sun R, Zhu L, Li Y, Yasui Y, Robison L. Inference for Set-Based Effects in Genetic Association Studies with Interval Censored Outcomes. Biometrics. 2023 Jun 01; 79(2):1573-1585. PMID: 35165890.
Sun R, Sun D, Zhu L, Sun J. Regression analysis of general mixed recurrent event data. Lifetime Data Anal. 2023 Oct 01; 29(4):807-822. PMID: 37438585.
Choi J#, Xu Z#, Sun R. Variance-components tests for genetic association with multiple interval-censored outcomes. Stat Med. 2024 Jun 15; 43(13):2560-2574. PMID: 38636557.
Xu Z#, Choi J#, Sun R. Set-based tests for genetic association studies with interval-censored competing risks outcomes. Statistics in Biosciences. 2024+ (in press).
Integration of high-dimensional omic data to characterize biological processes
This work describes techniques to integrate high-dimensional omic data in interpretable models that move beyond association analysis and explain the biological pathways involved. For instance, instead of only associating genetic variants with disease, we can explain how those genetic variants influence disease risk by regulating expression of certain risk genes.
Li X, Li Z, Zhou H, Gaynor SM, Liu Y, Chen H, Sun R, Dey R, Arnett DK, Aslibekyan S, Ballantyne CM, Bielak LF, Blangero J, Boerwinkle E, Bowden DW, Broome JG, Conomos MP, Correa A, Cupples LA, Curran JE, Freedman BI, Guo X, Hindy G, Irvin MR, Kardia SLR, Kathire- san S, Khan AT, Kooperberg CL, Laurie CC, Liu XS, Mahaney MC, Manichaikul AW, Martin LW, Mathias RA, McGarvey ST, Mitchell BD, Montasser ME, Moore JE, Morrison AC, O’Connell JR, Palmer ND, Pampana A, Peralta JM, Peyser PA, Psaty BM, Redline S, Rice KM, Rich SS, Smith JA, Tiwari HK, Tsai MY, Vasan RS, Wang FF, Weeks DE, Weng Z, Wilson JG, Yanek LR, (TOPMed) Consortium NTFPM, Working Group TL, Neale BM, Sunyaev SR, Abecasis GR, Rotter JI, Willer CJ, Peloso GM, Natarajan P, Lin X. Dynamic incorporation of multiple in silico functional annotations empowers rare variant association analysis of large whole-genome sequencing studies at scale. Nat Genet. 2020 Sep 01; 52(9):969-983. PMID: 32839606.
Sun R, Xu M, Li X, Gaynor S, Zhou H, Li Z, Boss ́e Y, Lam S, Tsao MS, Tardon A, Chen C, Doherty J, Goodman G, Bojesen SE, Landi MT, Johansson M, Field JK, Bickeboller H, Wichmann HE, Risch A, Rennert G, Arnold S, Wu X, Melander O, Brunnstr ̈om H, Le Marchand L, Liu G, Andrew A, Duell E, Kiemeney LA, Shen H, Haugen A, Johansson M, Grankvist K, Caporaso N, Woll P, Dawn Teare M, Scelo G, Hong YC, Yuan JM, Lazarus P, Schabath MB, Aldrich MC, Albanes D, Mak R, Barbie D, Brennan P, Hung RJ, Amos CI, Christiani DC, Lin X. Integration of multiomic annotation data to prioritize and characterize inflammation and immune-related risk variants in squamous cell lung cancer. Genet Epidemiol. 2021 Feb 01; 45(1):99-114. PMID: 32924180.
Li X, Yung G, Zhou H, Sun R, Li Z, Hou K, Zhang MJ, Liu Y, Arapoglou T, Wang C, Ionita- Laza I, Lin X. A multi-dimensional integrative scoring framework for predicting functional variants in the human genome. Am J Hum Genet. 2022 Mar 01; 109(3):446-456. PMID: 35216679.
Zhu H#, Choi J#, Kui N#, Yang T, Wei P, Li D, Sun R. Identification of pancreatic cancer germline risk variants with effects that are modified by smoking. JCO Precis Oncol. 2024 Mar; 8:e2300355. PMID: 38564682.
High-dimensional inference for composite null hypotheses.
Composite null hypotheses occur when we are interested in a set of individual null hypotheses, and we want to know whether all individual nulls in the set should be rejected simultaneously. This statistical challenge has many applications in translational genetics studies such as mediation analysis, pleiotropy analysis, and replication analysis.
Sun, R, McCaw Z, Lin, X. Testing a large number of composite null hypotheses using conditionally symmetric multidimensional gaussian mixtures in genome-wide studies. Journal of the American Statistical Association. 2024+ (accepted) https://arxiv.org/abs/2309.12584.
Describing more robust methods for analysis of clinical trial data
Modern clinical trials aim to answer highly heterogeneous biological questions and produce many varied, unique datasets. However, the statistical analysis of such data remains highly uniform. This work is aimed at clinicians and describes robust options that may be more appropriate than conventional methods for handling common challenges such as stratification or study discontinuation.
Kim DH, Li X, Bian S, Wei LJ, Sun R. Utility of Restricted Mean Survival Time for Analyzing Time to Nursing Home Placement Among Patients With Dementia. JAMA Netw Open. 2021 Jan 28; 4(1):e2034745. PMID: 33507253.
Sun R*, McCaw Z*, Tian L, Uno H, Hong F, Kim DH, Wei LJ. Moving beyond conventional stratified analysis to assess the treatment effect in a comparative oncology study. J Immunother Cancer. 2021 Nov 01; 9(11) PMID: 34799398.
Huang B*, Sun R*, Claggett B, Tian L, Ludmir EB, Wei LJ. Handling Informative Premature Treatment or Study Discontinuation for Assessing Between-Group Differences in a Comparative Oncology Trial. JAMA Oncol. 2022 Oct 01; 8(10):1502-1503. PMID: 35980612.
Translational omics research in cancer
New omics modalities are increasingly available to assess cancer patient prognosis. However, standard approaches to analyze such data may not be appropriate for all settings. This work produces bespoke models that allow translational researchers to pose novel biological hypotheses. For example, we introduce an empirical Bayes model that estimates the probability of false negative KRAS readings that arise in circulating tumor DNA (ctDNA). Such readings are highly impactful in colorectal cancer treatment.
Yam C*, Abuhadra N*, Sun R*, Adrada BE, Ding QQ, White JB, Ravenberg EE, Clayborn AR, Valero V, Tripathy D, Damodaran S, Arun BK, Litton JK, Ueno NT, Murthy RK, Lim B, Baez L, Li X, Buzdar AU, Hortobagyi GN, Thompson AM, Mittendorf EA, Rauch GM, Candelaria RP, Huo L, Moulder SL, Chang JT. Molecular Characterization and Prospective Evaluation of Pathologic Response and Outcomes with Neoadjuvant Therapy in Metaplastic Triple-Negative Breast Cancer. Clin Cancer Res. 2022 Jul 01; 28(13):2878-2889. PMID: 35507014.
Parseghian CM*, Sun R*, Woods M*, Napolitano S*, Lee HM, Alshenaifi J, Willis J, Nunez S, Raghav KP, Morris VK, Shen JP, Eluri M, Sorokin A, Kanikarla P, Vilar E, Rehn M, Ang A, Troiani T, Kopetz S. Resistance Mechanisms to Anti-EGFR Therapy in RAS/RAF Wildtype Colorectal Cancer Vary by Regimen and Line of Therapy. J Clin Oncol. 2023 Jan 20; 41(3):101200JCO2201423. PMID: 36351210
Napolitano S, Parikh AR, Henry J, Parseghian CM, Willis J, Raghav KP, Morris VK, Johnson B, Kee BK, Dasari AN, Overman MJ, Luthra R, Drusbosky LM, Corcoran RB, Kopetz S, Sun R. Novel Clinical Tool to Estimate Risk of False-Negative KRAS Mutations in Circulating Tumor DNA Testing. JCO Precis Oncol. 2023 Sep; 7:e2300228. PMID: 37824798.
Correspondence in high-impact clinical journals
As noted previously, despite the heterogeneous nature of modern clinical trials, the analysis of clinical trial data remain highly homogenous. Our group advocates for more robust quantitative approaches through these Correspondence. The overall goals are to inform policy, educate readers, and improve professional practice. For instance, in a piece responding to KEYNOTE-189, we describe how the lack of proportional hazards in the survival curves indicates that the reported hazard ratio is likely invalid; as such, it should not be used as a reference for effect sizes in designing future trials.
- Sun R, Horiguchi M, Wei LJ. Interpreting the Benefit of Trifluridine/Tipiracil in Metastatic Colorectal Cancer With Respect to Progression-Free Survival and Overall Survival. J Clin Oncol. 2018 May 01; 36(13):1378-1379. PMID: 29558278.
- Sun R, Rich MW, Wei LJ. Pembrolizumab plus Chemotherapy in Lung Cancer. N Engl J Med. 2018 Sep 13; 379(11):e18. PMID: 30211499.
- Sun R, Nie L, Huang B, Kim DH, Wei LJ. Quantifying Immunoscore performance. Lancet. 2018 Nov 03; 392(10158):1624. PMID: 30496077.
- Sun R, Wei LJ. Regional Hyperthermia With Neoadjuvant Chemotherapy for Treatment of Soft Tissue Sarcoma. JAMA Oncol. 2019 Jan 01; 5(1):112-113. PMID: 30489612.
- Sun R, Lee H, Wei LJ. Interpreting the Long-term Prognostic Value of Total Mesorectal Excision Plane Quality in Rectal Adenocarcinoma. JAMA Surg. 2019 Jan 01; 154(1):96. PMID: 30427976.
- Sun R, Zhu H, Wei LJ. Assessing the Prognostic Value of the Automated Bone Scan Index for Prostate Cancer. JAMA Oncol. 2019 Feb 01; 5(2):270. PMID: 30543368.
- Wei LJ, Sun R, Orkaby AR, Kim DH, Zhu H. Biodegradable-polymer stents versus durable- polymer stents. Lancet. 2019 May 11; 393(10184):1932-1933. PMID: 31084958.
- Sun R, Kim DH, Wei LJ. Analysis of Overall Survival Benefit of Abemaciclib Plus Fulvestrant in Hormone Receptor-Positive, ERBB2-Negative Breast Cancer. JAMA Oncol. 2020 Jul 01; 6(7):1121- 1122. PMID: 32463424.
- Sun R, Messick C, Wei LJ. Two-Stage Turnbull-Cutait Pull-Through Coloanal Anastomosis for Low Rectal Cancers. JAMA Surg. 2021 Feb 01; 156(2):202-203. PMID: 33175112.
- Sun R, Tian L, Wei LJ. Evaluating Long-term Efficacy of Neoadjuvant Chemoradiotherapy Plus Surgery for the Treatment of Locally Advanced Esophageal Squamous Cell Carcinoma. JAMA Surg. 2022 May 01; 157(5):458-459. PMID: 35080625.
- Sun R, Wei, LJ. Pembrolizumab in Triple-Negative Breast Cancer. N Engl J Med. 2022 Oct 13; 387(15):1435-1436. PMID: 36239657.
- Sun R, Wei LJ. Quantifying Clinical Utility of Adjuvant Abemaciclib in Patients With High- risk Early Breast Cancer Who Received Neoadjuvant Chemotherapy. JAMA Oncol. 2022 Nov 01; 8(11):1701. PMID: 36173642.
- Sun R, Wei LJ. Quantifying Clinical Utility of Enzalutamide for Overall Survival in Metastatic Hormone-Sensitive Prostate Cancer. J Clin Oncol. 2022 Dec 20; 40(36):JCO2201084. PMID: 35985008.
- Sun R, Huang B, Wei LJ. Comparing Short- and Long-Term Treatment Duration of Beva- cizumab for Advanced Ovarian Cancer. J Clin Oncol. 2023 Apr 01; 41(10):1952-1953. PMID: 36763910.
- Sun R, Wei LJ. Efficacy, Safety, and Analysis Issues in a Study of Intraoperative Hyperthermic Intraperitoneal Chemotherapy for Locally Advanced Colon Cancer. JAMA Surg. 2023 Dec 13; 158(12):1357-1358. PMID: 37585200.
- Sun R, Seibert, TM, Wei, LJ. Predictability of Olfactory Neuroblastoma Staging Systems. JAMA Otolaryngol Head Neck Surg. 2024 Jan 11; 150(1):84-85. PMID: 37971764.
- Sun R, Liu, J, Wei, LJ. Assessing Predictability of Pathologic Lymph Node Regression for Recurrence and Survival in Esophageal Adenocarcinoma. J Clin Oncol. 2024 Jan 20; 42(3):366-367. PMID: 37988644.
- Sun R, Wei, LJ. Is Pertuzumab Plus Trastuzumab Without Chemotherapy a Reasonable Treatment for ERBB2-Positive Metastatic Breast Cancer? JAMA Oncol. 2024 Apr 18; 10(4):537. PMID: 38329744.
- Sun R, Moraleda JM, Wei LJ. Quantification of Treatment Effect of Tislelizumab vs Sorafenib for Hepatocellular Carcinoma. JAMA Oncol. 2024 May 01; 10(5):674. PMID: 38483380.
- Sun R, Wei LJ. Benralizumab versus Mepolizumab for Eosinophilic Granulomatosis with Polyangiitis. N Engl J Med. 2024 May 30; 390(20):1939. PMID: 38810203.
- Sun R, Wei LJ. Aspirin vs Placebo as Adjuvant Therapy for Breast Cancer. JAMA. 2024+ (in press)