Developing prediction models when there are systematically missing predictors in individual patient data meta-analysis.

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • Additional Information
    • Source:
      Publisher: Wiley Blackwell Country of Publication: England NLM ID: 101543738 Publication Model: Print-Electronic Cited Medium: Internet ISSN: 1759-2887 (Electronic) Linking ISSN: 17592879 NLM ISO Abbreviation: Res Synth Methods Subsets: MEDLINE
    • Publication Information:
      Publication: : Chichester : Wiley Blackwell
      Original Publication: Malden, MA : John Wiley & Sons, 2010-
    • Subject Terms:
    • Abstract:
      Clinical prediction models are widely used in modern clinical practice. Such models are often developed using individual patient data (IPD) from a single study, but often there are IPD available from multiple studies. This allows using meta-analytical methods for developing prediction models, increasing power and precision. Different studies, however, often measure different sets of predictors, which may result to systematically missing predictors, that is, when not all studies collect all predictors of interest. This situation poses challenges in model development. We hereby describe various approaches that can be used to develop prediction models for continuous outcomes in such situations. We compare four approaches: a "restrict predictors" approach, where the model is developed using only predictors measured in all studies; a multiple imputation approach that ignores study-level clustering; a multiple imputation approach that accounts for study-level clustering; and a new approach that develops a prediction model in each study separately using all predictors reported, and then synthesizes all predictions in a multi-study ensemble. We explore in simulations the performance of all approaches under various scenarios. We find that imputation methods and our new method outperform the restrict predictors approach. In several scenarios, our method outperformed imputation methods, especially for few studies, when predictor effects were small, and in case of large heterogeneity. We use a real dataset of 12 trials in psychotherapies for depression to illustrate all methods in practice, and we provide code in R.
      (© 2023 The Authors. Research Synthesis Methods published by John Wiley & Sons Ltd.)
    • References:
      Steyerberg E, Nieboer D, Debray T, van Houwelingen H. Meta-analysis of prediction models. Handbook of Meta-Analysis. CRC Press; 2020. doi:10.1201/9781315119403-22.
      de Jong VMT, Moons KGM, Eijkemans MJC, Riley RD, Debray TPA. Developing more generalizable prediction models from pooled studies and large clustered data sets. Stat Med. 2021;40(15):3533-3559. doi:10.1002/sim.8981.
      Riley RD, Lambert PC, Abo-Zaid G. Meta-analysis of individual participant data: rationale, conduct, and reporting. BMJ. 2010;340:c221. doi:10.1136/bmj.c221.
      Debray TPA, Moons KGM, van Valkenhoef G, et al. Get real in individual participant data (IPD) meta-analysis: a review of the methodology. Res Synth Methods. 2015;6(4):293-309. doi:10.1002/jrsm.1160.
      Rubin DB. Multiple imputation after 18+ years. J Am Stat Assoc. 1996;91(434):473-489. doi:10.2307/2291635.
      van Buuren S. Flexible Imputation of Missing Data. 2nd ed. Chapman and Hall/CRC; 2018.
      Resche-Rigon M, White IR. Multiple imputation by chained equations for systematically and sporadically missing multilevel data. Stat Methods Med Res. 2018;27(6):1634-1649. doi:10.1177/0962280216666564.
      Jolani S. Hierarchical imputation of systematically and sporadically missing data: an approximate Bayesian approach using chained equations. Biom J. 2018;60(2):333-351. doi:10.1002/bimj.201600220.
      Quartagno M, Carpenter JR. Multiple imputation for IPD meta-analysis: allowing for heterogeneity and studies with missing covariates. Stat Med. 2016;35(17):2938-2954. doi:10.1002/sim.6837.
      Audigier V, White I, Jolani S, et al. Multiple imputation for multilevel data with continuous and binary variables. Stat Sci. 2018;33(2):160-183. doi:10.1214/18-STS646.
      Furukawa TA, Suganuma A, Ostinelli EG, et al. Dismantling, optimising, and personalising internet cognitive behavioural therapy for depression: a systematic review and component network meta-analysis using individual participant data. Lancet Psychiatry. 2021;8(6):500-511. doi:10.1016/S2215-0366(21)00077-8.
      Debray TPA, Moons KGM, Abo-Zaid GMA, Koffijberg H, Riley RD. Individual participant data meta-analysis for a binary outcome: one-stage or two-stage? PLoS One. 2013;8(4):e60650. doi:10.1371/journal.pone.0060650.
      Riley RD, Debray TP, Fisher D, et al. Individual participant data meta-analysis to examine interactions between treatment effect and participant-level covariates: statistical recommendations for conduct and planning. Stat Med. 2020;39(15):2115-2137.
      Jolani S, Debray TPA, Koffijberg H, van Buuren S, Moons KGM. Imputation of systematically missing predictors in an individual participant data meta-analysis: a generalized approach using MICE. Stat Med. 2015;34(11):1841-1863. doi:10.1002/sim.6451.
      Jackson D, White I, Kostis JB, et al. Systematically missing confounders in individual participant data meta-analysis of observational cohort studies. Stat Med. 2009;28(8):1218-1237. doi:10.1002/sim.3540.
      Rubin DB. Multiple Imputation for Nonresponse in Surveys. John Wiley &Sons; 1987.
      Wood AM, Royston P, White IR. The estimation and use of predictions for the assessment of model performance using large samples with multiply imputed data. Biom J. 2015;57(4):614-632. doi:10.1002/bimj.201400004.
      Burgess S, White IR, Resche-Rigon M, Wood AM. Combining multiple imputation and meta-analysis with individual participant data. Stat Med. 2013;32(26):4499-4514. doi:10.1002/sim.5844.
      Resche-Rigon M, White IR, Bartlett JW, Peters SAE, Thompson SG, PROG-IMT Study Group. Multiple imputation for handling systematically missing confounders in meta-analysis of individual participant data. Stat Med. 2013;32(28):4890-4905. doi:10.1002/sim.5894.
      Reiter J, Raghunathan T, Kinney S. The importance of modeling the sampling design in multiple imputation for missing data. Survey Methodol. 2006;32(2), 143-150.
      Marshall A, Altman DG, Holder RL, Royston P. Combining estimates of interest in prognostic modelling studies after multiple imputation: current practice and guidelines. BMC Med Res Methodol. 2009;9(1):57. doi:10.1186/1471-2288-9-57.
      Steyerberg EW. Clinical Prediction Models. 2nd ed. Springer; 2019.
      Steyerberg EW, Harrell FE Jr. Prediction models need appropriate internal, internal-external, and external validation. J Clin Epidemiol. 2016;69:245-247. doi:10.1016/j.jclinepi.2015.04.005.
      Debray TPA, Moons KGM, Ahmed I, Koffijberg H, Riley RD. A framework for developing, implementing, and evaluating clinical prediction models in an individual participant data meta-analysis. Stat Med. 2013;32(18):3158-3180. doi:10.1002/sim.5732.
      Hoerl A, Kennard R. Ridge regression: biased estimation for nonorthogonal problems. Dent Tech. 2012;12:55-67. doi:10.1080/00401706.1970.10488634.
      R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing; 2018. https://www.R-project.org/.
      Bates D, Mächler M, Bolker B, Walker S. Fitting linear mixed-effects models using lme4. J Stat Softw. 2015;67(1):1-48. doi:10.18637/jss.v067.i01.
      White IR, Royston P, Wood AM. Multiple imputation using chained equations: issues and guidance for practice. Stat Med. 2011;30(4):377-399. doi:10.1002/sim.4067.
      Audigier V, Resche-Rigon M. micemd: Multiple Imputation by Chained Equations with Multilevel Data; 2018. https://CRAN.R-project.org/package=micemd.
      Robitzsch A, Grund S. Miceadds: Some Additional Multiple Imputation Functions, Especially for “Mice”; 2021. https://CRAN.R-project.org/package=miceadds.
      Snijders T, Bosker R. Multilevel Analysis: An Introduction to Basic and Advanced Multilevel Modeling. Sage; 1999.
      Polikar R. Ensemble based systems in decision making. IEEE Circuits Syst Mag. 2006;6(3):21-45. doi:10.1109/MCAS.2006.1688199.
      Gashler M, Giraud-Carrier C, Martinez T. Decision tree ensemble: small heterogeneous is better than large homogeneous. 2008 Seventh International Conference on Machine Learning and Applications; IEEE, 2008:900-905. doi:10.1109/ICMLA.2008.154.
      Pavlou M, Ambler G, Seaman S, Omar RZ. A note on obtaining correct marginal predictions from a random intercepts model for binary outcomes. BMC Med Res Methodol. 2015;15(1):1-6. doi:10.1186/s12874-015-0046-6.
    • Grant Information:
      Ambizione grant 180083 Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung
    • Contributed Indexing:
      Keywords: ensemble predictive modeling; individual patient data; meta-analysis; multilevel model; prediction research
    • Publication Date:
      Date Created: 20230209 Date Completed: 20230523 Latest Revision: 20230523
    • Publication Date:
      20240104
    • Accession Number:
      10.1002/jrsm.1625
    • Accession Number:
      36755407