Skip to main content

smdi: An R Package to Perform Structural Missing Data Investigations on Partially Observed Confounders in Real-world Evidence Studies

    Basic Details

    Partially observed confounder data pose a major challenge in statistical analyses aimed to inform causal inference using electronic health records (EHRs). While analytic approaches such as imputation are available, assumptions on underlying missingness patterns and mechanisms must be verified. We aimed to develop a toolkit to streamline missing data diagnostics to guide choice of analytic approaches based on meeting necessary assumptions. We developed the smdi (structural missing data investigations) R package based on results of a previous simulation study which considered structural assumptions of common missing data mechanisms in EHR.


    Janick Weberpals, Sudha R. Raman, Pamela A. Shaw, Hana Lee, Bradley G. Hammill, Sengwee Toh, John G. Connolly, Kimberly J. Dandreo, Fang Tian,Wei Liu, Jie Li, José J. Hernández-Muñoz, Robert J. Glynn, Rishi J. Desai

    Corresponding Author

    Janick Weberpals, RPh, PhD, Division of Pharmacoepidemiology and Pharmacoeconomics,

    Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School