Skip to main content

An Expedited Chart Review Process for Large Database Studies Using Natural Language Processing and Multi-Wave Adaptive Sampling

    Basic Details
    Date
    Type
    Publication
    Description

    One of the ways to enhance analyses conducted with large claims databases is by validating the measurement characteristics of the code-based algorithms used to identify health outcomes or other key study parameters of interest. These metrics can be used in quantitative bias analyses to assess the robustness of results for an inferential study given potential bias from outcome misclassification. However, performing this validation through manual chart review of free-text notes from linked electronic health records requires extensive time and resource allocation.

    We describe an expedited process for validating code-based algorithms that introduces efficiency using two distinct mechanisms: 1) use of natural language processing (NLP) to reduce time spent by human reviewers to review each chart, and 2) a multi-wave adaptive sampling approach with pre-defined criteria to stop the validation study once performance characteristics are identified with sufficient precision. We illustrate this process in a case study that validates the performance of a claims-based outcome algorithm for intentional self-harm in patients with obesity.

    Author(s)

    Shirley V. Wang, Georg Hahn, Sushama Kattinakere Sreedhara, Mufaddal Mahesri, Haritha S. Pillai, Rajendra Aldis, Joyce Lii, Sarah K. Dutcher, Rhoda Eniafe, Jamal T. Jones, Keewan Kim, Jiwei He, Hana Lee, Sengwee Toh, Rishi J. Desai, Jie Yang

    Corresponding Author

    Shirley V. Wang; Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA.

    Email: swang1@bwh.harvard.edu