Details
One of the ways to enhance analyses conducted with large claims databases is by validating the measurement characteristics of the code-based algorithms used to identify health outcomes or other key study parameters of interest. These metrics can be used in quantitative bias analyses to assess the robustness of results for an inferential study given potential bias from outcome misclassification. However, performing this validation through manual chart review of free-text notes from linked electronic health records requires extensive time and resource allocation.
We describe an expedited process for validating code-based algorithms that introduces efficiency using two distinct mechanisms: 1) use of natural language processing (NLP) to reduce time spent by human reviewers to review each chart, and 2) a multi-wave adaptive sampling approach with pre-defined criteria to stop the validation study once performance characteristics are identified with sufficient precision. We illustrate this process in a case study that validates the performance of a claims-based outcome algorithm for intentional self-harm in patients with obesity.