Skip to main content

A General Framework for Developing Computable Clinical Phenotype Algorithms

    Basic Details

    This study's framework aims to guide developers in creating computable algorithms for identifying patients with specific clinical conditions using machine learning and natural language processing methods. This framework was developed through two sources: experience in national consortia and projects conducted to develop computable algorithms for anaphylaxis, acute pancreatitis, and COVID-19 disease. The framework aims to identify principles, strategies, and guidelines for computable algorithm developers to streamline the development process. The framework is organized into five stages: fitness-for-purpose assessment, manual chart review, engineering features (predictor variables), model development, and model evaluation and reporting. All stages address "scalable development" approaches, enhancing efficiency by avoiding high-risk/low-reward efforts, minimizing the use of costly/rare expertise, and optimizing methods and algorithms for reusability. The framework is supported by the Sentinel Innovation Center and aims to normalize and streamline the development process for the Sentinel Active Risk Identification and Analysis (ARIA) system.


    David S. Carrell, James S. Floyd, Susan Gruber, Brian L. Hazlehurst, Patrick J. Heagerty, Jennifer L. Nelson, Brian D. Williamson, Robert Ball

    Corresponding Author

    David S. Carrell; Kaiser Permanente Washington Health Research Institute, Seattle, WA