Sentinel has converted the Transformed Medicaid Statistical Information System (T-MSIS) Analytic Files (TAF) Research Identifiable Files (RIF) to SCDM v8.0.0. These data contain Medicaid and Children’s Health Insurance Program (CHIP) data housed in the Center for Medicare and Medicaid Services' CMS Virtual Research Data Center (VRDC). Duke University Department of Population Health Sciences (DPHS) serves as the Sentinel Data Partner in accessing the source data in the CMS VRDC, transforming it into a Sentinel Common Data Model (SCDM) compliant database, executing queries, and returning results to the Sentinel Operations Center (SOC). There are three components made available to the public: (1) Program Specifications; (2) Code pack; and (3) User Guide.
1. Program Specifications: Describes the required extraction, transformation, and loading (ETL) processes and mappings specific to Medicaid/CHIP source data from 2014 through 2018. This document consists of the following sections:
- Medicaid/CHIP Source Data: This section describes the content, structure, and update schedule of the 100% Medicaid/CHIP data stored in the VRDC
- VRDC Environment: This section describes the relevant particulars of the VRDC computing environment
- ETL Specifications: This section describes the different types of information required before starting a new ETL and the different build types that need to be supported
- Source Data Mapping: This section describes the table-specific and field-specific mappings necessary to transform the Medicaid/CHIP data into SCDM-compliant intermediate tables
- Final Tables: This section describes the process of combining intermediate tables to create the final tables that will be used for analyses
Except as related to implementation of the Medicaid/CHIP data, the specification document does not otherwise discuss the rationale or content of the SCDM.
As guiding principles, the processes and programs created to accomplish this ETL should be flexible and extensible. This includes attributes such as the ability to handle different kinds of ETLs (e.g., incremental build v. full rebuild), the ability to create intermediate files that can be easily reused in a subsequent ETL, and the ability to easily add new Medicaid/CHIP data sources into the process.
2. Code Pack: Includes the following features:
- All parameters relating to each type of source file accessed (e.g., Demographic and Eligibility (DE), Pharmacy (RX) files, Inpatient (IP), Long-Term Care (LT), and Other Services (OT) claims files.
- All parameters relating to use of already transformed source data into SCDM-formatted data files and/or how to bypass their availability.
- Establishment of SAS data libraries (i.e., LIBNAMEs) for source, intermediate and permanent files.
- Highlights on code that is unique to the CMS VRDC environment, such as security settings, remote submits, standard data libraries, etc. which may not be applicable to public users of the code pack.
- Sequencing of any program execution.
- A list of included programs and macros to serve as a “packing list,” so that users of the code pack can be sure that their pack is complete.
3. User Guide: Includes information on how to use the code pack, along with guidance for researchers who may have different source files and/or programming environments available. The target audience for this document is researchers who wish to create SCDM-compliant tables using Medicaid/CHIP data. While the programs in the Code Pack documented above are specific to the processing of the Medicaid and CHIP data within the VRDC, we anticipate that the mapping information, specifically, will be of use to all researchers.
The associated files on this site are for Sentinel TAF RIF (CMS Medicaid and CHIP source files) ETL version one utilizing SCDM version 8.0.0, approved by Sentinel in June 2022.
- SAS version 9.4 or later
- The content on this page is technical and intended for use by scientists, analysts, and programmers, in various areas of expertise.
- This SAS program package uses source data from the Centers for Medicare and Medicaid Services (CMS) 100% Transformed Medicaid Statistical Information System (T-MSIS) Analytic Files Research Identifiable File (TAF RIF) source data from 2014 through 2018. The SAS program package was designed for execution within CMS’s Virtual Research Data Center (VRDC) environment administered by the Research Data Assistance Center (ResDAC) with the following technical resources:
- VRDC access provisioned with 32Gbytes of RAM
- SAS version 9.4 or later
- Sufficient disk storage resources for source datasets, SCDM datasets, WORK data library space, and results of program packages
- A SAS Grid of multiple computers enabling simultaneous processing
- Source data files obtained from CMS by other researchers may have different file names, different partitioning schemes (e.g., annual RIF), different samples of the data, and possibly different variables and/or variable names. Users are responsible for making any adjustments to the SAS program package to be compatible with source data they receive from CMS for implementation in their technical environment.
- There is no mechanism for technical support by Duke University, the Sentinel Operations Center, ResDAC, CMS, or by the U. S. Food and Drug Administration (FDA) for use of this SAS program package.
- The SAS program package is distributed “as is” and with no warranties of any kind, whether express or implied, including and without limitation, any warranty of merchantability or fitness for a particular purpose.
- In no event shall any individual, the Duke University Department of Population Health Sciences, the Sentinel Operations Center located at Harvard Pilgrim Health Care Institute, nor the FDA be liable for any damages whatsoever relating to the use, misuse, or inability to use this SAS program package (including, without limitation, damages for loss of profits or revenue, business interruption, loss of information, or any other loss).
- The information contained on this website is provided as part of FDA's commitment to place knowledge acquired from the Sentinel System in the public domain.
Bradley Hammill, DrPH; Department of Population Health Sciences, Duke University School of Medicine, Durham, NC
Judith C. Maro, PhD; Department of Population Medicine, Harvard Pilgrim Health Care Institute and Harvard Medical School, Boston, MA
Sarah Dutcher, PhD, MS; David Money, RPh., PMH; Efe Eworuke, PhD, MSc; Office of Surveillance and Epidemiology, Center for Drug and Evaluation Research, US Food and Drug Administration, Silver Spring, MD
Patricia Bright, MSPH, PhD; Jamila Mwidau, RN, BSN, MPH; Sanae Cherkaoui, MS, MPH, CPH; Office of Surveillance and Epidemiology, Center for Drug and Evaluation Research, US Food and Drug Administration, Silver Spring, MD
Steven J. Lippmann, PhD; Michael Stagner; Jessica E. Pritchard, PhD; Pratap Adhikari, MS; Department of Population Health Sciences, Duke University School of Medicine, Durham, NC
Christine Halbig, MPH; Laura Shockro, Katie Shapiro; Daniel Kiernan; Alexander Mai; Department of Population Medicine, Harvard Pilgrim Health Care Institute and Harvard Medical School, Boston, MA
Robert Rosofsky, MA; Health Information Systems Consulting LLC, Milton, MA