Skip to main content

Medicare Claims Synthetic Public Use Files in Sentinel Common Data Model Format: Datasets

    Basic Details
    Date Posted

    Medicare Claims Synthetic Public Use Files (SynPUFs) were created to allow interested parties to gain familiarity using Medicare claims data while protecting beneficiary privacy. These files are intended to promote development of software and applications that utilize files in this format, train researchers on the use and complexities of Centers for Medicare and Medicaid Services (CMS) claims, and support safe data mining innovations. The SynPUFs were created by combining randomized information from multiple unique beneficiaries and changing variable values. This randomization and combining of beneficiary information ensures privacy of health information.  

    Sentinel uses a distributed data approach in which Data Partners maintain physical and operational control over electronic data in their existing environments. The distributed approach is achieved by using a standardized data structure referred to as the Sentinel Common Data Model (SCDM). Sentinel’s Cohort Identification and Descriptive Analysis (CIDA) tool is a set of SAS macros that allows users to select a cohort of interest. The CIDA tool specifically reads data that is structured in the SCDM.

    The Sentinel Operations Center (SOC) has transformed the CMS SynPUFs into the SCDM format as part of an ongoing effort to make Sentinel resources available to external investigators, with the goal of creating a community of investigators who can understand, utilize, and contribute to the Sentinel enterprise.

    This page contains:

    • SCDM-formatted SynPUFs datasets in the form of 20 subsamples and their related data element tables: death, demographic, diagnosis, dispensing, encounter, enrollment, and procedure, provided in zipped files.
    • Descriptive statistics of each SynPUFs SCDM subsample and a corresponding data dictionary.

    SOC will continue to maintain SCDM 7.0.0-formatted SynPUFs datasets. These are available upon request by e-mailing

    Refer to the Medicare Claims Synthetic Public Use Files in Sentinel Common Data Model Format: User Documentation and Example Routine Querying Package page for user documentation, technical specifications, example routine querying package and SynPUFs demontration report.

    Time Period
    January 1, 2008 - December 31, 2010
    Population / Cohort
    Individuals 18 years of age or older
    Data Source(s)
    SCDM-formatted SynPUFs