Page content relevant to:

    Data Acquisition

    The NIH and other organizations make the results of supported research available to the public enabling data reuse, increasing transparency, and the facilitation of reproducibility of research results.

    IRB approval is required for some requests and while open-access data can be browsed online or downloaded without prior permission, controlled-access data can only be obtained after the requestor has been authorized.

    Specific information about the common sources can be found below.

    The database of Genotypes and Phenotypes (dbGaP)

    The database of Genotypes and Phenotypes (dbGaP) was developed by NIH’s National Center for Biotechnology Information (NCBI) to archive and distribute the data results from studies that have investigated the interaction of genotype and phenotype in humans. The high-level workflow is below:
    1. The PI will:
    2. NCBI will notify the SPS Signing Official (SO) of the request.
    3. SPS will secure NCBI application and Data Use Certification Agreement (DUC) signature(s).
    4. The SO will approve the request in the NCBI system.
    Use of the data set is authorized for one year and must be renewed annually. Renewals also follow the above process. Please contact Sponsored Program Services with any questions about this process.

    NIDDK Specimen and Data Repository

    National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) Central Repository

    The NIDDK Central Repository enables scientists to test new hypotheses without the need to collect any new data or biospecimens and provides the opportunity to pool data across several studies to increase the power of statistical analyses. In addition, most NIDDK-funded studies are collecting genetic biospecimens and carrying out high-throughput genotyping making it possible for other scientists to use Central Repository resources to match genotypes to phenotypes and to perform informative genetic analyses. Data Access is Controlled:

    Summary level data is open.

    Credentialed user must apply for access to individual level data.

    Data Request Instructions

    National Institute of Mental Health Data Archive (NDA)

    National Institute of Mental Health Data Archive (NDA)

    The National Institute of Mental Health Data Archive (NDA) makes available human subjects data collected from hundreds of research projects across many scientific domains. NDA provides infrastructure for sharing research data, tools, methods, and analyses enabling collaborative science and discovery. De-identified human subjects’ data, harmonized to a common standard, are available to qualified researchers. Summary data are available to all. Data Access is Mixed.
    NDA access portal

    International Bio-repositories (UK BioBank)

    International Bio-repositories (UK BioBank) is a uniquely powerful biomedical database that can be accessed globally by approved researchers to explore de-identified data from half a million UK Biobank participants to enable new discoveries to improve public health.

    The UK Biobank Data Showcase provides a summary of all the information gathered by UK Biobank on our 500,000 participants and is available to explore. The showcase contains background information on how these data were collected and notes about future collections.

    Before applying to access UK Biobank data, or if you are already accessing data, please keep up to date by checking the notes and additional resources provided with categories and data-fields for useful information. The user guide can be found here:

    National Institutes of Mental Health NRGR (NIMH Repository and Genomics Resource)

    The National Institute of Mental Health Repository and Genomics Resource allows users to search phenotypic & genetic data collections by disorder or study, including stem-cell data.

    Principal Investigators (PIs) can request access for 3 years, however, once the access request expires, the Principal Investigator (PI) may not maintain any raw data files.  To maintain access to distributions, the PI must submit a renewal request to renew their access.

    It can take at least four weeks for NIMH to review your application. For questions regarding the application process, please contact the NIMH Genomics Resources Support Team at

    The NIH also provides a searchable list of more than 100 affiliated repositories containing scientific data that can be accessed by researchers, and more than 25 containing genomic data. The list provides links to information about how to request access to the datasets.