Page content relevant to:

    Data Collection and Repositories

    The NIH requires the sharing of the data as soon as possible, but no later than the publication of findings OR at the end of the award, whichever comes first. Researchers must specify the repository in their DMSP, but may amend the plan to reflect any subsequent changes to the repository. When selecting a repository for sharing their data, researchers have the following options:

    1. Determine if your funder Institute / Center mandates the use of a specific repository. Use this tool to search for your institute and policies/procedures associated with their mandated repository(ies): NIH Institute and Center Data Sharing Policies
    2. If your Institute / Center does not mandate the use of a specific repository, use the National Library of Medicine Search tool to locate an appropriate "generalist" or domain-specific repository. This page offers a searchable list of more than 100 repositories by Institute or Center, subject area, keyword, and more, with built-in links to repository sites

    If unsure which repository to use, consult your NIH Program Official or ask colleagues doing similar research as they may already have used a repository or have insights that can aid your inquiry.

    Data collection and storage options at UConn Health

    Data should be housed in a secure location prior to sharing; consult your department head, department administrator, or colleagues about available options.

    The UConn Health High Performance Computing Facility provides a variety of High-Performance Computing, Cloud Computing and customized Servers and Services to the UConn Health research community and its collaborators. The facility provides free access to over 300 TeraFLOPS of compute power, 10,000 processor cores and over 14 PB of storage housed in a secure, state of the art data center supported 24/7 by a dedicated staff of professionals.  Presently, these resources are available to UConn researchers free of charge.

    In addition, the following tools can be used by researchers for data collection.

    REDCap is an electronic data collection tool with a user-friendly interface allowing researchers to build and manage online surveys and databases.  ALL REDCap projects are subject to fees: $150/per project for the 1st year and $75 every year thereafter.

    LabMaC (Lab Management Core) provides a service to centralize all lab management tasks, including metadata management requirements per NIH's mandate. While LabMaC meets NIH's requirements, its goal is to facilitate rigorous biomedical research and enhance reproducibility.   The videos below provide more information about LabMac.

    Introduction to LabMaC (08:29 mins)

    LabMaC Information Session - April 3rd, 2024 (41:56 mins)

    Data Sharing: Considerations for Selecting a Repository

    Repositories often have specific data organization and formatting requirements, and different repositories may have different standards and metadata requirements for data submission. Choosing a repository early ensures that you can collect and document your data in a way that aligns with these standards, increasing the likelihood that your data will be accepted and usable by others.

    Many funding agencies and journals require researchers to deposit data in specific repositories as a condition of funding or publication. By selecting a repository in advance, you can ensure that your research complies with these policies and avoid delays in publishing or securing funding.


    Desirable Characteristics of Data Repositories:
     
    • Unique persistent identifiers
    • Long-term sustainability
    • Curation and quality assurance
    • Broad and measured reuse
    • Free and easy access
    • Metadata
    • Confidentiality
    • Retention policy
    • Clear user guidance
    • Security and integrity
    • Provenance (origin, source, background)

    Generalist Repositories

    While many NIH Institutes and Centers are associated with domain-specific repositories, when researchers cannot locate a repository for their discipline or the type of data, a generalist repository can be a useful place to share data. 

    The links below provide detailed information about select generalist repositories from Fairsharing.org, a curated, informative and educational resource on data and metadata standards, inter-related databases and data policies. Additional search options are available to search a multitude of resources.


    figshare is a subject-agnostic repository for many different types of digital objects that can be used without cost to researchers. Data can be submitted to the central figshare repository (described here), or institutional repositories using the figshare software can be installed locally, e.g. by universities and publishers.

    Dryad Digital Repository is an open-source, community-led data curation, publishing, and preservation platform for CC0 publicly available research data. Dryad has a long-term data preservation strategy with storage in US and EU. Costs are covered by institutional, publisher, and funder members, otherwise a one-time fee of $120 for authors to cover cost of curation and preservation.

    Harvard Dataverse Repository is a research data repository running on the open source Dataverse software. The repository is fully open to the public, allows upload and browsing of data from all fields of research, and is free for all researchers worldwide.

    The Open Science Framework (OSF) is a free and open free, open repository and platform to enable collaboration and support the entire research lifecycle: planning, execution, reporting, archiving, and discovery. It is 100% free to researchers, open source, and intended for use in all domain areas.

    Mendeley Data is a multidisciplinary, free-to-use open repository specialized for research data. Files of any format can be uploaded and shared with the research community following the FAIR data principles, up to a maximum of 10GB per dataset.

    Synapse is a collaborative research platform that allows individuals and teams to share, track, and discuss their data and analysis in projects. Synapse allows researchers to share and describe data, analyses, and other content. Synapse also provides mechanisms for adding and retrieving data, analyses, and their respective descriptions.

    Vivli Data Repository provides a global data-sharing and analytics platform serving all elements of the international research community. Focused on sharing individual participant-level data from completed clinical trials to serve the international research community, Vivli provides managed access for human subject clinical research data.

    Zenodo is a generalist research data repository built and developed by OpenAIRE and CERN. Zenodo helps researchers receive credit by making the research results citable and is also passed to DataCite and onto the scholarly aggregators. Restricted and Closed content is also supported and is free for researchers below 50 GB/dataset.