DNA HIVE

DNA HIVE is an environment for enhancing collaboration capabilities with the Food and Drug Administration (FDA). We provide a platform for easy and secure data transfer for collaborative research projects between outside entities and FDA. DNA HIVE also hosts the most recent version of Codon Usage Tables (HIVE-CUTs), a new database for accessing codon usage information. The HIVE-CUTs databases are available to the public and regularly updated.

Access HIVE-CUTs

An account is not required to use HIVE-CUTs. To request an account or if you already have credentials, use links below:

Request Account Log In

Research Data Transfer

DNA HIVE provides a platform for an easy and secure research only data transfer for the collaborative projects between you and FDA. Collaborators with an account can log in and upload data. All uploaded data can be managed through a file explorer like interface with any computer that has access to the internet.

To start the process, create a temporary account by emailing the HIVE Team at HIVE@fda.hhs.gov or clicking ‘Request Account’ below and filling out a form.

Request Account PDF Tutorial

HIVE-CUTs

HIVE Codon Usage Tables (HIVE-CUTs) is a collaborative set of projects between Dr. Kimchi-Sarfaty's research group at the FDA and FDA-HIVE. Currently, it consists of CoCoPUTs, TissueCoCoPUTs CancerCoCoPUTs SARS-CoV-2 CoCoPUTs and EmbryoCoCoPUTs, which are databases for accessing codon, codon pair and dinucleotide usage information for species, human tissues, viral lineages and genes, and various mouse strains, respectively.

Until recently, codon, codon pair and dinucleotide usage resources were very limited and outdated. Given the massive recent increase in GenBank, RefSeq, TCGA and GISAID databases, we have created new codon, codon pair and dinucleotide usage tables with the most up-to-date sequence information that can be downloaded by the user. Several applications of these tables exist ranging from recombinant protein engineering, gene therapy design, vaccine development, individualized therapies, and genetic evolution studies.

CUTs - Codon usage table platform — HIVE-CUTs platform | Previewing tabs: genomic codon pair observed/expected ratio heatmap, liver codon usage table, junction dinucleotide frequencies

Codon and codon pair usage tables (CoCoPUTs) contain genome-wide data of all genomes with published sequences in GenBank. Tissue-specific codon and codon pair usage tables (TissueCoCoPUTs) contain transcriptome-derived data from 52 human tissues. Tumor-specific codon and codon pair usage tables (CancerCoCoPUTs) derived from genomic codon usage information and primary tumor-specific transcriptomic data. The tables presented here represent 32 human primary tumor types / subtypes and their respective normal tissues. SARS-CoV-2 codon and codon pair usage tables (SARS-CoV-2 CoCoPUTs) contain data from 3,201 lineage representatives that are split by into 12 genes. Mutational data and ensemble free energy (EFE) data also provided for this database. All SARS-CoV-2 data was processed from GISAID sequences. Embryo codon and codon pair usage tables (EmbryoCoCoPUTs) contain transcriptomic weighted data spanning 4 mouse strains, 16 tissue categories, and 13 embryonic stages. CoCoPUTs, TissueCoCoPUTs, CancerCoCoPUTs, SARS-CoV-2 CoCoPUTs and EmbryoCoCoPUTs feature graph and heat map visualizations for investigating codon and codon pair usage, as well as dinucleotide and junction dinucleotide frequencies with side-by-side comparisons between organisms, human tissues, viral lineages/genes or embryonic strains/tissues/stages and other tools such as phylogenetic trees and downloadable raw data. In addition, GC% content and effective number of codons (ENC) are provided for all databases.

The HIVE-CUTs databases are available to the public and regularly updated. Notice: HIVE-CUTs works in Edge, Firefox, Chrome, and Safari.

CoCoPUTS TissueCoCoPUTs CancerCoCoPUTs SARS-CoV-2 CoCoPUTs EmbryoCoCoPUTs

Video Tutorial

Database info: Access more information about the database

Publications:

A New and Updated Resource for codon usage tables

Codon and Codon-Pair Usage Tables (CoCoPUTs): Facilitating Genetic Variation Analyses and Recombinant Gene Design

TissueCoCoPUTs: Novel Human Tissue-Specific Codon and Codon-Pair Usage Tables Based on Differential Tissue Gene Expression

CancerCoCoPUTs: Distinct signatures of codon and codon pair usage in 32 primary tumor types in the novel database CancerCoCoPUTs for cancer-specific codon usage

The Kazusa codon usage database, CoCoPUTs, and the value of up-to-date codon usage statistics

Wikipedia: Codon Usage Bias Synonymous Mutations

For questions about HIVE-CUTs, please contact Dr. Chava Kimchi-Sarfaty: Chava.Kimchi-Sarfaty@fda.hhs.gov.

About HIVE:

DNA HIVE

There are currently several HIVE instances in both the public and private domains. DNA HIVE is the public instance of the U.S. Food and Drug Administration (FDA) as a platform for easy and secure data transfer for collaborative research projects between outside entities and FDA, but it lacks the capability to run computational analysis. To request an account, click here to fill out a form. When the account is setup, please review the File Uploading Tutorial. GWU HIVE is another instance hosted at The George Washington University (GWU) where any scientist around the world can access the resources through the web-portal and supports bioinformatics analysis.

FDA HIVE

Compute

Store

Share

The High-performance Integrated Virtual Environment (HIVE) at the Food and Drug Administration (FDA) consists of a high-performance computing cluster with petabyte scale high-availability storage; a sophisticated web-based genomics analysis platform; support for machine learning in Python and R; and a team of expert bioinformaticians, computer scientists, and software developers. The HIVE team works with FDA researchers and reviewers to perform complex data analysis on Next-Generation Sequencing (NGS) experiments, add custom analytics and pipelines to the HIVE platform, train and support FDA Researchers and Reviewers, and assist with big data transfer and storage. HIVE is maintained and operated by the Center for Biologics Evaluation and Research (CBER) but supports projects across multiple centers in FDA. Current projects include genome-wide association studies, gene expression analysis, predictive models using machine learning, and microbiome diversity analysis. Additionally, HIVE supports the review of regulatory submissions with NGS protocols or data. The BioCompute is FDA funded project to establish a framework for community-based development of standards for harmonization of High-throughput Sequencing (HTS), standardization of data formats, promotion of interoperability, and bioinformatics verification protocols. Thus, HIVE supports FDA's overall goals and objectives in areas where information technology requires specialized bioinformatics knowledge and supercomputer-strength computational power.

HIVE Contact Information

Please contact the HIVE team HIVE@fda.hhs.gov if you have any questions regarding HIVE or need help with access, data transfer, or any account related question.