Related Projects

The Consortium is involved in and hosts data for additional ventures whose goals run parallel to those of the ENCODE project:


The Genomics of Gene Regulation was an NIH-funded project with the goal to construct more accurate gene activity predictions using genomic data. The project completed this work in 2018. The ENCODE Data Coordination Center also served as the DCC for the GGR project, and thus, metadata and data were submitted by production labs directly to the ENCODE Portal as they were available.
GGR data on the ENCODE Portal


The modENCODE consortium identified various elements in the genomes of C. elegans and Drosophila, including transcriptomes, transcription factor binding sites, histone-modifications and the occupancy of histone variants and histone modifying-factors. The consortium completed this work in 2012.
modENCODE data on the ENCODE Portal


The model organism encyclopedia of regulatory networks (modERN) project focuses on identification of additional transcription factor binding sites in C. elegans and Drosophila. The goal of the project is to comprehensively map the binding sites for as many factors as possible.
This project is ongoing.
modERN data on the ENCODE Portal


The Roadmap Epigenomics Mapping Consortium was an NIH funded project comparable to ENCODE to identify DNA methylation, histone modifications, chromatin accessibility, and small RNA transcripts in primary human tissues. The consortium completed this work in 2013. The Roadmap metadata has been fully incorporated into the ENCODE Portal, including the standardization of terms and ontologies. The experimental metadata can now be searched and explored alongside the ENCODE data, and data from both projects are incorporated into the Encyclopedia. Where possible, raw data was imported to the ENCODE Portal and was reprocessed using ENCODE uniform processing pipelines against hg19 and GRCh38. However, raw data for a subset of Roadmap experiments is housed at dbGaP. For these, only de-identified processed data is available to download from the ENCODE Portal. Access to the raw data can be requested at the Costello (phs000791) and Bernstein (phs000610) project pages at dbGaP.
Roadmap experiments on the ENCODE Portal
Roadmap reference epigenomes on the ENCODE Portal (these can be linked to the Roadmap EID via the aliases, e.g. E001 --> roadmap-epigenomics:E001)


The ENCODE Portal also hosts certain collections of data, most of which can be found in the "Data" drop down menu in the header:


Within ENCODE, The Encyclopedia of RNA Elements (ENCORE) project is aiming to map all RNA elements recognized by RNA binding proteins (RBPs) encoded in the human genome by applying various assays to HepG2 and K562 cell lines.
ENCORE data 


As part of the NIH Common Fund's Enhancing GTEx (eGTEx) project, the ENCODE and Genotype-Tissue Expression (GTEx) consortia have collaborated to deeply profile overlapping tissues from the same four donors.
ENTEx data generated by ENCODE
ENTEx data generated by GTEx (coming soon)


As a member of the International Human Epigenome Consortium (IHEC), ENCODE compiles project data from a single cell or tissue type as Reference Epigenomes, following the IHEC Reference Epigenome standards. ENCODE also works with other IHEC members to determine metadata standards for the IHEC portal, which collects and displays data from all IHEC projects, and to align assay standards and processing pipelines.
ENCODE Reference Epigenomes
IHEC data portal


The ENCODE consortium has coordinated with the Personal Genome Project (PGP) to deeply profile the epigenomic landscape of various cell types from the PGP donor hu43860C.
PGP data on the ENCODE Portal


RegulomeDB is a tool designed to assess non-coding variants based on functional genomic data. RegulomeDB integrates ENCODE data collected by various assays, including ChIP-seq, DNase-seq, etc. It also incorporates DNA binding motifs and quantitative trait locus (QTL). Using a sophisticated algorithm, RegulomeDB aims to help researchers hypothesize functional roles of variants derived from genomic sequencing or GWAS studies. With ENCODE releasing of new data from both classic and emerging technologies, RegulomeDB will update both its algorithm and the underlying data periodically.
RegulomeDB portal


The ENCODE consortium has coordinated with the Southeast Stem Cell Consortium (SESCC) in order to deeply profile the epigenomic state of various cell types differentiated from the human cell line H9.
SESCC data on the ENCODE Portal


The NIH Common Fund's 4D Nucleome program (4DN) aims to understand the principles behind the three-dimensional organization of the nucleus in space and time (the 4th dimension), the role nuclear organization plays in gene expression and cellular function, and how changes in the nuclear organization affect normal development as well as various diseases. 4DN and ENCODE share a common database framework, SnoVault, and collaborate on data standards, uniform processing pipelines, and other software components.
4DN portal

Alternate providers of ENCODE data

ENCODE data is also available through other genomics portals. These resources may include a subset of the entire ENCODE corpus based on the purpose of the resource.

Visualize ENCODE data with other genomic annotations:

Data repositories for ENCODE data:

  • GEO: Processed data are deposited at GEO.
  • SRA and ENA: Raw sequence data are deposited in the sequence read archives