Project Overview

About the ENCODE Consortium

The Encyclopedia of DNA Elements (ENCODE) Consortium is an international collaboration of research groups funded by the National Human Genome Research Institute (NHGRI). The goal of ENCODE is to build a comprehensive parts list of functional elements in the human genome, including elements that act at the protein and RNA levels, and regulatory elements that control cells and circumstances in which a gene is active.

ENCODE investigators employ a variety of assays and methods to identify functional elements. The discovery and annotation of gene elements is accomplished primarily by sequencing a diverse range of RNA sources, comparative genomics, integrative bioinformatic methods, and human curation. Regulatory elements are typically investigated through DNA hypersensitivity assays, assays of DNA methylation, and immunoprecipitation (IP) of proteins that interact with DNA and RNA, i.e., modified histones, transcription factors, chromatin regulators, and RNA-binding proteins, followed by sequencing.

Current consortium members

The current ENCODE Consortium is composed primarily of scientists funded under Requests for Application (RFAs) released by NHGRI in 2011. NHGRI provides a list of current ENCODE projects and participants. The ENCODE External Consultants Panel oversees the activities of the Consortium and provides advice and feedback on the Consortium's goals, progress and membership.

The consortium is open to any investigator willing to abide by the criteria for participation established for the ENCODE Project by NHGRI and has brought in additional participants when appropriate. Those interested in applying for membership to the ENCODE Consortium should review the ENCODE Project criteria for participation and contact the ENCODE program directors at NHGRI, Elise Feingold, Ph.D., Mike Pazin, Ph.D., or Daniel Gilchrist, Ph.D. 



Previous consortium members

The production scale-effort of the ENCODE project began in 2007. The data generated by projects and participants in the production scale were submitted, processed, and released by the UCSC Genome Browser.

All data generated by the ENCODE Consortium is also available at and can be easily browsed and searched. Details of data organization and release prior to 2014 are found at the UCSC Genome Browser ENCODE data release log.

Pilot phase

In September 2003, the NHGRI introduced the ENCODE project to facilitate the identification and analysis of the complete set of functional elements in the human genome sequence. During the initial pilot and technology development phases of the project, 44 regions—approximately 1% of the human genome—were targeted for analysis using a variety of experimental and computational methods with the aim of assembling a comprehensive encyclopedia of the functional elements in these regions, showing their identity and precise location. The pilot project established protocols for scaling up to full-genome coverage and produced a wealth of data, elucidating elements such as protein-coding genes, transcription units, protein binding sites, conserved DNA elements, features of chromatin assembly and modification, and single nucleotide polymorphisms.

The data generated by projects and participants in the pilot phase were submitted, processed, and released by the UCSC Genome Browser. The data from the pilot phase are still accessible from the pilot phase repository at UCSC.

Related Projects

The consortium is involved in additional ventures whose goals run parallel to those of the ENCODE project. 

modENCODE: The modENCODE consortium identified various elements in the genomes of C. elegans and Drosophila, including transcriptomes, transcription factor binding sites, histone-modifications and the occupancy of histone variants and histone modifying-factors. The consortium completed this work in 2012.

modERN: The model organism encyclopedia of regulatory networks (modERN) project focuses on identification of additional transcription factor binding sites in C. elegans and Drosophila. The goal of the project is to comprehensively map the binding sites for as many factors as possible. This project is ongoing.

REMC: The Roadmap Epigenomics Mapping Consortium (ROADMAP) identified DNA methylation, histone modifications, chromatin accessibility, and small RNA transcripts in primary human tissues.

GGR: The Genomics of Gene Regulation is a newly funded NIH project that will be hosting its data at this site.