RAMPAGE (RNA Annotation and Mapping of Promoters for the Analysis of Gene Expression) is a sequencing approach designed to identify transcription start sites (TSSs) at base-pair resolution, quantify their expression, and characterize their transcripts. The assay uses direct cDNA evidence to link specific genes and their regulatory TSSs 1.
Updated June 2017
The RAMPAGE pipeline was developed as a part of the ENCODE Uniform Processing Pipelines series. The full RAMPAGE pipeline code is freely available on Github and can be run on DNAnexus (link requires account creation) at their current pricing.
The ENCODE-developed pipeline for RAMPAGE assays is also used for the analysis of CAGE (Cap Analysis Gene Expression), and can process libraries generated using rRNA-depleted total RNA >200 nucleotides in size. The CAGE method is intended to provide information on the 5' end of mRNA, and by extensions, TSSs; RAMPAGE is an improvement of the CAGE method 1.
View the current instance of this pipeline
Information contained in file
|Paired-end, g-zipped DNA-sequencing reads||Reads must meet the criteria outlined under the Uniform Processing Pipeline Restrictions.|
|fastq||control reads||A Bismark-transformed, Bowtie-indexed genome|
|tar||genome index||G-zipped STAR genome index|
|View RAMPAGE library structure overview.|
Information contained in file
Produced by mapping reads to the genome
|bigWig||signal||Signals are generated both for unique reads and for unique+multimapping reads.||If data are stranded, unique and unique+multimapping signals are produced for each strand (minus and plus). If the data are unstranded, signals are created without attention given to individual strands.|
|bed tss_peak, bigBed tss_peak, gff||transcription start sites (TSS)||Raw peak files for each replicate|
|bed idr_peak, bigBed idr_peak||consensus transcription start sites (TSS)||IDR comparison of TSSs generated from individual replicates.||Irreproducible Discovery Rate (IDR): compares two peak files (bed), typically from a pair of replicates of the same experiment, allowing validation of the experiment methods and reducing noise in the final results.|
Quality control metrics are also generated, comparing two TSS quantification files and calculating the Mean Absolute Deviation and correlations.
These pipelines require both assembly information for the species of interest and a gene reference. Each of the main programs, TopHat, STAR, and RSEM create an index for use in subsequent steps. More information on the use of RSEM is available here.
Links and Publications
1. Batut, Philippe et al. “High-fidelity promoter profiling reveals widespread alternative promoter usage and transposon-driven developmental gene expression.” Genome Research 23.1 (2013): 169–180. PMC. Web. 9 Feb. 2016.
Uniform Processing Pipeline Restrictions
- Sequencing must be paired-ended
- The read length should be a minimum of 50 base pairs.
- All Illumina platforms are supported for use in the uniform pipeline; colorspace (SOLiD) are not supported.
- Barcodes and spike-ins, if present in the fastq, must be indicated.
- Each RAMPAGE or CAGE experiment must have a corresponding RNA-seq experiment as a control.
- Library insert size range must be indicated.
- Alignment files are mapped to either the GRCh38 or mm10 sequences.
- Gene and transcript quantification files are annotated to either GENCODE V24 or M4.
- For IDR comparison, the experiment must have two and only two replicates.
Current Standards (RAMPAGE)
Experimental guidelines for RAMPAGE experiments can be found here.
- Experiments should have at least two replicates.
- Each replicate should have 20 million aligned reads. Older projects aimed for 10 million aligned reads.
- Each RAMPAGE experiment should have a corresponding RNA-seq experiment as a control.
- Replicate concordance: the gene level quantification should have a Spearman correlation of >0.9 between isogenic replicates and >0.8 between anisogenic replicates.