ENCODE Software
All software used or developed by the ENCODE Consortium
Showing 12 of 12 results
Number of displayed results:
- sQTLseekeR — sourcesQTLseekeR is a package to detect splicing QTLs (sQTLs), which are variants associated with change in the splicing pattern of a gene. In sQTLSeeker, splicing patterns are modeled by the relative expression of the transcripts of a gene. The most recent version of sQTLseekeR can be employed to detect genetic variant associated to any multivariate phenotypeSoftware type: variant annotation
- AnnotBoost — sourceAnnotBoost is a gradient boosting-based framework to impute and denoise Mendelian disease-derived pathogenicity scores to improve their informativeness for common disease, as described in our manuscript “Improving the informativeness of Mendelian disease-derived pathogenicity scores for common disease”.Software type: variant annotation
- vcf2diploid — sourceCreates phased diploid genomes variants from a vcf file by integrating variants to a reference genome.Software type: variant annotation
- LongRangerLong Ranger is a set of analysis pipelines that processes Chromium sequencing output to align reads and call and phase SNPs, indels, and structural variants. These pipelines combine Chromium-specific algorithms with widely used components such as BWA, Freebayes, and GATK. Output is delivered in standard BAM, VCF, and BEDPE formats that are augmented with long range information.Software type: aligner, variant annotation
- GATK — sourceThe Genome Analysis Toolkit or GATK is a software package for analysis of high-throughput sequencing data, developed by the Data Science and Data Engineering group at the Broad Institute. The toolkit offers a wide variety of tools, with a primary focus on variant discovery and genotyping as well as strong emphasis on data quality assurance. Its robust architecture, powerful processing engine and high-performance computing features make it capable of taking on projects of any size.Software type: variant annotation
- UES (Uncovering Enrichment through Simulation) — sourceThe UES (Uncovering Enrichment through Simulation) algorithm was written to help interpret results from genome-wide association studies (GWAS) using publicly available datasets.Software type: variant annotation
- WASP — sourceWASP is a software package for two related tasks: (1) correcting allelic bias in mapped sequencing reads and, (2) identifying molecular quantitative trait loci (QTLs) using next-generation sequencing data (e.g. gene expression QTLs or histone mark QTLs). The WASP mapper works with any read mapping pipeline that outputs BAM or SAM format. WASP identifies molecular QTLs using a statistical test that combines information about the total depth and allelic imbalance of mapped reads. WASP can call QTLs with very small sample sizes (as few as 10) compared to traditional QTL mapping approaches.Software type: aligner, variant annotation
- RegulomeDB — sourceIdentifies DNA features and regulatory elements in non-coding regions of the human genome. One can enter dbSNP IDs, BED files, VCF files, or GFF3 files. A score is returned assessing the evidence for regulatory potential. Clicking on the score reveals the data supporting the inference, by data type and cell type. One can also click on hyperlinks to see the SNP or the region in the UCSC browser, ENSEMBL browser, and dbSNP.Software type: database, variant annotation
- HaploReg — sourceExplores annotations of the noncoding genome at variants on haplotype blocks, such as candidate regulatory SNPs at disease-associated loci. Under Set Options tab, set Browse ENCODE button to "on" and select an LD threshold and reference population. Under Build Query Tab, enter a SNP (rsXXXXX), a set of SNPs, a genomic region, or select a GWAS from the drop down menu. HaploReg returns SNPs in LD with query SNPs, their frequency in 4 populations from 1000 Genomes Phase1, and also tells you what evidence ENCODE has found for regulatory protein binding (mouse over to see the protein names), chromatin structure (mouse over to see the cell types with DNase hypersensitivity), the chromatin state of the region (the chromatin state can predict an enhancer or promoter), and putative transcription factor binding motifs that are altered by the variant. Clicking on the SNP name hyperlink reveals further details, including cell type metadata and the mechanism of disruption/creation of TF binding regulatory motifs (showing the PWM matched and its alignment to the local sequence context). SNPs are also intersected with cross-species conserved elements, chromatin states from the Roadmap Epigenomics Consortium, and lead eQTLs from the GTEx Project browser.Software type: database, variant annotation