ENCODE Software
All software used or developed by the ENCODE Consortium
Showing 20 of 20 results
Number of displayed results:
- GraphReg — sourceGraphReg (Chromatin interaction aware gene regulatory modeling with graph attention networks) is a graph neural network based gene regulation model which integrates DNA sequence, 1D epigenomic data (such as chromatin accessibility and histone modifications), and 3D chromatin conformation data (such as Hi-C, HiChIP, Micro-C, HiCAR) to predict gene expression in an informative way.
- HiCDCPlus — sourceThe package HiCDCPlus provides methods to determine significant and differential chromatin interactions by use of a negative binomial generalized linear model, as well as implementations for TopDom to call topologically associating domains (TADs), and Juicer eigenvector to find the A/B compartments. This vignette explains the use of the package and demonstrates typical workflows on HiC and HiChIP data.
- scPOST — sourceSimulation of single-cell datasets for power analyses that estimate power to detect cell state frequency shifts between conditions (e.g. an expansion of a cell state in disease vs. healthy), as described in our manuscript “Maximizing statistical power to detect clinically associated cell states with scPOST”.Software type: other
- cdr3-QTL — sourceWe tested associations between HLA genotypes and TCR-CDR3 amino acid compositions. We treated the amino acid composition of CDR3 as a quantitative trait, and tested its association with HLA genotype; we call this CDR3 quantitative trait loci analysis (cdr3-QTL), as described in our manuscript “HLA autoimmune risk alleles restrict the hypervariable region of T cell receptors”.Software type: other
- Imperio — sourceThis software includes (i) DeepBoost, a gradient boosting method for constructing boosted deep learning annotations by integrating deep learning allelic-effect annotations with fine-mapped SNPs; (ii) tools to combine these deep learning annotations with SNP-to-gene (S2G) linking strategies and relevant gene sets, and (iii) Imperio, a method for integrating deep learning annotations with S2G strategies to predict gene expression in whole blood and construct allelic-effect annotations based on changes in predicted expression. Applications of these 3 approaches to blood-related traits are described in our manuscript “Integrative approaches to improve the informativeness of deep learning models for human complex diseases”.Software type: other
- GSSG — sourceGSSG consists of tools to generate enhancer-driven and master-regulator gene scores in blood, and combine these gene scores with distal and proximal SNP-to-gene (S2G) linking strategies to construct SNP annotations for blood-related traits, as described in our manuscript “Unique contribution of enhancer-driven and master-regulator genes to autoimmune disease revealed using functionally informed SNIP-to-gene linking strategies”.Software type: other
- AnnotBoost — sourceAnnotBoost is a gradient boosting-based framework to impute and denoise Mendelian disease-derived pathogenicity scores to improve their informativeness for common disease, as described in our manuscript “Improving the informativeness of Mendelian disease-derived pathogenicity scores for common disease”.Software type: variant annotation
- mountainClimber — sourcemountainClimber is a method for de novo identification of alternative transcript start sites and polyadenylation sites in RNA-seq dataSoftware type: transcript identification
- Mediated Expression Score Regression (MESC) — sourceMESC is a method for quantifying genetic effects on disease mediated by assayed gene expression levels (Yao et al. 2020 Nat Genet).Software type: quantification
- Stratified LD fourth moments (S-LD4M) — sourceThis software implements our Stratified LD 4th moments regression (S-LD4M) method for estimating polygenicity across allele frequencies and functional categories, as described in our manuscript “Polygenicity of complex traits is explained by negative selection”.Software type: quantification
- Ascertained Sequentially Markovian Coalescent (ASMC) — sourceASMC is a method for inferring pairwise coalescence times implicating regions under negative selection that are enriched for disease heritability (Palamara et al. 2018 Nat Genet).Software type: other
- Signed LD profile (SLDP) regression — sourceSigned LD profile regression is a method for identifying genome-wide directional effects of signed functional annotations on diseases and complex traits, as described in our manuscript “Detecting genome-wide directional effects of transcription factor binding on polygenic disease risk”.Software type: other