Showing 16 of 16 results
SoftwarereleasedA very efficient implemenation of the the Maximal Overlap Discrete Wavelet Tranform (MODWT). See D. B. Percival and A. T. Walden (2000), Wavelet Methods for Time Series Analysis. Cambridge, England: Cambridge University Press. This is not the usual discrete wavelet transform found in, for example, gsl but an extended set of algorithms designed to overcome some problems with the usual discrete wavelet transform.
SoftwarereleasedThe snow package provides support for simple parallel computing on a network of workstations using R. A master R process calls makeCluster to start a cluster of worker processes; the master process then uses functions such as clusterCall and clusterApply to execute R code on the worker processes and collect and return the results on the master. This framework supports many forms of "embarrassingly parallel" computations.Software type: other
SoftwarereleasedThis is a focused collection of makefiles and scripts to for analysis on Illumina sample data. While this probably can't be directly used by someone not in our lab, it might be a useful reference for others.Software type: filtering
SoftwarereleasedAn R package for screening SNPs for their potential to enhance or disrupt transcription factor binding sites. atSNP accepts as input either SNP ids or the actual coordinates of the SNPs and the alternative alleles. It uses ENCODE motifs and JASPAR motifs to evaluate the regulatory potential of the SNPs; however, it also allows user specified set of transcription factor binding sites in the form of position specific matrices. It outputs for each SNP the significance of the match to each position specific matrix with both the reference and the alternative allele and also the significance of the change in these match scores. atSNP also provides easy visualization of the SNP impact on the binding site by composite logo plots.
SoftwarereleasedThe bigWig format is for display of dense, continuous data that will be displayed in the Genome Browser as a graph. BigWig files are created initially from wiggle (wig) type files, using the program wigToBigWig. The resulting bigWig files are in an indexed binary format. The main advantage of the bigWig files is that only the portions of the files needed to display a particular region are transferred to UCSC, so for large data sets bigWig is considerably faster than regular wiggle files.Software type: file format conversion
SoftwarereleasedUsing an intuitive interface, you can 1) identify DNaseI-hypersensitive sites (DHS) within a genomic region of interest, 2) predict the target gene for DHS of interest, 3) predict the DHS that regulate a gene of interest, 4) identify clusters of similarly regulated DHS, that may have related function, 5) identify enriched motifs for transcription factors that may bind in these similarly regulated DHS, and 6) identify DHS that contain a DNA sequence motif for a transcription factor of interest. The Regulatory Elements Database provides access to roughly 2.8 million DNaseI-hypersensitive sites and their signal in 112 human samples, as well as Affymetrix microarray expression data for the same cell-types.Software type: database
SoftwarereleasedA database that uncovers the molecular basis of TF binding in the human genome based on regulatory motif analysis of all Transcription Factors (TFs) grouped by family. This allows browsing of all known motifs for each factor, curated from TRANSFAC, Jaspar, and Protein Binding Microarray (PBM) experiments, and their enrichment and instances within corresponding TF binding experiments. It also provides a list of novel regulatory motifs discovered by systematic application of several motif discovery tools (including MEME, MDscan, Weeder, AlignACE) and evaluated based on their enrichment relative to control motifs within TF-bound regions. ENCODE-motifs also provides a genome-wide map of regulatory motif instances in the human genome for both known and novel motifs.Software type: database
SoftwarereleasedA wiki-style resource that organizes all the information associated with each transcription factor (TF), including the ChIP-seq peaks, discovered motifs, TF-TF interactions, and the chromatin features (histone modification patterns, DNase I cleavage, and nucleosome positioning) around the ChIP-seq peaks. Will be updated as the project proceeds. The Factorbook display of this information is transcription factor anchored and dynamic.Software type: database
SoftwarereleasedPIQ is a computational method that models the magnitude and shape of genome-wide DNase profiles to facilitate the identification of transcription factor (TF) binding sites. The input of PIQ is one or more DNase-seq experiments, the genome sequence of the organism assayed and a list of motifs represented as position weight matrices (PWMs) that describe candidate TF binding sites. PIQ uses machine learning methods to normalize input DNase-seq data and then predicts TF binding by detecting both the shape and magnitude of DNase profiles specific to each TF. The output of PIQ is the probability of occupancy for each candidate binding site in the genome, along with aggregate TF-specific scores (e.g. metrics for TF-specific chromatin opening).Software type: database
SoftwarereleasedIdentifies DNA features and regulatory elements in non-coding regions of the human genome. One can enter dbSNP IDs, BED files, VCF files, or GFF3 files. A score is returned assessing the evidence for regulatory potential. Clicking on the score reveals the data supporting the inference, by data type and cell type. One can also click on hyperlinks to see the SNP or the region in the UCSC browser, ENSEMBL browser, and dbSNP.Software type: database, variant annotation
SoftwarereleasedExplores annotations of the noncoding genome at variants on haplotype blocks, such as candidate regulatory SNPs at disease-associated loci. Under Set Options tab, set Browse ENCODE button to "on" and select an LD threshold and reference population. Under Build Query Tab, enter a SNP (rsXXXXX), a set of SNPs, a genomic region, or select a GWAS from the drop down menu. HaploReg returns SNPs in LD with query SNPs, their frequency in 4 populations from 1000 Genomes Phase1, and also tells you what evidence ENCODE has found for regulatory protein binding (mouse over to see the protein names), chromatin structure (mouse over to see the cell types with DNase hypersensitivity), the chromatin state of the region (the chromatin state can predict an enhancer or promoter), and putative transcription factor binding motifs that are altered by the variant. Clicking on the SNP name hyperlink reveals further details, including cell type metadata and the mechanism of disruption/creation of TF binding regulatory motifs (showing the PWM matched and its alignment to the local sequence context). SNPs are also intersected with cross-species conserved elements, chromatin states from the Roadmap Epigenomics Consortium, and lead eQTLs from the GTEx Project browser.Software type: database, variant annotation