Showing 17 of 17 results
SoftwarereleasedBowtie 2 is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences. It is particularly good at aligning reads of about 50 up to 100s or 1,000s of characters, and particularly good at aligning to relatively long (e.g. mammalian) genomes. Bowtie 2 indexes the genome with an FM Index to keep its memory footprint small: for the human genome, its memory footprint is typically around 3.2 GB. Bowtie 2 supports gapped, local, and paired-end alignment modes.
SoftwarereleasedA software package for estimating gene and isoform expression levels from RNA-Seq data. RSEM has enabled valuable guidance for cost-efficient design of quantification experiments with RNA-Seq, which is currently relatively expensive.Software type: transcript identification
Softwarereleasedmfinder is a software tool for network motifs detection. Network motifs are defined as basic interaction patterns that recur throughout biological networks, much more often than in random networks. In order to detect network motifs mfinder implements two methods: a full enumeration of subgraphs and a sampling of subgraphs for estimation of subgraph concentrations. mfinder generates random networks based on the switching method, the stubs method and "Go with the winners" algorithm.
SoftwarereleasedThe lumi package in R provides an integrated solution for the Illumina microarray data analysis. It includes functions of Illumina BeadStudio (GenomeStudio) data input, quality control, BeadArray-specific variance stabilization, normalization and gene annotation at the probe level. It also includes the functions of processing Illumina methylation microarrays, especially Illumina Infinium methylation microarrays.
SoftwarereleasedKING is a rapid algorithm for relationship inference using high-throughput genotype data typical of GWAS that allows the presence of an unknown population substructure. The relationship of any pair of individuals can be precisely inferred by robust estimation of their kinship coefficient, independent of sample composition or population structure (sample invariance). KING performs properly even under extreme population stratification, while algorithms assuming a homogeneous population give systematically biased results. KING performs relationship inference on millions of pairs of individuals in the order of minutes.
SoftwarereleasedThe hive plot is a visualization method for drawing networks. Nodes are mapped to and positioned on radially distributed linear axes. Edges are drawn as curved links. Hive plots can give quantitatively understanding for important aspects of a network's structure. Hive plots can also manage the visual complexity arising from a large number of edges and expose both trends and outlier patterns in a network structure.
SoftwarereleasedGREAT assigns biological meaning to a set of non-coding genomic regions by analyzing the annotations of the nearby genes. Thus, it is particularly useful in studying cis functions of sets of non-coding genomic regions. Cis-regulatory regions can be identified via both experimental methods (e.g., ChIP-seq) and by computational methods (e.g. comparative genomics).
SoftwarereleasedGOrilla is a web-based application that identifies enriched GO terms in ranked lists of genes, without requiring the user to provide explicit target and background sets. It also employs a flexible threshold statistical approach to discover GO terms that are significantly enriched at the top of a ranked gene list. Building on a complete theoretical characterization of the underlying distribution, GOrilla computes an exact p-value for the observed enrichment, taking threshold multiple testing into account without the need for simulations. The output of the enrichment analysis is visualized as a hierarchical structure, providing a clear view of the relations between enriched GO terms.
SoftwarereleasedGERP identifies constrained elements in multiple alignments by quantifying substitution deficits. These deficits represent substitutions that would have occurred if the element were neutral DNA, but did not occur because the element has been under functional constraint. These deficits, or rejected substitutions, are a natural measure of constraint that reflects the strength of past purifying selection on the element. GERP estimates constraint for each alignment column; elements are identified as excess aggregations of constrained columns. A false-positive rate (which is user-settable) is calculated using 'shuffled' alignments in which the order of columns is randomized.
SoftwarereleasedDAVID is able to extract biological features and meanings associated with large gene lists. DAVID is able to handle any type of gene list, no matter which genomic platform or software package generated them. DAVID systematically maps a large number of interesting genes in a list to the associated biological annotation (e.g., gene ontology terms), and then statistically highlights the most overrepresented (enriched) biological annotation out of thousands of linked terms and contents.
SoftwarereleasedCluster 3.0 is an implementation of k-means clustering, hierarchical clustering and self-organizing maps in a single multi-purpose open-source library of C routines, callable from other C and C++ programs. This library is an improved version of Michael Eisen's well-known Cluster program for Windows, Mac OS X and Linux/Unix. Additionally a Python and a Perl interface to the C Clustering Library is implemented to combine the flexibility of a scripting language with the speed of C.
SoftwarereleasedCircos is a software package for visualizing data and information. It visualizes data in a circular layout for exploring relationships between objects or positions. Circos creates publication-quality infographics and illustrations with a high data-to-ink ratio, layered data and symmetries.
SoftwarereleasedBowtie is an ultrafast, memory-efficient short read aligner. It aligns short DNA sequences (reads) to the human genome at a rate of over 25 million 35-bp reads per hour. Bowtie indexes the genome with a Burrows-Wheeler index to keep its memory footprint small: typically about 2.2 GB for the human genome (2.9 GB for paired-end).
SoftwarereleasedCollectively, the bedtools utilities are a swiss-army knife of tools for a wide-range of genomics analysis tasks. The most widely-used tools enable genome arithmetics: that is, set theory on the genome. For example, bedtools allows one to intersect, merge, count, complement, and shuffle genomic intervals from multiple files in widely-used genomic file formats such as BAM, BED, GFF/GTF, and VCF.Software type: file format conversion
SoftwarereleasedANNOVAR is an efficient software tool to utilize update-to-date information to functionally annotate genetic variants detected from diverse genomes (including human genome hg18, hg19, as well as mouse, worm, fly, yeast and many others). Given a list of variants with chromosome, start position, end position, reference nucleotide and observed nucleotides, ANNOVAR can perform: (i) Gene-based annotation: identify whether SNPs or CNVs cause protein coding changes and the amino acids that are affected. (ii) Region-based annotations: identify variants in specific genomic regions, for example, conserved regions among 44 species, predicted transcription factor binding sites, segmental duplication regions, GWAS hits, database of genomic variants, DNAse I hypersensitivity sites, ENCODE H3K4Me1/H3K4Me3/H3K27Ac/CTCF sites, ChIP-Seq peaks, RNA-Seq peaks, or many other annotations on genomic intervals. (iii) Filter-based annotation: identify variants that are reported in dbSNP, identify the subset of common SNPs (MAF>1%) in the 1000 Genome Project, identify subset of non-synonymous SNPs with SIFT score>0.05, find intergenic variants with GERP++ score>2, or many other annotations on specific mutations.
SoftwarereleasedGEM is a Java software package for analyzing genome wide ChIP-seq/ChIP-exo data. GEM can decompose single observed peaks into multiple binding events, determine binding event location at high spatial resolution, and discover explanatory DNA sequence motifs with an integrated model of ChIP reads and proximal DNA sequences. GEM is able to process single-end or paired-end data and can be run in single-condition mode or multi-condition mode. GEM will be used in the ENCODE 3 uniform peak calling pipeline.Software type: peak caller