Distinct chromatin features characterize different classes of repeat sequences in Drosophila melanogaster.

Krassovsky K, Henikoff S.
BMC genomics. 2014;15:105.
Abstract
BACKGROUND: Repeat sequences are abundant in eukaryotic genomes but many are excluded from genome assemblies. In Drosophila melanogaster classical studies of repeat content suggested variability between individuals, but they lacked the precision of modern high throughput sequencing technologies. Genome-wide profiling of chromatin features such as histone tail modifications and DNA-binding proteins relies on alignment to the reference genome and hence excludes highly repetitive sequences. RESULTS: By analyzing repeat libraries, sequence complexity and k-mer counts we determined the abundances of different D. melanogaster repeat classes in flies in two public datasets, DGRP and modENCODE. We found that larval DNA was depleted of all repeat classes relative to adult and embryonic DNA, as expected from the known depletion of repeat-rich pericentromeric regions during polytenization of larval tissues. By applying a method that is independent of alignment to the genome assembly, we found that satellite repeats associate with distinct H3 tail modifications, such as H3K9me2 and H3K9me3 for short repeats and H3K9me1 for 359 bp repeats. Short AT-rich repeats however are depleted of nucleosomes and hence all histone modifications and associated chromatin proteins. CONCLUSIONS: The total repeat content and association of repeat sequences with chromatin modifications can be determined despite repeats being excluded from genome assemblies, revealing unexpected distinctions in chromatin features based on sequence composition.
Consortium data used in this publication
modENCODE Fly RNA, histone modification ChIP, TF ChIP, Table 2, 4, Figure 3, 6, 7, 8