Satellite DNA consists of very large arrays of repetitive, non-coding DNA in tandem . Satellite DNA is the main component of the functional centromere , and is the main structural component of heterochromatin.
The name “satellite DNA” refers to the phenomenon that the duplication of a short DNA sequence produces a different frequency of the adenine , cytosine , guanine and thymine bases, and thus a different density from the bulk DNA such that they are one The second or ‘satellite’ band occurs when genomic DNA is separated on a density gradient . [2] Sequences with a higher ratio of A+T exhibit a lower density while those with a higher ratio of G+C exhibit a higher density than the bulk of the DNA.
Satellite DNA family in humans
Satellite DNA replicates together with minisatellite and microsatellite DNA . [3]
The major satellite DNA families in humans are called:
satellite family | Repeating Unit Size (BP) | location in human chromosomes |
---|---|---|
α (alphoid DNA) | 170 | all chromosomes |
b | 68 | Centromeres of chromosomes 1 , 9 , 13 , 14 , 15 , 21 , 22 and Y |
satellite 1 | 25-48 | Centromere and other regions in the heterochromatin of most chromosomes |
satellite 2 | 5 | most chromosomes |
satellite 3 | 5 | most chromosomes |
Length
A repeated pattern can range from 1 base pair long (a mononucleotide repeat) to several thousand base pairs long, [5] and the total size of a satellite DNA block can be several megabases without interruption. Long repeat units have been described as consisting of short repeat segments and domains of mononucleotides (1–5 bp) arranged in clusters of microsatellites, in which the differences between individual copies of long repeat units are clustered. it was done. [5] Most satellite DNA is localized to the telomeric or centromeric region of the chromosome. The nucleotide sequence of the repeat is fairly well conserved across species. However, variation in repetition length is common. For example, minisatelliteDNA is a short region (1–5kb) of repetitive elements with length >9 nucleotides. Whereas microsatellites in DNA sequences are considered to be 1-8 nucleotides in length. [6] The difference in how many repeats are present in a region (region length) is the basis for DNA fingerprinting .
Origin
Microsatellites are believed to have originated from polymerase slippage during DNA replication. This comes from the observation that microsatellite alleles are usually length polymorphic; Notably, the observed length differences between microsatellite alleles are typically multiples of the repeat unit length.
Microsatellite expansion ( trinucleotide repeat expansion ) is frequently found in transcription units. Frequent base pair repetition will disrupt proper protein synthesis, leading to diseases such as myotonic dystrophy .
Structure
Satellite DNA adopts higher-order three-dimensional structures in eukaryotic organisms. This was demonstrated in the land crab Gecarcinus lateralis , whose genome consists of 3% of the GC-rich satellite band containing a ~2100 base pair (bp) “repeat unit” sequence motif termed ru. [9] [10] Ru was arranged in long tandem arrays with approximately 16,000 copies per genome. Several Ru sequences were cloned and sequenced to reveal conserved regions of conventional DNA sequences spanning more than 550 bp, with five “different domains” within each copy of Ru.
Four distinct domains contain microsatellite repeats, which are biased in base structure, with purines on one strand and pyrimidines on the other. Some contain mononucleotide repeats of C:G base pairs approximately 20 bp in length. The length of these strand-biased domains ranges from about 20 bp to more than 250 bp. The most prevalent repetitive sequences in the embedded microsatellite regions were CT:AG, CCT:AGG, CCCT:AGGG, and CGCAC:GTGCG [11] [12] [5] These repetitive sequences adopt altered structures including triple-stranded DNA . was shown for. , Z-DNA , stem loop and others under superhelical tension . [11] [12][५]
Between the strand-biased microsatellite repeat and the C:G mononucleotide repeat, all sequence variations retained one or two base pairs, with A (purine) disrupting the pyrimidine-rich strand and T (pyrimidine) purine-rich. Interrupts the enriched strand. This sequence feature appeared in all four strand-biased domain sequences between microsatellite repeats and C:G mononucleotides. These constraints in conformational bias led to highly distorted conformations, as shown by their reactivity to nuclease enzymes, possibly due to the steric effects of larger (bicyclic) purines in the complementary strand of smaller (monocyclic) pyridine rings. The sequence TTAA: TTAA was found in the longest such domain of RU, Which produced the strongest of all reactions for the nucleus. That particular strand-biased divergent domain was subcloned and its altered helical structure studied in more detail.[1 1]
A fifth divergence domain in the Ru sequence was characterized by variations of the symmetric DNA sequence motifs of alternative purines and pyrimidines shown to adopt a left-handed Z-DNA/stem-loop structure under superhelical tension. The conserved symmetric Z-DNA was abbreviated Z 4 Z 5 NZ 15 NZ 5 Z 4 , where Z represents alternating purine/pyrimidine sequences. A stem-loop structure was centered on the highly conserved palindromic sequence CGCACGTGCG:CGCACGTGCG in the Z 15 element and was flanked by extended palindromic Z-DNA sequences in the 35 bp region. Z4 Z5 NZ15 in multiple RU variantsDeletions of at least 10 bp outside the NZ5Z4 structural element were shown, whereas others contained additional Z -DNA sequences extending the alternative purine and pyrimidine domains to more than 50 bp. [13]
An expanded ru sequence (EXT) showed six tandem copies of a 142 bp amplified (AMPL) sequence motif, inserted in a region surrounded by inverted repeats, where most copies contained just one AMPL sequence element. There were no nuclease-sensitive altered structures or significant sequence aberrations in the relatively conventional AMPL sequence. A short RU sequence (TRU), 327 bp shorter than most clones, resulted from a single base change leading to a second EcoRI restriction site in TRU. [9]
Another crab, the hermit crab Pagurus policaris , was shown to have a family of AT-rich substituents with inverted repetitive structures , comprising 30% of the entire genome. Another cryptic satellite from the same crab with the sequence CCTA:TAGG [14] [15] was inserted into some palindromes.