TF family statistics

Divergence

Phylogenetic subdivisions

Three subdivisions of the TF family are distinguishable: indicated as TF1, TF2 and the L1spa group. Six differences between The differences distinguishing TF1 and TF2 are concentrated between positions 6845..6955. This almost certainly reflects a small patch of gene converted sequence. 5/6 of the differences are reversions to an ancestral state found higher up on the tree, consistent with this being a gene conversion from some older sequence. The tree shown is obtained by forcing these 6 variants to covert simultaneously and fixing the direction by making the ancestral state the one found in spretus. Sequences were placed relative to their content of this converted region, with disagreeing positions left to be explained as coincidences or recombinants.

Conversion patterns

Recombinants

In addition to AF081111 and AF081104 discussed above, AC4405LR is a recombinant at about position 5704 with a full length A2 element such that the A2 portion is on the 5' side. The TSD was disrupted, so this is an ectopic exchange.

Nonsynonymouse:synonymous tabulation

The substitutions taken as a whole have a nonsynonymous:synonymous ratio consistent with selection for function of the transposon.

The lineage before the burst

We have conducted directed hybridization and cloning experiments to identify sequences joining the upper part of the lineage, but everything we found maps to TFnode1 or TFnode2 or the L1spa group. The current estimate is less than 20 copies in the genome could map to the upper part of this lineage. Naas et al., 1998 reported 2000-3000 copies in the burst itself. We have found the immediate precursors to the burst in the M. spicilegus strain PANCEVO. Therefore we believe that the absence of L1 copies from earlier in the TF lineage in M. domesticus is because TF wasn't introduced into M. domesticus until transfer from M. spicilegus (or its sister species macedonicus) until very recently.

Promoter arrays

The promoters are F-type arrays. m2 and higher elements are very similar. The 5' half of m1 is very similar to m2 and above, whereas the 3' half of m1 is quite distinctive. Even with a lot of gapping to increase similarity of the divergent m1 portion, there are 25/109 positions substituted. This is enough divergence to date the original array expansion to the order of 5 Mya. The upstream monomers were apparently recently homologized by a reexpansion. The pattern of homogenization suggests that the expansion initiated when a cDNA from a template with 1 1/2 monomers block slipped and reprimed in the middle of m1. There is much evidence of multiple reexpansions and shifting of sequence patterns from one monomer position to another. See discussion of variation in TF promoter by DeBerardinis et al., 1999.

Sequence Sources