Results and Discussion

AMEL and ARHGAP6

The AMEL gene in eutherians, opossums and African clawed toads resides in a large intron of ARHGAP6 in the opposite orientation (fig. 1). By contrast, the chicken and fugu genome databases show no trace of AMEL in their genomes. As mentioned earlier, AMEL is a member of the SCPP family and all the members except AMEL are clustered in one chromosome [Kawasaki and Weiss, 2003, 2006]. It is therefore likely that AMEL too originally arose in that cluster and happened to transpose into an intron of ARHGAP6, thereby making a nested gene structure. There are two possible explanations for the presence or absence of AMEL in different tetrapod genomes. One in-

Fig. 2. Evolutionary stratum and a relatively recent X-chromosomal inversion encompassing AMELX. Strata 3 and 4 began to be formed in the ancestral lineage of eutherians, about 100 MYA, and in the ancestral lineage of simian primates (New and Old World monkeys), >50 MYA, respectively. The gametologous ARHGAP6 on the human Y chromosome has since disappeared. The inversion of 3- to 4-Mb harboring AMELX is X-chromosome specific and must have occurred after the formation of stratum 4. p = The nucleotide differences per site between X-Y homologous regions.

>50 MYA

Ancient PAR

AMELX

AMELY

<50 MYA

New PAR

AMELX

At present

Present PAR

AMELX

XLEMA

Inversion

XLEMA

Inversion vokes a single transposition in the ancestral lineage of tetrapods. The absence of AMEL in chicken and possibly in toothless turtles can then be explained by secondary loss of the gene. The disappearance of chicken AMEL is consistent with the absence of exon 1 and intron 1 of chicken ARHGAP6 (fig. 1). Alternatively, AMEL might transpose twice independently: one in the amphibian lineage and the other in the ancestral mammalian lineage. Although there is no need to invoke loss of AMEL in birds, this alternative becomes much less parsimonious than the first when we explain (1) why the genomic position of mammalian and amphibian AMELs is the same and (2) how AMEL came to exist in reptiles, e.g. caimans [Toyosawa et al., 1998]. Hence, it is concluded that all tetrapod AMELs in ARHGAP6 have experienced a single transposition.

Even after sex chromosomes evolved independently in some reptiles, birds and mammals, AMEL and ARHGAP6 had remained to be autosomal. It is only in eutherian mammals that a pair of homologous autosomes that carried these and all other linked genes were added or translocated to the telomeric end of the pseu-doautosomal region (PAR) of the sex chromosomes. As a consequence, subsequent evolution of eutherian AMEL became intimately related to evolution of the sex chromosomes.

AMEL Differentiation in the Eutherian Sex

Chromosomes

The original mammalian sex chromosomes arose from a pair of homologous autosomes, >200 MYA before the divergence of monotherians [Ohno, 1967; Graves, 2002]. One possible cause for this sex-chromosomal dif-

0.05

100 p Human X

-Squirrel monkey X

Ring-tailed lemur X

-Cattle X

Horse X

100 Human Y

Chimpanzee Y

Squirrel monkey Y

Ring-tailed lemur Y -Cattle Y

Pig Y

Horse Y House shrew AMEL

Opossum AMEL

Fig. 3. Phylogenetic relationships of mammalian AMELX and AMELY sequences rooted by house shrew and opossum AMELs [accession numbers: AB287298 and AB287299]. Not only the coding but also intron sequences were used wherever they can be aligned. Open diamonds stand for differentiation points between game-tologous AMELX and AMELY by recombination inhibition. a The 5' region located in evolutionary stratum 3 [Lahn and Page, 1999], including partial intron sequences (483 bp). b The 3' region located in evolutionary stratum 4, including partial intron sequences (770 bp). a, b The number near a node stands for the bootstrap value in 1,000 replications.

Human X Chimpanzee X

Squirrel monkey X

100 I" Human Y

L Chimpanzee Y

Squirrel monkey Y

Ring-tailed lemur X

Ring-tailed lemur Y

Cattle X

Cattle Y

Pig X

Horse X

Horse Y

House shrew AMEL - Opossum AMEL

0.05

b ferentiation is that two or more genes that determine complex sex characters evolved in one chromosome and that homologous recombination among these genes was inhibited [Nei, 1969 and see a later discussion]. The earliest inhibition of homologous recombination appears to be responsible for forming the oldest so-called 'evolutionary stratum 1' manifested in the long arm of the X chromosome [Lahn and Page, 1999]. Later, after the divergence between eutherians and metatherians, a pair of homologous autosomes that harbored AMEL fused with the original mammalian sex chromosomes and became the short arms [Graves, 2002]. The proximal part of the short arms adjacent to stratum 2, which was deposited near the centromeric region by the time of the chromosomal fusion, was then subjected to recombination inhibition in the stem lineage of eutherians >100 MYA. However, the distal part of the short arms was permitted to recombine until the emergence of simian primates (New and Old World monkeys), about 50 MYA [Martin, 1993; Taka-hata, 2001]. Thus, during the period of about 50 MYA, the

Fig. 4. The ratio (f) of per site nonsynony-mous to synonymous substitutions along branches in the neighbor-joining tree of eutherian AMELXs and AMELYs with opossum AMEL [Hu et al., 1996, accession number: AB287299] as an outgroup. The exon 3, 5 and 6 sequences are used for computing per site synonymous substitutions, whereas only the exon 6 sequences are used for computing per site nonsynony-mous substitutions. The ratio f is then calculated as the ratio of the per site nonsyn-onymous substitutions in exon 6 to the per site synonymous substitutions in exons 3, 5 and 6. The total number of synonymous sites is 132.8 in the three exons, and the number of nonsynonymous sites is 268.3 in exon 6. The symbol g means no synonymous substitutions. Significance levels off< 1 are indicated by asterisks (* 0.01 < p < 0.05, ** p < 0.01).

2.01

1.67

0.29

0.40

Chimpanzee X

0.30

Rhesus monkey X

Squirrel monkey X

14.4

1.84

Human Y

0.53

Chimpanzee Y

Squirrel monkey Y

0.87

0.50

0.16

0.81

0.73

5.33

0.25

0 Rat X

Golden hamster X -Guinea pig X

Goat X

Goat Y

0.28

1.55

Cattle X

Cattle Y

0.70

Pig X

Pig Y Horse X

Horse Y

Opossum AMEL

0.02

proximal part (stratum 3) accumulated substantial sequence differences, yet the distal part was still allelic or constituted the ancient PAR in which the X and Y chromosomes could pair and recombine in meiosis (fig. 2). The junction between these proximal and distal parts is marked by transposon medium reiterated frequency repeat 5 (MER5) within intron 2 of AMELX and regarded as an ancient pseudoautosomal boundary [Iwase et al., 2001, 2003]. The phylogenetic analysis of eutherian AMELX and AMELY genes shows that the 5' region (upstream from MER5) differentiated before the eutherian radiation, while the 3' region (downstream from MER5) differentiated independently within individual eutherian orders (fig. 3). In primates, differentiation of the 3' region occurred after the divergence between prosimians and simian primates, but before the splitting between New and Old World monkeys. Since exons 1 and 2 in the 5' region are largely untranslated, it is naturally found that the phylogenetic relationship in the 3' region is identical to the one previously studied based on the amino acid or intron 3 sequences [Huang et al., 1997; Toyosawa et al., 1998].

Differentiation of AMELX and AMELY is likely a result, rather than a cause, of recombination inhibition in the short arms. It was argued that SRY (sex-determining region Y) and RBMY (RNA-binding motif Y) are candi date genes for recombination inhibition [Iwase et al., 2003]. In this respect, it is interesting to note the presence of nine gene families in the human Y ampliconic region or massive repeat units [Skaletsky et al., 2003]. These gene families including RBMY are expressed exclusively or predominantly in testes and many of them are implemented in spermatogenesis or sperm production. The families originated either from proto-XY gene pairs in the original mammalian sex chromosomes or from ret-roposition or transposition of autosomal genes. The emergence of such genes as CDY (chromodomain Y) and VCY (variable charge Y) coincides with formation of stratum 3, about 100 MYA, and stratum 4, about 50 MYA, respectively [Bhowmick et al., 2006]. For these reasons, it is tempting to speculate that these ampliconic genes or male-specific genes in the human Y amplicons have somehow been involved in stepwise differentiation of the eutherian sex chromosomes as well as in that of AMELX and AMELY.

Selection for and against AMELX and AMELY

As aforementioned, AMELY genes may not be under strong functional constraint or may be even on the way to dead genes or pseudogenes. To examine this possibility, we estimated synonymous (bS) and nonsynonymous (bN) substitutions that have accumulated in individual

Table 1. Polymorphism at the human AMELX and AMELY loci and the average nucleotide differences per site (p distances) from the chimpanzee ortholog

5' region in evolutionary 3' region in evolutionary Both regions stratum 3 stratum 4

AMELX AMELY AMELX AMELY AMELX AMELY

5' region in evolutionary 3' region in evolutionary Both regions stratum 3 stratum 4

AMELX AMELY AMELX AMELY AMELX AMELY

Nucleotide sites, bp

2,571

3,229

3,942

3,398

6,513

6,627

Segregating sites

10

0

6

6

16

6

Haplotypes

12

1

9

8

18

8

n, % [Nei and Li, 1979]

0.064

0

0.055

0.022

0.058

0.012

0, % [Watterson, 1975]

0.089

0

0.035

0.042

0.056

0.022

D [Tajima, 1989]

-0.831

0

0.491

-1.381

0.121

-1.381

p distances, %

0.86

1.24

0.69

1.27

0.76

1.25

The sample size of AMELX and AMELY is 45 and 18, respectively. 1 Not significant (p > 0.1).

The sample size of AMELX and AMELY is 45 and 18, respectively. 1 Not significant (p > 0.1).

branches of the AMELX and AMELY gene tree (fig. 4). The ratio (f) of bN/bS is an indicator of selective pressure for nonsynonymous substitutions relative to synonymous substitutions both of which have accumulated for the same period of evolutionary time. The value off ranges from 0 to 1 under the neutral theory of molecular evolution [Kimura, 1983]. Since the neutral theory assumes negligible roles of positive selection at the molecular level, the smaller the f value, the stronger the negative pressure against nonsynonymous substitutions. A caveat is that although the neutral mutation rate per se may differ between the X- and Y-linked genes [Ebersberger et al., 2002], the f value is independent of the mutation rate. It is therefore sensible to compare f values of various genes irrespective of their chromosomal locations. On the other hand, if positive selection operates for nonsynonymous substitutions, thefvalue maybecome >1. However, since all nonsynonymous sites in a gene are unlikely to be subjected to positive selection, the observation of f> 1 averaged over the nonsynonymous sites in a given gene tends to be a very conservative criterion for detecting positive selection.

Unexpectedly, along the common ancestral lineage leading to humans and chimpanzees, f becomes >1 in both AMELX and AMELY (fig. 4). This enhanced rate of nonsynonymous substitutions can also be visible in human AMELY. Similarly, in the ancestral lineage of Ruminantia (cattle and goats) or in that of Perissodac-tyla (horses), f> 1 is found before the divergence between their gametologous AMELX and AMELY. Although these f values are subject to large sampling errors and are not significantly greater than 1 in most of the cases, it is suggested that positive selection operated in particular lineages of eutherian amelogenin genes. The remaining lineages show conservation of amelogenin genes at the amino acid level. In particular, AMELY is highly conserved in cattle and horses. On the other hand, AMELX in rodents exhibits relatively high f values. This observation raises two possibilities: relaxation of functional constraint and positive selection for some nonsynonymous substitutions. In the absence of AMELY in rodents, the latter possibility appears more likely than the former. In any case, there is no indication for preferential deterioration of AMELY at the amino acid level. Rather, like AMELX, existing AMELY genes have experienced positive selection, followed by negative selection.

Polymorphism of Human AMELX and AMELY

We examined the DNA sequences of 45 AMELX genes (each about 6.5 kb) and 18 AMELY genes (each about 6.6 kb) for a worldwide sample taken from the human population (table 1). As expected, almost all observed polymorphic or segregating sites are due to single nucleotide substitutions and occur in introns 1 and 2. Exceptionally, two substitutions are found in the coding region. One is a synonymous substitution in exon 6 of AMELX that is shared by different ethnic groups. The other is a nonsense mutation in exon 5 of AMELY that is represented by a single Asian male (Ami) in the Y chromosome sample. In addition, there is only one insertion/deletion polymorphism (4 bp) in intron 1 that is represented by a single chromo some in Druze. Thus, although human AMELX and AMELYexperienced positive selection in the past, they are well conserved in the present-day human population.

The nucleotide differences per site (p) between human and chimpanzee orthologs are uniformly distributed over the 5' and 3' regions (table 1), suggesting no region-specific, differential mutation rates. However, the overall p-distances are significantly greater in the comparison of AMELY (1.25%) than of AMELX (0.76%). These values are in agreement with previous estimates if high occurrences of C to T mutations at CpG sites are excluded [Ebersberger et al., 2002; Jobling et al., 2004]. The relatively large p value for AMELY supports the notion of male-driven hypothesis of molecular evolution [Miyata et al., 1987]. Provided that the sex ratio is 1, AMELY evolves with male mutation rate rm whereas one third of AMELX in a population evolves with rm and two thirds with female mutation rate rf. Thus, we have p(Y) = 2trm (1a)

in which p(Y) = 1.25%, p(X) = 0.76% and t is the divergence time between humans and chimpanzees. Eliminating tin (1), we obtain the ratio of male to female mutation rates (a = rm/rf) of about 2.6.

The extent of polymorphism measured by nucleotide diversity w and 9 is also more or less homogeneous over the 5' and 3' regions. The theoretical formula of w [Nei, 1987] suggests that the ratio of X chromosomal to autosomal w is given by

Formula (2) takes into account the differences in both population sizes and mutation rates between autosomal and X-linked genes. If a = 2.6, the ratio becomes 0.64. Even if the a value is as large as suggested by other studies [see Jobling et al., 2004 for review], the expected ratio must be >0.5. A typical value of w for human autosomes is as low as 0.088% [Yu et al., 2002] and implies a relatively small effective size in the human demographic history [Takahata, 1993]. With 0.088% for autosomal w, the expected w value for human X-linked genes ranges from 0.044 to 0.056% and is in agreement with the observed value in the 5' and 3' regions (table 1).

In our sample of human AMELY, there is no segregating site in the 5' region, but six in the 3' region. However, since the difference in the extent of polymorphism between the 5' and 3' region is not statistically significant, we compare the expected w value with the observed

0.012% over the two regions. The ratio of Y chromosomal to autosomal w is given by a/{2(1 + a)}. (3)

This ratio becomes 0.36 for a = 2.6 and 0.5 for larger a. Thus, the w value in formula (3) ranges from 0.032 to 0.044%. Although the observed 0.012% is below this range and may be lowered by either positive or negative selection at completely linked sites, there is no significant difference between these expected and observed values.

LD and Ancient Pseudoautosomal Boundary

When carrying out a human population survey of AMELX and AMELY, we hypothesized that molecular mechanisms responsible for making the ancient pseudo-autosomal boundary within the amelogenin gene may still somehow affect patterns and levels of the present-day polymorphism. We first examined LD at pairs of polymorphic sites in a specified region. We measured nonran-dom association at such a pair of sites by r2 [Hill and Robertson, 1968] or the absolute square root |r|. These values cannot be large when segregating sites under study are not intermediate in frequency in a sample. We ignored this fact and took the average over all pairs of polymorphic sites. Indeed, the average |r| value for AMELY becomes as small as 0.19 despite the absence of recombination. On the other hand, the average value for AMELX is 0.15 in the 5' region, 0.37 in the 3' region and 0.21 in the entire region. The |r| value is slightly larger in the 3' region than in the 5' region, reflecting some excess of rare-frequency segregating sites in the former region (or D < 0 in table 1). Clearly, it is necessary to examine LD in a large sample as well as in a large chromosomal scale, because formation of evolutionary stratum is a chromosome-wide phenomenon. To this end, we used the Hap-Map data encompassing X-linked ARHGAP6 of 570 kb length.

The LD analysis for the African population (fig. 5) shows the presence of a strong LD block in the large in-tron of ARHGAP6. The block is largest within the surrounding region of 3 Mb length and can be taken as evidence for the presence of a recombination cold spot in ARHGAP6. Since almost the same pattern is obtained in the European and Asian populations (data not shown), the phenomenon does not seem to stem from the human demography, but from genomic causes. We presume that when stratum 3 was formed 100 MYA, the cold spot in ARHGAP6 already existed in the eutherian sex chromosomes. We then hypothesize that this cold spot was used

Fig. 5. LD map in the region of 10.6-13.6 Mb from the short arm end of the human X chromosome that includes AMELX and ARHGAP6. The top bar indicates positions of genotyped SNPs. Solid triangles indicate strong LD blocks defined as in Gabriel et al. [2002]. The lower panel is a magnification of a region surrounding ARHGAP6.

Fig. 5. LD map in the region of 10.6-13.6 Mb from the short arm end of the human X chromosome that includes AMELX and ARHGAP6. The top bar indicates positions of genotyped SNPs. Solid triangles indicate strong LD blocks defined as in Gabriel et al. [2002]. The lower panel is a magnification of a region surrounding ARHGAP6.

to determine the proximal end of the ancient PAR that consists of the current PAR and stratum 4 (fig. 2). However, it is to be noted that the boundary of this ancient PAR occurs within AMELX and does not exactly correspond to the distal end of the cold spot (fig. 5): Actually the cold spot is included in the centromeric end of the ancient PAR. Nonetheless, we may argue that the cold spot was fortuitously involved in the determination of the ancient pseudoautosomal boundary.

Curiously, the gene orientation or the centromere-telomere polarity of AMELX and ARHGAP6 in humans, chimpanzees and rhesus monkeys is reversed compared with that of AMELY and deteriorated ARHGAP6 in their Y chromosomes as if the 5' region of AMELX disrupts the continuity of stratum 4 (fig. 1, 2). This may result from a small X chromosomal inversion of <3 to 4 Mb length [Ross et al., 2005]. Indeed, the gene orientation of AMELX

in cattle in the National Center for Biotechnology Information database is just opposite to that of humans, chimpanzees and rhesus monkeys. It thus appears that the inversion occurred after the eutherian radiation, <100 MYA, but before the divergence between hominoids and Old World monkeys, >30 MYA. More precisely, since the extent of sequence differentiation in the 3' region between human AMELX and AMELY is the same as that of stratum 4, we may conclude that the inversion should not predate the formation of stratum 4 (fig. 2). In other words, the inversion that occurred during the time period from 30 to 50 MYA is unlikely to be the cause of recombination inhibition for stratum 4.

Was this article helpful?

0 0

Post a comment