AMEL and the Evolutionary Origin of EMP Genes

The current knowledge on the relationships and evolutionary origin of EMPs was acquired in several steps, and this study represents the last (but not least) one. This story can be briefly reconstructed as follows.

In 2001, Delgado et al. showed a high sequence similarity of the 5' region (exon 2, which mainly encodes the signal peptide) of AMEL, SPARC, and SPARCL1, suggestive of a common origin of this region after duplication. Using a molecular-clock method to estimate SPARC/ SPARCL1 divergence, these authors proposed that AMEL exon 2 was created >600 MYA (i.e. at the end of the Pre-cambrian). This meant that AMEL could have been present before the origin of vertebrates, 530 MYA [Shu et al., 1999, 2003], and of the first evidence of mineralized elements in euconodonts, 500 MYA [Sansom et al., 1992; 1994; Janvier, 1996].

Two years later, taking advantage of the availability of the sequenced human genome and gene mapping, Kawasaki and Weiss [2003] convincingly demonstrated that (i) EMPs comprise a subfamily, (ii) EMP, milk casein, and salivary protein families together are regrouped into a cluster on chromosome 4, forming a larger family, and (iii) this family also contains the SIBLING gene cluster, which is located in another locus on the same chromosome. The SCPP family was now a fact.

Another chapter was added to the story when SPAR-CL1 was proposed to be the common ancestor of SCPP genes on the basis of its location, adjacent to the SIBLING cluster on chromosome 4, and of the structure of its N-terminal region [Kawasaki et al., 2004]. Therefore, although SPARC still remains at the origin of the mineralizing protein gene story, it was SPARCL1 that gave rise to the SCPP gene ancestor. SPARC is present in both proto-stomes and deuterostomes6, where it influences cell behavior and interactions with the extracellular matrix, rather than being involved in the generation of mineralized tissues. Several runs of duplications, and subsequent sub- and/or neofunctionalization have occurred and led to the current diversity of this family. Using a molecular-clock method, the divergence date between SPARC and SPARCL1 was found to be inferior or equal to the current divergence date of cartilaginous fishes (estimated at 528 8 56 MYA using molecular dating [Kumar and Hedges, 1998]). This led to the conclusion that the SCPP genes probably emerged after this date [Kawasaki et al., 2004]. This dating is more recent than the >600 MYA previously calculated by Delgado et al. [2001].

Taken together, these findings suggest that AMEL is more distantly related to SPARC and/or SPARCL1 than hitherto believed before, and that at least five duplication events took place from SPARC to AMEL [Sire et al., 2006]:

SPARC ] SPARCL1 ] SCPP ancestor ]

Below, we briefly review the current scenario for EMP gene relationships, which was established in the course of studies dealing with AMEL origins [Sire et al., 2005, 2006]. The previously published dataset is completed by additional information on AMTN and ODAM (fig. 1, 2), with the aim to clarify the relationships of all ameloblast-secreted SCPP proteins.

The Evolutionary Origin of AMEL

This study was performed in three steps:

Step 1: Evolutionary Analysis of AMEL Sequences in


A total of 80 AMEL sequences (including mammals, reptiles, and amphibians) were compiled (published se-

6 Protostomes and deuterostomes: the two main divisions of bilateria mostly comprising animals with bilateral symmetry and three germ layers (endoderm, mesoderm, and ectoderm).

Fig. 5. Phylogenetic analysis (distance analysis with maximum likelihood using neighbor-joining method) of the five ameloblast-expressed SCPP genes (AMEL, AMBN, AMTN, ODAM, and ENAM) based on the 5' region (288 bp) of their putative ancestral sequences. The ancestral sequence of SPARCL1, the probable ancestor of SCPP genes, was used to root the tree. Bootstrap values are indicated (1,000 replicates).

Fig. 5. Phylogenetic analysis (distance analysis with maximum likelihood using neighbor-joining method) of the five ameloblast-expressed SCPP genes (AMEL, AMBN, AMTN, ODAM, and ENAM) based on the 5' region (288 bp) of their putative ancestral sequences. The ancestral sequence of SPARCL1, the probable ancestor of SCPP genes, was used to root the tree. Bootstrap values are indicated (1,000 replicates).

quences, sequences retrieved in the databases, and new sequences; see Sire et al. [2006] for the species list). The sequences were aligned as described above for AMTN and ODAM, and a putative AMEL ancestral sequence was calculated using PAUP 4.0. The conserved versus variable regions were determined and used for the next step.

Step 2: Search for Sequence Similarity in Databases

A PSI-blast search (National Center for Biotechnolog-ical Information) of statistically significant similar peptides was performed in GenBank [Sire et al., 2006]. The well-conserved regions of the putative ancestral AMEL were used, i.e. the N-terminal region: exon 2 (signal peptide), exon 3, exon 5, and beginning of exon 6. Sequence similarities were detected with AMBN, then with ENAM and, finally, with SPARCL1. It is noteworthy that the first non-AMEL sequence to be found using PSI-blast was crocodile AMBN, indicating that the latter is closer to ancestral AMEL than mammalian AMBN. This would mean that crocodile AMBN is more conservative of an ancestral state, and could have been subjected to a slower rate of evolution than mammalian AMBN after reptile/ mammal divergence. At this time (July 2004), neither AMTN nor ODAM sequences were available in databases [Sire et al., 2005].

Step 3: Sequence Analysis

The putative ancestral sequences of AMEL, AMBN, ENAM, and SPARCL1 were calculated as described above for AMTN and ODAM. The dataset comprised AMEL

sequences, 30 AMBN, 28 ENAM, and 20 SPARCL1 (entire and partial sequences), and those obtained here from 10 AMTN and 10 ODAM (fig 1, 2). The N-terminal region of SPARCL1 was only used because EMPs and the other SCPPs are supposed to be derived from this region [Kawasaki et al., 2004]. The N-terminal regions of these putative ancestral sequences were aligned to the same region of AMEL (i.e. the first 62 residues, from exon 2 to the TRAP proteolytic site at the beginning of exon 6) with CLUSTALX and hand-checked using Se-Al 2.0. The phy-logenetic analysis was performed using maximum likelihood (neighbor-joining method) in PAUP 4.0 and the tree was rooted on SPARCL1, since this is the probable ancestor of the SCPPs. This analysis confirms with a good statistical support the previous finding that AMEL and AMBN are sister genes [Sire et al., 2006] (fig. 5). The two newly identified ameloblast-expressed genes, ODAM and AMTN, appear as two sister genes (this is well supported statistically), and their group is the sister group of the AMEL/AMBN group. ENAM is the sister gene of the two groups AMEL/AMBN + ODAM/AMTN, and SPARCL1 is the sister gene of the three. However, the relationships of ENAM and SPARCL1 are not strongly supported by our bootstrap analysis. This phylogenetic analysis means that AMEL/AMBN and ODAM/AMTN have a common ancestor, which was probably issued from a duplication of the ENAM ancestor, itself deriving from a copy of the SPARCL1 ancestor.

This phylogeny corresponds to our relatively weak knowledge of ameloblast-expressed genes and must be interpreted with caution. Indeed, even though a large number of sequences were used, most of them are from mammals, and even from eutherians only. Only a few AMEL and AMBN sequences are available in reptiles and amphibians, and no ENAM, AMTN, and ODAM sequences are known in these lineages. This lack of data in non-mammalian lineages does not allow to obtain representative putative ancestral sequences at the amniote and tetrapod levels. This means that the phylogenetic signal (i.e. gene relationships) is probably reduced by (i) the long evolutionary period (hundreds of million years) that separates each gene from its closest relative, (ii) the different evolution rate for each gene in each lineage, and (iii) the rapid divergence of some gene regions in relation to their proper functions. This phylogeny will become more accurate in the near future, when more ameloblast-ex-pressed SCPP gene sequences will be known in reptiles and amphibians. Nevertheless, the present analysis supports AMBN/AMEL relationships and the hypothesis that both genes derive from ENAM. It furthermore indi-

Fig. 6. Current probable scenario for the origin and evolution of SCPP genes and, in particular, of ameloblast-expressed genes (AMEL/AMBN, AMTN/ODAM, and ENAM). Early in deuterian evolution, SPARC duplicated into SPARCL1. During successive rounds of genome and gene duplication, SPARCL1 and its descendants were copied several times on the same chromosome, giving rise to two clusters: the ameloblast-expressed/milk/saliva protein gene cluster and the bone/dentin protein gene cluster (SIBLINGs). The ENAM ancestor duplicated from an SCPP ancestor and one ENAM copy was duplicated again, giving rise to the ancestors of AMBN/AMEL and of AMTN/ODAM. After its duplication from AMBN, AMEL was translocated to another chromosome.


Vertebrate SPARC ancestor

SPARCL1 ancestor

SIBLING ancestor

SCPP ancestor








cates that ODAM and AMTN could also be derived from ENAM. This implies that an additional duplication event has occurred between ENAM and the other ameloblast-expressed SCPP genes (fig. 6).

A preliminary, schematic scenario for SCPP evolution and for the place of the ameloblast-secreted actors (to which AMTN and ODAM are now added) can be drawn, but the story is far from complete (fig. 6). In particular, the relationships between SPARCL1 and the two gene clusters (SIBLINGs and enamel-milk-saliva protein genes), and among the SIBLINGs are not established. In contrast, within the salivary SCPPs, histatins 1 and 3 derive from statherin duplication, and the latter was created from a copy of a milk casein ancestor (CSN1S2) [Kawasaki and Weiss, 2003]. The evolutionary story of salivary SCPPs is relatively recent (they are known in some euthe-rians only), while the origin of milk caseins is more ancient in mammalian evolution. Indeed, a-, p- and K-ca-seins are identified in the milk of metatherians (marsupials) [Ginger et al., 1999; Stasiuk et al., 2000]. Milk casein family members are also evolutionarily related and, given their structural similarity with EMP genes, the ancestral Ca-sensitive casein gene was probably derived from the duplication of an EMP [Kawasaki and Weiss, 2003], which remains to be found (fig. 6).

In summary, depending on the branches of the tree, SCPP relationships are either strongly or weakly supported. Strong relationships are: SPARC/SPARCL1; STATH/ HTHs; CSN/STATH/HTHs; AMEL/AMBN, and AMEL/ AMBN/ENAM. In contrast, there are (i) no clear rela tionships established within the SIBLING cluster, and between this cluster and SPARCL1; (ii) no clearly identified connection between CSNs and EMPs; (iii) weak (lack of non-mammalian sequences) relationships between ODAM/AMTN, and ENAM/ODAM/AMTN, and (iv) no clear relationship between the ameloblast-expressed genes (AMEL/AMBN, ODAM/AMTN, and ENAM) and SPARCL1.

Sequencing these SCPP genes in non-mammalian species [reptiles (crocodiles, lizards, and snakes) and amphibians (salamanders, caecilians, and frogs)] will help to improve our knowledge on the relationships in the family.

Dating of AMBN/AMEL Duplication

Now that AMEL and AMBN are clearly established sister genes, the last questions are: was the ancestral gene AMBN or AMEL and is it possible to date this duplication event? The stronger support to AMBN ancestry is indirectly suggested by the location of AMEL on sex chromosomes. Indeed, it is difficult to imagine that an AMEL copy (that would have become AMBN) was translocated by mere chance, on the chromosome housing the other SCPP genes, and close to ENAM, their close relative. In contrast, the close location of AMBN and ENAM on the same autosomal chromosome (fig. 4) strongly supports that AMBN was created from a copy of ENAM, and, as a consequence, that AMEL originated after a duplication of the ancestral AMBN, and then translocated to another chromosome. One could argue that AMEL translocation

AMBN Human

Evolutionary distance

_ AMBN Mouse

_ AMBN Crocodile

- AMBN Xenopus

_. AMEL Xenopus AMEL Crocodile AMEL Mouse AMEL Human

Million years 400

Evolutionary 0.5 distance y = 874.03x R2 = 0.7

Evolutionary 0.5 distance

Fig. 7. a Linearized tree obtained from the phylogenetic analysis of AMBN and AMEL sequences in human, mouse, crocodile, and Xenopus. The calibration time used is: human/mouse: 90 MYA; human/crocodile: 310 MYA; human/Xenopus: 360 MYA [Hedges, 2002]. b Linear regression of time versus distance (y-x). Each point has two evolutionary distances of AMBN and AMEL. The duplication time of AMBN/AMEL can be estimated when we add the evolutionary distance of duplication to this linear equation, i.e. it occurred >600 MYA.

occurred after its duplication from the ENAM ancestor and that the copy remained close to ENAM and differentiated into AMBN. This scenario cannot be maintained since the similarities found in gene organization (fig. 3) and in amino acid pattern indicate that AMBN is closer to ENAM than AMEL is. Therefore, AMBN is the 'mother' of AMEL and not the opposite.

In summary, all ameloblast-expressed genes are phy-logenetically related, and ENAM could be the ancestor of all of them. AMEL, which codes for the major protein of the forming enamel matrix in mammals (90% of the protein content) is the youngest EMP gene. This strongly suggests that AMEL divergence after AMBN duplication was an important innovation for enamel, at least in mammals. To date, the relationships of EMP genes with SPAR-

CL1 are difficult to establish and more data are needed to test the hypothesis of SPARCL1 ancestry.

The availability of AMEL and AMBN sequences in various mammalian species, in a crocodile and in an amphibian (Xenopus) allowed to envisage a molecular dating of AMBN/AMEL duplication. A phylogenetic tree was inferred from the amino acid sequences using the neighbor-joining method (fig. 7a). From the phylogeny, it is apparent that the duplication event was much earlier than the speciation events such as the mammal/amphibian split, or the mammal/reptile split, and roughly two times of these events. To give an approximate estimate of when this duplication event occurred, we utilized the molecular dating technique developed by Gu et al. [2002], calibrated by the fossil record: primate/rodent split (around 90 MYA), mammal/reptile split (310 MYA), and amniote/amphibian split (360 MYA) [Hedges, 2002]. Our results are as follows.

1 If the amniote/amphibian split is used alone, the date of duplication (T) = 627 MYA.

2 If the mammal/reptile split is used alone: T = 896


3 If the primate/rodent split is used alone: T = 480


4 If all three calibrations are used: T = 682 MYA.

This is a molecular dating of gene duplication, so it should be compared to other molecular date profiling [Gu et al., 2002]. Here, (2) and (3) are unreliable because the distance between human-mouse or human-crocodile differs considerably in AMBN/AMEL genes. In contrast (1) is mostly reliable and (4) takes the average, but both give similar results, i.e. AMBN/AMEL duplication occurred >600 MYA (fig. 7b). This result confirms the previous dating of AMEL origins during the Precambrian period [Delgado et al., 2001]. A major peak of genome and gene duplication occurred around 700-500 MYA [Gu et al., 2002]. Therefore, like many developmental genes, EMPs were duplicated during this period, which preceded vertebrate diversification and skeletal mineralization.

In summary, two unrelated molecular dating methods of EMP origins (SPARC/SPARCL1 divergence date: Delgado et al. [2001] and AMBN/AMEL duplication date: this study) indicate that the genes encoding them were created from several duplication rounds that have occurred before the currently accepted dates of the appearance of the first vertebrates in the fossil record (>600 MYA). In contrast, the molecular dating of SPARC/SPARCL1 divergence proposed by Kawasaki et al. [2004] supports an emergence of EMPs after the di a b vergence of cartilaginous fish (approximately 500 MYA Kumar and Hedges [1998]). The knowledge of the divergence date of SPARC/SPARCL1 is of importance as SPARCL1 is considered the probable ancestor of SCPPs. However, the apparent different evolutionary rates of SPARC and SPARCL1 in various taxa, together with the fact that various gene regions were compared within each species or each clade, does not allow an accurate prediction of the divergence date. Indeed, these two paralogs share a well-conserved C-terminal region which is not easy to differentiate from one gene to the next in the vertebrate species examined. In contrast, their N-termi-nal region is not only extremely different but also, when comparing this region in various species, difficult to align due to a large number of sequence variations. Nevertheless, the N-terminal region of SPARCL1 is considered the probable ancestor of SCPPs. The divergence date of AMBN/AMEL seems to be more reliable because the relationships of these two genes are now well established. Also, the presence of enamel-like tissues in early vertebrates indicates that the divergence of SCPP genes might have preceded the origin of vertebrate tissue mineralization.

It is important to realize the following.

(i) The molecular dating of AMBN/AMEL duplication does not indicate the presence of these molecules in forming enamel, 600 MYA. After the duplication, several dozens of millions of years were probably necessary before one copy acquired its new function (new gene structure and new expression). This divergence could have occurred before, during or after the vertebrate diversification, reported to be in the Cambrian as demonstrated in the fossil record. Moreover, genetic evidence suggests that most animal phyla evolved dozens of millions of years before they started to leave behind fossil evidence, although this is debated by paleontologists. Given the lack of a temporal association between the birth of a gene (e.g. AMEL 600 MYA) and the advent of mineralized 'teeth' >50-100 millions of years later, the confidence in the assigned dating should be softened.

(ii) Tissue mineralization could not have occurred if the necessary tools were not already present. This implies that EMPs could have had other functions before the first enamel/enameloid tissues mineralized and before EMPs were recruited for mineralization later in vertebrate evolution. This novel trait (mineralization) therefore probably evolved by employing already existing materials.

Was this article helpful?

0 0
Parenting Teens Special Report

Parenting Teens Special Report

Top Parenting Teenagers Tips. Everyone warns us about the terrible twos, but a toddler does not match the strife caused once children hit the terrible teens. Your precious children change from idolizing your every move to leaving you in the dust.

Get My Free Ebook

Post a comment