Mutations involving a small number of nucleotides, from point mutations to small deletions or insertions, account for 90% of all mutations in the LDLR gene, while the remaining are major rearrangements due to unequal recombination between the 30 Alu sequences identified throughout the gene (Hobbs et al. 1990). To date, more than 1400 point mutations and small deletions or insertions associated with FH have been reported in the LDLR gene (http://www.ucl.ac.uk/fh and www.umd.be/LDLR/).
The UMD-LDLR database (www.umd.be/LDLR/) actually includes 1404 point mutations, small deletions or insertions and mutations affecting splicing (intronic mutations) in the LDLR gene reported in the literature. It cannot accommodate mutations from the UTR and promoter regions, and large deletions or insertions or indels. In addition, two mutations that affect the same allele are entered as two different records linked by the same sample ID. If the same mutation has been reported in apparently unrelated patients (for example, the c.1A>C (p.Met1Leu) identified in Spanish (Chaves et al. 2001), British (Day et al. 1997) and Dutch patients (Fouchier et al. 2005), separate entries were made for each patient as recurrent mutations, in the absence of haplotypes demonstrating a common ancestor.
Among these 1404 small DNA variations of the LDLR gene, 58.5% are missense mutations, 21.7% are small deletions or insertions, 10.4 % are nonsense and 9.4% are splice site mutations. A large majority of these small DNA variations are single nucleotide substitutions (76.6%, 1076/1404), including 75.1% missense, 13.6% nonsense and 11.3% splice site mutations.
Missense mutations are the most numerous of the small DNA variations (58.5%, 821/1404) reported in the LDLR gene in association with Familial Hypercholesterolemia (FH). Like the other small DNA variations in the LDLR gene, missense mutations are widely distributed throughout the whole sequence of the gene (Figure 2). Therefore, no real mutation hot spot can be defined which sustains the need to scan the whole gene sequence to identify FH-causing mutations in the diagnostic procedures.
The CpG dinucleotide has been shown to be a hot spot for mutations in humans because it can undergo oxidative deamination of 5-methyl cytosine (Krawczak et al. 1998). The LDLR gene sequence includes 123 CpG dinucleotides, accounting for 4.8% of the coding sequence. This ratio is similar to the mean percentage of CpG (3.7%) in the coding sequence of a large number of genes involved in human diseases and localised on autosomes (Cooper and Krawczak 1990). Missense mutations are the only substitutions in the LDLR gene occurring at the CpG dinucleotide for 4.8% (46/954) of all the single nucleotide variations. Interestingly, in the LDLR gene, the percentage of substitution occurring at the CpG (4.8%) is significantly lower than the mean observed for disease-causing mutations in other genes (37%) (Cooper and Krawczak 1990). There is no explanation, to date, for this observation.
In the LDL receptor protein, the most numerous amino acids are aspartate (8.7%), serine (8.1%), leucine (7.7%), cysteine (7.3%), glycine (7.2%) and valine (6.7%). The less represented amino acids are methionine (1.3%), tyrosine (2.0%), histidine (2.2%), tryptophane (2.3%) and phenylalanine (3.0%). This distribution of amino acids is consistent with the one reported for human proteins in general, with an exception for cysteine that is less abundant (3%) (Lewin
1990). The LDL receptor is known to be a cysteine-rich protein in which disulphide bonds between two cysteines are essential for ensuring the correct folding of 10 major modules necessary for protein activity (Russell et al. 1989, Kurniawan et al. 2001).
The number of mutations affecting an amino acid is not always related to its frequency in the protein. Cysteine, tryptophane and aspartate are more frequently affected than others residues, indicating that they are essential actors of protein activity. Substitutions affect 57 (90%) of the 63 cysteines of the LDL receptor, 43 (57%) of the 75 aspartates and 12 (60%) of the 20 tryptophanes. Cysteines are involved in the folding of the ligand binding and EGF-like domains. Aspartates are also highly conserved residues of the repeated modules of the LDL binding domain. Their negative charges are involved in bonds with positively charged residues of the apo B and apo E ligands. Apart from its hydrophobicity, tryptophane does not have a structural or functional role as manifest as those of a cysteine or a charged residue. However, along with methionine, tryptophane is the only amino acid encoded by a single codon, probably explaining its "more mutable" trait observed here.
Figure 2. Distribution of point mutations within the LDL receptor gene (LDLR).
A certain proportion of the disease-causing substitutions (missense and nonsense mutations), ~25%, have been shown to alter functional splicing signals within exons, such as exonic splicing enhancers (ESE), to create an alternative splice site within exons that is used preferentially, or induce the loss of the consensus exonic splice site (Cartegni et al. 2002,
Sterne-Weiler et al. 2011). Within the LDLR gene, 28.4% of the reported missense mutations are predicted to alter functional splicing signals. The missense mutation c.2140G>C (p.Glu714Gln) that was predicted to be benign with four prediction tools for substitutions (Polyphen*, SIFT*, Pmut* and SNPs3D*) was predicted to create the loss of the intron 14 donor splice site with either NetGene2* and NNSPLICE* prediction tools for splice site mutations (Marduel et al. 2010). It is clear, however, that mRNA analyses are necessary to support these predictions, as performed for a small number of exonic substitutions. The conservative amino acid substitution c.2389 G>T (p.V776L) that would be unlikely to affect LDL receptor function, concerns the last nucleotide of exon 16 and causes exon 16 skipping (Bourbon et al. 2009). These missense mutations would therefore be likely to exert their major pathological effects on splicing rather than through an alteration in the amino acid sequence of the LDL receptor. This is reinforced by the observation of several silent substitutions associated with the clinical phenotype of familial hypercholesterolemia. The silent mutation p.Leu605Leu (c.1813C>T) was predicted to create a new donor splice site AGGT at position 1813 in exon 12. The use of this new donor site would lead to the substitution of leucine 605 by a threonine, the deletion of 11 amino acids (from Alanine 606 to Aspartate 616), a frameshift and the appearance of a premature termination 49 codons further on (Marduel et al. 2010). The variant, c.621C>T (p.Gly207Gly), was found to be associated with altered splicing. The nucleotide change leading to p.Gly207Gly resulted in the generation of new 3'-splice donor site in exon 4 of the LDL receptor gene. Splicing of this alternate splice site leads to an in-frame 75-base pair deletion in a stable mRNA of exon 4 and nonsense-mediated mRNA decay (Defesche et al. 2008). The silent mutation, p.Arg406Arg, that also introduces a new splice site, causes a deletion of 31 bp in the LDLR mRNA sequence, and introduces a premature termination 4 codons further on (Bourbon et al. 2007).
Tools for in silico prediction of protein function. 3.2. Frameshift mutations
Among the 1404 small DNA variations of the LDLR gene, a total of 305 (21.7%) are small deletions or insertions, including 261 (85.6%) independent mutations leading to a frameshift and 55 (14.4%) in-frame deletions or insertions. This proportion of in-frame small deletions or insertions is consistent with observations made for other disease-causing genes (Cooper, Antonarakis and Krawczak 1995). The frameshift mutations are due to either a small deletion (176/261, 12.5%) or insertion/duplication (85/261, 6.0%) of a few nucleotides (from 1 to 49 for deletions, from 1 to 23 for insertions). The sequence context analysis provides
evidence that a repeated motif flanking the frameshift event could be involved in the aetiology of the mutation in 48.0% of the deletional events and in 29.2% of the insertional events.
Half of the frameshift mutations involved a single nucleotide: 58.5% (103/176) among deletions and 56.5% (48/85) among insertions. In half of the deletion cases and in half the insertion cases, the single nucleotide deletion/insertion occurs within runs of 2 to 7 identical bases. Runs of identical bases are known to cause deletions/insertions according to the slipped mispairing mechanism occurring at DNA replication (Ball et al. 2005).
Deletions involving larger sequences (from 2 to 49 bp) can be divided into three different types: (1) One of the repeated flanking sequences is included in the deletion, which is also explained by the slipped mispairing mechanism occurring at DNA replication (Ball et al. 2005); (2) The repeated sequences flanking the deletion are not included in the frameshift mutation, which is explained by homologous recombination between palindromic or symmetric repeated sequences (Cooper 1995); (3) Parts of the flanking repeated sequences are included in the deletion. To date, no molecular mechanism has been identified to explain such deletional events.
Insertions involving larger sequences (from 2 to 23 bp) can be explained by the same mechanisms as described for deletions, and can be divided into two different types: (1) The inserted sequence is a duplication; (2) The inserted sequence is new within the LDLR gene sequence. This latter observation raises the hypothesis that very probably insertions do not occur at random but rather in order to create repeated sequences that were not present in the original gene sequence. A consensus sequence, GTAAGT, was frequently identified flanking small deletions or insertions (Ball et al. 2005). In the LDLR gene sequence, this consensus is present at the 3' end of exon 4 at position c.681-687. Among the 96 deletions (in frame and frameshift) in the LDLR gene, 11 (11.5%) are at this position pointing to a discrete hot spot for insertions, as observed in Figure 2 and in accordance with previous reports (Kotze et al. 1996).
3.3. Nonsense mutations
Nonsense mutations represent 10.4% (146/1404) of the small DNA variations in the LDLR gene, and 13.6% (146/1076) of the FH-causing substitutions.
Among the 860 codons of the LDLR gene sequence, 253 potential stop codons (codons that can be turned into a stop codon with only one substitution) were identified (29.4%) and were not equally distributed throughout the whole gene. In exons 2 to 8, more than 33% of the protein codons are potential stop codons, while less than 21% of the protein codons are potential stop codons in exons 9, 10, 13, 15 and 16. Among these 253 potential stop codons, 93 of them (36.8%) are affected by a mutational event.
The number of mutations affecting potential stop codons is not always related to their frequency in each exon. Potential stop codons are more frequently affected by mutation in exons 3, 9, 10 and 14, with 57.1%, 50.0%, 46.2% and 53.3% respectively of potential stop codons in each exon carrying a mutational event. Conversely, in exons 1, 12, 13 and 17,
16.7%, 18.2%, 20.0% and 26.7% respectively of the potential stop codons are affected by a mutational event.
Among the 1404 small DNA variations of the LDLR gene, a total of 132 (9.4%) are splice site mutations and, among the 1076 single nucleotide FH-causing substitutions, 122 (11.4%) are intronic. From the analysis of a large number of genes, a mean proportion of 15% for splice site mutations among disease-causing DNA substitutions was evaluated (Krawczak et al. 2007). The expected frequency of splice site substitutions within the LDLR gene is 9% (Cooper and Krawczak 1990). The number of FH-causing splice site substitutions observed in this wide review of the literature (9.5%) is thus consistent with the expected value for the LDLR gene.
Among the 132 splice site mutations of the LDLR gene, 14 (10.6%) are mid-intronic mutations situated at more than 10 bp of intron/exon junctions. Half of the intronic mutational events in the LDLR gene (55.3%, 73/132) affect the two canonical "AG" and "GT" highly conserved dinucleotides of the acceptor and donor splice sites respectively. Accordingly to the analysis of a large number of disease-causing mutations in different genes (Krawckak et al. 1992), within the LDLR gene intronic mutations affecting a donor splice site are more frequent (65.1%, 86/132) than mutations affecting an acceptor splice site (36.4%, 48/132).
Was this article helpful?
Discover secrets, myths, truths, lies and strategies for dealing effectively with cholesterol, now and forever! Uncover techniques, remedies and alternative for lowering your cholesterol quickly and significantly in just ONE MONTH! Find insights into the screenings, meanings and numbers involved in lowering cholesterol and the implications, consideration it has for your lifestyle and future!