COMMENTARY

Can the impact of human genetic variations be predicted? Yuval Itana,1 and Jean-Laurent Casanovaa,b,1 a St. Giles Laboratory of Human Genetics of Infectious Diseases, The Rockefeller University, New York, NY 10065; and bHoward Hughes Medical Institute, New York, NY

Compared with an arbitrary reference, the protein-coding sequence of any human genome contains about 20,000 single-nucleotide variants, most of which are heterozygous, and far fewer variants of other types. When their frequency is >1% in a given population, variants are designated as common. The others are rare, including some that appear to be private to the individual or kindred studied. In each individual, at most two variations can underlie a monogenic disorder. It has thus been hoped that computational methods would be able to prioritize these variants and point at a handful of candidate culprits in the exome of any patient, not only for diagnostic but also more ambitiously for research purposes (1). This vision culminated in the utopia, or dystopia, of genetic medicine relying only on a dry laboratory to draw conclusions from genome sequencing (2). To work, such methods should show both a low false-positive (FP) prediction rate, to filter out irrelevant variants (the smaller the size of the haystack the better), and a low false-negative (FN) rate, to avoid filtering out the disease-causing mutations (the needle must stay in the haystack). Many “variantlevel” approaches predict the biochemical impact of variants. Examples include sorting intolerant from tolerant (SIFT), which is based on protein sequence homology (3), polymorphism phenotyping v2 (PolyPhen-2), which is based on a combination of sequence conservation and biochemical properties of proteins and was trained by a set of known diseasecausing mutations (4), and combined annotation-dependent depletion (CADD), which combines existing variant-level methods (including SIFT and PolyPhen-2) with an analysis of the impact of actual versus simulated human variants (5). In an important study, Miosge et al. show experimentally that the current software generate a low rate of FNs but a high rate of FPs (6). Clinical genetic studies had occasionally shown that computational methods can lack sensitivity or specificity to predict the impact of variations (7). In a comprehensive study, Miosge et al. (6) studied homozygous mice after mutagenesis by N-ethyl-N-nitrosourea (ENU). They focused on 30 missense mutations in 23 immunity genes, each found in a single mouse substrain (8). These genes were selected because

other mice, homozygous for null mutations in these genes, have a clear immunological phenotype. Among the 30 missense mutations, 20 were predicted to be damaging by most of six methods, including SIFT, PolyPhen-2, and CADD. The authors (6) found that only about four (20%) of the variants predicted to be damaging actually caused the in vivo phenotype of null alleles. None of those predicted to be benign showed a phenotype. This study thus reveals a gap between the performance of current variantlevel methods in silico and experimental phenotypes in mice in vivo, with a lack of FNs but an excess of FPs (9, 10). One may argue, however, that the missense mutations predicted to be damaging could actually be hypomorphic or hypermorphic, and therefore not mimic the phenotype of null alleles. The authors (6) thus went one step further and measured in vitro the biochemical impact of all 2,314 possible missense mutations in human TP53, somatic null mutations of which are common in human cancer. They show that only 4% of the variants predicted to be benign were actually deleterious, but that 40% of the variants predicted to be deleterious had little or no biochemical impact. Prediction software thus generate a low rate of FNs but a high rate of FPs for TP53. The few other human genes that were computationally and experimentally tested suggest that the conclusions drawn from the TP53 experiments may be generalizable (11, 12). It would be particularly interesting to test genes that are known to harbor both loss-of-function and gain-of-function lesions, and various shades of them, each corresponding to a distinctive phenotype. Perhaps in a not too-distant future, all nonsynonymous mutations in all coding genes will be tested, each in appropriate cell types, taking advantage of induced pluripotent stem cell (iPSC) and CRISPR-Cas9 technologies. However, even under this optimistic scenario, one should not forget that an in vitro experiment tests only one of the many functions of a protein, and not in the conditions seen in vivo. For example, one may argue that some TP53 mutations that were neutral in the aforementioned assay may be pathogenic in vivo, driving malignancy or another phenotype. Meanwhile, although the prediction software should reasonably aim at reducing the number

11426–11427 | PNAS | September 15, 2015 | vol. 112 | no. 37

of FPs, it is probably more important to ensure stringency regarding the number of FNs. The worst-case scenario is to miss the good mutation. The current methods appear to be satisfactory in that regard. To ensure that the FN rate is minimal, ideally null, one can test any prediction method against a set of known diseasecausing mutations from curated databases, such as the Human Gene Mutation Database (13), HumDiv/HumVar (4), and ClinVar (14). As a matter of fact, several variant-level methods used these databases as training sets to differentiate deleterious from neutral alleles (for example, PolyPhen-2 and HumDiv/ HumVar). Of course, it is difficult to devise software that would reduce both FPs and FNs, because rare, nonconservative mutations of highly conserved residues will often present high prediction scores, whether FPs or diseasecausing mutations. Miosge et al. (6) understandably focus on missense mutations, which are the most abundant coding nonsynonymous variations and currently pose the greatest prediction difficulties. Popular variant-level prediction methods, such as PolyPhen-2 and SIFT (3, 4), were designed to predict the impact of missense variants. However, it is also important to discuss the prediction value of variant-level approaches for other classes of variants. Up to 63,541 of the 135,953 (46.74%) Human Gene Mutation Database-curated disease-causing mutations are not missense (13). These variations are thought to have stronger functional consequences, because most are nonsense, start-loss, stop-loss, splicing mutations, and insertions or deletions that can be small or large in scale. However, one should be careful in predicting a deleterious impact of apparently severe mutations, as even a premature stop codon is not synonymous (so to speak) with a null allele. Indeed, there can be downstream reinitiation of translation if the stop is sufficiently upstream, or there can be stop codon read-through depending on the sequence in vicinity. Or, if the stop is sufficiently downstream, the truncated protein may function well. Moreover, genes can display multiple isoforms, and entire coding exons may be partly or totally redundant, making stop codons anywhere inconsequential Author contributions: Y.I. and J.-L.C. wrote the paper. The authors declare no conflict of interest. See companion article on page E5189. 1

To whom correspondence may be addressed. Email: yitan@ rockefeller.edu or [email protected].

www.pnas.org/cgi/doi/10.1073/pnas.1515057112

Itan and Casanova

and gene-level approaches complement other processes. Linkage analysis is beneficial in multiplex or consanguineous kindreds, and genetic homogeneity can be of invaluable help when two or more kindreds are studied. The comparison of the frequency of variants in a set of genes, in the patients versus the general population (or specific ethnic groups), has also more recently shown utility (23). More generally, knowledge of the prevalence of disease and estimates of clinical penetrance and genetic homogeneity are essential to make the best use of the prediction software, as they directly govern the frequency of the candidate mutant alleles (9, 10). Equally important, the mode of inheritance is a key factor when determining the relevance of a genotype for phenotype. Regardless of the variant- and gene-level prediction scores, homozygous and compound heterozygous variants are stronger candidates for a recessive trait than any heterozygous variant. A recessive model with homozygous variations should be favored in patients born to consanguineous parents. In sporadic cases, de novo mutations are more likely to be causative than other heterozygous variants in a model of high clinical penetrance. It is often asserted that genomewide approaches, in patients or populations, are unbiased and hypothesis-free. This is true from a physiological perspective, but not entirely correct, as a good genetic hypothesis is key to a successful endeavor. The mode of inheritance, level of genetic heterogeneity, and clinical penetrance are the three key hypotheses that will lead to success or failure, whether for diagnosis or research, and in the latter case whether for single-patient or for large-population genetic studies. Variant- and gene-level software will never compensate for a faulty genetic hypothesis.

Miosge et al. (6) assess the predictive power of variant-level methods for 30 missense in 23 mouse immunity genes in vivo and for 2,314 missense in human TP53 in vitro. This massive achievement suggests that current methods overestimate the impact of mutations. One can hope that variant- and gene-level approaches improve with time. However, this study reminds us that it will remain necessary to experimentally validate any computational prediction by functional assays (9, 10). Candidate mutant alleles must be tested individually, as well as cells carrying the biallelic genotypes. With the advent of iPSCs and CRISPR/Cas9 technologies, decisive experiments can now be conducted with human cells in vitro in fields other than hematology and immunology (10, 11). In that process, broad and profound knowledge of physiological and pathological processes are essential not only to select candidate genes and variations, but also to design ways to test them experimentally. Experiments remain necessary for each novel candidate mutation, and when a known disease-causing mutation is found in a patient with a different phenotype. In designing such experiments, the aim is to discover the genotype underlying a given phenotype. In anyone’s genome, there are multiple genotypes that are responsible for a variety of phenotypes. Establishing a causal relationship requires bridging a candidate genotype and the relevant phenotype by an experimentally proven thread of causes and consequences. This relies on a series of intermediate phenotypes, at the molecular, cellular, and organismal levels, connected via mechanisms that best ascertain the causal relationship between the variant(s) and the disease.

1 Goldstein DB, et al. (2013) Sequencing studies in human genetics: Design and interpretation. Nat Rev Genet 14(7):460–470. 2 Service RF (2013) Biology’s dry future. Science 342(6155): 186–189. 3 Kumar P, Henikoff S, Ng PC (2009) Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nat Protoc 4(7):1073–1081. 4 Adzhubei IA, et al. (2010) A method and server for predicting damaging missense mutations. Nat Methods 7(4):248–249. 5 Kircher M, et al. (2014) A general framework for estimating the relative pathogenicity of human genetic variants. Nat Genet 46(3):310–315. 6 Miosge LA, et al. (2015) Comparison of predicted and actual consequences of missense mutations. Proc Natl Acad Sci USA 112:E5189–E5198. 7 Jordan DM, et al.; Task Force for Neonatal Genomics (2015) Identification of cis-suppression of human disease mutations by comparative genomics. Nature 524(7564):225–229. 8 Andrews TD, et al. (2012) Massively parallel sequencing of the mouse exome to accurately identify rare, induced mutations: An immediate source for thousands of new mouse models. Open Biol 2(5):120061. 9 Chakravarti A, Clark AG, Mootha VK (2013) Distilling pathophysiology from complex disease genetics. Cell 155(1):21–26. 10 Casanova JL, Conley ME, Seligman SJ, Abel L, Notarangelo LD (2014) Guidelines for genetic studies in single patients: Lessons from primary immunodeficiencies. J Exp Med 211(11):2137–2149. 11 Findlay GM, Boyle EA, Hause RJ, Klein JC, Shendure J (2014) Saturation editing of genomic regions by multiplex homologydirected repair. Nature 513(7516):120–123. 12 Starita LM, et al. (2015) Massively parallel functional analysis of BRCA1 RING domain variants. Genetics 200(2):413–422. 13 Stenson PD, et al. (2014) The Human Gene Mutation Database: Building a comprehensive mutation repository for clinical and

molecular genetics, diagnostic testing and personalized genomic medicine. Hum Genet 133(1):1–9. 14 Landrum MJ, et al. (2014) ClinVar: Public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res 42(Database issue):D980–D985. 15 Bolze A, et al. (2012) A mild form of SLC29A3 disorder: A frameshift deletion leads to the paradoxical translation of an otherwise noncoding mRNA splice variant. PLoS One 7(1): e29708. 16 Boisson B, Quartier P, Casanova JL (2015) Immunological loss-offunction due to genetic gain-of-function in humans: Autosomal dominance of the third kind. Curr Opin Immunol 32:90–105. 17 Sauna ZE, Kimchi-Sarfaty C (2011) Understanding the contribution of synonymous mutations to human disease. Nat Rev Genet 12(10):683–691. 18 Ng SB, et al. (2009) Targeted capture and massively parallel sequencing of 12 human exomes. Nature 461(7261):272–276. 19 Belkadi A, et al. (2015) Whole-genome sequencing is more powerful than whole-exome sequencing for detecting exome variants. Proc Natl Acad Sci USA 112(17):5473–5478. 20 Itan Y, et al. (2013) The human gene connectome as a map of short cuts for morbid allele discovery. Proc Natl Acad Sci USA 110(14):5558–5563. 21 Petrovski S, Wang Q, Heinzen EL, Allen AS, Goldstein DB (2013) Genic intolerance to functional variation and the interpretation of personal genomes. PLoS Genet 9(8): e1003709. 22 Samocha KE, et al. (2014) A framework for the interpretation of de novo mutation in human disease. Nat Genet 46(9): 944–950. 23 Bolze A, et al. (2013) Ribosomal protein SA haploinsufficiency in humans with isolated congenital asplenia. Science 340(6135): 976–978.

PNAS | September 15, 2015 | vol. 112 | no. 37 | 11427

COMMENTARY

for some or all protein functions. Moreover, exons carrying a frameshift may functionally rescue a splice variant that is normally out-offrame (15). Finally, stop mutations in certain proteins may even paradoxically be gain-offunction (16). Conversely, apparently synonymous mutations in coding exons can be lossof-function, not necessarily by interfering with the splicing process, and should not be dismissed blindly (17). Moreover, whole-genome sequencing not only improves the coverage of exome sequencing, but also provides an opportunity to discover disease-causing mutations elsewhere in the genome (18, 19). The CADD method filled these blanks by predicting the impact of most categories of genome variants (5). These predictions must, however, be taken with even greater caution than those of missense mutations. In particular, much needs to be learned about the biology of variations in nonprotein-coding genes and the intergenic space. Fortunately, variant-level approaches are not the only computational tools available to predict the clinical impact of a variant. One can also take into consideration the properties of the genes themselves, in “gene-level” approaches. The human gene connectome (20), which measures the biological distance between genes, is helpful for selecting candidate genes. Other methods assess the population genetic properties of the genes, thereby estimating their relevance to human disease. The residual variation intolerance score (21) ranks human genes by their deviation from the genome-wide average number of nonsynonymous mutations found in genes with a similar amount of global mutational burden. It showed that known Mendelian disease-causing genes are less tolerant to coding variations than other genes. The de novo mutation excess method (22) compares the rate and nature of potential and observed de novo mutations per human gene. These two methods have distinct algorithms and complementary proposed uses. The methods, however, partly overlap because the information generated in both cases is inspired by studies of purifying selection. In the future, combining existing methods and tailoring new methods may also optimize prediction. For example, a method that would detect (and filter out) genes likely to harbor FPs would neatly complement existing methods, which are optimized to detect (and select) genes likely to harbor true positives. Importantly, a combination between variant-level and genelevel approaches is synergistic. A variant predicted to be damaging in a poorly polymorphic protein is likely to have the strongest phenotypic impact, whereas a variant predicted to be benign in a protein crippled with variations is unlikely to have a phenotypic impact (21). During the in silico search for candidate variations in single patients, small series of patients, or larger populations, both variant-level

Can the impact of human genetic variations be predicted?

Can the impact of human genetic variations be predicted? - PDF Download Free
557KB Sizes 0 Downloads 6 Views