Forensic Science International: Genetics 9 (2014) 111–117

Contents lists available at ScienceDirect

Forensic Science International: Genetics journal homepage: www.elsevier.com/locate/fsig

Evaluation of the IrisPlex DNA-based eye color prediction assay in a United States population Gina M. Dembinski, Christine J. Picard * Department of Biology and Forensic and Investigative Sciences Program, Indiana University-Purdue University Indianapolis, 723 W. Michigan Street, Indianapolis, IN 46202, USA

A R T I C L E I N F O

A B S T R A C T

Article history: Received 25 April 2013 Received in revised form 27 August 2013 Accepted 4 December 2013

DNA phenotyping is a rapidly developing area of research in forensic biology. Externally visible characteristics (EVCs) can be determined based on genotype data, specifically based on single nucleotide polymorphisms (SNPs). These SNPs are chosen based on their association with genes related to the phenotypic expression of interest, with known examples in eye, hair, and skin color traits. DNA phenotyping has forensic importance when unknown biological samples at a crime scene do not result in a criminal database hit; a phenotypic profile of the sample can therefore be used to develop investigational leads. IrisPlex, an eye color prediction assay, has previously shown high prediction rates for blue and brown eye color in a Dutch European population. The objective of this work was to evaluate its utility in a North American population. We evaluated six SNPs included in the IrisPlex assay in population sample collected from a USA college campus. We used a quantitative method of eye color classification based on (RGB) color components of digital photographs of the eye taken from each study volunteer so that each eye was placed in one of three eye color categories: brown, intermediate, or blue. Objective color classification was shown to correlate with basic human visual determination making it a feasible option for use in future prediction assay development. Using these samples and various models, the maximum prediction accuracies of the IrisPlex system after allele frequency adjustment was 58% and 95% brown and blue eye color predictions, respectively, and 11% for intermediate eye colors. Future developments should include incorporation of additional informative SNPs, specifically related to the intermediate eye color, and we recommend the use of a Bayesian approach as a prediction model as likelihood ratios can be determined for reporting purposes. ß 2013 Elsevier Ireland Ltd. All rights reserved.

Keywords: SNP DNA phenotyping Forensic Eye color prediction IrisPlex

1. Introduction One of the rapidly developing areas in forensic biology is the ability to predict externally visible characteristics (EVCs) based solely on DNA-based genetic information, also known as DNA phenotyping [1]. In DNA phenotyping, single nucleotide polymorphism (SNP) markers found in and around genes associated with EVCs are genotyped for the prediction of certain traits [2]. For example, human sex determination is an EVC that is accurate in predicting sex with the analysis of the amelogenin marker [2]. In 2001, Grimes et al. [3] published the first example of a DNA-based phenotyping test using the MC1R gene for the prediction of the red hair phenotype [4]. EVCs that show the most promise for successful

* Corresponding author at: 723 W. Michigan Street, SL 306, Indianapolis, IN 46202, USA. Tel.: +1 317 278 1050. E-mail addresses: [email protected] (G.M. Dembinski), [email protected] (C.J. Picard). 1872-4973/$ – see front matter ß 2013 Elsevier Ireland Ltd. All rights reserved. http://dx.doi.org/10.1016/j.fsigen.2013.12.003

development in forensic prediction tests are skin, hair, and eye color; they are among the most visible phenotypic traits [5] and have a small number of markers that account for a large proportion of the variation [3]. There have been improvements in the sensitivity of conventional short tandem repeats (STR) testing where DNA profiles are now routinely obtained from very little biological material, including touch samples [6]. However, a limitation of DNA evidence is when a DNA profile from a crime scene fails to match any one individual from a DNA database and thus results in a ‘dead end’ in terms of the investigation. FBI statistics showed that DNA profiles in databases increased exponentially from 2001 to 2006, yet hits increased linearly, which leads to an increasing discrepancy between unmatched DNA profiles and hits [3]. As a complement to conventional STR profiling, DNA phenotyping can be used as an investigational tool, not just for criminal casework, but also those pertaining to missing persons or mass disasters [1]. For example, the information from a DNA phenotype profile will either corroborate or negate eye witness statements

112

G.M. Dembinski, C.J. Picard / Forensic Science International: Genetics 9 (2014) 111–117

[1]. This has been demonstrated in a criminal investigation in which a genetic testing company used 71 ancestry informative markers (AIMs) to negate eye witness testimony [7]. The markers suggested the contributor was predominantly of African descent, whereas eyewitness testimony had suggested a Caucasian suspect. A month later, the suspect was arrested and has since been convicted of the charges [7]. Once developed for forensic applications, the information possible from these predictions may help in developing plausible leads for investigations, especially in cases when the genetic information may be the only witness. The full genetic determination of these external traits is still being explored; however, many studies have observed associations of SNPs to genes that contribute to variation in pigmentation, specific to eye, hair, and skin color [4,5,8–16]. Common variants associated with normal pigmentation in humans have been identified via genome-wide association studies (GWAS) resulting in six informative genes: MC1R, OCA2, SLC24A5, MATP (SLC45A2), ASIP, and TYR [5]. For eye color, three SNPs have significant melanin effects in human melanocytes: rs12913832 (HERC2), rs16891982 (SLC24A4), and rs1426654 (SLC24A5) [17]. Though expected to be useful in future forensic investigations, more research in this area is necessary as these traits are highly polymorphic and polygenic [10]. One SNP, rs12913832, is in the highly conserved intronic region of the HERC2 gene, and is located upstream from the OCA2 promoter on chromosome 15 [16]. This SNP, in conjunction with SNPs in the OCA2 gene, has the highest association to iris color, especially in predicting blue eye color [16]. However, no single gene could be used to make a reliable iris color inference which suggests intergenic complexity for iris color determination [9]. Previously, the IrisPlex assays of six SNPs have been used to make eye color predictions in three color categories: brown, blue, or intermediate [1]. In this work, based on a Dutch European population, the prediction accuracy is high for blue and brown eye color (91.6% and 87.5%, respectively) using a multinomial logistic regression model [1]. However, no intermediate eye color phenotypes were included in this study. The objective of this work was to test the model (under the described parameters [1]) in a North American sample population. When it was determined that the predictive power of the model was not ideal (see Section 3), we tested additional models for the use of eye color prediction using the same six SNPs, including a modification of the regression model based on the North American minor allele frequencies, and a Bayesian network model [18].

500 mL phenol (Thermo Fisher Scientific Inc., Waltham, MA) was added and centrifuged at 13,000 rpm for 1 min. The aqueous layer was removed to a new tube and 500 mL phenol:chloroform:isoamyl alcohol (25:24:1) (Thermo Fisher Scientific Inc.) was added and centrifuged at 13,000 rpm for 1 min. The aqueous layer was removed and placed into a new tube to which 500 mL of cold 95% ethanol (Thermo Fisher Scientific Inc.) and 25 mL 0.2 M NaCl (Thermo Fisher Scientific Inc.) was added. The tubes centrifuged at 4 8C at 13,000 rpm for 15 min. The supernatant was discarded and the pellet was washed with 500 mL of cold 70% ethanol followed by centrifugation at 4 8C at 13,000 rpm for 5 min. The supernatant was removed and the sample was allowed to air dry. The sample was re-suspended in 50 mL of TE buffer (Thermo Fisher Scientific Inc.) and stored at 20 8C until further use. DNA quantitation was performed according to the manufacturer’s specifications using the Quantifiler1 Human DNA Quantification kit (Life Technologies Corp., Carlsbad, CA) on a 7300 Real Time PCR System (Life Technologies Corp.). 2.3. SNP amplification and genotyping The six SNPs primers are described in Walsh et al. [1]. However, two multiplex reactions, one of four IrisPlex SNP primers (HERC2, SLC45A2, TYR, IRF4) and one of the remaining two IrisPlex SNP primers (SLC24A4 and OCA2) were amplified and later pooled. For each multiplex reaction, 1 ng of DNA was amplified in a 12 mL reaction with 6 mL of AmpliTaq Gold 360 Master Mix (Life Technologies Corp.) including 0.5 mL GC Enhancer, and 2.5 mM of each primer. PCR was performed using the same parameters as in Walsh et al. [23] on a Mastercycler Pro thermal cycler (Eppendorf, Hauppauge, NY). The PCR products were purified using USB ExoSAP-IT1 (Affymetrix, Santa Clara, CA). The purified PCR products were used in the SBE reaction with 1 mL of total pooled PCR product and 2 mL of SNaPshot reaction mix in a reaction volume of 5 mL using the SNaPshot1 Multiplex kit (Life Technologies Corp.). PCR was performed on a Mastercycler Pro (Eppendorf) following the same SBE conditions as Walsh et al. [1]. SBE products were then purified using shrimp alkaline phosphatase (SAP, Takara, Berkeley, CA). Capillary electrophoresis was performed with 1 mL of purified SNaPshot products were run on an ABI 3500 Genetic Analyzer (Life Technologies Corp.) following standard protocol of the SNaPshot1 Multiplex kit. Data analysis was performed using GeneMarker v2.20 software (SoftGenetics, State College, PA). For sensitivity, a threshold of 200 rfu was set for peak intensities, and a heterozygote peak height ratio (PHR) of 0.40 was used for allele designation.

2. Materials and methods 2.4. Iris color determination and measurement 2.1. Sample collection Buccal swabs and a digital photograph were taken of each right eye (volunteers were asked to remove any corrective lenses) from 200 anonymous volunteers (Indiana University IRB Approval Protocol # 1111007371). A Canon PowerShot digital camera (Canon Inc., Tokyo, Japan) was used with macro mode, ISO80, and flash settings. A light box was built for photo collection to ensure equal distance and lighting conditions for all photos. 2.2. DNA extraction and quantitation DNA was extracted by a modified organic extraction. Briefly, swabs were incubated in 1.5 mL tubes at 65 8C for a minimum of 8 h in 500 mL lysis buffer (Invitrogen, Carlsbad, CA) with 50 mL proteinase K (Qiagen, Germantown, MD). Following lysis, the buffer was removed from the swabs with the use of DNA IQTM spin baskets (Promega Corporation, Madison, WI) and discarded. Then,

Eye color was determined both subjectively and objectively. The subjective manner was basic human visual identification, in which a set of the digital photos was given to 5 individual examiners and asked to classify the photo as blue, brown, or intermediate, where intermediate was any color other than blue or brown. Every digital photo was evaluated once by each individual. To overcome any disagreement in classifying between the 5 individuals, the consensus eye color was set as the subjectively determined color. A quantitative eye color determination was made using the iris melanin index (IMI), a numerical value derived from ratios based on measured color reflectance values [7]. This method involved determining the red, green, and blue (RGB) color and luminosity (brightness) components of the iris from each digital photo, calculating reflectance ratios between red/green, red/blue, and green/blue, averaging the luminosity, and summing the values to a single numerical value. The iris was digitally extracted to isolate the iris and measure the RGB components and

G.M. Dembinski, C.J. Picard / Forensic Science International: Genetics 9 (2014) 111–117 Table 1 Percentage (%) of samples determined for each eye color category, the IMI values were calculated for each sample and the IMI ranges were set based on least number of misclassifications when compared with the visual determinations. Eye color

Visually determined (%)

IMI value

IMI determined (%)

Brown Intermediate Blue

34 26 40

1.25–1.65 1.66–2.32 2.33–3.20

36.5 22 41.5

luminosity of the whole iris using Adobe Photoshop1 Elements 10 (Adobe Systems Inc., San Jose, CA). A ratio of these components, determined by the histogram function, along with averages of color scale and luminosity, measures the color as an IMI value. The IMI scale (Table 1) for the ranges of each eye color category was set based on concordance with the visually determined classifications where the least amount of confusion between each color range. The RGB components were converted to xy color coordinates using the OpenRGB software program (Logicol, Trieste, Italy), with F7 fluorescent illuminant and 108 observation angle used in the conversion factors, allowing for point comparisons and graphical representations of each color category (see Section 3). The converted coordinates were grouped statistically by discriminant analysis (DA) using XLSTAT (Addinsoft, Paris, France) using Microsoft Excel (Microsoft, Redmond, WA). To determine whether our sample population was representative of the larger US population, a chi-squared (x2) test was done for each eye color category for statistically significant deviations in population eye color frequency to a larger US sample population (State of Indiana). 2.5. Eye color prediction models Eye color prediction was done using the multinomial logistic regression model used by Walsh et al. [1]. This model uses categorical classification of subjects based population minor allele frequencies, and calculated probabilities of each individual for each color category: brown, intermediate, or blue [19]. The color category with the highest probability above a user-set threshold is the predicted color. Two thresholds were used to evaluate the predictions, 0.5 and 0.7, which were the same thresholds as used in the original study [1]. No other thresholds were considered at this time. The original model was based on data from a Dutch population that included 3804 individuals [19] and was evaluated using a test sample of 40 individuals [1]. Given the poor results using the above model (see Section 3), minor allele frequencies were calculated from 100 random samples (training set) and an adjusted multinomial regression model was developed and tested with the remaining 100 samples (verification set) using MATLAB1 2012a (The MathWorks Inc., Natick, MA). An alternative prediction model was tested based on Bayesian network (BN) analysis, also based on minor allele frequencies as described by Pos´piech et al. [18] using the Hugin Lite 7.6 software program (Hugin Expert A/S, Aalborg, Denmark). This results in a probability for every eye color category based on a priori odds of each eye color frequency. Two a priori odds were evaluated: equal odd for all three color categories, as well as odds based on the known eye color distribution deemed representative of our sample set (Indiana Bureau of Motor Vehicles database). The six SNPs used in the IrisPlex assay were initially evaluated by Liu et al. [19] where area under the receiver characteristic operating curve (ROC) was used to evaluate the overall prediction model performance. An ROC curve is a graphical plot with the false positive rate on the x axis and true positive rate on the y axis. For comparison of each model performance (e.g. ability to classify

113

correctly), the area under the curve (AUC), sensitivity, specificity, positive predictive (PPV) and negative predictive values (NPV) were determined for each of our models. In evaluating ROC curves, an AUC value of 0.5 indicates poor prediction while an AUC close to 1 indicates near perfect prediction accuracy. One important note for evaluating AUC values, it reflects both true positive values (e.g., correctly predicting blue for blue samples) and true negative values (e.g., correctly predicting non-blue for non-blue samples). Sensitivity is the true positive rate, the number of true positives out of the total number of true positives called (total number accounts for false negatives) and specificity is the true negative rate, the number of true negatives out of the total number of true negatives called (total number accounts for false positives). The PPV is the number of true positives out of the total number of positive predictions (includes false positives), and the NPV is the number of true negatives out of the total number of negative predictions (includes false negatives). 3. Results 3.1. Eye color determination Each digital photo’s iris color was subjectively and objectively determined for all 200 samples. For the subjective determination, all 5 examiners classified samples in the same color category for 68% of the samples. There was an average of 28% disagreement for samples where one or two examiners differed in classifying the sample. An IMI scale was determined after digital analysis and set with highest agreement to the visual determinations (Table 1). Values were classified as brown if they fell in the range 1.25–1.65, intermediate range 1.66–2.32, and blue range 2.33–3.20. Hazel and green colors are considered in the intermediate range. Discriminant analysis was performed using the xy color coordinates to show the statistical separation of each color category, where some overlap of the 95% confidence ellipses are only seen between intermediate and either brown or blue (Fig. 1). The IMI classes were used as the determined color for the sample set. There were 22 of the 200 samples (supplemental Table S1) which did not identify in the same color category between the objective IMI classification and subjective human visual determination. All mistaken classifications were between intermediate and either brown or blue. Supplementary material related to this article can be found, in the online version, at doi:10.1016/j.fsigen.2013.12.003.

Fig. 1. Discriminant analysis plot of the first two canonical variates (F1 and F2) describing the sample population based on color coordinates. It shows the statistical grouping of the 200 samples with 100% variation based on xy color coordinates of each color category.

G.M. Dembinski, C.J. Picard / Forensic Science International: Genetics 9 (2014) 111–117

114

Table 2 Eye color distribution among sample population and larger scale United States sample population (approximate percentages, %).

Brown Blue Intermediate Sample size (N)

Collected samples (%)

State of Indiana (%)

P value

34 40 26 200

43 34 23 7,115,106

>0.50 >0.30 >0.50

To determine if the 200 samples were a representative sample of the Indiana population, we collected data from the Indiana Bureau of Motor Vehicles (D. Rosebrough, Indiana BMV, personal communication). There was no significant difference between the frequency distributions of our collected sample and that collected by the BMV (Table 2), although there were a higher number of observed blue-eyed individuals (p > 0.10). 3.2. Multinomial logistic regression model The IrisPlex genotypes for all individuals were determined (Table S1) and used as the input for the prediction models. The prediction model used by Walsh et al. [1] calculates probabilities in each of the three color categories based on multinomial logistic regression using formulas as in Liu et al. [19]. Two different parameter sets were used for prediction evaluation: the Walsh et al. parameters [1] and an adjusted set based on our sample allele frequency data. Two cut-off probability thresholds were chosen as discussed by Walsh et al. [1] in evaluating accuracy of prediction, 0.5 and 0.7. The IMI classification, not the visual determination, was used as the eye color for each sample. Comparing each method of classification, the quantitative method did lead to more correct predictions; especially at the 0.5 threshold where 4 and 2 more samples were correctly predicted under the IrisPlex and adjusted parameters, respectively, than the visual determination method. Using the original frequencies, the correct eye color prediction rate was 95% and 76% for blue and brown eye colors, respectively,

Table 3 The correct prediction rates (%) by color category of the verification set (N = 100) was evaluated against the IrisPlex regression parameters and the adjusted regression parameters. The verification set was then evaluated using the Bayesian network with either set of a priori odds. Parameters

Threshold

Brown (%)

MLR: IrisPlex

0.5 0.7

88 76

0 0

95 95

MLR: adjusted

0.5 0.7

58 42

19 11

95 93

Bayesian: equal odds

0.5 0.7

55 55

20 20

95 80

Bayesian: adjusted

0.5 0.7

67 55

30 15

98 98

Intermediate (%)

Blue (%)

Equal odds = 0.33 each eye color category, adjusted odds = 0.33 brown, 0.44 blue, 0.17 intermediate.

at the 0.7 threshold, and 95% and 88% for blue and brown eye colors, respectively, at the 0.5 threshold (Table 3). The intermediate color at both thresholds did not yield any true positive predictions. Using the adjusted parameters (based on our training set), the predicted eye colors of the verification set (n = 100) were 93% and 42% for blue and brown eye colors, respectively, at the 0.7 threshold; and 95% and 58% for blue and brown eye color, respectively, at the 0.5 threshold. For the intermediate color the rate was 11% at the 0.7 threshold and 19% at the 0.5 threshold (Table 3). The number of correct predictions decreased for the brown eye color and increased for intermediate color using the adjusted parameters vs. IrisPlex parameters; blue eye color predictions were similar for either model. The adjusted parameters did not measure more accurate correct predictions than those of Walsh et al., however, important to note, there was a fewer number of incorrect predictions and an increase in the number of inconclusive predictions, those in which the probabilities in either color category did not measure above at least the 0.5 threshold

Fig. 2. The frequency of overall correct, incorrect, and inconclusive eye color predictions using the multinomial regression model. (a) Predictions under IrisPlex parameters at the 0.5 threshold, (b) predictions under adjusted parameters at the 0.5 threshold, (c) predictions under IrisPlex parameters at the 0.7 threshold, and (d) predictions under adjusted parameters at the 0.7 threshold.

G.M. Dembinski, C.J. Picard / Forensic Science International: Genetics 9 (2014) 111–117 Table 4 AUC values of each prediction model evaluating the training set (N = 100). AUC reflects model performance (ability to make accurate predictions). Higher AUC value indicates better model performance. Prediction model

Blue

Intermediate

Brown

Liu et al. [19] IrisPlex parameters [1] Adjusted parameters Bayesian: equal Odds Bayesian: adjusted

0.91 0.97 0.97 0.97 0.97

0.73 0.84 0.89 0.88 0.86

0.93 0.95 0.97 0.96 0.96

(Fig. 2). The adjusted parameters model resulted in the lowest number of incorrect predictions (= error rate), with error rates of 3% (blue eye color), and 9% (brown eye color) at both probability thresholds. For the intermediate color prediction, the error rates were 59% at the 0.5 threshold and 48% at the 0.7 threshold. The AUCs were determined for our samples using both parameter sets (Table 4). Our samples evaluated with the IrisPlex parameters show an AUC of 0.97 for blue, 0.84 for intermediate, and 0.95 for brown. They improved when our samples were evaluated with the adjusted frequency parameters with AUCs of 0.97 for blue, 0.89 for intermediate, and 0.97 for brown. With the adjusted parameters, the sensitivity improved for brown and intermediate predictions and positive predictive values also improved with PPV of 93% for blue, 73% for intermediate, and 89% for brown (Table 5). 3.3. Bayesian network analysis Statistical analysis was performed using a Bayesian network prediction model as described by Pos´piech et al. [18]. The predictions were evaluated with two a priori probability scenarios. One adjusted to the eye color frequencies of our training set, and one assuming no previous knowledge of population frequencies thus assuming equal odds for each color. Table 3 shows the positive prediction rates for each eye color category and Fig. 3 shows a summary of the overall number of predictions for both prior probability sets. The AUC for the Bayesian model was also determined for its prediction accuracy (Table 4). The AUC with both a priori probabilities was 0.96 for brown eye color, and 0.97 for blue eye color. For the intermediate color, the AUC for a priori equal odds (equal probability) was 0.88, whereas with adjusted frequencies, it was slightly lower at 0.86. A summary of these AUC values can be seen in Table 4. The positive predictive value (PPV) for equal odds was 91% for blue, 94% for brown, and 65% for intermediate; and for adjusted frequency odds, the PPV was 91% for blue, 80% for brown, and 54% for intermediate (Table 5). 4. Discussion The 200 samples were shown to be representative of a larger United States population when compared to the collected Indiana driver statistics. However, important to note, as eye color is selfreported for driving records, some subjective discrepancy might be present. Quantitative color classification has led to an increase in accurate predictions in the tested models. A recent study used hue and saturation values in a GWAS study for quantifying eye color and quantitation as a more systematic, objective approach compared to categorical classification, and as a result additional candidate eye color SNPs were discovered [11]. Visual determinations cannot be disregarded, however, as they are the basis for eye witness testimonies and the practical manner of classification for forensic investigations; thus it is essential that objective eye color classification correlates with visual determinations.

115

Table 5 Prediction model performance test characteristics (%) of both regression and Bayesian parameter sets after analysis of the training set (N = 100). Model

Test characteristics

Blue

Intermediate

Brown

MLR: IrisPlex

Sensitivity Specificity PPV NPV

91 95 93 93

41 93 54 89

85 84 77 90

MLR: adjusted

Sensitivity Specificity PPV NPV

86 95 93 90

65 95 73 93

85 93 89 91

Bayesian: equal odds

Sensitivity Specificity PPV NPV

91 93 91 93

76 92 65 95

77 97 94 87

Bayesian: adjusted

Sensitivity Specificity PPV NPV

93 93 91 95

41 93 54 89

82 87 80 88

PPV = positive prediction value (correctly predicted positives); NPV = negative prediction value (correctly predicted negatives).

In the original data, Walsh et al. had reasonable prediction accuracies 91.6% and 56% for blue and brown eye colors, respectively at the 0.7 threshold; and 91.6% and 87.5% for blue and brown eye colors, respectively at the 0.5 threshold [1]. However, their data lacked any inclusion of intermediate color category samples. When evaluating the models, all models showed more accurate negative predictive values than positive except for brown color for the equal odds Bayesian model. The AUC is relatively high for all models as well because it considers both the true negative and true positive predictions, which also translates into the measure of sensitivity and specificity of the model (Table 5). With this model, the specificity is more accurate than the sensitivity which can indicate higher false negatives rates as compared to false positives. Negative rates are useful as true negative predictions are important for exclusionary purposes in forensic investigations. But true positive predictions for inferring an unknown individual’s phenotype are a major goal of this model, and with this model as it is currently, positive predictions are not yet sufficient for acceptably accurate inferences. Our sample set had more inconclusive results as compared to the IrisPlex samples. This is likely due to the greater number of intermediate samples and the hypothesized population admixture that is inherent in our samples. Most incorrect and inconclusive predictions were found in the heterozygous individuals at the HERC2 SNP rs12931382 (30% of our individuals) compared to homozygous individuals. This HERC2 SNP alone should explain most of the differences in phenotype expression between blue and brown eye color [16,20]. Also in our sample set, 3 individuals who were found to be homozygous for the blue associated CC genotype at rs12913832 had brown eyes, and 3 individuals who were homozygous for the brown associated TT genotype had blue eyes. Additional data support this hypothesis for the failure of prediction rates [21]. This additional study looked at 60 samples of individuals with European-Asian background and saw that with higher levels of admixture, the predictions were less accurate [21]. Also, in evaluating IrisPlex across Europe, which included 3840 individuals from seven other European countries, and adjustments in the regression model parameters, some blue-associated alleles were seen at some of the SNPs in brown-eyed Europeans [22]. In the developmental validation study of IrisPlex, subsets from the Human Genome Diversity Cell Line Panel (HGDP-CEPH), a large DNA database comprised of many populations, were used to show prediction accuracy applied to several populations [23]; however,

116

G.M. Dembinski, C.J. Picard / Forensic Science International: Genetics 9 (2014) 111–117

Fig. 3. The frequency of overall correct, incorrect, and inconclusive eye color predictions using the Bayesian model. (a) Predictions under equal odds at the 0.5 threshold, (b) predictions under adjusted frequency odds at the 0.5 threshold, (c) predictions under equal odds at the 0.7 threshold, and (d) predictions under adjusted frequency odds at the 0.7 threshold.

the eye color phenotypes were not available for the typed samples. Therefore, even if a sample was determined to be 90% or greater for a certain eye color category, there is no way to know if it may be a true or false positive as the actual phenotype is unknown. Our study looked at samples with known phenotypes, which offers strong empirical support that predictions are not always correct given a high probability in any one color category. Overall AUC measurements do not indicate optimal prediction performance for any single model tested, the Bayesian model may be more useful as likelihood ratios can be calculated from these Bayesian probabilities using prior odds as described by Pos´piech [18]. Likelihood ratios are useful in reporting forensic interpretations and may also prove to be appropriate for reporting phenotype inference if used in future prediction models. Recently, Ruiz et al. reported success in using a modified online Bayesian classifier application, Snipper, to give such likelihood-based eye color predictions based on SNP allelic frequencies [24]. Still, as with the regression model, positive prediction inferences were not shown to be at an acceptable level for forensic application with a North American population, especially for the intermediate color category.

SNP profiles associated with each eye color category. Though considered in some studies without much success [18], further breakdown of color category discrimination of the intermediate colors should be explored (e.g. green and hazel instead of intermediate). Other studies suggest finding candidate genes for the determination of certain iris patterns [26]. Additional SNPs will need to be incorporated if this tool is to be used as a DNA-based eye color prediction assay; recently three additional SNPs associated with the intermediate color especially were discovered and may be informative [11]. Combining SNPs into one assay for inferring hair, skin, and eye color simultaneously would be ideal in developing a phenotypic profile of multiple traits based on DNA profiles. Recently Walsh et al. published a combined hair and eye color assay called HIrisPlex [27], however, its utility in a North American population should be determined. The Bayesian network model should be considered as an optimal prediction model method (over a multinomial regression model) due to better predictions and because likelihood ratios can be calculated from the results, which are more easily reported by scientists in statistically interpreting forensic phenotype profiles. References

5. Conclusions Eye color variation is highly polygenic trait confirmed by GWAS studies in individuals of European descent [11] and IrisPlex has been shown to be useful for eye color predictions in European populations [1,19,22,25]. However, the IrisPlex assay is shown to be only moderately predictive of eye color in a representative US population. The United States is a highly admixed population as compared to European populations. Objective color quantitation is comparably accurate to visual color determination which may be more effective in classification of eye color. If using such a classification scheme as the IMI scale, it may be possible to establish a reference database of group-specific

[1] S. Walsh, F. Liu, K.N. Ballantyne, M. van Oven, O. Lao, M. Kayser, IrisPlex: a sensitive DNA tool for accurate prediction of blue and brown eye colour in the absence of ancestry information, Forensic Sci. Int. Genet. 5 (2011) 170–180. [2] M. Kayser, P. Schneider, DNA-based prediction of human externally visible characteristics in forensics: motivations, scientific challenges, and ethical considerations, Forensic Sci. Int. Genet. 3 (2009) 154–161. [3] R.K. Valenzuela, M.S. Henderson, M.H. Walsh, N.A. Garrison, J.T. Kelch, O. CohenBarak, D.T. Erickson, F. John Meaney, J. Bruce Walsh, K.C. Cheng, S. Ito, K. Wakamatsu, T. Frudakis, M. Thomas, M.H. Brilliant, Predicting phenotype from genotype: normal pigmentation, J. Forensic Sci. 55 (2010) 315–322. [4] G. Tully, Genotype versus phenotype: human pigmentation, Forensic Sci. Int. Genet. 1 (2007) 105–110. [5] P. Sulem, D.F. Gudbjartsson, S.N. Stacey, A. Helgason, T. Rafnar, K.P. Magnusson, A. Manolescu, A. Karason, A. Palsson, G. Thorleifsson, M. Jakobsdottir, S. Steinberg, S. Pa´lsson, F. Jonasson, B. Sigurgeirsson, K. Thorisdottir, R. Ragnarsson, K.R. Benediktsdottir, K.K. Aben, L.A. Kiemeney, J.H. Olafsson, J. Gulcher, A. Kong,

G.M. Dembinski, C.J. Picard / Forensic Science International: Genetics 9 (2014) 111–117

[6] [7] [8] [9]

[10]

[11]

[12]

[13] [14]

[15] [16]

U. Thorsteinsdottir, K. Stefansson, Genetic determinants of hair, eye and skin pigmentation in Europeans, Nat. Genet. 39 (2007) 1443–1452. S. Aditya, A.K. Sharma, C.N. Bhattacharyya, K. Chaudhuri, Generating STR profile from touch DNA, J. Forensic Legal Med. 18 (2011) 295–298. T. Frudakis (Ed.), Molecular Photofitting: Predicting Ancestry and Phenotype from DNA, 1st ed., Academic Press, Burlington, MA, 2008. R.A. Sturm, Molecular genetics of human pigmentation diversity, Hum. Mol. Genet. 18 (2009) R9–R17. T. Frudakis, M. Thomas, Z. Gaskin, K. Venkateswarlu, K.S. Chandra, S. Ginjupalli, S. Gunturi, S. Natrajan, V.K. Ponnuswamy, K.N. Ponnuswamy, Sequences associated with human iris pigmentation, Genetics 165 (2003) 2071–2083. E. Pos´piech, J. Draus-Barini, T. Kupiec, A. Wojas-Pelc, W. Branicki, Gene–gene interactions contribute to eye colour variation in humans, J. Hum. Genet. 56 (2011) 447–455. F. Liu, A. Wollstein, P.G. Hysi, G.A. Ankra-Badu, T.D. Spector, D. Park, G. Zhu, M. Larsson, D.L. Duffy, G.W. Montgomery, D.A. Mackey, S. Walsh, O. Lao, A. Hofman, F. Rivadeneira, J.R. Vingerling, A.G. Uitterlinden, N.G. Martin, C.J. Hammond, M. Kayser, Digital quantification of human eye color highlights genetic association of three new loci, PLoS Genet. 6 (2010) e1000934. W. Branicki, U. Brudnik, A. Wojas-Pelc., Interactions between HERC2, OCA2 and MC1R may influence human pigmentation phenotype, Ann. Hum. Genet. 73 (2009) 160–170. T. Frudakis, T. Terravainen, M. Thomas, Multilocus OCA2 genotypes specify human iris colors, Hum. Genet. 122 (2007) 311–326. J.L. Han, P. Kraft, H. Nan, Q. Guo, C. Chen, A. Qureshi, S.E. Hankinson, F.B. Hu, D.L. Duffy, Z.Z. Zhao, N.G. Martin, G.W. Montgomery, N.K. Hayward, G. Thomas, R.N. Hoover, S. Chanock, D.J. Hunter, A genome-wide association study identifies novel alleles associated with hair color and skin pigmentation, PLoS Genet. 4 (2008) e1000074. J. Mengel-From, C. Børsting, J.J. Sanchez, H. Eiberg, N. Morling, Human eye colour and HERC2, OCA2, and MATP, Forensic Sci. Int. Genet. 4 (2010) 323–328. H. Eiberg, J. Troelsen, M. Nielsen, A. Mikkelsen, J. Mengel-From, K.W. Kjaer, L. Hansen, Blue eye color in humans may be caused by a perfectly associated founder mutation in a regulatory element located within the HERC2 gene inhibiting OCA2 expression, Hum. Genet. 123 (2008) 177–187.

117

[17] A. Pneuman, Z.M. Budimlija, T. Caragine, M. Prinz, E. Wurmbach, Verification of eye and skin color predictors in various populations, Legal Med. 14 (2012) 78–83. [18] E. Pos´piech, J. Draus-Barini, T. Kupiec, A. Wojas-Pelc, W. Branicki, Prediction of eye color from genetic data using Bayesian approach, J. Forensic Sci. 57 (2012) 880– 886. [19] F. Liu, K. van Duijn, J. Vingerling, A. Hofman, A. Uitterlinden, A. Janssens, M. Kayser, Eye color and the prediction of complex phenotypes from genotypes, Curr. Biol. 19 (2009) R3–R192. [20] R.A. Sturm, D.L. Duffy, Z.Z. Zhao, F.P.N. Leite, M.S. Stark, N.K. Hayward, N.G. Martin, G.W. Montgomery, A single SNP in an evolutionary conserved region within intron 86 of the HERC2 gene determines human blue-brown eye color, Am. J. Hum. Genet. 82 (2008) 424–431. [21] P.R. Prestes, R.J. Mitchell, R. Daniel, K.N. Ballantyne, R.A.H. van Oorschot, Evaluation of the IrisPlex system in admixed individuals, Forensic Sci. Int. Genet. Suppl. Ser. 3 (2011) e283–e284. [22] S. Walsh, A. Wollstein, F. Liu, U. Chakravarthy, M. Rahu, J.H. Seland, G. Soubrane, L. Tomazzoli, F. Topouzis, J.R. Vingerling, J. Vioque, A.E. Fletcher, K.N. Ballantyne, M. Kayser, DNA-based eye colour prediction across Europe with the IrisPlex system, Forensic Sci. Int. Genet. 6 (2012) 330–340. [23] S. Walsh, A. Lindenbergh, S.S. Zuniga, T. Sijen, P. de Knijff, M. Kayser, K.N. Ballantyne, Developmental validation of the IrisPlex system: determination of blue and brown iris colour for forensic intelligence, Forensic Sci. Int. Genet. 5 (2011) 464–471. [24] Y. Ruiz, C. Phillips, A. Gomez-Tato, J. Alvarez-Dios, M. Casares de Cal, R. Cruz, O. Maron˜as, J. So¨chtig, M. Fondevila, M.J. Rodriguez-Cid, A´. Carracedo, M.V. Lareu, Further development of forensic eye color predictive tests, Forensic Sci. Int. Genet. 7 (2013) 28–40. [25] J. Purps, M. Geppert, M. Nagy, L. Roewer, Evaluation of the IrisPlex eye colour prediction tool in a German population sample, Forensic Sci. Int. Genet. Suppl. Ser. 3 (2011) e202–e203. [26] R. Sturm, M. Larsson, Genetics of human iris colour and patterns, Pigment Cell Melanoma Res. 22 (2009) 544–562. [27] S. Walsh, F. Liu, A. Wollstein, L. Kovatsi, A. Ralf, A. Kosiniak-Kamysz, W. Branicki, M. Kayser, The HIrisPlex system for simultaneous prediction of hair and eye colour from DNA, Forensic Sci. Int. Genet. 7 (2013) 98–115.

Evaluation of the IrisPlex DNA-based eye color prediction assay in a United States population.

DNA phenotyping is a rapidly developing area of research in forensic biology. Externally visible characteristics (EVCs) can be determined based on gen...
730KB Sizes 0 Downloads 3 Views