Forensic Science International: Genetics 11 (2014) 52–55

Contents lists available at ScienceDirect

Forensic Science International: Genetics journal homepage: www.elsevier.com/locate/fsig

Forensic Population Genetics – Original Research

Combining autosomal and Y chromosome match probabilities using coalescent theory John Buckleton a,*, Steven Myers b a b

ESR Ltd., Private Bag 92021, Auckland 1142, New Zealand California Department of Justice, Jan Bashinski DNA Laboratory, Richmond, CA 94804, United States

A R T I C L E I N F O

A B S T R A C T

Article history: Received 11 November 2013 Received in revised form 10 February 2014 Accepted 14 February 2014

Walsh et al. [1] outlined a method for adjusting autosomal coancestry values, uA, to take account of the existence of a Y chromosome match, uAjY. The framework established by Walsh et al. is flexible and allows an investigation of some real world effects such as family structure. It also allows the effect of a Y chromosome match to be placed within the construct of existing casework practice. Most notable is the ability to deal with an assigned value for the autosomal coancestry coefficient and the fact that most casework statistics report a value for unrelated individuals unless case circumstances suggest differently. The values of uAjY are not much larger than uA and a coherent argument could be made that any adjustment is unnecessary. ß 2014 Elsevier Ireland Ltd. All rights reserved.

Keywords: DNA interpretation Y Chromosome Match probability

1. Introduction The Y chromosome is inherited as an intact copy of genetic material from the father [2] and is an identical copy of the father’s Y chromosome except for mutation. In forensic Y chromosome analysis a sample is usually analysed at l STR loci each with a mutation rate ml. It is also possible that the same sample has been typed at a number of autosomal loci. If the sample matches a person of interest at the autosomal and the Y chromosomal loci there is some interest in assessing the combined weight for both the autosomal and the Y chromosome matches. Match probabilities for autosomal loci are estimated using the allele probabilities and the coancestry coefficient uA [3]. Walsh et al. [1] outlined a method for adjusting autosomal coancestry values to take account of the existence of a Y chromosome match. In that paper Walsh et al. gave a method for taking account of the pre-existing autosomal substructure by adjusting the value of NM, the number of males in the effective breeding population. This approach challenges the user with rather small values for NM. The autosomal coancestry coefficient uA is usually assigned in what is believed to be a conservative manner. This assignment is often informed by studies of the differences between populations thought to be diverging primarily by drift. It is almost never assigned by reference to theoretical equilibrium values that take

* Corresponding author. Tel.: +64 98153 904; fax: +64 98496046. E-mail address: [email protected] (J. Buckleton). http://dx.doi.org/10.1016/j.fsigen.2014.02.009 1872-4973/ß 2014 Elsevier Ireland Ltd. All rights reserved.

account of the effective population size and mutation rates. Whilst informed by these studies, the assigned value is usually a round number believed to be near the top of the plausible range and not the actual estimate. Since this is an assigned value rather than an estimated value, we cannot expect it to have a direct relationship with values derived theoretically from the effective population size and mutation rates. Many theoretical approaches start from the assumption that every individual has one male and one female parent randomly selected from the population. This seemingly innocuous assumption ignores the family structure present throughout much of human history. The presence of family structure increases the information that a match at one locus conveys to other loci. In this work we extend the Walsh et al. model, which provides a very flexible and intuitive framework for investigating the effect of a Y chromosome match on the autosomal coancestry coefficient. We adjust the assigned autosomal coancestry coefficient to a more intuitive position in the equation. Also following a suggestion from Bruce Weir (personal communication), we incorporate mutation at the autosomal loci.

2. Method Following Walsh et al. [1] we model the time to the most recent common Y ancestor, TMRCA, as having a geometric distribution with parameter l = 1/NM where NM is the effective breeding population of males. Also following Walsh et al., the posterior

J. Buckleton, S. Myers / Forensic Science International: Genetics 11 (2014) 52–55 Table 1 Modified uAjY values applying Eq. (2) summed from t = 3 to infinity to various commercial multiplexes, NM, and uA. The proportion of full siblings (x), 0.88, assumed an average of two children per family [4] and 12% of children having halfsiblings [5]. The mA was 0.0025. FA was assumed to equal uA. The duplicated locus DYS385 was counted as one locus for l. Multiplex

PPY

Yfiler

PowerPlex Y23

Yfiler Plus

Loci (l) mave. [6–8]

11 0.00210

16 0.00256

22 0.00354

25 0.00566

uAjY

uAjY

uAjY

uAjY

0.00204 0.00188 0.00186 0.00186 0.01103 0.01087 0.01085 0.01085 0.03100 0.03085 0.03083 0.03083

0.00264 0.00249 0.00248 0.00247 0.01163 0.01148 0.01146 0.01146 0.03159 0.03145 0.03143 0.03143

0.00380 0.00367 0.00366 0.00365 0.01278 0.01264 0.01263 0.01263 0.03272 0.03259 0.03258 0.03257

0.00551 0.00540 0.00539 0.00539 0.01447 0.01436 0.01435 0.01435 0.03438 0.03427 0.03426 0.03426

uA

NM 100 1000 10,000 100,000 100 1000 10,000 100,000 100 1000 10,000 100,000

0.001

0.01

0.03

t1

ð1  lÞ lvtY P1 t1 lvtY t¼1 ð1  lÞ This particular ancestor (MRCYA) will contribute two identical by descent (IBD) alleles to two individuals in generation t = 0 with probability

;

where F is Wright’s F statistic [3] and is assumed to be equal to the assigned background coancestry coefficient uA. In order to still be IBD t generations later they must not have mutated with probability ð1  mA Þ2t ¼ vtA where mA is the mutation rate at the autosomal locus under consideration. The two alleles are not IBD copies from this particular ancestor with probability [1  (2)/(22t+1)] and are assumed to be IBD with probability uA. No adjustment is made to this pair of alleles for mutation nor further increases in the IBD probability with t (see Appendix A). This suggests

uAjY ¼

    ð1  lÞt1 vtY 1þF 2 vtA 2tþ1A þ 1  2tþ1 uA P1 t1 t 2 2 vY t¼1 t¼1 ð1  lÞ

1 X

3. Potential adjustments to this approach

uAjY ¼

t1 Q ð1  lÞ l l ð1  ml Þ2t P1 t1 Q l l ð1  ml Þ2t t¼1 ð1  lÞ Q Writing vY ¼ l ð1  ml Þ2 gives

22tþ1

change from the equations outlined by Walsh et al. It differs in that mutation is considered for the autosomal locus and that the background coancestry coefficients, FA and uA, appear explicitly rather than via modification of the l value.

The discussion so far has focussed on the autosomal contributions of the Y chromosome common ancestor. Eq. (1) therefore assumes that the relationship identified by the TMRCYA is halfsibling (t = 1), half cousin (t = 2), half second cousin (t = 3) and so on. In most cases the possession of a common ancestor t generations ago is because two people are full not half siblings, full not half cousins, etc. This suggests

probability that the TMRCA is t generations, given the matching Y profiles and the Y STR mutation rates ml, is

1 þ FA

53

(1)

where uAjY is the adjusted value for the autosomal locus given the observation of a Y haplotype match. Eq. (1) represents very little Table 2 Autosomal match probabilities calculated applying uA and uAjY. uAjY assumes matching Yfiler Plus haplotypes and NM = 10,000. Match probabilities were calculated for profiles that were either homozygous or heterozygous at all fifteen Identifiler loci. These profiles were comprised of the most common set of alleles in the New Zealand Caucasian database (personal communication Jo-Anne Bright). Autosomal match probability

No Y information uA = 0.01

Yfiler Plus match uAjY = 0.1435

No Y information uA = 0.03

Yfiler Plus match uAjY = 0.3426

P(AAjAA) P(ABjAB)

1 in 3.7  1015 1 in 1.5  1013

1 in 1.7  1015 1 in 1.2  1013

1 in 1.4  1014 1 in 6.1  1012

1 in 7.5  1013 1 in 5.1  1012

ð1  lÞt1 vtY P1 t1 t vY t¼1 t¼1 ð1  lÞ     2ð1 þ xÞ t 1 þ FA  ð1 þ xÞvA 2tþ1 þ 1  uA 2tþ1 2 2 1 X

(2)

where x represents the fraction of relatives that are full as opposed to half relatives. This is often referred to as bilineal (related on both the mother’s and the father’s sides) or unilineal (related on one side only.) See Appendix B for a derivation of this adjustment. The modified uA value given in Eq. (1) includes a contribution from first and second order relatives, half siblings and half cousins in this case. In routine forensic work these relationships are usually assigned a separate match probability. To implement this approach the sum in Eq. (2) is taken from t = 3 to infinity. As observed in Table 1, the increase in uAjY is modest and fairly insensitive to NM when excluding the contribution of close relatives. Example multi-locus autosomal match probabilities are provided in Table 2. The maximum cumulative difference across all loci was less than a factor of 2.2, a change of limited practical import given the overall low probability estimates. This factor was for the most common alleles. The factor would be larger for rare alleles but the overall match probability would be smaller. Y haplotype matches based upon less discriminating multiplexes would reduce the uAjY effect on autosomal match probabilities. The match probabilities for siblings and first cousins also contain a uAjY. There is no effect on uAjY for a Y chromosome match for two brothers, nor for two cousins whose parents included brothers. For cousins whose parents included sisters or a brother sister pair there is an effect.Formulae appear in Appendix C.

4. Discussion The framework established by Walsh et al. is flexible and allows an investigation of some real world effects such as family structure. It also allows the effect of a Y chromosome match to be placed within the construct of existing casework practice. Most notable is the ability to deal with an assigned value for the autosomal coancestry coefficient and the fact that most casework statistics report a value for unrelated individuals unless case circumstances suggest differently. The values of uAjY are not much larger than uA and a coherent argument could be made that any adjustment is unnecessary. Acknowledgements This work was supported in part by grant 2011-DN-BX-K541 from the US National Institute of Justice. Points of view in this document are those of the authors and do not necessarily represent the official position or policies of the U.S. Department of Justice.

54

J. Buckleton, S. Myers / Forensic Science International: Genetics 11 (2014) 52–55

Appendix A The alleles drawn in generation t ago that are not from the MRCYA may be copies of the same allele with probability 1/2(N S 1) where N is the effective population size (male and female) or may be IBD from the background with probability 

1

 1 ut : 2ðN  1Þ

common ancestors while the remaining (1  x) will share only one common ancestor. The overall probability that two alleles would be IBD at t generations can then be derived as     1 þ FA t 2ð1 þ xÞ ð1 þ xÞ v þ 1  uA A 22tþ1 22tþ1 Appendix C

This suggests

uAjY ¼

t1

1 X

ð1  lÞ vtY P1 t1 t vY t¼1 t¼1 ð1  lÞ       2 1 1 t 1 þ FA t þ 1  vA 2tþ1 þ vA 1  2tþ1 ut 2ðN  1Þ 2ðN  1Þ 2 2

which we have approximated as Eq. (1) by assuming

vtA



1 þ 2ðN  1Þ

 1

  1 u t ¼ uA 2ðN  1Þ

and which is the same form as the assumption behind the classical drift mutation equilibrium equation

vtA



1 þ 2N

   1 1 ut ¼ uA : 2N

Appendix B Consider a simple pedigree such as

The same allele will be passed to the individuals in generation 0 ½ of the time and different alleles ½ of the time. Two alleles selected at random from each of these two individuals will be the set from the common ancestor ¼ of the time. The alleles in the common ancestor may also be IBD with probability FA. This suggests that the contribution to the coancestry coefficient of a common ancestor one generation ago is ð1 þ F A Þ=ð8ÞvA with the term vA for the probability that neither allele has mutated. This will be reduced by vA =22 for each subsequent generation. The alleles may also be IBD even if not traced back to this common ancestor with probability uA. The coancestry coefficient for two individuals who share a common ancestor t generations ago is therefore 







1 þ FA 1 2 1  2ðt1Þ vtA þ 1   2ðt1Þ uA 8 2 8 2     1 þ FA t 2 ¼ v þ 1  uA A 22tþ1 22tþ1

The equivalent formula for individuals sharing two common ancestors is

2

  1 þ FA 2tþ1

2



vtA þ 1 

4 2tþ1

2



uA

For two individuals who are both separated from their common ancestor(s) by t generations, one proportion (x) will share two

Taking one allele from each of two cousins, ¼ of the time both alleles are IBD from the pedigree, and 3/ 4 of the time one is from within the pedigree and one from without or both are from without. For matching homozygote aa genotypes, both alleles IBD from within the pedigree means that if one cousin is aa one allele of the other cousin is specified as a. We require the allele from outside the pedigree to also be a. Similarly, for matching heterozygote ab genotypes, both alleles IBD from within the pedigree means that if one cousin is ab one allele of the other cousin is specified as a or b with the remaining allele from outside the pedigree being b or a, respectively. In both cases, the Y chromosome match suggests a RCA so we should use uAjY. When both alleles are from outside the pedigree, the Y chromosome match again suggests a RCA so we should use uAjY. 1 ð2uAjY þ ð1  uAjY Þ pa Þ 2 vA 4 1 þ u AjY þ

3 ð2uAjY þ ð1  u AjY Þ pa Þð3uAjY þ ð1  u AjY Þ pa Þ ð1 þ uAjY Þð1 þ 2uAjY Þ 4

for matching

homozygous genotypes

1 ð2uAjY þ ð1  uAjY Þð pa þ pb ÞÞ 2 vA 1 þ uAjY 8 þ

3 2ðuAjY þ ð1  u AjY Þ pa ÞðuAjY þ ð1  uAjY Þ pb Þ ð1 þ uAjY Þð1 þ 2uAjY Þ 4

for matching

heterozygous genotypes References [1] B. Walsh, A.J. Redd, M.F. Hammer, Joint match probabilities for Y chromosomal and autosomal markers, Forensic Sci. Int. 174 (2008) 234–238. [2] L. Roewer, Y Chromosome STR typing in crime casework, Forensic Sci. Med. Pathol. 5 (2009) 77–84. [3] S. Wright, The gentical structure of populations, Ann. Eugenics 15 (1951) 323–354. [4] The World Factbook, New Zealand and United States estimates: 2.06 children born/ woman, 2014, https://www.cia.gov/library/publicaqtions/the-world-factbook/ fields/2127.html. [5] R.M. Kreider, J.M. Fields, Children’s coresidence with half siblings, U.S. Census Bureau, in: Annual Meeting of the Population Association of America, Dallas, TX, April 15–17, 2010, http://www.census.gov/hhes/socdemo/children/data/sipp/ children_coresidenceHalfsibsposter.pdf.

J. Buckleton, S. Myers / Forensic Science International: Genetics 11 (2014) 52–55 [6] K.N. Ballantyne, M. Goedbloed, R. Fang, O. Schaap, O. Lao, A. Wollstein, et al., Mutability of Y-chromosomal microsatellites: rates, characteristics, molecular bases, and forensic implications, Am. J. Hum. Genet. 87 (2010) 341–353. [7] M. Goedbloed, M. Vermeulen, R.N. Fang, M. Lembring, A. Wollstein, K. Ballantyne, et al., Comprehensive mutation analysis of 17 Y-chromosomal short tandem repeat

55

polymorphisms included in the AmpFlSTR1 Yfiler1 PCR amplification kit, Int. J. Legal Med. 123 (2009) 471–482. [8] M. Vermeulen, A. Wollstein, K. van der Gaag, O. Lao, Y. Xue, Q. Wang, et al., Improving global and regional resolution of male lineage differentiation by simple single-copy Y-chromosomal short tandem repeat polymorphisms, Forensic Sci. Int. Genet. 3 (2009) 205–213.

Combining autosomal and Y chromosome match probabilities using coalescent theory.

Walsh et al. outlined a method for adjusting autosomal coancestry values, θA, to take account of the existence of a Y chromosome match, θA|Y. The fram...
272KB Sizes 0 Downloads 3 Views