J. Anim. Breed. Genet. ISSN 0931-2668

ORIGINAL ARTICLE

Expected influence of linkage disequilibrium on genetic variance caused by dominance and epistasis on quantitative traits €ki-Tanila2 W.G. Hill1 & A. Ma 1 Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, UK 2 Department of Agricultural Sciences, University of Helsinki, Helsinki, Finland

Keywords Covariance of relatives; genotypic variance; genetic interaction; linkage. Correspondence W.G. Hill, Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, EH9 3JT, UK. Tel: +44 131 650 5705; Fax: +44 131 650 6564; E-mail: [email protected] Received: 22 November 2014; accepted: 22 January 2015 Edited by Frank W. Nicholas

Summary Linkage disequilibrium (LD) influences the genetic variation in a quantitative trait contributed by two or more loci, with positive LD increasing the variance. The magnitude of LD also affects the relative magnitude of dominance and epistatic variation. We quantify the extent of the nonadditive variance expected within populations, deriving analytical expressions for simple models and using numerical simulation in finite population more generally. As LD generates non-independence among loci, a simple partition into additive, dominance and epistatic components is not possible, so we merely distinguish between additive and non-additive components based on comparing covariances among close relatives, such as full sibs, half sibs and offspring–parent. As tight linkage is needed to yield substantial LD in outbred populations, we ignore recombination in the generation used to estimate components and it is analogous to a multi-allelic model. The expected magnitude of the non-additive variance is generally increased but not greatly so by the LD in outbred populations. Thus, as found in previous studies for unlinked loci, independent of the type and strength of gene interaction, the epistatic variance contributes little to the total.

Introduction In outbred populations, the non-additive genetic variance due to dominance and epistasis is difficult to estimate for quantitative traits: extensive data sets are needed which include large families of closely related individuals. Even so, the estimates of epistatic and dominance variance are typically confounded with each other, and with common environmental effects of family members. Evidence for dominance through its effect on the mean is widespread, but estimates of dominance variance are less common, particularly in animals. They have now become available in livestock through analyses of large data sets fitting dominance relationship matrices (e.g. Misztal 1997) using simple algorithms ignoring inbreeding (c.f. Smith & M€ akiTanila 1990), while a genomic approach for estimating additive and dominance effects would not require doi:10.1111/jbg.12140

consideration of inbreeding (e.g. Toro & Varona 2010). Estimation of dominance and epistatic effects using genomewide association studies (GWAS) has low power because multiple markers and/or nonlinear models have to be fitted. GWAS have, however, been undertaken to estimate dominance effects in human populations (Zhu et al. 2015) and epistatic effects in human (Hemani et al. 2014), laboratory (Huang et al.2012) and livestock populations (Ali et al. 2015). There is evidence for both, but there are risks of artefacts in testing for epistasis through partial linkage disequilibrium between pairs of markers (Wood et al. 2011; 2014). Some direct evidence of epistatic interactions for pairs of loci comes from laboratory studies, for example analyses of F2 cross-derived populations (Mackay 2014), but these do not necessarily imply much epistatic variance in outbred populations. Further, those estimates that have been obtained free © 2015 Blackwell Verlag GmbH

• J. Anim. Breed. Genet. 132 (2015) 176–186

W. G. Hill & A. M€aki-Tanila

Non-additive variation with linkage disequilibrium

of confounding effects such as common environmental variance indicate that the magnitude of both the dominance and the epistatic variance in outbred populations is generally low. Indeed, theory indicates that the proportion of the genetic variation contributed by dominance and epistatic variance is likely to be small in outbred populations (Hill et al. 2008; M€ aki-Tanila & Hill 2014a) even if most genes do not act additively. This is basically because the epistatic variance depends on products of heterozygosity at multiple loci and the dominance variance on squared heterozygosity at individual loci. Therefore, unless heterozygosity is high, such loci would contribute relatively less to non-additive variance than to additive variance. There are some philosophical differences in views on biological complexity and how to handle it or in matching the perceptions on functional versus statistical epistasis. To some extent, the widespread perception that dominance and epistatic variances are large arises because a distinction is not always made between on the one hand the existence of within locus interaction (heterozygote differing from homozygote mean) and interaction among pairs or more of loci, and on the other of dominance and epistatic variance. We discussed some of these points recently (M€ aki-Tanila & Hill 2014a) and will not elaborate further here, save to note that there are differences in view. During his long and productive research career, John James generally utilized additive models, but on at least one occasion made an excursion to the worlds of epistasis and dominance. Mueller & James (1983) investigated the ‘Effect on linkage disequilibrium of selection for a quantitative character with epistasis’. Their analysis followed up the work of Griffing (1960) who had concluded that additive 9 additive (AA) variance would contribute to directional selection, but its effects on response would asymptote and then be lost if selection is subsequently relaxed. Griffing assumed unlinked loci and did not explicitly use the concept of linkage (gametic) disequilibrium (LD) in explaining the erosion of gains due to interaction effects. Bulmer (1971) showed that, in an additive model, directional selection produces negative LD through reduction in phenotypic variance of selected individuals and hence reduces genetic variance among the parents and among families in the next generation; in the epistatic model, he further showed by comparing parent–offspring and grandparent–grandoffspring regression how the gains under epistasis are quickly lost as LD breaks down, unless there is tight linkage (Bulmer 1980). Mueller & James (1983) © 2015 Blackwell Verlag GmbH

• J. Anim. Breed. Genet. 132 (2015) 176–186

found that synergistic AA interaction (their A + AA model) would lead to positive LD among pairs of loci. Taking together these calculations and simulations over multiple generations and including dominance interactions, they conclude, however, ‘The results suggest that so long as epistatic effects are not large relative to additive effects, and the proportion of pairs of loci which show epistasis is not very high, the predominant effect of linkage disequilibrium will be to reduce the rate of selection response.’ Mueller and James’ study is nicely argued but, judging by the few citations to it, has been mostly ignored. It was, however, picked up recently by Hansen (2013) who argues strongly that epistasis is important in the evolutionary context, albeit we have criticized his conclusions (M€ aki-Tanila & Hill 2014a). In finite populations, even if there is no epistasis or epistatic effects on fitness, negative LD can be generated by selection, most simply exemplified by the case where in the initial population, the haplotype with potentially highest fitness is absent (Hill & Robertson 1966). The existing theory on magnitudes of dominance and epistatic variance expected in natural populations or under weak selection applies primarily to loci in linkage equilibrium (LE), but we cannot rule out the potential LD among linked loci. Evidence is not yet clear on the distribution of epistatic interactions across the genome (Wei et al. 2014), but if there is a preponderance of cis-acting effects, the epistatic loci may be closely linked, acting like alleles or haplotypes within a complex gene and therefore potentially in high LD. Further, it is likely that genes that have large effect, notably through influence on flux in biochemical pathways, have both dominant and epistatic effects, based on the arguments of Kacser & Burns (1973), although they may not generate much non-additive variance (Keightley 1989; Hill et al. 2008). Both selection (on additive or epistatic effects) and finite population size can generate LD. If the LD among individual pairs of loci is to be high, however, it requires that the loci are very tightly linked if LD is not to be lost by recurrent recombination. If generated by drift sampling, the recombination fraction must be less than approximately the inverse of effective population size; or if generated by selection, less than the product of heterozygosities at the loci and the product of single locus effects and/or the interaction term as fractions of the phenotypic standard deviation (Hill & Robertson 1966; Bulmer 1971; Mueller & James 1983). Therefore, if we wish to discuss quantities such as covariances among relatives and consider closely related individuals (i.e. one/two generations apart), it 177

W. G. Hill & A. M€aki-Tanila

Non-additive variation with linkage disequilibrium

seems sufficient to assume there is no recombination over these transmissions, that is haplotypes are transmitted intact. The frequencies of alleles at loci in LD are correlated. Hence, the variance contributed by a pair of loci A1 and A2 together may be unequivocal, but in an analysis, the variance contributed by A2 depends on whether A1 is fitted first or not at all. In the classical partition of variance into additive and non-additive components of Fisher (1918), Cockerham (1954), Kempthorne (1954) and others, linkage equilibrium (LE) is assumed. Although account can be taken of linkage among loci (Cockerham 1956; Schnell 1963; Weir & Cockerham 1977; Lynch & Walsh 1998), such that the statistical model in an analysis of variance has proportional frequencies in subclasses, gene and genotype frequencies, and therefore the main effects contributed by individual loci are uncorrelated; but this requires LE. In the presence of LD, however, such an orthogonal partition is not possible. Therefore, we consider it unfruitful to attribute variance to individual loci if they are in LD. The totals (or the variance utilizable by selection) contributed by the set of linked loci are more tangible, however. Hence, we take a pragmatic view (following e.g. Avery & Hill 1979) and analyse just potentially estimable quantities, the genotypic variance and covariances among relatives such as offspring and parent, full sibs and half sibs. We investigate here the extent to which LD influences the magnitude of non-additive variance and its impact on our previous conclusions in which LD was ignored. Although analysis of multilocus models with dominance is straightforward (Comstock & Robinson 1952; Avery & Hill 1979), we restrict the derivation of explicit formulae for epistatic variance to only AA effects at pairs of loci, in part because the AA component alone comprises much of the epistatic variance for loci in LE (M€ aki-Tanila & Hill 2014b), but mainly because the algebra quickly becomes unmanageable with more loci or dominance, and so we compute the variances numerically. To obtain information on the impact of drift sampling in finite populations on the expected magnitude of epistatic variance from linked loci, we use Monte Carlo simulation. We consider both epistatic and dominance variance, in part because these usually cannot be unequivocally distinguished in the estimates of covariance of relatives in random-mating populations. We develop results given previously for genes in LE, but make a number of assumptions. We concentrate 178

essentially on variance components rather than selection responses because the latter depend on the former. More importantly, we mainly assume LD has been generated by drift rather than selection, not least because defining both multilocus models, selection criteria and the snapshot when the population is analysed produce more alternative scenarios than we have space to cover.

Analysis Genotypic model

The model and notation are basically the same as we used previously (M€ aki-Tanila & Hill 2014a), but with the addition of LD, and we give these in full here for just a pair of biallelic loci (Table 1). At loci A1 and A2, the increasing alleles A1 and A2 have frequency p1 and p2 and the alternative alleles are a1 and a2, respectively. The linkage disequilibrium coefficient is D, and the consequent haplotype frequencies fi, i = 1,. . ., 4; for example, the frequency of A1A2 is, f1 = p1p2 + D. We assume there is random mating, such that genotype frequencies at individual loci and among haplotypes are in Hardy–Weinberg proportions; for example, the frequency of genotype A1A2/A1A2 is f12. Allele Ai increases the trait over allele ai by an effect ai. The additive 9 additive interaction effect at the haploid level for these loci is [aa]12. Dominance and other interaction effects together with genotypic values, Gij, are shown in Table 1. We make the customary assumption that coupling and repulsion heterozygotes have the same functional value (no cis effects at the haploid genome level). The recombination fraction between the loci is c. Covariance of first-degree relatives ignoring recombination

There are simple general formulae under this assumption (M€ aki-Tanila & Hill 2014b), which basically are extensions of those for individual loci that apply with random mating and any number of loci; the multilocus haplotypes are now equivalent to multiple alleles at a single locus with additive, dominance and epistatic effects embedded in them. Let fi now refers to the frequency of a multilocus haplotype, say A1a2a3A4. . ., and let fi2 denotes the frequency of the homozygous genotype and 2fifj that of a specific heterozygote, with genotypic values Gii and Gij, respectively. The population mean is l = ΣΣijfifjGij, and the genotypic variance is VG = ΣiΣjfifjGij2  l2. © 2015 Blackwell Verlag GmbH

• J. Anim. Breed. Genet. 132 (2015) 176–186

W. G. Hill & A. M€aki-Tanila

Non-additive variation with linkage disequilibrium

Table 1 Haplotype frequencies (fi) and genotypic values (Gij) for haplotype combinations with a two-locus model Haplotypes and their frequencies Haplotype A1A2 Frequencya f1 p1p2 + D Components of genotypic value without dominance A1A2 2a1 + 2a2 + 4[aa]12b A1a2 2a1 + a2 + 2[aa]12 a1A2 a1 + 2a2 + 2[aa]12 a1a2 a1 + a2 + [aa]12 Additional components of genotypic value with dominance A1A2 0 A1a2 d2 + 2[ad]12 a1A2 d1 + 2[da]12 a1a2 D

A1a2 f2 p1(1 – p2) – D

a1A2 f3 (1 – p1) p2 – D

a1a2 f4 (1 – p1)(1 – p2) + D

2a1 + a2 + 2[aa]12 2a1 a1 + a2 + [aa]12 a1

a1 + 2a2 + 2[aa]12 a1 + a2 + [aa]12 2a2 a2

a1 + a2 + [aa]12 a1 a2 0

d2 + 2[ad]12 0 D d1

d1 + 2[da]12 D 0 d2

Dc d1 d2 0

a

With haplotype frequencies also expressed in terms of gene frequencies and LD term D. Factors of 2 and 4 in [aa]12, etc. arise because they are counted for each allelic pair. c D = d1 + d2 + [ad]12 + [da]12 + [dd]12. b

Let mi be the mean genotypic value of individuals comprising one specified haplotype i and the other haplotype sampled at random, so assuming random mating, mi = ΣjfjGij. This is therefore also the expected genotypic value of half-sib offspring of a homozygous individual with genotypic value Gii. Hence, in the absence of recombination, the offspring mean of a heterozygote individual with genotypic value Gij is ½(mi + mj), regardless of the presence of dominance and/or epistasis. The parent–offspring covariance is therefore covOP ¼ 1=2

XX

fi fj ðmi þ mj ÞGij  l2 i j X X X X ¼ 1=2 fi m i fj Gij þ 1=2 fj mj fi Gij  l2 ¼

i 1=2ðR f m 2 i i i

j

j 2

i

þ Rj fj mj Þ  l ¼ Ri fi mi  l 2

2

2

ð1aÞ and the covariance of half sibs is   2  l2 covHS ¼ RRij fi fj 1=2 mi þ mj ¼ 1=4Ri fi mi 2 þ 1=4Rj fj mj 2 þ 1=2Ri fi mi Rj fj mj l2 ¼ 1=2ðRfi mi 2 l2 Þ ¼ 1=2covOP :

ð1bÞ

In the absence of recombination, the mean of full sib progeny of a pair of individuals with genotypic values Gij and Gkl is ¼(Gik + Gil+ Gjk + Gjl). As for a single locus, a pair of full sibs shares the same genotype with probability ¼, shares one haplotype (like offspring and parent) with probability ½ and shares no haplotype with probability ¼. Hence, the covariance of full sibs is given by

© 2015 Blackwell Verlag GmbH

• J. Anim. Breed. Genet. 132 (2015) 176–186

covFS ¼ 1=4VG þ 1=2covOP :

ð1cÞ

Also, for example, covHS ¼ covFS  1=4VG

ð1dÞ

VG  2covOP ¼ 2ðVG  2covFS Þ

ð1eÞ

and the variance among full sibs within half-sib families, which is also the variance associated with the sire 9 dam interaction Vs9d, is Vsd ¼ covFS  2covHS ¼ covFS  covOP ¼ 1=4VG  1=2covOP :

ð1f Þ

Equations (1a-f) apply for any model of dominance and/or epistasis at any number of completely linked loci, regardless of the extent of LD, but can be illustrated for a single locus with dominance: Let VA and VD be the additive and dominance variances, respectively: VG= VA + VD, covOP = ½VA, covHS = ¼VA, covFS = ½VA + ¼VD and Vs9d = ¼VD (Falconer & Mackay 1996; Lynch & Walsh 1998). For loci in LE, these formulae also hold for multiple dominant loci. In contrast, for unlinked loci in LE, covOP includes ¼VAA, covHS includes 1/16VAA and covFS includes ¼VAA + 1/8VAD + 1/16VDD, together with corresponding terms in three and more loci. For completely linked loci in LE, the corresponding terms in two-locus epistatic variance are ¼VAA, 1/8VAA and 3/8VAA + 1/4VAD + 1/ 4VDD, respectively (Cockerham 1956; Lynch & Walsh 1998, p. 148). In so far as there is a direct comparison,

179

W. G. Hill & A. M€aki-Tanila

Non-additive variation with linkage disequilibrium

Vs9d comprises a quarter of all components. So we just take e2 = 4Vs9d/VG as a natural scaling to quantify in a simple way the contribution of dominance and epistatic variance to the genotypic variance in an analysis of data undertaken assuming most pairs of loci in the genome are unlinked or weakly linked with VD being the predominant non-additive component. Formulae including LD for specific models: dominance

Linkage disequilibrium influences both the additive and non-additive variances for loci showing dominance. For the model used here, that is two partially dominant loci with effects a1, d1 and a2, d2 (Table 1), we use results of Avery & Hill (1979). The genotypic variance is VG ¼ Ri ½2ai 2 pi ð1  pi Þ þ 4di 2 pi 2 ð1  pi Þ2  þ 4a1 a2 D þ 8d1 d2 D2

AA interactions across loci (model in Table 1, upper part). It can be shown that l ¼ 2p1 a1 þ 2p2 a2 þ ð4p1 p2 þ 2DÞ½aa12 VG ¼ 2p1 ð1  p1 Þa1 2 þ 2p2 ð1  p2 Þa2 2 þ 4Da1 a2 þ 4½2p1 ð1  p1 Þp2 þ Da1 ½aa12 þ 4½2p1 p2 ð1  p2 Þ þ D a2 ½aa12 þ ½4p1 p2 ð1 þ p1 þ p2  3p1 p2 Þ þ 2Dð1 þ 2p1 þ 2p2  4p1 p2 Þ½aa212 ð5Þ (M€ aki-Tanila & Hill 2014b, who also give (6) and (7)). In the absence of epistasis, (5) reduces to (3), that is the variance due to additive loci. The genotypic means, mi, where one haplotype is designated and the other is random depend on D, as for the overall mean but not for the case of dominance. With no recombination, for example for haplotype A1A2:

ð2Þ where ai = ai + (1 - 2pi)di. With additive gene action, ai = ai and di = 0 and the term D appears clearly as the covariance of allele frequencies:   VA ¼ VG ¼ 2Ri ai 2 pi ð1  pi Þ þ 4a1 a2 D:

m1 ¼ ð1 þ p1 Þa1 þ ð1 þ p2 Þa2 þ ð1 þ p1 þ p2 þ p1 p2 þ DÞ½aa12 : The covariance of offspring and parent is

ð3Þ

The covariances among first-degree relatives are, respectively, covOP ¼ 2covHS ¼ Ri ½ai pi ð1  pi Þ þ 2a1 a2 D 2

covOP ¼ p1 ð1  p1 Þa1 2 þ p2 ð1  p2 Þa2 2 þ 2Da1 a2 þ 2½2p1 ð1  p1 Þp2 þ Da1 ½aa12 þ 2½2p1 p2 ð1  p2 Þ þ Da2 ½aa12 þ ½p1 p2 ð1 þ 3p1 þ 3p2  7p1 p2 Þ

ð6Þ

þDð1 þ 2p1 þ 2p2  p1 p2 Þ  D2 ½aa12 2   covFS ¼ Ri ½ai 2 pð1  pi Þ þ di 2 pi 2 ð1  pi Þ2 þ 2a1 a2 D þ 2d1 d2 D2 : and the scaled interaction variance, from (1f), is   4Vsd ¼ Ri 4di 2 pi 2 ð1  pi Þ2 þ 8d1 d2 D2 :

ð4Þ

Equation (4) can also be written in terms of heterozygosities, Hi = 2pi(1 – pi), and the correlation of gene frequencies between loci, r, as 4Vs9d = d12H12 + d22H22 + 2r2d1d2H1H2. The above formulae can be extended to multiple loci simply by summation over all pairs of loci: for example, 4Vsd ¼ Rdi 2 Hi 2 þ 2Ri Rj\i rij 2 di dj Hi Hj :

Formulae including LD for specific models: additive9 additive effects

This is the simplest epistatic model, in which gene effects at individual loci are additive but there can be

180

The first three terms in (6), that is those without epistasis, are one-half of those for the variances VA or VG (Equation 2), as expected for the parent–offspring covariance; but the contribution from the epistatic term is less straightforward. The scaled sire 9 dam interaction in a diallel analysis (using 1f, 5 and 6) is 4Vsd ¼ VG  2covOP   ¼ 2 p1 ð1  p1 Þp2 ð1  p2 Þ þ D2 ½aa12 2 :

ð7Þ

Equation (7) can also be written 4Vs9d = 2p1(1 – p1)p2(1 – p2)[1 + r2][aa]122, and Equations (5–7) apply to the variances contributed by any other pair of loci under the same assumptions. Note that, for unlinked loci in LE (M€ aki-Tanila & Hill 2014a), the epistatic variance is VAA = 4p1(1 – p1)p2(1 – p2) [aa]122 = 4Vs9d, differing from (7) as explained above. The impact of including a little recombination in the generation when the variances are estimated at a level concomitant with that needed to generate the LD would be very small.

© 2015 Blackwell Verlag GmbH

• J. Anim. Breed. Genet. 132 (2015) 176–186

W. G. Hill & A. M€aki-Tanila

Non-additive variation with linkage disequilibrium

Expected variances consequent on LD in segregating populations

When analysing the likely contribution of epistasis to variance, we have pointed out that, under arguably the most parsimonious population genetic model, random drift with no selection and rare recurrent mutation, the weight of the distributions are towards allele frequencies near 0 or 1, so expected heterozygosity is low and consequently, the dominance and epistatic variances are expected to comprise only a small part of the genotypic variance (Hill et al. 2008; M€aki-Tanila & Hill 2014a). Now, we consider the impact of LD, and use basically the same model, except allowing also for LD generated by random drift. Two-locus distributions have been derived (Song & Song 2007), but we find the formulae intractable and use Monte Carlo simulation. A simple haploid model was (forward) simulated, here described for two loci, but a three-locus version was developed similarly. There was assumed to be random mating and no selection with expected haplotype frequencies after recombination specified by the reduction in D to (1 – c)D, where c is the recombination fraction. Mutation was assumed to be very rare, such as to have no impact in a segregating population. If the sample of haplotypes for the next generation became fixed at one of the loci, a mutant allele was sampled at the fixed locus. For illustration, say allele A2 was fixed at the second locus: then, an A1a2 haplotype was sampled with probability p1 and an a1a2 haplotype with probability 1 - p1. As no mutation was included when both loci were segregating, this models the case of a very low mutation rate. For the steadystate distribution, the results across generations were accumulated solely when both loci were segregating. For a specified population size, a continued iteration of 106 generations preceded by a ‘burn-in’ of 1000 unrecorded generations was used to construct the frequency distribution of haplotype classes. From the cumulative distributions so obtained, genetic variances could be evaluated for any quantitative genetic model without further simulation. To minimize computing requirements and storage space at manageable size, a diploid population of n = 40 (haploid population size nh = 80) or exceptionally n = 20 (nh = 40) was simulated. Note that, as the distribution is asymptotically (using diffusion equation arguments) a function of Nc, this analysis has some generality. It is, however biased towards finding more epistatic variance than would be expected in larger populations, because the distribution at individual loci becomes more extreme U shaped as N rises, and so heterozy© 2015 Blackwell Verlag GmbH

• J. Anim. Breed. Genet. 132 (2015) 176–186

gosity and consequently epistatic variance become proportionately lower (Hill et al. 2008; M€ aki-Tanila & Hill 2014a). The variance components were computed using haplotype combinations for each model of defined gene effects, and these weighted by the frequency with which the haplotype combination occurred so as to approximate the integral over the distribution of haplotype combinations. Results were also obtained indirectly for unlinked loci as a check on the simulation and as a reference point, because some disequilibrium is inevitable in finite population even with free recombination. A square transition probability matrix of dimension nh + 1 for a single locus with initial frequency 1/nh was iterated until approaching fixation, and the frequency distribution accumulated over generations. The twolocus joint frequency distribution for independent loci was obtained as the product of the respective single locus marginal frequencies. Predicted variance components were essentially the same as for simulation with c = 0.5 (if the same assumptions were made about no recombination in the last generation), and therefore are not reported. Recurrent mutation

Because the heterozygosity is low for the neutral mutation model above and thus likely to lead to little non-additive variance, the model was also run as a check with recurrent mutation, with equal mutation rates between the alleles. This thereby increased the average heterozygosity at each locus. Three loci

When more loci are considered the possible number of interaction terms, including those involving dominance, rises dramatically. Our previous analysis shows that the expected proportion of epistatic variance does not rise accordingly for a higher number of unlinked loci in LE (M€ aki-Tanila & Hill 2014a). To find out the respective consequences for linkage and LD, we undertook a check for three loci and the neutral mutation model. The simulation program described above was extended, again analysing one million simulated generations, with mutation occurring only after fixation. The genotypic model in Table 1 was extended to include all two- and three-locus epistatic terms (M€ aki-Tanila & Hill 2014b). For three loci, there are now three two-locus LD terms and a three-locus LD term (Geiringer 1944; Hill 1974). We assumed loci were ordered A1, A2 and A3 on the chromosome; 181

W. G. Hill & A. M€aki-Tanila

Non-additive variation with linkage disequilibrium

recombination fractions c12 and c23 between adjacent loci were specified, and there was assumed to be no interference. So c13 = c12 + c23  2c12c23 and c123 = 1 – (1 – c12)(1 – c23) = c12 + c23 + c12c23 is the three-locus recombination coefficient.

quencies can occur only after multiple generations of recombination, with respective losses in LD, while drift generates LD.

Results

Under the neutral mutation model, the allele frequencies tend to be weighted towards 0 or 1, and the allele frequencies usually differ markedly among the loci, so that |D| and r2 are usually small at any generation. An alternative specifically defined scenario is one where a population is derived from a two-way cross. Then initial frequencies are 0.5 at each locus and initially D0 = 0.25 or 0.25 (i.e. r2 = 1). We simulated this scenario using the same program as in the neutral mutation case for pairs of loci, without burn-in or allowing any further mutation after fixation, and also simulated D0 = 0 for comparison. Haplotype occurrences were accumulated over generations as previously, such that much of the weight of the distribution for allele frequencies near one-half was accumulated in early generations when historic LD was still high. Combinations of more extreme fre-

Some examples for pairs of loci are given in Figure 1 for recombination rates varying from c = 0 to 0.5. Examples used are complete dominance and solely positive or negative AA epistasis of magnitude similar to individual gene effects as a basis, and a more complex model including also AD and DD epistasis. The results for c = 0.5 serve as a reference point for previous work, although it must be recalled here that e2 is a function of all potential interaction terms, including dominance, but not a simple function of individual epistatic variance components. Tight linkage has an influence on the interaction variance generated by dominance and by AA epistasis and by combinations of both types (Figure 1), increasing its proportion of the genotypic variance for all examples shown as nonlinear function of c. All the examples include complete dominance or epistatic effects of the same magnitude as those of individual

p1 = p2 = 0.25

p1 = 0.25; p2 = 0.75

–0.2 0.0 D

0.2

0.6 V sxd /V G

0.6 0.4

0.4

V sxd /V G

0.2

0.2 0.10 D

0.0

0.0

–0.05

p1 = p2 = 0.75

0.8

1.0 0.8

0.8 0.6 0.0

0.0

0.2

0.2

0.4

0.4

V sxd /V G

V sxd /V G

0.6

0.8

1.0

1.0

p1 = p2 = 0.5

Specific gene frequency combinations

1.0

More extreme LD: populations initiated from a twoway cross

–0.15

0.00

–0.05

D

0.10 D

Figure 1 Examples of contribution of interaction variance (dominance and epistasis) for specified haplotype frequencies. The range in D is constrained by the allele frequencies. Models included are ai = di = 1 (line marking –); for ai = [aa]12 = 1( ); for ai = -[aa]12 = 1 ( ); for ai = di = [aa]12 = [ad]12 = [da]12 = [dd]12 = 1 (- -). Effects are 0 otherwise.

182

© 2015 Blackwell Verlag GmbH

• J. Anim. Breed. Genet. 132 (2015) 176–186

W. G. Hill & A. M€aki-Tanila

Non-additive variation with linkage disequilibrium

gene effects and are therefore quite extreme models of interactions. If interaction effects are of smaller order than main effects, for example d = [aa] = a/2, [ad] = a/4, [dd] = a/8, the variances they contribute would accordingly be smaller. Those combinations of effects that may yield substantial interaction in the absence of LD are also those that may contribute most in the presence of LD, notably for example when the [aa] term is of opposite sign to that of the gene effects, a, such that the [aa] terms reduce the average effects of the genes. To test whether the assumption of no recombination in the last generation (rate cL = 0) was important, numerical examples (not shown) indicate that, for example, the ratio e2 changes by much

Expected influence of linkage disequilibrium on genetic variance caused by dominance and epistasis on quantitative traits.

Linkage disequilibrium (LD) influences the genetic variation in a quantitative trait contributed by two or more loci, with positive LD increasing the ...
242KB Sizes 0 Downloads 7 Views