SOME ASPECTS OF A STOCHASTIC TWO LOCUS SELFING GENETIC MODEL WITH SELECTION AND COMPUTER SIMULATION
R. L. W. Department
WELCH, S.
C.
SMEACHand
C. P.
T~~KOS
of Mathematics, University of South Florida, Tampa, Florida 33620 (USA)
(Received: 20 June, 1975)
SUMMAR
Y
The present paper examines a speciJc genetic model as aJinite Markovprocess, using the normal matrix approach. This model is the two locus selfing model with selection studied by Tan (1913), who used an eigenvalue approach. The properties of the process are analytically and numerically investigated and the effects of selection and crossover on the transition from a heterozygotic parent through several generations of heterozygotic progeny are assessed. These results enlarge upon Tan’s work and, in addition, present two new aspects of the model. In particular (1) the expected number of generations of heterozygotic progeny of genotype j that will descendfrom a heterozygotic parent of genotype i and (2) the variance of this number of generations about the mean value have not been previously considered.
SOMMAIRE
Cet article examine un modsle gPnPtique spe’cifique en tant que processus de Markov jini, utilisani une matrice de transition. Ce modtile est le modile d deux loci, avec sPlection, e’tudie par Tan, qui utilise Ie calcul des valeurs propres. Les propriGs du processus sont PtudiPesd’un point de vue analytique et num&ique, on Pvalue Pgalement les e#ets de selection et d’enjambement d’un parent htWrozygote h travers plusieurs g&rateurs de descendants he’tt!rozygotes. Ces rbultats Plargissent le travail de Tan et e’clairent en plus, deux nouveaux aspects du modsle, qui n’avaient pas Pte’abordt!s prtWdemment: (I) le nombre attendu de gPnPrations de descendants ht%+ozygotes de ginotype j qui descendent d’un parent hPt&ozygote de gPnotype i; (2) la variance de ce nombre de gPnt+ations par rapport ri [a valeur moyenne. In&J. Bio-Medical
35 Computing
Printed in Great Britain
(7) (1976 j-0
Applied Science Publishers Ltd, England, 1976
36
R. L. W. WELCH, S. C. SMEACH, C. P. TSOKOS
INTRODUCTION
In a genetics system it is of interest to determine the expected genotypic composition of progeny generations descendent from a parent population with given genetic characteristics. Such information, for example, may aid an agricultural experimenter who is trying to develop a strain of some plant with particularly desirable traits. In 1949 Fisher (1965) treated breeding systems as discrete stochastic processes in the Markovian sense. Nelder (1952) discussed the two locus model without selection. He studied the transition from one generation to another in the selfine and backcross cases, though he did not investigate the model from the standpoint of Markov chain theory. In 1969 Bosso et al. (1969) used the approach implied earlier by Fisher (1965) to present the one locus sib mating model as an example of the finite absorbing Markov chain. Through a discussion of several special cases, they illustrated some of the properties of that model with the normal matrix techniques outlined in Kemeny and Snell (I 960) or Tsokos (1972). Tan (1973) has recently considered the application of Markovian theory to the two locus selfng model with selection. Preliminary to further discussion, some remarks are in order about this latter model. As usual (Nelder, 1952; Tan, 1973) we shall assume that each genome consists of two linked loci: the first locus with alleles A and a; and the second locus with alleles Band 6. Let us denote, e.g., the genotype with genes A and B on one chromosome and A and b on the other by AB/Ab. Considering all possible combinations of alleles at both loci on each chromosome of the genome, the following ten genotypes are obtained, assuming that there is no difference caused by sex: AB Ab aB ab AB AB Ab aB AB Ab EAbzabAbaBabababz We interpret these formulations as the ten states of a Markov chain. Note that the first four are homozygotic in both loci; the second four are heterozygotic in one locus, hereafter called singly heterozygotic. The last two genotypes are heterozygotic in both loci, hereafter called doubly heterozygotic. In addition, we assume that the fitness of the loci is additive, that is, the selection affects each locus independently. Let x1 represent the relative fitness of the AA pairing, let y, be that of aa, x2 that of BB and y, that of bb. Thus the relative fitnesses of the genotypes are : BB bb Bb AA x, + 1 x1 +x2 Xl +Yz Aa I + x2 1+1 1 +Yz aa Yl +x2 Yl + Y2 Yl + 1 where xi, yi 2 0 (i = 1, 2). Furthermore, we include the possibility of crossovers in the specification of the model by lettingp be the probability of a cross-over occurring, where 0 I p I _t. For further explanation of this see Elandt-Johnson (1971).
A STOCHASTIC TWO LOCUS GENETIC MODEL
37
Tan’s work was concerned with determining the following stochastic properties of the model: (1) The probability that a heterozygotic parent will have progeny of a particular homozygotic genotype in the nth generation, at or before the nth generation, and in some generation (i.e. ultimately); (2) the probability that a heterozygotic parent will have homozygotic progeny of any genotype in the nth generation; (3) the expected number of generations that a heterozygotic line would continue producing heterozygotic offspring; and (4) the dispersion (variance) of the above number of generations about the mean value. It is the aim of this paper to consider some of the analytical and computational aspects of this model and to extend the previous results. First we answer two questions, not studied before, which are of interest: (5) The expected number of generations of heterozygotic offspring of genotype j descended from an initial heterozygotic line of type i; and (6) the variance of the number of generations about this mean value. Secondly, the behaviour of the model in the limit as the relative fitnesses of the homozygotic genotypes approach infinity is derived. Thirdly, it will be shown that the methods of the present investigation are computationally less complex than the eigenvalue approach used by Tan (1973). Using the eigenvalue methods of Tan, it is not only cumbersome to display most of the properties of the system algebraically, but also impractical to assess analytically the resulting functions. Moreover, previous authors have discussed only a few special cases numerically, and these were presented in a tabular form which is inadequate to describe the finer aspects of the behaviour of the system. We have chosen to do a more complete simulation of the model, using a computer, and to display the results graphically. The inherent simplicity and flexibility of the present techniques may prove to be advantageous in investigating similar systems. It should be noted that the answers to the genetical questions that Tan examined can very easily be deduced from the current approach utilising the normal matrix. The formulation of the model and an explanation of the mathematical procedures used are given below. The analytical and numerical results, along with the accompanying graphs, as well as the conclusions of the current study, are also presented below.
FORMULATION OF THE MODEL
The one-step probability transition matrix P which describes the behaviour of the two locus selfing model can be represented in the special form given below (Kemeny and Snell, 1960; Tan, 1973; Tsokos, 1972):
R. L. W.
38
WELCH,
S. C. SMEACH,
C. P. TSOKOS
In the present case with four homozygotic and six heterozygotic genotypes, each matrix is as follows. Z is a (4 x 4) identity matrix which represents the fact that a homozygotic parent will reproduce its own kind with probability one (and other homozygotic genotypes with probability zero). Since it is also impossible for a heterozygotic genotype to be produced from a homozygotic line, 8 is a (4 x 6) zero matrix, G is the (6 x 4) matrix which gives the probability that, in the nth generation, a heterozygotic parent of the ith genotype will produce homozygotic offspring of thejth genotype, belonging to the (n + 1)th generation. Tis the (6 x 6) matrix of one-generation probabilities that a heterozygotic parent of type i will have heterozygotic offspring of type j.
AB
Xl +
Ab
Cl
AB aB Ab
Xl
+
AB
Ab
aB
ab
AB
Ab
iii
ab
Xl
x2
+Y2
0
0
Cl x2
Yl
0
+x2
c2
0
c2 Xl
0
z
+Y2
Yl
0
c3
+Y2 c3
G = aB ab
Ab aB
and
+x2
Yl
c4
AB ab
Yl
0
0 x1 + g2Ty
P-
2x1
x2
2 Xl P-
+x2 9-
Cs
-
2x1
+
Cs +Y2
cs
Y2
2Yl
P-
+
(2)
Y2
c4 +
x2
2 Yl + cl-
2 Yl
+
Y2
4-
Cs
CS x2
cs
2y1+ P-
Y2
cs
39
A STOCHASTIC TWO LOCUS GENETIC MODEL
AB
Ab
AB
AB
Ab
Ab
aS
z
2---x,+1 Cl
0
AB aB Ab
0 1
2- x2 f
AB -I ab
Ab
0
0
0
0
0
0
2Ylf’
0
0
aB ab
aB
0
c2
0
0
2Y,+l
ab
c3
T= aB ab AH ab Ab aB
where
(3)
0
0
0
c4
2Pq ___ x, f
1 2PY ~x2 + 1 2Pq Y2 ____+ 1
cs 2Pq -x,
2Pq -x2
+ 1
-4pz
c5
cs
4p2 -
4q2 -
cs
cs
c5
cs
C5
+ 1 cs
2pq- y, + 1 4q2 -
2Pq Yz -
c5
+ cs
1
2pq- y1
+ 1 cs
4 = 1 -p, Cl = 4x, + x2 4 y, + 2, c2 = 4X2 + X1 + Y, + 2, C3 = 4J’, + X, + Y, + 2, c4 = 4y,
+ x2 f y2 + 2 and
c5 = x1 + x2 + y1 + y2 + 4
In genetical terms this matrix yields, for a parent of genotype i, the expected frequency (or probability) of offspring of any genotype in the next generation. To illustrate, if the parent’s genotypic make-up were AB/Ab, one should expect [(x1 + x2)/c 1] per cent of the offspring to be of the AB/AB genotype, [(x1 + v2)/c 1] per cent of the Ab/Ab genotype and [2(x, f 1)/c,] per cent of the same genotype as the parent, in any generation. It should be noted that since a singly heterozygotic genotype is unaffected by crossing-over, such a parent can only reproduce itself or generate homozygotic offspring. A doubly heterozygotic parent can produce offspring of any genotypic sort, as long as the cross-over fraction p is non-zero. The probability transition matrix P contains all the information necessary for one to examine the statistical properties of the genetic system. (For the theoretical details see Kemeny and Snell(1960), Chapter III, or Tsokos (1972), pp. 556-67.)
40
R. L. W. WELCH,
S. C. SMEACH,
C. P. TSOKOS
We shall continue by investigating four aspects of the genetic behaviour in detail. The number of generations of heterozygotic progeny of thejth genotype that one would expect to issue from an initial line of the ith (heterozygotic) genotype (see question (5) above) is given by the normal or fundamental matrix N of the Markov process. It is computed as: NC
$‘L(&T)-r n=O
(4)
where the T matrix is as explained above and I is the (6 x 6) identity matrix. The three additional questions of interest are studied in terms of this normal matrix. The dispersion about the mean number of generations of heterozygotic genotype j, descended from an initial heterozygotic genotype i (see question (6) above), is the variance, V,, of the normal matrix, given by: Var(N)= Vi =N(2D-I)-S (5) where D is a (6 x 6) diagonal matrix whose elements are the main diagonal elements of N, I is the (6 x 6) identity matrix and S is a (6 x 6) matrix whose elements are the squares of the corresponding elements of N. Summing the rows of the normal matrix yields the expected number of generations of any sort of heterozygotic progeny produced by an initial line of the ith heterozygotic genotype (see question (3) above) : ti = W
(6)
where @is the (6 x 1) mean vector and 5 is a (6 x 1) vector of ones. The dispersion about the mean number of heterozygotic offspring (see question (4) above) is given by the variance of the @ vector, $i : VarW = 11/1= (2N - 449 - k. (7) where I is the (6 x 6) identity matrix and @so is a (6 x 1) vector whose elements are the squares of the corresponding elements of +. By letting the cross-over value p and the relative fitnesses assume specific values in the heterozygotic transition matrix, T, one can easily obtain the above information. We proceed, on this basis, in the next section to present the results of the simulation, with the variables assuming successive values over a wide range.
ANALYTICAL
AND
NUMERICAL
RESULTS
To illustrate the usefulness of the formulae stated in the previous section in terms of answering the questions posed, we shall perform the following numerical investigations. Except in analysing the algebraic results which may easily be displayed in the singly heterozygotic genotypes and the limiting properties of the normal matrix, we shall consider the system under the influence of symmetric
A STOCHASTIC TWO LOCUS GENETIC MODEL
41
selection. That is, we set y I = x1 and y, = x2; this effectively reduces the number of variables so that it will be analytically tractable. We shall employ a threefold numerical scheme : (1) Fix the fitnesses or relative survival rates xi and x2 of the homozygous pairings AA and BB, and let the cross-over probability p (0 I p I 3) vary to obtain the behaviour of the system under these circumstances; (2) fix x1 and p while letting x2 vary; and (3) fix x2 and p while letting x, vary. Because of the structure of the heterozygotic probability transition matrix T (see eqn. (3)) there is a symmetry or complementarity in the results obtained for numerical schemes (2) and (3). This will be noted in detail below. We let the quantities xi and x2 range from 0.1 to 10.0 in increments of O+M5.The first extreme (x, = 0.1 or x2 = 0.1) indicates lethalness in the homozygotic genotypes; if the fitnesses are decreased any further, the statistics of the system rapidly become unbounded. The latter case (xi = 10.0 or x2 = 10.0) represents a relatively fast extinction of the heterozygotic genotypes. In most cases the system is essentially stable once either xi or x2 reaches the value of 50. If, say, xi = 50, this means that the homozygous AA and aa pairings survive five times as often as the heterozygous Aa pairing. By the time the fitness values reach 10.0, we have certainly extracted all the useful information about the system. In general the lower the relative fitnesses (and, hence, survivability) of the homozygotic genotypes, the longer the process stays in a transient status-or the greater the number of generations of heterozygotic progeny we would expect to see before a homozygotic offspring is produced. For case (l), p is incremented in steps of 0~00025. Singly heterozygotic initial genotypes
With respect to the four singly heterozygotic genotypes, ABlAb, ABlaB, Ablab and aBlab, the process may easily be analysed. As we have indicated above, these genotypes are unaffected by cross-overs in the transition from one generation to the next. Thus it is only possible for a parent of one of these genotypes to reproduce itself or to produce a homozygotic offspring. Since the first four rows of the heterozygotic transition matrix, T, representing these genotypes, are diagonal, the sum of each row is just the single diagonal element. Thus, the results obtained by considering the normal matrix, N, and its variance, V,, are the same as if one had studied the $ vector (the sum of the rows of N) and its variance, $, , for these particular genotypes. We shall begin a brief description of the numerical results presented in the accompanying graphs by discussing the cases involving a singly heterozygotic parent generation. The expected number of generations of progeny of genotype A B/A6 that would issue from a parent of that make-up is:
42
R. L. W. WELCH, S. C. SMEACH, C. P. TSOKOS
N,,
1
z-z
4x, + x2 + Y, + 2
1 - 7.1,
2x1 +
x2
+
Y2
N, i is the element in the first row and first column of the normal matrix, N; T, , occupies the corresponding position in the T matrix. This quantity is identically equal to two, if x2 + Y2 = 2. It is a monotonically decreasing hyperbola in x1 if x2 + Y, < 2, and a monotonically increasing hyperbola if x2 + Y, > 2. In either of these two instances: lim N, 1 = 2 x1-+* The variance of this quantity, which is given by V 1.11 = N, i(2Ni 1 - 1) - N, i2 (the element in the first row and first column of the V, matrix), is more difficult to examine analytically, but it behaves in a similar fashion. It is clear that: lim V,.i, = 2 x1-00 Figures 1 and 2 show this behaviour with respect to the variable x1 for the case of symmetric selection (x 1 = y, and x2 = Y2). The straight line depicts the special case when x2 + y2 = 2. The mean value is a monotonically decreasing hyperbola in x2 or y2, with: lim Nii = 1 xp or y2-cm The variance acts in a similar manner, except that:
(see Figs. 3 and 4). The points of intersection on the two graphs reflect the fact that if x2 = 1 (that is, x2 + y2 = 2, since we hold x2 = y2), then: N,, = I’,.,, = 2 The expected number of generations of genotype AB/aB that would descend from a parent of the same kind is: 1
N22=-= 1 -
T22
4X,
+ X1 + J’, + 2
2x2
+
Xl
+
Yl
The dispersion about this mean for genotype ABlaB is given by: VI.22
=
N22W22
-
1)
-
N22’
It should be noticed that these two quantities will be the same as N, 1 and V,. 11, respectively, if x1 and x1 are switched and y, is replaced by y2. That is, the mean and variance for AB/Ab, with x1 varying and x2 = a (or x2 + y2 = a if selection is non-symmetric) held fixed, behave exactly as do the mean and variance associated with ABlaB, with x2 varying and x1 = a (or x1 + Y r = a) held fixed, or vice versa. Thus, Figs. 14 also depict the case for initial genotype AB/aB, which is consistent with the asymptotic behaviour.
A STOCHASTICTWO
LOCUSGENETIC
43
MODEL
600 Mean of genotype AWAb producmg AWAb progeny (1)
(21 (3) (4) (51
400-
x2 =05 x.2= IO x2 = I 5 X2 = 2 0 x2 ~25
z= 3QO(1) 2OO-
I:! 14)
(51
‘O:: 0
167
333
500 Xl
667
834
100
Fig. 1.
600
%r!once ofgenotype AB/Ab voducmg ABlAb progeny
1 5OQ-
400-
: 300-
200-
too--
o!
0
I 67
333
500 xi
Fig. 2.
667
834
,
100
0 01
bE8
E 0 = 0 I =
‘x (E) ‘x (b)
s I = ‘x (E) oz = ‘x (2) Liz= ‘x (I)
habold qv,cgv 6ulmpoJd qv,gtj addlouab JOaxmJoA
L9’9
00s
EEC
L9 I
0
‘0
-0Ob S‘O = ‘X (S) 0’1 = ‘x(b)
E’I = ‘x CC) 0’2 = ‘x (2) sz= I”‘)
-00s
-,I
dua6OJd c_,V,~P
Q,,8tj 6WnpOJ d “““_fq adAioua6 ,o I’ =’
SOXOSL ‘d ‘3 ‘H3V3WS
LO09
‘3 ‘S ‘H313M
‘M “I ‘?.I
A STOCHASTIC
The mean generations
TWO LOCUS GENETIC MODEL
45
of progeny of genotype Ablab from a parent of that
type is: N,,
= 1 -
Its
1
= 4Y, +
T,,
Xl
2Y2 +
+ Xl
Yl +
+
2
Yl
variance is given by:
v,.,, = N,,(2N,3 - 1) - NM’ these two quantities behave in all respects similar to the preceding case (for genotype AB/aB) with x2 replaced by y,. The two cases are identical under the influence of symmetric selection. The mean number of generations of genotype aB/ab from a like parent is: N,,=
-=
1
1 - Tu
4YI + x2 + Yz + 2 2Yl +
x2
+
Y2
The variance about this mean is: vi.,, = N&N44 - 1) - N,d2 The quantities behave similarly to those in the case for genotype AB/Ab with x1 replaced by y 1; the cases coincide if the selection is symmetric in these two variables. Doubly heterozygotic initial genotypes
With regard to the future offspring of one of the two doubly heterozygotic lines, AB/ab or AblaB, some general remarks may be made. Starting from an initial line of genotype ABlab, the mean and variance of the number of generations of any of the singly heterozygotic genotypes (ABlAb, AB/aB, Ablab or aB/ab) that will be produced are the same as if the initial genotype had been Ab/aB. Secondly, the mean and variance for a continuous line of genotype AB/ab, given a like parent generation, is the same as that for an Ab/aB line, given a parent of the latter kind. Likewise, the statistics for an AB/ab parent producing AblaB progeny are identical to those of an Ab/aB parent producing ABlab progeny. Thirdly, the genetic
system, if selection is symmetric, shows no difference with regard to genotypes ABlAb and aBlab or AB/aB and Ablab. Finally, the statistics for the number of
generations of any sort of heterozygotic offspring are the same whether the initial line was AB/ab or AblaB. We summarise this algebraically: N,l=N61=N54:N64 N5, = Ns2 = NSJ = N63 N,,
= N,,,
-
N56 = N.55
and V 1.51 - V 1.61 -- Vi.54 = VI.64 V 1.52 -- vl.62 = vl.5, = vl.6, V 1.55 = v1.66, vl.56 = vl.65
whereas in the previous notation, N,,
is the element in the iXth row and first
46
R. L. W. WELCH, S. C. SMEACH, C. P. TSOKOS
column of the normal matrix, and so forth. Also, for the row-sums in the JI vector and the variance @r : e5 = $6
and
h5
= h6
These identities should be kept in mind as we continue below to discuss the numerical and graphical results of the simulation. We now deal with the cases involving a parent with a doubly heterozygotic genetic constitution. Because of the complexity of the problem, it is more difficult to analyse the algebra in any meaningful manner. We do, however, discuss the limiting behaviour in each instance. Singly heterozygotic progeny: The statistics of the number of generations of ABlAb (or of aB/ab) offspring descendent from an initial ABlab or AblaB genotypic line are as follows. If p = 0 this event cannot occur. For p # 0, the mean number of generations is: T NS1 = (1 - T,,)(l
IIT,,
- T,,)
From eqn. (3) we obtain the following limits: ( 2pq
lim NS1 = x1+m 74pq ( 2pq
lim N,,
=
Yl-+00
symmetric selection non-symmetric selection symmetric selection
(0
non-symmetric selection
and lim N5i =0 x* OIya--ca, Also the variance of this mean is: V 1.51 = N~I(~NII
- 1) - &I’
from which it can be concluded that: lim V,.5,
Xl-+rn
6pq - 4pzqz
=
12pq - 16p2q2
symmetric selection non-symmetric selection
(6pq- 4p2q2 symmetric selection
lim V1.51 = * n-m
non-symmetric selection
10
and lim V,.,, xp ory.J-cm
= 0
If x2 < 1, N, I decreases with x1 to a minimum and then slowly rises again to the limiting value; the minimum is farther from the origin for smaller values of p. If x2 2 1, N5 i increases towards the asymptotic line as xi increases. The dispersion V, .5 1 appears to decrease to the asymptote as x 1 increases, if x2 < 1, and increases
A STOCHASTlC TWO LOCUS GENETIC MODEL
I
oo-
0 833-
0 667-
0-I 0
Mean ofgenotype AWab or AbhB produclng AB/Ab progeny (1) (2) (3) (4) (5)
I 67
p xg x2 x2 xp x2
=05 = 05 = I 0 = I 5 = 2 0 = 2 5
333
500
667
XI
Fig.5.
600-
500-
400-
p
47
Vmance of genotype AB/ab or AbhB producmg ABlAb progeny 0=05
p’,
x; = 05 x* = I 0 (3) x9 = I 5
(4) (5)
x*‘. = 2 0 xp = 2 5
300-
Fig.6.
834
I
100
%
t&8
199
00s
EEI
19 I
I3 I71
0
0
-1910
‘x (El
I =
‘x (t7)
0 I =
s 02
sz= co=
go,qv
=
‘x
-199 0
(2)
‘x (I) d
-EC8 0
huaboJd qv/&! bumpmd m qojgv adhlouab,o uoafl I 00
SOWOSl ‘d ‘3 ‘HWBWS
‘3 ‘S ‘H313M
I
‘M ‘-I 2
49
A STOCHASTIC TWO LOCUS GENETIC MODEL
Meon of genotype ABlob or Ab/oE producmg Af3/Ab progeny
100,
x2= x, x, x, x, XI
(1) (2) (3) (4) (5)
0 633-
0 667-
IO =25 = 2 0 = I 5 = I 0 = 05
0 167
Fig. 9.
300 1
Vor~once of genotype ABlob producmg ABlAb progeny (1) (2) (3) (4) (5)
2 50-
2 oo-
x2 x, x, x, x, x,
or AbhB
= IO = 25 = 20 = I5 = IO =05
= 150>-
0
nv 0
0 833
I 67
2 50 Pxlo-’
Fig. 10.
333
I
4 17
500
50
R. L. W. WELCH, S. C. SMEACH, C. P. TSOKOS
to this line if x2 2 1. Both NS, and V,.,, decrease as x2 increases. As p is held constant at successively higher values, the numbers tend to increase overall. These quantities increase rather slowly as p varies from 0 to 3, for fixed values of the fitnesses x1 and x2. Figures 5-10 illustrate this behaviour. If a parent of genotype ABlab or AblaB has successive generations of descendants of genotype AB/aB (or of Ablab), then the system acts as above (for progeny of type AB/Ab or aB/ab) with the roles of x1 and x2 reversed. To summarise the asymptotic results, we have: T N52
=
(1 -
T22)U
12L
-
TM)
and V 1.52
N,2(2N22 - 1) - N522
=
which imply : lim
N52
=
x1 ory1-+m
lim N,, x9-+@=
=
lim V,.,, x2-m
V,.,,
= 0
(2pq
symmetric selection
14pq
non-symmetric selection
(2pq
symmetric selection
(0
non-symmetric selection
lim N,,=Yl-a
lim
x1 or y1-+s
4p2q2symmetric
( 6pq -
= jl2pq -
16p2q2
selection
non-symmetric selection
and lim V,.,, YlfOO
=
{W - 4p2q2
symmetric selection
10
non-symmetric selection
Doubly heterozygotic progeny: The mean and variance of the number of generations that a doubly heterozygotic parent will produce a given genotype of doubly heterozygotic offspring are as below. The mean number of times that genotype AB/ab or Ab/aB will reproduce itself is:
N55= (1
1 - TSS -
T55)2 - T,,’
It may be noted from eqn. (3) that this is actually a function of x1 + x2 + y1 + y2 = a. It can easily be shown that for any of the fitnesses individually, this is a monotonically decreasing function, and : lim Xl.XZ~Yl ory2-m
N,,
= 1
A STOCHASTIC TWO LOCUS GENETIC MODEL
51
The variance about this mean V1.5, = N,5(2N,5 - 1) - N5S2 also decays monotonically as any of the fitnesses increase, and:
The convergence for both is faster for higher values of p, and the process appears more stable. For the symmetric case, shown in Figs. 11 and 12, if x 1 or x2 2 5, the value of the other fitness has very little influence on either the mean level or the dispersion about that mean. The statistics decrease with p, as the fitnesses are held constant (Figs. 13 and 14). The mean number of times and the variance that genotype ABlab or AblaB will produce the other genotype (AblaB or ABlab, respectively) are : T N56 = (1 _ T,,;:
_ Ts62
and v1.56
=
Ns6(2Nss
-
1) -
N562
These numbers, like N, 5 and Vi. 55, are monotonically decreasing functions of any of the fitnesses (or of x1 + x2 + y, + y2 = cr). The limits may be derived as: lim
N56
V,.,,
lim
=
xl.xz.YlorYa~~
= 0
xl.xa.YlorYz-"
Asp ranges from 0 to 3, Ns6 and v1.5.5 increase very slightly in an almost linear fashion. Figures 15-l 8 illustrate the above-mentioned principles. Heferozygotic Progeny of any kind: If the process starts with a parent of type ABlab or Ab/aB and continuously produces heterozygotic progeny, the mean and variance are given by, respectively:
$5 = $Nsi i=l
and $1.5 = 2
2 NsiNii
+ (2N55
-
lM5
f
2N5611/5
-
es2
i=l
In the special case whenp = 0, then the only entry on the fifth row of the normal matrix is N, 5; thus, i+G5= N,, and tj1.5 = I’,.,,. Forp # 0, we have:
lim xl,x.,yl ory2-r
*5
=
1 +
4pq
since the limit of the sum will be the sum of the limits of each element.
lim x,,xz,ylory,_cc
$1.5
=
12Pcl - 16P2c12
Also:
52
R. L. W. WELCH, S. C. SMEACH, C. P. TSOICOS
Mean of genotype ABlab or Ab/aR producmg progeny of the same kmd
05::oo 0
1.67
333
xi
Fig. 11.
6 00~
Vwonce of genotype ABlab or Ab/oB praducmg progeny of the some kmd
(1)x 5.00-
400.
(2) (3) (4) 15)
p=oz -05 x; : IO x* = I 5 x* 2 0 x2 = 2 5 ??
Fig. 12.
53
A STOCHASTIC TWO LOCUS GENETIC MODEL
300
Mean of genotype AB/abor Ab/aB producing progeny of the same kmd
2.50
(I) (2)
x2 = IO x, = 05 x, = IO
200
$ I 50
I 00
050
0
I
0 033
2 50
I.67
I
333
I
4 I7
1
500
Px 10-l
Fig. 13.
Vurmnce of,wnotype AWob orAb/aB producmg progeny of the same kmd
600
(2) (3) (4) (5)
5.00
x, x, x, x,
= = = =
IO I 5 2 0 2.5
400
I00
(1) I:1 (4) (5)
0 0
0 833
I 67
250 Px 10-l
Fig. 14.
3.33
4 I7
500
R. L. W. WELCH, S. C. SMEACH, C. P. TSOKOS
54
Meon ofgenotype AWob or AbloB producing progeny of the opposite hmd
I001
p =04
0.833-
0 667-
p
a 500 I 0 3331
Oj+_;?i 0
167
3 33
500
667
834
XI
Fig. 15.
Vortonce of genotype ABlab or AbM prodwng progeny of the opposate kmd
100-l
(1) (2) (3) (4) (5)
0 833-
0 667-,
p x2 x2 x2 x2 xp
= = = = = =
04 05
IO I5 20 2 5
$0500.
~&&&_ ot
ifI
0
1.67
3.33
500 XI
Fig. 16.
6.67
I 8 34
100
00s
LI c
,_Ol Xd OS2
L9 I
EE9 0
0
0
00s 0 ;
s2 = 02 = s I = =
‘x (S)
‘x (PI ‘x 1x
*o=
‘x
0’1
so = ‘x
(E) (2) (I}
pwy al!soddo al(+ 40 Auabold bumpoJd go/qQ JO qo/gQ adA+ouab p awotmA
t
L99 0
1 En3 0
001
1-01xd
EEE 0
t 00s 0
$
t s
-L99
‘x (S) ‘x (C)
2 =
02 =
s I = ‘x (E) 0 I = ‘x (2) S’O = ‘x [I)
0
-E’S8 0
so= zx
puy ailsoddo aqj JO Auabodd bumpo~d go/qQ 10 qo,8Q adA&ouab ~0 UDaN
ss
13CW4
3IL3N39
Sfl301
OMI
-00
311SVHDOLS
I
V
R. L. W. WELCH,
56
S. C. SMEACH,
C. P. TSOKOS
600
Mean of genotype AE/ab or AbloB producing heterozygotx progeny
500
(1) (2) (3)
=05 iY2 =05 x2 = IO x2 = I 5
I;;
;;
1;;
400
2
300-
2 oo-
I oo0:
!
0
500
3 33
I 67
667
I
834
1
100
XI
Fig. 19.
12 0
IO 0
1
07 0
Vormce ofgenotype AB/ab or Ab/oB produclng heterozygotlc progeny
I67
= 05 = 05 = IO = I 5
(I) (2) (3)
P x2 x* x2
I::
xp = 2 O 5 x2
r
/
333
500 Xl Fig. 20.
667
8 34
I
IO 0
A STOCHASTIC
600
1
TWO LOCUS GENETIC
MODEL
57
Mean of genotype AB/ab or Ab/aB producing heterozygotx progeny
ObB33;67;5000 Px 10-l
Fig. 21.
The mean level, ti5, decays monotonically as any of the fitnesses increase. If, say, x1 2 5, the fitness x1 has little effect on the mean, especially for choices ofp near 0. The traces appear more stable for values of p near 4. If x2 (or xi) < 1 and p # 0, the variance $ 1.5 appears to decrease monotonically to the asymptote as x 1 (or x1) increases. If x2 (or x1) 2 1, I++ 1.5 decreases rather rapidly to a minimum below the asymptotic line, and then increases back to its final limit. This effect is somewhat enhanced by choosing p near 3. (See Figs. 19 and 20 for an illustration of this behaviour.) It must be pointed out that an earlier author (Tan, 1973) erroneously claimed that both of these quantities were increasing functions of X, (or xJ. If the fitness levels are fixed and p is allowed to vary from 0 to 3, the results are rather curious. The mean $ 5 seems to be affected very little by changing p. The variance * 1. 5, however, has a noticeable maximum at p near zero, when the fitness values are low. As they are held constant at successively higher levels, the maximum occurs closer to p near 4 (see Figs. 21-24). This seems to be consistent with what the author mentioned above claimed in this case.
CONCLUSIONS
It has been observed, in a mathematical sense, that it makes no difference which locus on the model genome we consider to be the first and which the second. For
58
R. L. W. WELCH, S. C. SMEACH, C. P. TSOKOS
Variance of genotype ABlab w AblaB producing heterorygatic progeny
O?
I
0
0.833
1.67
2.50 P x 10-l
3.33
4 17
t
5.00
Fig. 22.
example, AB/Ab behaves identically to AB/aB, under the influence of the respective fitnesses of each genotypic pair (AA, BB, Bb and bb on the one hand, and BB, AA, Aa and ua on the other). Furthermore, since we label the alleles arbitrarily, AB/Ab behaves as aB/ab. Therefore, in dealing with the four cases which involve the singly heterozygotic genotypes, we have found that a complete discussion of one will also sufhce to describe the behaviour of the other three. In view of the analytical and numerical results, we can draw several conclusions. Consider that the initial breeding line were singly heterozygotic, for example AB/Ab, then one should expect the following. If the homozygous pair AA had a large selective advantage, there would be about two generations of AB/Ab progeny before complete homozygosis was achieved. This expectation should hold true, regardless of the viability of the AA pairing, if there were no selective difference in the BB and bb pairs vis-a-vis Bb. If the fitness of either BB or bb is large, then a viable homozygote should appear rapidly. The instance in which a (doubly heterozygotic) AB/ab or AblaB line reaches homozygosity through ABlAb progeny is rare, but the system would behave qualitatively as in the previous case. The
A STOCHASTIC
600
LOCUS GENETIC
MODEL
59
Varmce of genotype AWab or Ab/aB producmg heterozygotx progeny
-I
(I) (2) (3) (4) (5)
500
4oc
TWO
x2 x, x, x, x, x,
= IO =05 = IO = 15 = 2.0 = 2 5
I-
/
73oc
I-
I-/
2oc
I oc
/ / I-/
Ch-
I
I
0
I 67
0033
250 Pxlo-’
333
4 17
500
Fig. 23.
1
600
Variance of genotype AWob or Ab/oB producmg heterozygohc progeny
5.00 (3) (4) 15)
400
o!
0
0 833
xp
= 15
x, x, x,
= 15 =20 =25
I 67
250 Px 10-l
Fig. 24.
333
4 I7
,
500
60
R. L. W. WELCH, S. C. SMEACH, C. P.JSOKOS
incidence of crossing-over, however, will affect not only the frequency of this event, but also the value around which the system stabilises for a large AA fitness. One should expect to achieve homozygosity after one generation of an AB/ab or Ab/aB parent reproducing itself, for even a moderate selective advantage in one of the homozygotic pairs. Furthermore, the dispersion of any experimental results will be small. Reaching homozygosity through the other doubly heterozygotic genotype is much more infrequent; both gametes must have had a cross-over take place during formation for this to occur. If we consider a doubly heterozygotic line producing any sort of heterozygotic offspring, we find that this is dominated by the case just mentioned in which the parental type reproduces itself. This dominance is somewhat weakened the higher the occurrence of crossing-over, yet this proportion affects the individual cases much more significantly. Two more general observations may be made. First, in most instances when there is a 5:l or greater selective advantage in one of the homozygotic pairs (AA, aa, BB or bb) over its corresponding heterozygotic pair (Aa or Bb), the system is essentially stable. A small fluctuation in any of the other fitnesses will have little effect on any experimental outcome. Secondly, the statistics (mean and variance) have behaved similarly in almost every case. Thus, the probability distribution function which characterises the number of generations of heterozygotic offspring of a given type, produced by a given initial type, would seem to be restricted to a class for which the variance is proportional to the mean.
REFERENCES
J. A., SORARRAIN, 0. M. and FAVRET,E. E. A., Applications of finite absorbent Markov chains to sib mating populations with selection, Biometrics, 25 (March, 1969), pp. 17-26. ELANDT-JOHNSON, R. C.. Probability models and statistical methods in genetics, New York, John Wiley and Sons, 1971. FISHER,R. A., The theory ofinbreeding (2nd ed.), New York, Academic Press, 1965. KEMENY,J. G. and SNELL, J. L., Finite Markov chains, Princeton, D. Van Nostrand Company, Inc., 1960. NELDER, J. A., Some genotypic frequencies and variance components occuring in biometrical genetics, Heredity, 6 (December, 1952), pp. 387-94. TAN, W. Y., Applications of some finite Markov chain theories to two locus selfing model with selection, Biometrics, 29 (June, 1973), pp. 331-46. TSOKOS,C. P., Probability distributions: An introduction to probability theory with applications, Belmont, CA, Duxbury Press, 1972. Ehso,