AN ITERATIVE

ALGORITHM FOR OF VARIANCE

ANALYSIS

J. J. DAUDIN lnstitut National Agronomique, Paris-Grignon, Puris (France) (Received:

3 July, 1979)

SUMMARY

In t&s paper, an iterative algorithm is proposed for computing estimates of parameters and sums of squares in non-orthogonal multivariate analysis of variance, without inverting any matrix. It is useful in the case of a large design matrix for it saves memory and computation time. It was first proposed by Stevens (2948) for 3 factors and is here generalised to any number of factors and interactions of any order. Convergence properties are studied. The more orthogonal is the design, the faster is the convergence. Several examples are provided.

SOMMAIRE

On propose, duns cet article, un algorithme itkratifpour le calcul de l’estimation des param&res et de la somme de carrts duns le cas de l’analyse de variance non orthogonale et cela suns inversion de matrice. Cet algorithme est utile duns le cas d’une matrice de grandes dimensions car il tkonomise la mise en mkmoire et le temps de calcul. I1 fut propose pour la premike fois par Stevens (1948) duns le cas de trois facteurs, il est ttendu ici au cas dun nombre quelconque de facteurs et d’interactions d’ordre prbitraire. On &die 1espropritWs de convergence. Plus on se trouve proche de E’orthoganalitPplus la convergence est rapide. On donne plusieurs exemples.

1.

INTRODUCTION

Iterative methods are useful for resolving normal equations in 2 cases: When there are numerous factors with numerous levels, the design matrix is large, and an ordinary inversion needs a large memory and a long computation time. Moreover 507 Znr. .I. &o-Medical Computing (10) (1979) 507-518 @Elsevier/North-Holland Scientific Publishers Ltd.

508

J. J. DAUDIN

if many models, not contained one in another, are to be fitted, there are as many matrix inversions as the number of models. Lack of place and of time prevents handling of such data with small computers. When the matrix is ill-conditioned, iterative methods lead to an exact solution, while this is not the case for standard inversion methods. Several authors have proposed iterative methods, especially Hemmerle (1974). Although his algorithm is made for the same purpose, it is technically different from the algorithm studied in this paper, which has been first proposed by Stevens (1948), then by Kuiper (1952) and Corsten (1958). These authors have considered the 2 or 3 factorial cases without interaction1 We generalise it to any model of analysis of variance and study convergence properties. This is done in section 2 with free coordinate algebraic methods. In section 3 we apply section 2 results to the analysis of variance with classical notations. In section 4 we give examples of speed of convergence. Section 5 is devoted to a discussion of the merits and drawbacks of the method, as compared with the standard ones.

2.

ALGORITHM DEFINITION AND PROPERTIES

2.1. Algorithm for the projector Let E,... E, be p subspaces of the Euclidean space E. Let H = E, + E, + . *. E, Let Pi be the orthogonal projector on Ep We want to obtain the orthogonal projector P on H. Actually, we are more interested in the projection of an element rather than in the projector itself. Proposition 2.1: (i) and (ii) are equivalent: (i) P is the orthogonal

projector on H;

(ii) Vi= 1,p, PioP=Pi

and

im P=H.

Proof (i) *(ii) is obvious.

Let P and P’ be 2 applications checking (ii). Then: PiO(P-P)=O,

i=l*..p

Im (P - Z”) C Ker Pip

i=l..*p

where El is the orthogonal space to EiHL= fl which with Im (P-P’)C H implies that P= P.

&

implies that Im (P-P’)

C HI,

509

JTERATIVE ALGORITHM

The algorithm for building P consists in adjusting successively the iterated application PC”)to each equality P,oP = Pi: p’l’=p

1

p(i)=p(i-l)+pio(r_p(i-l))

i90 90 0.94

MEANS

5

6

14 6 0.15

1: 0.4

it seems to be very stable along the iterations. The mean value of this sequence is noted 1 in Table 3. Table 3 is concerned with additive models. If we consider models with all second order interactions convergence may be slower. For example, we obtain a precision of lo- 3 on estimated means with respectively 13 and 18 iterations for designs 5 and 6. When we consider sums of squares, we need fewer iterations than for estimated means, for convergence is geometrical with a rate 11911 i instead of I/g11 H. Therefore, if we are chiefly interested in sums of squares, the algorithm is more rapid. As an example of the usefullness of the algorithm, note that design 6 with all second order interactions leads to a size of incidence matrix of 226, which is impossible on a small-memory computer. We have, however, dealt with these data in 3 min on a computer of this kind. Finally, note that if there are estimability problems we may detect them by using a different order of adjustment in the algorithm when stopping the iterations. An alteration of parameters produced by this operation indicates estimability problems. All linear functions of the parameters which are stable are estimable ones.

5.

CONCLUSIONS

Two steps may be distinguished in dealing with analysis of variance. The first one consists of the search of a model which fits well the data. At this step we only need mean estimates and sums of squares. The second step, is the analysis of the parameters of the model, which needs the variance matrix of estimates. The algorithm studied here is well adapted to the hrst step and possesses the following merits in regard to standard matrix inversion methods. (1) The computations are faster for the number of elementary operations is between t and t2 for 1 global iteration, against t3/6, if t is the size of the matrix. As the number of iterations is generally between 2 and 30 for a reasonable precision, we see that the algorithm saves time, specially for large values of t, or if non-

518

J. J. DAUDIN

orthogonality is weak, for example if the design was initially orthogonal, with fewer experiments. (2) The algorithm allows us to save space, for the computations are easily divided in separate parts, and we do not handle the incidence matrix. Therefore we may deal with large-sized models on small-memory computers. (3) If non-orthogonality is important, convergence may be slow and computations may be longer than direct inversion. But in this case the matrix is ill-conditioned and the inversion does not give exact results, whereas the iterative algorithm does. (4) The volume of computations is closely related to the difficulty of the problem : if the design is orthogonal convergence is immediate; if the design has a marked non-orthogonality, convergence is slow. In the opposite case the standard methods lead to the same computations for all models. (5) If we want only to know the sums of squares, a little number of iterations is sufficient. The drawbacks of the method are the following points: (1) Only estimated means are computed. Usual parameters under constraints of the type: C

%,i,%,+

ai,i2n+i,=o

=I i2

h

are easily computed from the estimated means, but parameters under constraints of the type: 1 il

ni,i2ai,i2=o

ni,i2%,i2=x i2

are not directly derived from the estimated mean. If these constraints are requested, direct inversion matrix is a better method. (2) Covariance matrix of estimates is not computed by the iterative method. We think that we are not exaggerating the value of the iterative method which may be very useful, especially for small computers. REFERENCES Vectors, a tool in statistical regression theory--Medelingen van de Landbouedewhogeschool te Wagenungen, Nedenland, 58 (1958) 1,92. DAUDIN,J. J., Etude de la liaison entre variables alCatoires_Regression sur variables qualitatives+ T/r&e de 32 cyle-Orsay Paris XI, 1978. DUBY,C. and MASSON,J. P., Etude des liaisons statistiques de test en analyse de variance non-orthogonale. Int. J. Bio-Med. Comput., 9 (1978) 45-71. HEMMERLE, W. J., Non-orthogonal analysis of variance using iterative improvements and balanced residuals. JAVA, 69 (1974) 772-S. KUIPER, Variance analyse, Statistica 6 (1952) 149-194. STEVENS, Statistical analysis of non-orthogonal trifactorial experiment, Biomerrika 35 (1948) 346367. CORSTEN,

An iterative algorithm for analysis of variance.

AN ITERATIVE ALGORITHM FOR OF VARIANCE ANALYSIS J. J. DAUDIN lnstitut National Agronomique, Paris-Grignon, Puris (France) (Received: 3 July, 1979)...
659KB Sizes 0 Downloads 0 Views