Mutation Research, 253 (1991) 237-240 © 1991 Elsevier Science Publishers B.V. All rights reserved 0165-1161/91/$03.50

237

MUTENV 08807

Quantification of the predictivity of some short-term assays for carcinogenicity in rodents Gilles Klopman

a

and Herbert S. Rosenkranz b

" Department of Chemistry, Case Western Reserve University, Cleveland, OH 44106 and b Department of Environmental and Occupational Health, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA 15261 (U.S.A.) (Received 19 February 1991) (Revision received 17 July 1991) (Accepted 9 August 1991)

Keywords: Short-term assays, predictivity quantification; Predictivity, quantification, short-term assays

Summary A statistical procedure is described for assessing the predictive performance of short-term tests for carcinogenicity in which the actual number of chemicals tested is taken into consideration. The method is then applied to several widely used short-term assays.

Short-term tests have been valuable for predicting as well as understanding the carcinogenicity of chemicals. More recently, thanks to the data generated by the U.S. National Toxicology Program there has been a recognition that the performance of short-term assays is at its best when the chemicals act by an electrophilic mechanism (Ashby and Tennant, 1991). The question then, primarily, becomes how to express the performance of a short-term test as a predictor of carcinogenicity? Recently, sensitivity, specificity, positive predictivity, predictivity (in a Bayesian sense) and concordance have been used (Tennant et al., 1987; Zeiger et al., 1990; Ennever and Rosenkranz, 1989). What has been lacking, to a large extent, has been a method that can be used Correspondence: Prof. Herbert S. Rosenkranz, Department of Environmental and Occupational Health, Graduate School of Public Health, University of Pittsburgh, 130 DeSoto Street, Pittsburgh, PA 15261 (U.S.A.).

to assess concordance and also addresses, in a statistically acceptable fashion, the number of correct predictions that give rise to a particular concordance. Thus a concordance of 75% can result from the following ratios of correct predictions: 3/4, 39/52 and 75/100. Obviously the significance of the latter is greater than that of the other two. We describe herein a formulation (Klopman and Kolossary, 1991) of the chi 2 statistical test that appears appropriate for evaluating predictions and we apply it to the analysis of some widely used short-term tests. We analyzed Salmonella mutagenicity assay (STY), the induction of sister-chromatid exchanges (SCE), chromosomal aberrations (CVT) and unscheduled DNA synthesis (UDS) by comparing their results to carcinogenicity in rodents. The experimental results for STY, SCE and CVT were generated under the aegis of the U.S. National Toxicology Program (NTP) and compared to the results of NTP cancer bioassays (Tennant

238 et al., 1987; A s h b y a n d T e n n a n t , 1988; A s h b y et al., 1989). T h e U D S results a n d the c o r r e s p o n d ing c a n c e r b i o a s s a y d a t a w e r e t a k e n f r o m W i l l i a m s et al. (1989). F o r illustrative p u r p o s e s , let us d e f i n e s o m e o f t h e q u a n t i t i e s u s e d for t h e analyses. L e t the p e r centage of observed correct predictions (OCP) be d e f i n e d as t h e r a t i o o f the s u m o f t h e c o r r e c t p r e d i c t i o n s o f c a r c i n o g e n i c i t y (TP) o r lack t h e r e o f ( T N ) d i v i d e d by t h e t o t a l n u m b e r o f p r e d i c t i o n s m a d e . L e t also, the e x p e c t e d p e r c e n t a g e o f correct p r e d i c t i o n s ( E C P ) b e t h e p e r c e n t ' c o r r e c t ' p r e d i c t i o n s to b e e x p e c t e d w h e n a d a t a b a s e c o n t a i n i n g a fraction X of active m o l e c u l e s is p r e d i c t e d by a r a n d o m n u m b e r g e n e r a t o r engin e e r e d to p r o d u c e a fraction Y o f actives. E C P can be c a l c u l a t e d by the f o r m u l a :

with r e s p e c t to e x p e c t a t i o n s f r o m r a n d o m assignm e n t o f activity, a n d is e q u a l to 0 if O C P = E C P a n d to 1 for a p e r f e c t fit.

(1)

A value of Chi 2 o f 3.84 i n d i c a t e s t h a t t h e r e is a 5% probability that the observed concordance (e.g. % o f c o r r e c t p r e d i c t i o n s ) is d u e to c h a n c e ( c o n f i d e n c e level 95%), a value o f Chi 2 of 6.63 is n e e d e d to b r i n g it down to 1% ( c o n f i d e n c e level 99%). T h e v a l u e of 3.84 (conf. level = 9 5 % ) could r e a s o n a b l y be t a k e n as the lower limit o f a c c e p t ability o f p r e d i c t i o n s . I n d e e d , it w o u l d t a k e only 20 c o m p u t e r g e n e r a t e d r a n d o m p r e d i c t i o n s to do as well or b e t t e r t h a n such an assay. T h e d a t a for the four s h o r t - t e r m assays are s u m m a r i z e d in T a b l e 1.

E C P = (1 + 2 X Y - X - Y )

• 100

w h e r e X a n d Y values a r e b e t w e e n 0 a n d 1. The probability that the observed percentage o f c o r r e c t p r e d i c t i o n s , O C P , is d u e to c h a n c e is given by s t a n d a r d Chi 2 d i s t r i b u t i o n f u n c t i o n tables (Bailey, 1971). Chi 2 = N * Phi 2

(2)

w h e r e N is t h e n u m b e r o f m o l e c u l e s in the d a t a base; Phi 2 m e a s u r e s t h e a c c u r a c y o f t h e m e t h o d

TP 2

Phi 2 = .

TN 2

+ SC1 * SR1

SC2 * S R 2

FP 2

FN 2

+ SC1 * S R 2 + SC2 * SR1

1

(3)

w h e r e T P a r e t r u e positives, T N a r e t r u e n e g a tives, F P a r e false positives, F N a r e false negatives and, SR1 = TP + FN; SR2 = FP + TN; SC1 = TP + FP and SC2 = FN + TN

TABLE 1 COMPARISON BETWEEN CARCINOGENICITY AND FOUR SHORT-TERM ASSAYS a STY Ca ÷ Test + Test Tot al Test +/Total Chi 2 Expected Corr. Pred. OCP (Obs. concordance)

93 72 165 0.56 19.5 48.9% 62.0%

SCE

CVT

UDS

Ca-

Ca +

Ca-

Ca +

Ca

Ca +

Ca-

24 64

145 20

78 10

124 41

69 19

55 48

9 18

88 0.27

165 0.88 0.031

88 0.89

165 0.75 0.336

88 0.78

61.6% 61.3%

58.0% 56.5%

103 0.53 3.45

27 0.33

49.6% 56.2%

Ca, carcinogenicity in rodents. a While, for the most part, the STY, CVT and SCE tests and the corresponding carcinogenicity data were obtained under the aegis of NTP bioassay programs, the UDS data and the corresponding carcinogenicity data are taken from Williams et al. (1989). For the present computation it is assumed that a chemical listed as 'I' (inadequate evidence) is a non-carcinogen. Essentially similar conclusions are reached when the analyses were restricted to a smaller subset of NTP data which was attained using stricter criteria for the acceptability of positive responses for SCE and CVT.

239

The Chi 2 value for the prediction of carcinogenicity by the Salmonella mutagenicity test is 19.5, while the concordance, OCP, is 62%. This indicates that the Salmonella test results are indicative of carcinogenicity or lack thereof in a little less than 2 out of 3 cases (see also Klopman and Kolossvary, 1991; Ashby and Tennant, 1988; Ashby et al., 1989; Rosenkranz and Klopman, 1991). Random selection (ECP) though, would only have given 49% in this case, e.g., about 1 out of 2 correct answers. More important is the fact that the Chi 2 value places the confidence in these predictions at better than the 99.99 percentile range, which means that the computer generated random assignment of activity would have to venture more than 104 sets of guesses before it would produce such a good fit for the chemicals in the data base ( N = 253). The performance of the other tests as analyzed by this approach is greatly degraded. Thus, a comparison between the results of the SCE test and carcinogenicity in rodents shows a Chi 2 value of 0 but a concordance index OCP, of 61.2%. While one may think that a concordance of 61.2% is significant, one cannot escape the fact that this Chi 2 value indicates 0% confidence level. This means that 50% of the computer generated random guesses of the carcinogenicity of the chemicals in the data set, will be statistically as good and 50% will be better than the results of the SCE test. This indicates that by these criteria the test is not useful for predicting carcinogens. One may wonder why then was there a concordance of 61.2%. A quick examination of the data shows that this result is attributable entirely to the fact that SCE shows considerably more positive responses than negative ones and since the data base contains a large number of carcinogens, there is naturally a large concordance. Actually, if we were to call all chemicals, carcinogens, the concordance would climb to 65.2% even though the usefulness of the prediction is nil. This can be quantified by noting that the ECP index, as calculated by eq. 1, corresponding to a random call of a data base containing 65.2% carcinogens by a test set containing 88.1% SCE-inducing chemicals is 61.6%, actually somewhat better than the observed 61.2%. The reader may convince himself of this reality by noting in Table 1, that while the

probability of a carcinogenic molecule being SCE-inducing is 88%, the probability for an inactive one is about the same, i.e., 89%. Thus it is clear that in a statistical sense SCE cannot differentiate between carcinogens and non-carcinogens (see also Rosenkranz et al., 1990), which does not deny that there may be an empirical, mechanistic or even causal relationship between the induction of SCE and carcinogenicity (Rosenkranz et al., 1991), just that it is not one that is significant by the present criteria. The situation is similar with CVT. Here, the Chi 2 is found to be 0.33 which indicates a confidence level of about 50% . In other terms, one set of random guesses of the activity of the molecule of the data base out of two will be as good or better than CVT as a predictor of carcinogenicity. In addition, the concordance between the results of the CVT test and the rodent carcinogenicity is only 56.5%, actually less than the 58% that would have been obtained from a random guess of the activity of the molecules based on a distribution of 76.3% actives. Thus the reason why CVT had 'some' statistical significance is that it is worst than random. We must therefore conclude that in a statistical sense the association between response in CVT and carcinogenicity is not a compelling one although this does not negate that there may be a mechanistic correlation between CVT and rodent carcinogenicity for certain classes of chemicals (Rosenkranz et al., 1990, 1991). The situation for UDS is slightly better, although the data base is less rigorous and the number of compounds constituting it is smaller. Here, OCP is found to be 56.1%, which is somewhat better than the ECP of 50%. However, Chi 2 is only 3.44 (93% confidence level) which is slightly below the 95% confidence level that was selected as the cut-off. The suggestion has been made that animal bioassays are not sufficiently predictive of human concerns and moreover that they are uneconomical (Lave et al., 1988). These factors together with their almost universal availability led to the development and acceptance of surrogate shortterm tests. In the last several years, however, it has become recognized that some of the tests may not be highly predictive (Tennant et al., 1987; Ashby and Tennant, 1988; Zeiger, 1987;

240

Zeiger et al., 1990; Rosenkranz et al., 1991). This is confirmed, independently, by the present study. Finally it may be noted that even when the best of the surrogate test studied herein is used, i.e., Salmonella, 52.9% of the non-mutagenic chemical are still found to be rodent carcinogens. These have been designated 'non-genotoxic' carcinogens (Ashby and Tennant, 1988) and are presumed to act through a different mechanism. It must be noted that the analyses presented herein do not deal with the predictivity of tests for human cancers but are restricted to the prediction of carcinogenicity in rodents. This investigation was supported by the U.S. Environmental Protection Agency (R815488) and the National Institute of Environmental Health Sciences (ES04659). References Ashby, J., and R.W. Tennant (1988) Chemical structure, Salmonella mutagenicity and extent of carcinogenicity as indicators of genotoxic carcinogenesis among 222 chemicals tested in rodents by the U.S. NCI/NTP, Mutation Res., 204, 17-115. Ashby, J., and R.W. Tennant (1991) Definitive relationships among chemical structure, carcinogenicity and mutagenicity for 301 chemicals tested by the U.S. NTP, Mutation Res., 257, 229-306. Ashby, J., R.W. Tennant, E. Zeiger and S. Stasiewicz (1989) Classification according to chemical structure, mutagenicity to Salmonella and level of carcinogenicity of a further 42 chemicals tested for carcinogenicity by the U.S. National Toxicology Program, Mutation Res., 223, 73-103. Bailey, D.E. (1971) Probability and Statistics Models for Research, Wiley, New York. Chankong, V., Y.Y. Haimes, H.S. Rosenkranz and J. Pet-Edwards (1985) The carcinogenicity prediction and battery selection (CPBS) method: A Bayesian approach, Mutation Res., 153, 135-166. Ennever, F,K., and H.S. Rosenkranz (1989) Application of the carcinogenicity prediction and battery selection (CPBS) method to recent national toxicology 'program short-term test data, Environ. Mol. Mutagen, 13, 332-338.

Klopman, G., and I. Kolossvary (1990) Evaluation of quantitative structure-activity predictions, Comparison of the predictive power of an artificial intelligence system with human experts, J. Math. Chem., in press. Klopman, G., M.R. Frierson and H.S. Rosenkranz (1990) The structural basis of the mutagenicity of chemicals in Salmonella typhimurium: The Gene-Tox Data Base, Mutation Res., 228, 1-50. Lave, L.B., F.K. Ennever, H.S. Rosenkranz and G.S. Omenn (1988) Information value of the rodent bioassay, Nature (London), 336, 631-633. Rosenkranz, H.S., and G. Klopman (1990a) Structural basis of carcinogenicity in rodents of genotoxicants and nongenotoxicants, Mutation Res., 228, 105-124. Rosenkranz, H.S., and G. Klopman (1990b) The structural basis of the mutagenicity of chemicals in Salmonella typhimurium: The National Toxicology Program data base, Mutation Res., 228, 51-80. Rosenkranz, H.S., G. Klopman, V. Chankong, J. Pet-Edwards and Y.Y. Haimes (1984) Prediction of environmental carcinogens: A strategy for the mid 1980's, Environ. Mutagen., 6, 231-258. Rosenkranz, H.S., F.K. Ennever and G. Klopman (1990) Relationship between carcinogenicity in rodents and the induction of sister chromatid exchanges and chromosomal aberrations in Chinese hamster ovary cells, Mutagenesis, 5, 559-571. Rosenkranz, H.S., Y.P. Zhang and G. Klopman (1991) Implications of newly recognized relationships between mutagenicity, genotoxicity and carcinogenicity of molecules, Mutation Res., in press. Tennant, R.W., B.H. Margolin, M.D. Shelby, E. Zeiger, J.K. Haseman, J. Spalding, W. Caspary, M. Resnick, S. Stasiewicz, B. Anderson and R. Minor (1987) Prediction of chemical carcinogenicity in rodents from in vitro genotoxicity assays, Science, 236, 933-941. Williams, G.M., H. Mori and C.A. McQueen (1989) Structure-activity relationships in the rat hepatocyte DNA-repair test for 300 chemicals, Mutation Res., 221, 263-286. Zeiger, E. (1987) Carcinogenicity of mutagens: Predicti~,e capability of the Salmonella mutagenesis assay for rodent carcinogenicity, Cancer Res., 47, 1287-1296. Zeiger, E., J.K. Haseman, M.D. Shelby, B.H. Margolin and R.W. Tennant (1990) Evaluation of four in vitro genetic toxicity tests for predicting rodent carcinogenicity: Confirmation of earlier results with 41 additional chemicals, Environ. Mol. Mutagen., 16, Suppl. 18, 1-14.

Quantification of the predictivity of some short-term assays for carcinogenicity in rodents.

A statistical procedure is described for assessing the predictive performance of short-term tests for carcinogenicity in which the actual number of ch...
303KB Sizes 0 Downloads 0 Views