E F F I C I E N T S E A R C H P R O C E D U R E S FOR E X T R E M E
POLLUTANT
VALUES
D O N C A S E Y , P E T E R N. N E M E T Z ,
and D E A N U Y E N O *
The University of British Columbia
(Received 23 October, 1983; revised April, 1984) Abstract. Extreme pollutant values are of great interest in water quality monitoring because of their frequent toxicological significance. The principal barrier to the detection of these values, however, is the cost of extensive and comprehensive monitoring. This paper demonstrates an efficient method to determine the maximum sample measurement from a finite set of sequential samples without explicitly testing them all. It is assumed that the process of sample measurement is distinct from collection and has higher costs. It is further assumed that the measurements have high positive autocorrelation. A methodology is presented based on a common industrial testing procedure referred to as composite sampling the physical pooling or compositing of a set of sequential samples before measurement. A method known as primary first order compositing (PFOC) was found to be superior to the traditional technique of random sampling, particularly if small composite sizes are utilized.
1. Introduction
Pollution control standards frequently specify both timedependent arithmetic pollutant means and instantaneous maximum permitted values. Because of their potential toxicological significance, knowledge about extreme values is particularly important for the purpose of enforcing water quality standards. The maximum pollutant concentration should be an observed value or be estimated from other measurements. The principal barrier to the detection of violations is the cost of extensive and comprehensive monitoring. Due to the random component of the data, no method exists that will find the maximum concentration with certainty unless continuous monitoring is used. Any sampling method adopted must be able to identify the maximum pollutant concentration a large proportion of the time. Falling this identification, the method must be able to signal the existence of excessive pollutant levels by finding some 'large' value. In this paper, a method is developed which can estimate the maximum from a finite set of sequential samples without testing all samples. The following assumptions are used: (1) the process of collecting samples is distinct from their measurement; (2) the cost of sample measurement is high relative to that of collection; and (3) the sample measurements have high positive autocorrelation. There are many situations in water, air, and industrial process monitoring where the collection and testing of samples are distinct. An incremental change in the number of tests performed is important because the cost of testing is typically much larger than
* The authors are, respectively, operations research analyst, Vancouver, B. C.; Associate Professor and Chairman, Policy Analysis Division, Faculty of Commerce and Business Administration, University of British Columbia; and Associate Professor, Management Science Division, Faculty of Commerce, U.B.C.
Environmental Monitoring and Assessment 5 (1985) 165176. 9 1985 by D. Reidel Publishing Company.
01676369/85.15.
166
D. CASEY ET AL.
that of sampling if laboratory analysis is required. The assumption of high positive autocorrelation is fundamental to the method developed in this paper. It permits the estimation of information about some of the pollutant samples which are collected but not tested. 2. Relevant Literature There is very little literature about the detection of individual violations of water quality standards (Beckers et al., 1972; Beckers and Chamberlain, 1974; Casey et aL, 1983; Chamberlain et aL, 1974). Ward (1973) and others (Curtis, 1976; Sanders and Adrian, 1978; Sherwani and Moreau, 1975) examine the issue of sample size, while Loftis and Ward consider the variance of the sample mean under serial correlation (Loftis and Ward, 1978) and the effect of various statistical assumptions on confidence intervals (Loftis and Ward, 1980a, b). No articles were found which deal directly with the statistical and efficiency issues concerning the detection of maximum pollutant measurements. Somewhat related articles were found in the statistical literature. A number referred to the detection of defective units in a binomial population by group testing (Hwang, 1972 and 1975; Kumar and Sobel, 1971; Garey and Hwang, 1974; Graft and Roeloffs, 1974; Pfeifer and Enis, 1978). Group testing, the simultaneous testing of a group of samples, was used to determine if all units in a group were satisfactory or if at least one unit was defective. All units in the groups deemed defective would be then individually tested. This approach is applicable if there is a clear and measurable definition of what constitutes a defective unit. In our case we could not have a present notion of what the maximum value should be, thus effectively eliminating this form of screening. Furthermore, it was felt that consideration of autocorrelation in a sequence of samples would lead to a more effective sampling plan. Several investigators have considered the problem of the estimation of the mean of a characteristic of composited samples (Reed and Rigney, 1947; Brown and Fisher, 1972). Edelman (1974) has examined estimation of means and variances under 3stage nested designs for compositing. Sobel and Tong (1976) applied group testing to the estimation of percentiles for a normal population with unknown mean and variance. We were unable to find an article related to the determination of the maximum sample measurement, a key environmental indicator. 3. Composite Methods The procedure presented in this paper to detect the maximum comes from a class of techniques known as composite methods (Casey, 1982). Composite sampling, a common practice in water pollution monitoring, involves the physical pooling of a set of sequential samples prior to measurement. The result of this process is an arithmetic average of the samples that were composited. Assume there are N samples aggregated into m sequential groups. Then, within each group, a fixed portion of each sample is pooled
EFFICIENT SEARCH PROCEDURES FOR EXTREME POLLUTANT VALUES
167
to form a total of [N/m]* composite samples which are subsequently measured. In the presence of high positive autocorrelation, the maximum concentration among the N samples will tend to be surrounded by samples with high values. As a consequence, the composite sample that contains the maximum sample value will also tend to have a relatively high measurement. This suggests that the search for the maximum can be concentrated among the individual observations that formed the composites with the highest measured levels. Only a portion of each individual sample may be pooled when forming the composites, since the remaining portion must be retained for later analysis once the maximum among the composites is identified. An unbiased estimate of the population mean may be obtained by averaging the measurements of the [N/m] composites. Moreover, if a 2 is the population variance, then the variance of the estimate of the population mean will be a2/N even though only IN/m] samples are measured (Brumelle et aL, 1984).** Primary first order compositing (PFOC) consists of several steps. Initially, the composites are formed, measured, and the composite with the maximum level is identified. Then, all the samples that formed this composite are measured. The maximum of these sample measurements is the estimate of the maximum for all samples. The word 'primary' refers to the fact that only the samples that form the composite with the maximum measurement are analysed further. This represents the primary choice of samples. The term 'first order' is used because no further compositing is performed on the samples that remain; that is, the compositing procedure is applied only once. A major issue is grouping a given set of samples to form composites. Casey et aL (1982) offer theoretical and empirical evidence that indicates that all composites should be of equal size, a situation called 'balanced compositing'. This restriction is consequently followed in this paper. 4. The Data
Walden etal. (1971) have shown that specific conductivity, the ability of a given substance to conduct electric current, is closely related to pulp mill effluent toxicity. Specific conductivity data are used to measure the loss of process chemicals. Recorded data on this variable were obtained from a pulp mill in British Columbia for a period of 2009 hr (Nemetz and Drechler, 1978). This information is stored on computer files in the form of 120 540 minutebyminute observations. These data do not satisfy the assumption that the cost of measurement be expensive relative to sampling. This assumption, however, only specifies the situation in which the method will be costeffective. The data showed high positive autocorrelation. The autocorrelation function was calculated for each of the 2009 hr separately and the values at each lag were then averaged. The resulting 'average autocorrelation function' appears * Ix] stands for the smallest integer >x. ** The existenceofautocorrelationimpliesthat the observationsare not statisticallyindependent.However, the sample mean will still be an unbiased estimator and the variance can be estimated with a slight modificationto the expression provided by Loftis and Ward (1980a).
168
D. CASEY ET AL.
O
,~.~
RVERRGE RUTOCORRELBTJDN FUNCTION 951 CONFJDENCE INTERVQL
Q e,J 
O~o
"~\
~..~.

Or 
" ~~
o
o
".'.
1
!
I
I
I
I
I
I
I
3
I
I
5
LAG Fig. 1.
I
I
I
I
I
I
7
I
I
I
I
I
I
I'
1
9
(in m i n u t e s )
A v e r a g e a u t o c o r r e l a t i o n function.
in Figure 1. The standard error was calculated for each averaged value and the resulting 95 ~o confidence interval also indicated. The average first order autocorrelation is 0.869, decreasing monotonically to 0.177 by the tenth lag. 5' Measures of Error The random nature of the data guarantees that no method other than exhaustive testing will always find the maximum value. Thus, measurement of this error is required. Four criteria were chosen to demonstrate the efficiency of the compositing methodology: (1) the proportion of trials in which the maximum was found (P), (2) the mean absolute range error (MARE), (3) the maximum absolute deviation (MAD), and (4) the mean square error (MSE). Each of these measures of error has distinct characteristics. Over n trials, the proportion P is the success rate in finding the actual highest measurement. However, it provides no information about the magnitude of the deviations between the estimated and actual maximums when the highest measurement is not detected.
EFFICIENT SEARCH PROCEDURES FOR EXTREME POLLUTANT VALUES
169
The mean absolute range error (MARE) is defined for a single trial as: IActual M_ ~um _ Estimate of Max um I Actual MaximumActual Minimum MARE measures the absolute deviation as a proportion of the maximum possible error. The maximum absolution deviation (MAD) is the largest error (actual maximum estimated maximum) over the set of trials. It is a measure of worst case performance and is sensitive to extreme values in the data. The mean square error (MSE) is the average of the squared differences of actual and estimated maximums. The MSE assigns greater weight to larger errors.
6. A Comparative Measure of Performance The data chosen for this study exhibit high positive autocorrelation. This is typical of pollution data as noted by several authors (Curtis, 1976; Sanders and Adrian, 1978; Loftis and Ward, 1978, 1980a, b). Random sampling is used to provide a base for comparison with primary first order compositing. 7. Experimental Plan The data were available on a minutebyminute basis, and each trial consisted of estimating the maximum from a set of 60 samples. This procedure generated a reasonable number of samples to handle from the standpoint of testing and storage and provided a desirably large sample size. Casey (1982) provides empirical evidence that the time between samples is irrelevant, performance being dictated by the autocorrelation between samples. With the sample size fixed, balanced composites were chosen, composed of 2, 3, 4, 5, 6, 10, 12, 15, 20, and 30 subsamples. Using these ten compositing regimes, PFOC and random sampling were applied to each of the 2009 onehour blocks of samples and evaluated by the four measures of error. 8. Results This section assesses the performance of primary first order compositing using the four criteria: P, MARE, MAD, and MSE. 8.1. THE PROPORTIONOF TRIALS IN WHICH THE MAXIMUMWAS FOUND (P) The performance of primary first order compositing appears in Table I. PFOC performs substantially better than random sampling for every composite size, and finds the actual maximum in at least 33 ~o more trials than does random sampling with the difference as high as 45 ~ . The differences are significant to a confidence level exceeding 0.9999. There are always two distinct balanced composite sizes that result in the same number
D. CASEYET AL.
170
TABLE I
Comparison of PFOC with random sampling using P and MARE Number of tests
Composite size
PFOC proportion
Random sample
PFOC
Random sample
(P)
proportion (P)
MARE
Std. error (• 10 2)
MARE
Std. error (• 10 2)
32
2 30
0.83 0.82
0.50 0.49
0.010 0.039
0.097 0.276
0.049 0.049
0.272 0.276
23
3 20
0.75 0.77
0.38 0.39
0.022 0.047
0.168 0.285
0.077 0.074
0.346 0.331
19
4 15
0.75 0.73
0.30 0.29
0.036 0.052
0.191 0.299
0.091 0.091
0.350 0.348
17
5 12
0.72 0.72
0.28 0.28
0.033 0.050
0.220 0.285
0.108 0.104
0.391 0.380
16
6 10
0.71 0.71
0.25 0.27
0.039 0.048
0.242 0.278
0.111 0.110
0.387 0.388
of tests being performed. For example, composite sizes of 2 and 30 subsamples both will result in a total of 32 tests being performed per trial. The greater the number of samples tested, the larger the proportion of trials in which the sample with the maximum measurement was found. The withinpair proportions are, however, very similar with none of the differences being significant at an ~ level of 0.05. This suggests that the proportion of successes may be a function solely of the number of tests performed. 8.2. MEAN ABSOLUTE RANGE ERROR
(MARE)
The mean absolute range error (MARE) also appears in Table I. PFOC performs substantially better than random sampling for every composite size. In the worst case, represented by a composite size of 30, the hypothesis that the mean value of the MARE for PFOC was smaller than that for random sampling would be accepted at an e level of 0.01. For all other composite sizes, the crlevel is less than 0.000 I. Random sampling also exhibits a somewhat larger standard error. These results indicate that even when PFOC does not find the maximum pollutant level, it does seem to identify one of the larger values. The value of the MARE for the smaller composite size within each pair is statistically less, with a significance probability exceeding 0.01. Thus, smaller composite sizes perform slightly better with respect to this statistic. The variability of the MARE is less for the smaller composite sizes. This, too, is desirable as the probability of large errors will be less. 8.3. MAXIMUM ABSOLUTE DEVIATION
(MAD)
The value of the MAD for each composite size is presented in Table II for both PFOC and random sampling. Clearly, the MAD for PFOC is much less than for random
EFFICIENT SEARCH PROCEDURES FOR EXTREME POLLUTANT VALUES
171
T A B L E II Comparison of P F O C with random sampling using M A D Number of tests
Composite size
PFOC
Random sampling
MAD
Record No.
MAD
Record No,
32
2 30
0.036 0.110
449 1331
0,236 0,195
418 1579
23
3 20
0.037 0. I10
1018 1331
0.237 0.195
418 1579
19
4 15
0.039 0.110
1018 1331
0.237 0.238
418 418
17
5 12
0.055 0.085
446 1142
0.237 0.109
418 1331
16
6 10
0.058 0.085
1906 1142
0.240 0.195
418 1579
sampling. This implies that PFOC will always find a 'large' value and thus provide a more reliable estimate. A more interesting picture emerges when one examines the record number or hour in which the MAD occurred. Two records, No. 418 and No. 1579, account for all of the large absolute deviations incurred by the random sample method. These records do not, however, cause the same difficulties for PFOC. To understand why PFOC is much superior in these cases, consider the graphs of the data from these hours that appear in Figures 2 and 3. Both data series are characterized by very low initial 'pollutant' levels followed by a single extreme value which, in turn, is succeeded by moderately low values that steadily decrease. By sampling randomly, there is a large probability that the sample with the extreme value will not be selected. For instance, if 20 samples are selected from each hour, the probability of not finding both of the extremes and, hence, incurring a large error, is 1  (1/3) 2 = 8/9. This probability will increase as the number of trials with an extreme value increases. For PFOC, the consequence of extreme pollutant levels is entirely opposite. The single extreme value will tend to dominate the composite that includes it, resulting in the selection of that composite and the subsequent location of the sample with the extreme measurement. In fact, the larger the extreme value, the more likely it is to be detected. This is a highly desirable property and represents a significant advantage over random sampling. In Table II it can be seen that the smaller composite sizes have the lowest MAD. This is directly related to the detection of extreme values. The smaller the composite size, the more overwhelming the effect of the extreme value on the composite measurement. Thus, the smaller composites will be more likely to detect even moderately high spikes.
172
D. CASEY
ET AL.
>C:)
C:1~(__3

t,;
Q c~
u'?~_
i i i i i i i t i i I i ~ i i i i i c~
iI" '1
I 4
i
I
n
2
8


IB
I
I
20
I'
24
 "l"l
I
28
I
32
MINUTES
i I""1"'1
36

40
I'

i
1
~



52
,~8
i
'

i
}
'1
58
B0
Fig. 2. Record 418: September 18, hour 10.
>_

_~d
c d(...3c:) 
~mr I
0
I
4
I
I
8
I
l
]P_
I
I
38
I
1
20
I
I
24
I
I
28
I
I '!
32
I
36
I
I
!
40
MINUTES Fig. 3. Record 1579: November 10, hour 19.
'1
44
I
I
40
I
I
52
I "1
56
!
I
60
EFFICIENT SEARCH PROCEDURES FOR EXTREME POLLUTANT VALUES
8.4. M E A N S Q U A R E E R R O R
173
(MSE)
The average levels and standard errors for PFOC and random sampling are presented in Table III. For every composite level, the MSE for PFOC is less than that for random sampling. This error measure also exhibits a large degree of variability for random sampling. Since the MSE is sensitive to large values, these results would indicate that random sampling produces far more 'large' errors. However, the patterns of the fluctuations for both random sampling and primary first order compositing bear a strong similarity to the behavior of their respective maximum absolute deviations. This would suggest that the MSE is being overwhelmed by these extreme values. To assess this impact, all the records on which a MAD was incurred (Table II) were eliminated and the MSE was recalculated. These results also appear in Table III. The initial conclusions still apply. The MSE for random sampling exceeds that for primary first order compositing at every composite size and random sampling still exhibits much larger variability. Smaller composites tend to perform better on the MSE criterion. For the pairs 2 and 30, 3 and 20, and 4 and 15, the MSE for the smaller composite size is statistically less than that for the larger composite at an :~level of 0.05. The same cannot be said for the pairs 5 and 12, and 6 and 10, however. The standard errors of the MSE also appear in Table III. In addition to the smaller composite sizes having lower average levels of MSE, the variability of the error is also less. This combination of lower mean levels of the MSE coupled with lower variability indicate that, at least with respect to this statistic, smaller balanced composite sizes should be preferred.
9. Extensions
Further research is required to determine the effect of the autocorrelation function on the performance of composite methods. This might take the form of a MonteCarlo simulation study in which data would be generated according to a broad range of autocorrelation functions. Several alternative composite methods should be considered. Improved performance in detecting extreme values would be guaranteed if sample examination were not confined to the composite with the highest measurement. The logical extension would include an analysis of the samples from the second highest composite, and the approach could be incrementally extended to other composites with lower values. The resulting increase in performance would, of course, entail higher laboratory and related testing costs. An alternate approach might be to first isolate the samples that form the composite with the highest measurement and then apply primary first order compositing to these samples. This method is referred to as primary second order compositing because compositing is performed twice. The method could be extended by increasing the number of times compositing was undertaken or by the application of primary second order compositing to more than one initial composite.
Composite size
2 30
3 20
4 15
5 12
6 I0
Number of tests
32
23
19
17
16
TABLE III
0.929 1.586
0.676 1.189
0.394 2.690
0.319 2.492
0.102 2.773
0.248 0.492
0.227 0.429
0.131 0.948
0.124 0.899
0.066 0.959
8.663 5.553
9,363 4.307
9.513 5.297
7.809 5.641
8.459 4.060
3.605 2.162
2.564 1.115
3.639 2.924
3.517 2.165
3.655 2.018
0.474 0.401
0.389 0.598
0.258 0.756
0.187 0.819
0.039 0.838
MSE ( x 1 0 5 )
MSE ( x 1 0 5 )
MSE (• 10 5) Std. error ( x 1 0 5 )
PFOC
Random sample
PFOC Std. error (• 10 5 )
(1999 hr)
(2009 hr)
Comparison of PFOC with random sampling using MSE
0.146 0.268
0.142 0.157
0.088 0.222
0.082 0.247
0.021 0.268
Std. error (xl0 s)
3.333 3.209
2.707 1.849
3.038 2.088
2.034 2.311
1.728 1.770
MSE (• 5 )
1.050 1.021
1.027 0.442
1.065 0.635
0.912 0.939
0.928 0.850
Std. error (• 5 )
Random sample
EFFICIENT SEARCH PROCEDURES FOR EXTREME POLLUTANT VALUES
I0.
175
Summary
This paper presents methods to find the maximum sample value from a set of sequential pollutant samples without incurring the high costs associated with exhaustive testing. The methods were based on a theoretical framework consisting of the following assumptions: (1) the processes of collecting and measuring samples are distinct; (2) the cost of testing a sample is significant relative to that of collection; and (3) the sample measurements are highly positively autocorrelated. The objective and assumptions entailed in this analysis are of particular relevance to water quality monitoring. The assumption of high positive autocorrelation suggested primary first order compositing (PFOC). This method first aggregates the samples into sequential groups of equal size. Then, within each group, a fixed portion of each sample is pooled to form a composite. The composite sample with the highest measurement is identified and all of the samples that formed this composite are tested. The highest value observed is the estimate of the overall maximum sample measurement. This methodology is founded on the premise that in the presence of high positive autocorrelation, the maximum sample value will tend to appear in the composite with the highest measurement. PFOC was compared to random sampling under four measures of error: the proportion of trials in which the sample with the maximum measurement was found (P), the mean absolute range error (MARE), the maximum absolute deviation (MAD), and the mean square error (MSE). The methods presented were applied to data based on specific conductivity levels recorded at a pulp mill in British Columbia. These data exhibited high positive autocorrelation. Primary first order compositing was found to be much superior to random sampling with respect to all four criteria in terms of both the level and variability of error. PFOC provided an efficient estimate of the population mean and demonstrated a significant ability to detect extreme values. Composite sizes representing balanced compositing occur in pairs, each of which resulted in the identical number of tests being conducted. The smaller composite size in each pair usually resulted in improved performance. Among these smaller composites, the greater the number of tests, the better the performance. One would expect this result since with composites of size one, you would always find the maximum sample value, albeit, at a much higher cost. Although not reported in this paper, there is also some evidence that PFOC is superior to random sampling even at lower levels of autocorrelation (Casey, 1982). Inherent in composite methods are other properties valuable in water pollution control. The number and thus cost of tests performed is a constant and is known prior to laboratory analysis. Additionally, if the observed maximum is in contravention of a water quality standard, it is possible to expand the testing to a larger neighborhood of this sample so that an assessment can be made of the duration of the violation.
176
D. CASEYET AL.
References Beckers, C. V., Chamberlain S. G., and Grimsrud G. P.: 1972, Quantitative Methods for Preliminary Design of Water Quality Surveillance Systems, U.S. Environmental Protection Agency, Report No. EPAR572001. Beckers, C. V. and Chamberlain, S. G.: 1974, Design of CostEffective Water Quality Surveillance Systems, U.S. Environmental Protection Agency, Report No. EPA600/574004. Brown, G. H. and Fisher, N. I.: 1972, 'Subsampling a Mixture of Samples Material', Teehnometries 14, 663668. Brumelle, S., Nemetz, P. N., and Casey, D. B.: 1984, 'On Estimating Means and Variances from Composite Samples', Environmental Monitoring and Assessment 4, 8184. Casey, D.B.: 1982, "Measuring Sample Maximums: An Application to Water Quality Monitoring', Unpublished Master's Thesis, Faculty of Commerce and Business Administration, University of British Columbia, Vancouver, Canada, October. Casey, D. B., Nemetz, P.N., and Uyeno, D.H.: 1982, 'The Problems with Unbalanced Compositing', Working Paper, Faculty of Commerce and Business Administration, University of British Columbia, Vancouver, Canada, September. Casey, D. B., Nemetz, P. N., and Uyeno, D. H.: 1983, 'Sampling Frequency for Water Quality Monitoring: Measures of Effectiveness', Water Resources Research 19, 11071110. Chamberlain, S.G., Beckers, C.V., Grimsrud G.P., and Shull R.D.: 1974, 'Quantitative Methods of Preliminary Design of Water Quality Surveillance Systems', Water Resources Bulletin 10, 199217. Curtis, W. R.: 1976, 'Sampling for Water Quality', Proceedings of the 8th IMR Symposium, September 2024, pp. 237244. Edelman, D. A.: 1974, 'ThreeStage Nested Designs with Composited Samples', Technometrics 16, 409417. Garey, M. R. and Hwang, F. K.: 1974, 'Isolating a Single Defective Using Group Testing', Journal of the American Statistical Association 69, 151153. Graft, L. E. and Roeloffs, R.: 1974, 'A GroupTesting Procedure in the Presence of Test Error', Journal of the American Statistical Association 69, 159163. Hwang, F. K.: 1972, 'A Method for Detecting All Defective Members in a Population by Group Testing', Journal of the American Statistical Association 67, 605608. Hwang, F. K.: 1975, 'A Generalized Binomial Group Testing Problem', Journal of the American Statistical Association 70, 923926. Kumar, S. and Sobel, M.: 1971, 'Finding a Single Defective in Binomial GroupTesting', Journal of the American Statistical Association 66, 824828. Loftis, J. C. and Ward, R. C.: 1978, 'Statistical Tradeoffs in Monitoring Network Design', Proceedings of the A WRA Symposium on Establishment of Water Quality Monitoring Programs, San Francisco, California, June, 1214. Loftis, J. C. and Ward R. C.: 1980a, 'Sampling Frequency Selection for Regulatory Water Quality Monitoring', Water Resources Bulletin 16, 501507. LoRis, J. C. and Ward R.C.: 1980b, 'Water Quality Monitoring  Some Practical Sampling Frequency Considerations', Environmental Management 4, 521526. Nemetz, P. N. and Drechsler, H. D.: 1978, 'The Role of Effluent Monitoring in Environmental Control', Water, Air, and Soil Pollution 10, 477497. Pfeifer, C. G. and Enis, P.: 1978, 'DorfmanType Group Testing for a Modified Binomial Model', Journal of the American Statistical Association 73, 588592. Reed, J. F. and Rigney, J. A.: 1947, 'Soil Sampling from Fields of Uniform and Nonuniform Appearance and Soil Types', J. Amer. Soc. Agron. 39, 2640. Sanders, T.G. and Adrian, D.D.: 1978, 'Sampling Frequency for River Quality Monitoring', Water Resources Research 14, 569576. Sherwani, J. K. and Moreau, D. H., 1975, Strategies for Water Quality Monitoring, Report No. 107, Water Resources Research Institute, University of North Carolina, Raleigh. Sobel, M. and Tong, Y. L.: 1976, 'Estimation of a Normal Percentile by Grouping', Journal of the American Statistical Association 71, 189192. Walden, C. C., Howard, T. E., and Sheriff, W. J. 1971, 'The Relation of Kraft Mill Operating and Process Parameters to Pollution Characteristics of the Mill Effluents', Pulp and Paper Magazine of Canada 72, T81T87. Ward, R. C.: 1973, Data Acquisition Systems in Water Quality Management, U.S. Environmental Protection Agency, Report No. EPAR573014.