Optimization of ambient air quality monitoring networks : (Part I).

OPTIMIZATION

OF A M B I E N T A I R Q U A L I T Y M O N I T O R I N G NETWORKS

(Part I) PRASAD

M. M O D A K

Environmental Science and Engineering Group, Indian Institute of Technology, Powai, Bombay, 400 076, India and B. N. L O H A N I

Environmental Engg Division Asian Institute of Technology, P.O. Box 2754, Bangkok, Thailand, 10501 (Received 9 July, 1984) Abstract. A method has been developed to obtain a joint solution to the problem of optimum number and configuration of ambient air quality monitors, on the principles of spatial correlation analysis and the minimum spanning tree. The interest in this case is to represent the patterns of regional air quality, at a minimum of an overlap of information. This methodology is extended to account for the uncertainties in air quality simulations and also to incorporate the probabilities of occurrence. As an illustration to these methodologies, an example of Taipei City, Taiwan has been considered.

1. Introduction

In recent years, ambient air monitoring programs have become an important activity of urban air quality management. This has made air quality surveys more complex, requiring comprehensive planning to ensure that the prescribed objectives can be attained in the shortest possible time and at the least cost. The number and locations (configuration) of air quality monitors have an important role in achieving the monitoring objectives. In view of the fact that costs of equipment, maintenance and operating personnel are increasing dramatically, the possibility of optimizing monitoring design, is most attractive to the directors of air quality management programs. This paper addresses itself to the development of methodologies which would identify the optimum number and configuration of monitors for effective air quality monitoring network design. 2. Number of Monitors

There are two possible ways in which the number of monitors could be determined; namely the analytical and the empirical approach. The analytical approach is basically estimation oriented and the empirical approach is based on experience and professional judgement. Amongst the analytical techniques, Keagy et al. (1961) report a formula for the number of monitors based on the estimation of regional mean air quality as an objective. Formulae developed using the structure function concept (Munn, 1981) Environmental Monitoring and Assessment 5 (1985) 1-19. 9 1985 by D. Reidel Publishing Company.

0167-6369/85.15.

2

P. M. M O D A K AND B. N. LOHANI

compute the network density such that the interpolation error for estimating regionwide air quality concentrations is within a stipulated bound. Amongst the empirical guidelines which are perhaps most frequently used, are the ones developed by the US EPA (1971), WHO (1977) and by Noll and Miller (1977). The US EPA (1971) and the WHO (1977) guidelines are related to population (based on data in the U.S.A.) and recommendations by Noll and Miller (1977) rely on the qualitative concept of spatial resolution. One of the important limitations of the empirical approaches is that these methods do not provide any feedback on the accrued effectiveness vs the costs incurred.

3. Configuration of Monitors Once the number of monitors is decided, the next task is to obtain an optimal configuration with respect to monitoring objectives. It is not possible that all the objectives, elucidiated in Table I, could be quantified to suit the formulation of various optimization models. It is not surprising therefore that the past studies on optimization of Air Quality Monitoring Networks (herein referred as AQMN) have been primarily concerned with a few objectives of interest, namely: (1) Estimation of the regional pollutant distribution (considered as equivalent to objectives (1) and (7): Studies by Shannon et al. (1978), Nakamori et al. (1979), Bach and Vukovich (1981). (2) Source-orientation of compliance with ambient air quality standards (considered as equivalent to objectives 2 and 3): Studies by Seinfeld (1972), Godfrey et aL (1975), Buell (1975), Hougland and Stephens (1977). (3) Assessment of the effects due to pollution on populations at large (considered to be equivalent to objectives 4, 5, 6, and 8): Studies by Darby etal. (1974). It should be noted that the above optimization models do not provide a joint solution to the problem of optimum number as well as optimum configuration of monitors. It is necessary that the number of monitors be determined either on an ad-hoc basis (empirical methods) or using the estimation based approaches; and the optimum configuration is then determined by employing one of the objective-specific methods. In other words, the optimality of the number of monitors is seldom considered as a decision variable while the optimum configurations are to be identified. There appear to be three important characteristics which must be considered while developing quantitative procedures for optimum AQMN design. (1) Table I provides a summary of air quality monitoring objectives with respect to the degree of sophistication (WHO, 1977). It is then evident that an AQMN is normally expected to meet a wide variety of objectives. The framework of the proposed methodologies must therefore be flexible enough to accomodate the interests over multiple objectives. (2) The various objectives identified in the monitoring program must be appropriate, feasible and justifiable on a rational basis. For instance, it is now known that the population exposure could be only assessed via detailed epidemological studies and personal monitoring and not solely based on the data at fixed outdoor air quality

OPTIMIZATIONOF AMBIENTAIR QUALITYMONITORINGNETWORKS

3

TABLE I Sophistication of AQMN in relation to range of objectives (extracted and compiled from the WHO, 1977) Network sophistication

Possible range of objectives

Automatic instruments continuously operated (1 hr)

All

Semi-automatic instruments mechanized bubblers; high volume samplers etc. (24 hr)

1,2,4,5,7,9

Manually operated (once a week)

1,5,7,9

1. Assess the spatial-temporal trends. The number of stations should be sufficiently large so as to detect the general pattern. 2. Evaluate control strategies. 3. Activate episode controls. 4. Evaluate risks to human health. 5. Evaluate risks to environmental damage. 6. Data-base for land-use planning. 7. Test dispersion models. 8. Investigate complaints. 9. Initial assessment.

monitoring stations. Substantial differences have been found between the air pollution levels measured at the fixed outdoor monitors and levels to which people are actually exposed (Perkins, 1973; Goldstein, 1976; Silverman et al. 1982). It may not be appropriate then to expect that a fixed outdoor A Q M N , optimized for the maximization of population dosage product, (Darby et al., 1974) would provide a rational basis for the assessment of health effects due to pollution. It may only be considered as a cursory tool (and not a sole decision criterion), to place the monitors in the populated areas of probable concern. (3) The A Q M N is designed to monitor a variety of pollutants and therefore the proposed methodologies should be able to handle the interests of a number of pollutants. Unfortunately, past studies on the optimization of the ,~QMN have been based only on a single objective and a single pollutant. This yields only a piecemeal approach to the A Q M N design. An A Q M N optimized for the interest of one objective may not be optimum with respect to another. Similarly, an A Q M N optimized for one pollutant may not be optimal with respect to another. A joint consideration of multiple objectives and multiple pollutants may require certain compromises which should be explicitly known to the Decision Maker; (hereafter referred as DM). The objectives of this paper are therefore threefold. (1) To provide a joint solution to the problem of optimal number of monitors and monitor configuration, for a specified measure of effectiveness. (2) To develop methodologies for the multi-objective optimization of A Q M N . (3) To develop methodologies for the multi-pollutant optimization of A Q M N . These

4

P. M. MODAK AND B. N. LOHANI

methodologies should be flexible enough to consider a multi-objective, multi-pollutant optimization of the AQMN. This paper has been organized into 3 parts. Part I presents a new formulation based on the concepts of the Spatial Correlation Analysis (SCA) and the Minimum Spanning Tree (MST), to provide a joint solution to the problem of optimum monitor density and monitor configuration. The objective of the MST is to select a minimum number of monitors such that the regionwide variance of the pollutant is explained at a maximum of an overlap of information. Part II (Modak and Lohani, 1984b) develops two new methodologies as a necessary extension of the MST algorithm for the interest of multi-objective optimization of AQMN, namely (1) utility approach and (2) sequential interactive compromise. Objectives such as representation of air pollution patterns and detection of violations over ambient air quality standards have been considered as the principal AQMN objectives. The methodologies described in Parts I and II are primarily restricted for the interests of siting single pollutant monitors. Part III (Modak and Lohani, 1984c) introduces a multi-pollutant consideration to the optimization problem. Two new methodologies have been proposed, which make use of the (1) index theory and the (2) Pareto optimal approach. Since these approaches are basically an extension of the utility and sequential interactive approaches, a multiobjective multipollutant optimization of AQMN is possible.

4. Representation of Spatial-Temporal Patterns It is clear that a solution to the problem of minimum number of monitors could be best handled by the objective of estimation. The objective of estimation has been interpreted herein, in terms of the representation of region-wide air pollution patterns at a satisfactory spatial variance. It could be noted from Table I that representation of spatial-temporal patterns is one of the important objectives of the AQMN. Locations not carefully sited for the pattern would fail to permit a spatial resolution of ambient air pollution for the region concerned. Predictions of mathematical models at well-placed locations would be more representative than locations which are arbitrarily placed in the region. It is then necessary to develop an approach which addresses the task of pattern representation in an explicit manner. A need for the satisfactory representation of pollution patterns has been expressed by a number of research workers as well as by the policy makers (Vukovich, 1976; Noll and Miller, 1977; WHO, 1977). Despite general agreement on the importance of pattern representation in the AQMN design, methodologies for suitable incorporation in the design procedures have not yet been attempted. It is important therefore to quantify: (1) What is representating a pattern. (2) How to extract the regional patterns at a minimum number of monitors.

OPTIMIZATION OF AMBIENT AIR QUALITY MONITORING NETWORKS

5

5. Spatial Correlation Analysis (SCA) For an idealized random or uncorrelated domain, a selection of monitor configuration is not difficult, since each location reports an equal amount of independent information. On the other hand, if the domain shows various degrees of dependence between locations, then an advantage of such a dependency must be taken. The Spatial Correlation Analysis (SCA) is a very flexible tool to analyse dependencies or overlaps in such situations. In the SCA, similarities are assessed by calculating the cross-correlation coefficient between locations. If the cross-correlation coefficient between two locations is more than some fixed value, called the cut-off correlation coefficient C O, then the locations are considered to be dependent. A rationalization based on the SCA therefore depends on the choice of the cut-off correlation coefficient. A choice of the cut-off correlation coefficient provides good flexibility to investigate its effect on the extent of rationalization. The concepts of network rationalization using the SCA were originally developed for the analysis of precipitation networks (Hendrick and Comer, 1970; Huff and Shift, 1969) and for meteorological fields (Steinitz et al., 1971). Recently, the concepts of the SCA were applied by Elsom (1977); Liu and Avrin, (1981) and Handscombe and Elsom, (1982) for the rationalization of existing AQMN. This concept of rationalization has been extended further to some useful optimization formulations in this research.

6. Development of the Minimum Spanning Tree (MST) Algorithm If the data at two locations are related to each other by a cross-correlation of more than C o, where C O is the cut-off correlation coefficient, then the population correlation coefficient could be calculated from the curves reported by David (1980). If C~ is the population correlation coefficient for C O, then the variance explained by one station to another is given by (Cp~2. The optimization problem could therefore be considered as the identification of the optimum monitor number and configuration such that a variance of the order of (Cp~2 is explained for a maximum area of the region at a minimum possible overlap. To begin with, however, it is necessary to define terms such as overlap region, coverage area, effective gain, effective coverage, pattern scores and coverage effectiveness. Figure 1 shows the coverage areas and the overlap regions for 4 locations out of a mesh of candidate grid points. A coverage area for a location is simply an area of which a variance of the order of (CO)2 is explained. An overlap for locations i and j is then defined as the common coverage area, i.e., the area common to the coverage area of locations i and j. The coverage area has been quantified herein in terms of the 'pattern scores'. Apattern score N'p for ith candidate location is defined as the number of locations correlated to the time series of concentrations (simulated at this location), above the specified cut-off correlation coefficient C ~ Figure 1 shows the pattern scores for the 4 locations arbi-

6


0

0

0

0

0

0

0

0

0

0

0

oo o

0

0

0

0

0

0

0

"0

0

0

0

0

0

@ o

[]

0

0

,

0

0

0

0

0

0

0

0

0

0

0

0

0

[] 0

l

0

~176

0

0

0

0

0

0

0

0

0

0

0

0

0

Legend o Candidatelocation Coverage area Fig. 1.

0

(~ [] ~

0

0

0

0

0

Pattern score Reference location Overlaps

Representation of coverage area and overlaps for a typical grid of candidate locations.

trarily selected. Based on the above definition of pattern scores, the coverage area for location i could be defined as M e where M is the set of locations correlated with i above the stipulated cut-off correlation coefficient C O. In terms of set operation then, the overlap for locations i andj could be expressed as (M i ~ MJ). Further the effective coverage for both these locations could be represented

O P T I M I Z A T I O N OF AMBI ENT AIR QUALITY M O N I T O R I N G NETWORKS

7

as (Mew M 0 and the effective gain on combining location j with i would be (M e w M j - M e ~ M0. The coverage effectiveness for a combination of locations is then defined as the ratio of effective coverage (in terms of the pattern scores) and that of the number of candidate monitoring locations. Since the interest of the optimization problem is to achieve a maximum coverage effectiveness at a minimum overlap, the optimization problem could be formulated as, (in set notations) M a x ( M 1 u M 2 u ...

(1)

~ Mm).

Such that A Partial Cover Problem m = m~

(2)

or, A Total Cover Problem (M 1 u M a u ... w M m) = ( M I

~ M2

. . . k.)

M N)

(3)

m = the number of monitors, m ~ = the maximum number of monitors based on the budgetary constraint, N = the number of candidate locations, M t = the set of Np locations correlated with location i. To accomplish the above objective in this study, a selection methodology has been developed on a sequential rule. This method could be also regarded as a constructive method of heuristics, where the basic idea is to build up to (i.e. construct) a single feasible solution, in a deterministic sequential fashion. These algorithms are also known as greedy. Greedy algorithms do their best in each single step of decision. A well-known example of such an approach is the minimum spanning tree used in network analysis problems. The proposed optimization algorithm has been therefore termed as the Minimum Spanning Tree (MST) algorithm. A description of the optimization algorithm for the solution of the above problem is as follows. To begin with, a spatial correlation matrix is constructed from the data simulated at the grid-points of interest in the region. From examination of this matrix, a location which has a maximum of pattern scores is selected as the first best monitoring location. This is made possible by coding another matrix, called status matrix, which keeps track of the pattern scores for each of the candidate locations by comparing the elements of the correlation matrix with C O. Next, a location is identified, which has a minimum overlap (in terms of the pattern scores), with that of the location selected before. The extent of the overlap is identified by comparing the rows of the status matrix. This location is then selected as the next best monitor. The third location is now selected such that, the overlap is minimum with respect to both of the locations selected before. This step is ensured by carrying out a set union operation between the rows of the status matrix of the selected candidate locations. This

8


procedure of sequential selection is repeated till all the grid-points in the region are completely covered (Equations (3)), or the budgetary constraint is violated (Equation (2)). Modak (1984) should be referred for computational details. This algorithm could be used to identify the 'best' sequence of grid-points in a systematic manner to ensure maximization of regional coverage at a minimum overlap. Since in this case, the network flow is fully connected (i.e., any location could be picked up at any decision stage), the optimality in terms of the identification of minimum number of monitors, (i.e., decision stages) is guaranteed. The above algorithm could be used for two possible interests, such as, (1) It is possible that C Ois not known, but the total budget is known. The optimization algorithm is therefore expected to provide a maximum possible C ~ number of monitors and their optimal configurations, for a prescribed monitoring budgetary constraint. To accomplish such an objective, the MST algorithm needs to be iterated by lowering C O as necessary while meeting the prescribed budgetary constraint. (2) C Ois known and the optimization problem is expected to provide a solution to the best monitor number and the configuration and therefore the total cost of the system.

7. Assumptions in the Minimum Spanning Tree Algorithm Since the MST algorithm is based on the SCA, the assumptions for the SCA are directly applicable to the former. In the SCA, it is assumed that the data are normally distributed. Before proceeding with the SCA therefore, the data should be checked for normality. A log-transformation of the data should be adequate in most instances to remove skewness. Another assumption in the SCA is that there should not exist any significant temporal variation in the data so as to produce spurious autocorrelation coefficients. Examples of such situations would be that of the seasonal, week-end and the diurnal cycles normally observed in air pollution concentrations. Such variations should be therefore removed by carrying out suitable differencing procedures. The MST algorithm assumes that the movement of the spatial correlation coefficient is that of a monotone decreasing nature and the coverage areas (Figure 1) are continuous. In other words, it is assumed that the candidate locations are situated on a regular grid and there are no significant variations in the topographical features. Another important assumption in the MST is that the relationship between two locations could be adequately described based on a linear correlation coefficient. This assumption may not be true especially for the reactive pollutants such as ozone, hydrocarbons etc. It may be necessary then to use the expressions for the second or the third order correlation coefficients. Finally, the MST algorithm requires simulations of the air pollutants at all the candidate monitoring locations. In this methodology, it is assumed that the surveys to identify the air quality of a region can be effectively designed, only by utilizing idealized mathematical models. Yet another assumption is that the AQMN is also designed for the validatation or the calibration of the idealized mathematical models.

O P T I M I Z A T I O N OF AMBI ENT AIR QUALITY M O N I T O R I N G NETWORKS

9

8. An Example for Taipei City For the purpose of illustration, the MST algorithm has been applied to find the best number and the configuration for monitoring sulfur dioxide concentrations in Taipei City, Talwan. Taipei is the capital city of Taiwan and is located at 25 ~ 06' N. Latitude in a subtropical zone. The major sources of air pollution in Talpei City are not the point sources (such as grouped industrial stacks) but the contributions due to traffic emissions, Chuang (1977). An air quality monitoring program in Taipei City has been active for more than a decade. Figure 2 shows the configuration of the 11-station SO2 monitoring network, at an elevation of 10 m above ground level. SO2 concentrations are analysed by the conductometric method. To make the M S T algorithm operational, it is necessary to simulate the SO2 concentrations based on the idealized mathematical models. The idealized mathematical models are those which are based on the principles of atmospheric diffusion and transformations, emission inventories and the observed meteorological parameters. Since no information on the emission inventories and metereological parameters was available, the simulations were carried out by interpolating the data of the existing AQMN. For the purpose of interpolation, data on the monthly averaged hourly concentrations of SO2, observed in the month of January 1981, was used. Hourly concentrations of the SO 2 were simulated at the 34 grid points, shown in Figure 2, based on the algorithm developed by Modak and Lohani (1983). Unlike the conventional inverse distance weighing approaches, this algorithm makes use of the information on the wind roses observed at the existing monitoring stations by carrying out a vector dot product between the distance and the wind vectors. Figure 3 shows the contours of SO 2 at 9.00 a.m. which is the period of maximum concentration. The simulated data on SO 2 concentrations were subjected to the MST algorithm to evaluate the coverage effectiveness of the existing monitoring network and to identify the optimal configurations at different levels of cut-off correlation coefficient C ~ A virtual central processor time of the order of 3 sec was required for every run on IBM 3031. Figures 4, 5, and 6 show a graphical representation of the results. The present worth costs, presented in Figure 4, are based on the assumption that the cost of an automatic SO2 analyser is 5000 US$, the annual cost of maintenance and operation is 25000 US$ and the cost of the shelter and appurtances is 2800 US$. Further, it is assumed that a useful life of the automatic analyser is 5 yr and the annual rate of interest is 20~o.

9. Discussion of Results It can be observed from Figure 4, that as the cross-correlation coefficient C ~ is increased, the number of monitors required for the total representation of the pattern also increases.

10

P. M. M O D A K A N D B. N. L O H A N 1

24

) 23

~._~_

/

22

~

I

21

9 Existing Monitoring Stations ) O Grid Points used for Simulation Total Number of Existing Stations II

/

/

19

Total Number of Grid Points 54

H

0

0

0

17

0

15 )

0

0

'4` ~ - . L . ,

o

o

0

•

)o

o

o

o

o

o

o

o

I0

I::1

0

0

0

~

9

~

o

o

7

0

o

~.

o

,-~

o

F '

, _

o

o

4` 3

0

k

0

o

1

Scale I cm = 1.16 KMS

k _ I r

I

I

I

I

I

1

I

I

l

2

5

4

5

6

7

8

I"=~"=I

9

10

i ~ I . ~J l

11

12

I

15

14,

15

i

i

t

16

17

18

KILOMETERS Fig. 2. Configuration of existing AQMN and grid-points used for simulation for the case of Taipei City, Taiwan.

19

11

OPTIMIZATION OF AMBIENT AIR QUALITY MONITORING NETWORKS

24 23 22 21 20

19 18

1.0

17

1.5

16 15

1.9

"-"

14

2.5 3.0

IM

=E o 12 _J

2.8

"v'

II

('"

1.8

2.9

I0

8 7

3

N

Scole 1 cm = 1.16 KMS I

2

3

4

5

6

7

8

9

I0

II

12

13

14

15

16

17

18

KI LOMETERS Fig. 3.

Monthly averaged contours of CO 2 (in 10- 1 ppm) at 10.00 a.m., for Taipei City - January 1981.

19

12

P. M . M O D A K

AND

B. N . L O H A N I

0.95

908

% "6 09

0

0

o (J tO

8 0.9C

605 x c

z

o

o

Optimization of ambient air quality monitoring networks : (Part II).

Optimization of ambient Air Quality Monitoring Networks : (Part III).

Methodology for designing air quality monitoring networks: I. Theoretical aspects.

Environmental assessment of three egg production systems--Part I: Monitoring system and indoor air quality.

Epidemiological bases for ambient air quality criteria.

Adaptive preheating duration control for low-power ambient air quality sensor networks.

Quantification Method for Electrolytic Sensors in Long-Term Monitoring of Ambient Air Quality.

Continuous monitoring instrument for reactive hydrocarbons in ambient air.

Limitations of ambient air quality standards in evaluating indoor environments.

Assessment of ambient air quality in the port of Naples.

Finite mixture models to characterize and refine air quality monitoring networks.

Methodology for designing air quality monitoring networks: II. Application to Las Vegas, Nevada, for carbon monoxide.

Time to harmonize national ambient air quality standards.

A microprocessor-based air quality monitoring system.

Bioharness(™) multivariable monitoring device: part. I: validity.

Ambient air pollutant analyses-integral part of an environmental impact study.

Guidelines for personal exposure monitoring of chemicals: Part I.

Optimized arrangement of constant ambient air monitoring stations in the Kanto region of Japan.

A cluster analysis of constant ambient air monitoring data from the Kanto region of Japan.

Ambient air pollution and stroke.

Underwater Electromagnetic Sensor Networks-Part I: Link Characterization.

The deployment of carbon monoxide wireless sensor network (CO-WSN) for ambient air monitoring.

Emissions and ambient air monitoring trends of lower olefins across Texas from 2002 to 2012.

Chemiluminescent method for continuous monitoring of nitrous acid in ambient air.