ENVIRONMENTAL RESEARCH 59, 310--317 (1992)

Screening for Lead Exposure Using a Geographic Information System D A N I E L WARTENBERG

Department of Environmental and Community Medicine, UMDNJ-Robert Wood Johnson Medical School, Piscataway, New Jersey 08854 Received June 19, 1991 Screening programs for lead overexposure typically target high-risk populations by identifying regions with common risk markers (older housing, poverty, etc.). While more useful than untargeted screening programs, targeted programs are limited by the geographic resolution of the risk-factor information. A geographic information system can make screening programs more effective and more cost-efficient by mapping cases of overexposure, identifying high-incidence neighborhoods warranting screening, and validating risk-factor-based prediction rules. © 1992AcademicPress, Inc.

INTRODUCTION

Lead exposure poses a substantial health risk to children, particularly in urban environments. To limit risk, public health officials try to identify those at greatest risk of exposure. Many different sources of lead contribute to an individual's body burden (e.g., soil, paint, dust, air, water), making it difficult to identify directly those at greatest risk (e.g., Clark et al. 1985; Charney et al., 1980). Using a state-of-the-art computer database system, I present a data-based method for quantifying the health risk imparted by the principal sources of lead exposure in a neighborhood. Universal screening for lead overexposure is highly desirable and has been called for by the Centers for Disease Control. However, it is likely that some communities will not have sufficient resources to implement such programs, especially if screening is to be done with blood lead determinations rather than by one of the protoporphyrin methods. If communities can screen only a portion of the population at risk, a strategy is needed maximize the efficiency and efficacy of the program by targeting high-risk and high-incidence populations. Traditional approaches for lead exposure evaluation have employed classical screening methods (e.g., Daniel et al., 1990). Investigators select a population at risk, sample blood levels, and identify cases of overexposure. While useful in some contexts, there are limitations to this approach. First, only sampled children can be assessed for lead poisoning. Regional evaluation of risk is difficult because neighbors are not readily identified and geographic clusters of highly exposed individuals are likely to be missed. Second, confounder information cannot be used easily in identifying high-risk neighborhoods. Use of computerized databases, such as the U.S. Census files of population density, socioeconomic status, and ethnicity to assess population demographics, such as local water supply information to assess corrosivity, and such as lead mobilization, age of housing 310 0013-9351/92 $5.00 Copyright© 1992by AcademicPress, Inc. All rightsof reproductionin any form reserved.

SCREENING FOR LEAD USING A GIS

311

stock, and closeness to local industrial sources to assess lead sources, cannot be considered in an integrated assessment unless geographic proximity information is laboriously added for each observation or datum. In short, while traditional screening methods are potentially useful for an individual, only the most cursory assessments of a large region or population are possible. As an alternative to traditional screening programs, I propose a regional approach using a geographic information system (GIS) for the organization and analysis of relevant data. A GIS consists of a computerized database organized by geographic elements combined with a set of analytic tools including thematic map generation, proximity analysis, map overlay comparison, and buffer zone identification software.

DESIGNING A SCREENING PROGRAM Screening programs, unlike other epidemiologic surveys, are designed to identify high-risk individuals rather than to test hypotheses. It does not matter whether the factors used to discriminate between high-risk and low-risk individuals are those that cause high lead exposure, only that those so identified would have had high lead exposures which then can be reduced or prevented. In New York City, screening was conducted by health care providers for their patients as well as through a door-to-door community program (Fig. 1; Daniel et al., 1990). While many cases were detected through the health care providers, the rate of detection was low, likely because those at greatest risk have least access to primary health care. The screening program was redesigned to improve the

2

1.5

1.0

0.5

0.0

//

Provider City Clinic Door-to-Door GIS Fio. 1. Case-findingrates for lead overexposure. The ordinate is the number of lead overexposure cases found per individual screened. The abscissa denotes results for three lead screening programs and a hypothetical screening program using a GIS.

312

DANIEL WARTENBERG

case-finding rate by targeting high-risk neighborhoods, but the case-finding rate increased only by a factor of 4, while still being time-consuming and expensive.

CAPABILITIES OF A GEOGRAPHIC INFORMATION SYSTEM Exposure Prediction Ethnicity, socioeconomic status, lead-contaminated soil, water pipes with acidic water, and proximity to local industrial sources are risk factors for lead overexposure. While this information may be given heuristic consideration, it is rarely used directly or quantitatively in targeted screening programs. A GIS provides tools to integrate this information systematically for the identification of problematic regions. For example, using data from the U.S. Census and local environmental quality data, one can map relevant variables individually at local geographic scales. Buffer zones (areas within a specified geographic distance of a point, set of points, or other geographic feature), or regions of local dispersal, can be specified using the GIS, including directional components for wind-borne or downstream transport. An integrative function (e.g., sum, weighted average, logistic or discriminant function, or geographically constrained index) can then be specified to combine each of these sources into a single predictive model.

Mapping Cases A GIS also can be used to map individual cases of lead overexposure, enabling health professionals to detect areas showing a disproportionate number of cases. In reviewing such data, it is important to determine how many cases would be expected based on population density and possibly other risk factors (e.g., Whittemore et al., 1987; Day et al., 1988). For example, if one region has 20 times more children than another nearby region of the same geographic size, then observation of more cases in the highly populated region would be expected; lack of such number of cases would be surprising.

Prediction Validation One also can use a GIS to validate the exposure prediction with incidence data. Using case and risk-factor data, one can assess the correspondence of predicted and observed cases, noting areas of disparity for further review. One can also "tune" the risk-factor-based prediction rule using incidence data and conduct cross-validation analyses.

ILLUSTRATIVE EXAMPLES Exposure Prediction To demonstrate the utility of a GIS-based approach for lead exposure screening, we develop a hypothetical example. A square 16-block region is under consideration for lead screening (Figs. 2--4). Resources require that the blocks be prioritized. Using data collected for other reasons and stored in a GIS, each block can be characterized in terms of socioeconomic status and lead soil contamination (Figs. 2 and 3). In real situations, data on many other variables (e.g., home age

SCREENING

FOR

LEAD

USING

313

A GIS

FACTORY I

2

3

4

5

6

7

8

9

10

11

12

i3

14

15

I6

Fro. 2. A hypothetical map of socioeconomic status in a neighborhood. Each box represents a subarea within the neighborhood. The small number to the upper left of each box is an index number for identification. The Roman numeral inside each box is the socioeconomic status which is scaled from 1 to 4, with 1 being the highest. There is a hypothetical factory to the upper right of the neighborhood. and condition, water supply lead concentration, ethnicity) should be available and considered. In addition, there is an industrial lead emitter at the edge of the region under consideration. The plume of air emissions from this facility under typical wind conditions has been modeled and included in the GIS database (Fig. 3). The scores for each variable in each block are evaluated and listed in Table 1. The combined risk estimate is a simple sum of the scores for the individual variables. A map of these values is shown in Fig. 4. Regional priorities not congruent with any single variable are found. The risks calculated are related to the specific observations within each block. Depending on the resources available, and the number of blocks that can be screened, one could design a specific program to maximize risk scores evaluated and minimize the travel time of those conducting the screening. While some individuals at substantial risk of lead overexposure would be missed, this approach should minimize the number of such individuals within the constraints of available resources. Blocks 4 and 10 should be targeted first, followed by blocks 9 and 14, and then blocks 4-8 and 13.

Mapping Cases Overexposure assessment, or the identification of clusters of cases of lead poisoning, is a straightforward application of disease cluster methods. In the example presented in Fig. 5, there seems to be some clustering in the vicinity of

314

DANIEL WARTENBERG FACTORY 1

2

3

4

13

14

15

16

FIG. 3. A hypothetical map of soil and air contamination in the neighborhood shown in Fig. 2. The shaded boxes have contaminated soil. The plume of air contaminants is shown by the elliptical curves emanating from the factory. The curves represent contours of decreasing concentrations, the one closest to the factory representing a value of 3, the next a value of 2, and the one reaching farthest into the neighborhood a value of 1. The units are arbitrary.

block 10. H o w e v e r , without population density information, one cannot test the validity of this perception. That is, it m a y be that the blocks in the b o t t o m half of the figure all h a v e substantially higher populations. Then, the o b s e r v a t i o n of s e v e n cases in blocks 9-16 would not be surprising in contrast to the two cases in blocks 1-8. L e t us a s s u m e that the population is uniform and that there are 100 residents per block. Assuming a b a c k g r o u n d rate of o v e r e x p o s u r e of 2 per 800 b a s e d on blocks 1-8, the probability of seeing a single block with more than 1 case is approximately 0.02 and the probability of seeing a group of 8 blocks with m o r e than 6 cases is a p p r o x i m a t e l y 0.01. Using either calculation to assess the data, there is an excess w o r t h y of further investigation.

Prediction Validation T o validate the risk prediction rule, one can c o m p a r e directly the n u m b e r and location o f cases (or the n u m b e r and location of cases adjusted for population density and risk factors) with the predicted risk. Again, assuming that all blocks h a v e the s a m e population, we can c o m p a r e the distribution of cases in Fig. 5 with the distribution of risk scores Fig. 4. The results, shown in Table 2, reveal an adequate but not ideal prediction rule. In general, blocks with higher scores h a v e m o r e c a s e s , but the relationship is not monotonic. While some of the variation

315

SCREENING FOR LEAD USING A G1S FACTORY 1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

FIG. 4. A hypothetical map of lead-exposure risk score in the neighborhood shown in Fig. 2. This score combines the socioeconomic status data, and the soil and air contamination data. The risk scores range from 0 to 6 with 6 being the highest risk. See text for details.

TABLE 1 A HYPOTHETICAL REGION FOR LEAD SCREENING

Region

SES

Soil

Air

Score

1 2 3 4 5 6 7 8 9 10 I1 12 13 14 15 16

0 0 1 3 0 0 2 3 1 2 2 1 1 2 2 1

0 0 0 0 3 3 0 0 3 3 0 0 3 3 0 0

0 1 2 3 1 1 2 2 1 1 1 0 0 0 0 0

0 1 3 6 4 4 4 5 5 6 3 1 4 5 2 1

Note. SES is derived from U.S. Census files (0 = high; 3 = low). Soil is derived from local soil quality files (0 = clean; 3 = contaminated). Air is derived by specifying a buffer from a point source (0 = no effect; 3 = maximum effect). Score is the sum of SES, Soil, and Air.

316

DANIEL WARTENBERG FACTORY 3

4

6

7

8

10

11

I2

14

15

16

I II I3

?- f- f.-

FIG. 5. A hypothetical map of lead-exposure cases in the neighborhood shown in Fig. 2. Each solid circle represents a case.

may be ascribed to sampling variability due to the small number of cases, the rule probably could use some refinement or tuning. Once validated, one can apply the rule to new areas using risk-factor data from the computerized databases. This application is where the system is most powerful, identifying target areas among populations that have not yet been studied. Ideally, the prediction and validation procedures would be used in tandem, with new data constantly being added to update the databases, revalidate and refine the prediction rule, and make new predictions for new areas.

TABLE2 A FREQUENCY DISTRIBUTION O F C A S E S B Y RISKSCORE

Risk score

Number of blocks

Number of cases

Overexposure rate (cases per block)

0 1 2 3 4 5 6

1 3 2 1 5 2 2

0 1 1 1 1 1 3

0.~ 0.33 0.50 0.33 0.20 0.50 1.50

Total

16

8

0.50

SCREENING FOR LEAD USING A GIS

317

DISCUSSION AND CONCLUSIONS In this paper, I have shown a variety of ways in which extant data can be used to assess lead poisoning. None of these methods is inextricably linked to a GIS. But, none can be implemented as easily without a GIS. While a screening program could be based on a nonspatial assessment of risk factors, implementation might be very inefficient in terms of the geographic distribution of subjects and the travel time involved in reaching them. A geographically constrained approach places a priority on geographic contiguity and compactness with very limited loss of statistical power.

ACKNOWLEDGMENT I thank George Rhoads for helpful comments on the manuscript.

REFERENCES Charney, E., Sayre, J., and Coulter, M. (1980). Increased lead adsorption in Inner City children: Where does the lead come from? Pediatrics 65, 226-231. Clark, C. S., Bornschein, R. L., Succop, P., Que Hee, S. S., Hammond, P. B., and Peace, B. (1985). Condition and type of housing as an indicator of potential environmental lead exposure and pediatric blood lead levels. Environ. Res. 38, 46-53. Daniel, K., Sedlis, M. H., Polk, L., Duwuona-Hammond, S., McCants, B., and Matte, T. D. (1990). Childhood lead poisoning, New York City, 1988. M M W R 39, 1-7. Day, R., Ware, J. H., Wartenberg, D., and Zelen, M. (1988). An investigation of a reported cancer cluster in Randolph, MA. J. Clin. Epidemiol. 42, 137-150. Whittemore, A. S., Friend, N., Holly, E. A. (1987). A test to detect clusters 0f disease. Biometrika 74, 631-635.

Screening for lead exposure using a geographic information system.

Screening programs for lead overexposure typically target high-risk populations by identifying regions with common risk markers (older housing, povert...
577KB Sizes 0 Downloads 0 Views