Cancer Causes Control (2014) 25:81–92 DOI 10.1007/s10552-013-0310-1

ORIGINAL PAPER

Using a composite index of socioeconomic status to investigate health disparities while protecting the confidentiality of cancer registry data Mandi Yu • Zaria Tatalovich • James T. Gibson Kathleen A. Cronin



Received: 2 July 2013 / Accepted: 11 October 2013 / Published online: 1 November 2013 Ó Springer Science+Business Media Dordrecht 2013

Abstract Purpose The lack of individual socioeconomic status (SES) information in cancer registry data necessitates the use of area-based measures to investigate health disparities. Concerns about confidentiality, however, prohibit publishing patients’ residential locations at the subcounty level. We developed a census tract-based composite SES index to be released in place of individual census tracts to minimize the risk of disclosure. Methods Two SES indices based on the measures identified in the literature were constructed using factor analysis. The analyses were repeated using the data from the 2000 decennial census and 2005–2009 American Community Survey to create the indices at two time points, which were linked to 2000–2009 Surveillance, Epidemiology, and End Results registry data to estimate incidence and survival rates. Results The two indices performed similarly in stratifying census tracts and detecting socioeconomic gradients in cancer incidence and survival. The gradient in the incidence is positive for breast and prostate, and negative for lung cancers, in all races, although the level varies. The

Electronic supplementary material The online version of this article (doi:10.1007/s10552-013-0310-1) contains supplementary material, which is available to authorized users. M. Yu (&)  Z. Tatalovich  K. A. Cronin Surveillance Research Program, Division of Cancer Control and Population Sciences, National Cancer Institute, 9609 Medical Center Drive, Rockville, MD, USA e-mail: [email protected] J. T. Gibson Information Management Services, Inc., Silver Spring, MD, USA

positive gradient in survival is more salient for regionalstaged breast, colorectal, and lung cancers. Conclusions The census tract-based SES index provides a valuable tool for monitoring the disparities in cancer burdens while avoiding potential identity disclosure. This index, divided into tertiles and quintiles, is now available to the researchers on request. Keywords SEER  Socioeconomic index  Health disparity  Census tract  Cancer incidence  Cancer survival

Introduction The National Cancer Institute’s (NCI’s) Surveillance, Epidemiology, and End Results (SEER) Program provides the data on the occurrence of cancer in the population and is an authoritative resource for clinical and epidemiologic research. During the past decade, the demand for socioeconomic status (SES) data to describe cancer health disparities has increased [1–6]. SES reflects the different aspects of social stratification and is usually measured by a person’s income, education, and occupation [7]. Cancer registries do not routinely collect this information from patients, however. An attractive alternative is to construct area-based measures of SES based on the social and economic aspects of the area in which a patient resides. These measures usually can be obtained from the data collected by the U.S. Census Bureau. For SEER data, these measures are available at the county and census tract levels. Counties are legislative areas with 100,000 persons on average and are often comprised of people with diverse economic status. In comparison, census tracts are smaller areas of 1,500–8,000 persons and are designed to be homogenous

123

82

with respect to population characteristics, economic status, and living conditions [8]; therefore, they are the preferred basis for the construction of SES measures. A disadvantage of area-based measures, however, is that an individual’s SES may not be the same as the SES of the tract in which he or she lives. Caution should be taken in interpreting the impacts of area-based SES on health outcomes as if individual SES was used. Nevertheless, these measures are useful in investigating the contextual effects of the socioeconomic environment on health outcomes. Census tract-based SES measures can be obtained either by making the census tract codes available in a SEER data file, which permits users to link individual SEER records to any geo-referred SES data, or by withholding the codes but making several linked SES measures available. With either scenario, the risk of disclosing the identities of cancer patients, and subsequently disclosing sensitive tumor and treatment information about identified patients, increases dramatically [9]. This is because it is easy to single out a SEER patient who has the same unique combination of demographics and resides in the same census tract or has the same values on the census tract SES measures as an individual in another database that contains personally identifiable information, thus leading to disclosure. The incremental risk is negligible if the number of measures in the latter scenario reduces to one. However, it seriously limits the data users’ choice of SES measures to a single measure, which despite being carefully chosen usually cannot satisfy all users’ needs. We propose an approach that protects the privacy of SEER patients while allowing the use of rich SES data by developing and releasing a census tract-based composite SES index. An SES index is typically constructed by combining information about income, poverty, education, employment, occupation, owned home, and related attributes. Consensus is lacking, however, as to which SES measures should be used to study the effects of SES on health [10], and the choice of measures may depend on the health outcomes of interest [11]. Two area-based composite SES indices formulated using different sets of measures by Krieger et al. [3] and Yost et al. [6] have been used to show associations with cancer incidence and mortality. Considering the rigor of their research and the extensive use of their measures in the scientific literature, we based our SES index on their work. In the rest of this article, for simplicity, we use ‘‘Krieger’’ to refer to Krieger et al. [3] and ‘‘Yost’’ for Yost et al. [6]. This paper constructs an SES index at two time points and uses the index to evaluate the relationship between SES and cancer incidence and survival using SEER data. An SES index with fewer categories generally decreases

123

Cancer Causes Control (2014) 25:81–92

the risk of disclosure because a continuous index released with its full specificity would uniquely identify the census tract that it comes from. We examine whether SES classes generated by different classification schemes—three or five categories—are equally effective in ranking census tracts and detecting the effects of socioeconomic gradients on cancer outcomes.

Methods Data sources The data used in constructing the SES index at the first time point are from populations and census tracts covered by the SEER 17 registries [Atlanta, Connecticut, Detroit, Hawaii, Iowa, New Mexico, San Francisco-Oakland (SMSA), Seattle-Puget Sound (Seattle), Utah, Los Angeles (LA), San Jose-Monterey (SJM), Rural Georgia, Alaska Native Tumor Registry, Greater California, Kentucky, Louisiana, and New Jersey] in 2000 and were obtained from the U.S. 2000 decennial census long form survey. The data at the second time point cover the same geographic regions but are for 2005–2009 and came from American Community Survey (ACS) 5-year estimates. We chose the ACS because it replaced the long form census to provide an authoritative and detailed socioeconomic profile of the Nation after 2000. The ACS data are collected continuously using independent monthly samples and released every year. This 5-year data set combines the data collected from 2005 to 2009. It represents 5 % of the population and is the first to provide estimates for areas down to the census tract level since the 2000 census. Because of the smaller sample size, the sampling errors of the ACS estimates are larger than that of the 2000 census, which sampled 17 % of the population. However, the non-sampling errors in the ACS due to non-response, incomplete coverage, measurement, and processing reasons are much smaller compared to the 2000 census because of timely measurement, the use of updated sampling frame, and improved quality control [12]. Since the ACS was designed to produce information on content items similar to the 2000 census, most measures, especially socioeconomic measures, such as education, occupation, class of work, and income, at census tract level, are comparable across the two surveys despite small differences in question wording and recall reference period [13]. We excluded the data from the Alaska Native Tumor Registry because of additional confidentiality constraints. We also excluded the data from the Louisiana Tumor Registry because the census tract populations were not adjusted for the population loss due to the Hurricane

Cancer Causes Control (2014) 25:81–92

83

Table 1 Factor loadings for Krieger’s and Yost’s composite census tract-based SES indices constructed using the data from census 2000 and ACS 2005–2009, SEER 17 combined SES domain

Census tract-based SES measures

% of common variance explained

Krieger et al. [3]

Yost et al. [6]

Census 2000

Census 2000

ACS 2005–2009

ACS 2005–2009

82.29 %

81.02 %

93.09 %

91.82 %

Occupation

% working class

-0.133

-0.175

-0.082

-0.109

Unemployed

% aged C 16 years who are unemployed

-0.079

-0.043

-0.055

-0.037

Poverty

% of persons below 150 % of poverty line

NA

NA

-0.275

-0.233

% of persons below poverty line

-0.160

-0.127

NA

NA

Median HH income

0.168

0.199

0.462

0.490

% of total HH income in the area derived from interest, dividends, and net rent

0.026

0.044

NA

NA

% aged C 25 years and B 12th grade of education

-0.186

-0.182

NA

NA

% aged C 25 years and C 4 years of education

0.119

0.187

NA

NA

Education index (weighted school years)

NA

NA

0.089

0.118

Income

Education

House

Ownership

Living crowdedness

% home worth C $400 k

0.030

0.034

NA

NA

Median house value

NA

NA

0.052

0.047

Median rent

NA

NA

0.060

0.059

% home ownership

0.040

0.043

NA

NA

% car ownership

0.064

0.047

NA

NA

% no telephone

-0.068

-0.032

NA

NA

% of HHs w/more than one person per room

-0.044

-0.043

NA

NA

% of HHs w/only one room

-0.020

-0.014

NA

NA

% of HHs w/o kitchen

-0.029

-0.025

NA

NA

-0.028

-0.017

NA

NA

15

15

7

7

% of HHs w/o private plumbing Total number of census tract attributes Alaska native tumor registry and Louisiana tumor registry are excluded HH household, NA not applicable

Katrina. The impact of Hurricane on the population for the remaining SEER areas, however, is negligible because the population gains due to evacuees’ migrating from Hurricane impacted areas are mainly in the states of Alabama, Mississippi, and Texas, and none of them is a SEER area [14]. Geocoding SEER patients’ residences at diagnosis were done at each central registry, and the data submitted to the NCI met the SEER data quality criterion of less than 2 % of cases missing census tracts. Census tracts with missing information on one or more measures were excluded from the analysis. Because the sample size is relatively small, the ACS estimates are more susceptible to missing data. As shown in Electronic Supplementary Material 1, 110 (0.65 % of eligible census tracts) additional census tracts were excluded from constructing Krieger’s index using the ACS estimates compared to the 2000 census. For Yost’s index, the number of additional census tracts excluded was 399 (2.59 %).

SES index development Using the SAS PROC Factor procedure, we performed separate factor analyses on the measures identified by Krieger and Yost. Despite the fact that Krieger used several additional measures of living space and ownership (Table 1), both papers included the measures of the main SES factors that generally are agreed upon in the literature, such as income, poverty, unemployment, working class, education, and house value. All measures were normalized using the rank transformation [15] prior to being entered into the maximum likelihood factor analysis. For the purpose of parsimony, we selected the first factor estimated from the factor analysis to be the SES index and estimated the SES score for each census tract. We further divided the SES scores as closely as possible into tertiles and quintiles with equal populations in each class to ensure the most efficient comparisons between SES classes. We then

123

84

assigned the SES classification of a census tract to cancer patients who resided in that census tract at diagnosis. Statistical analysis Cancer incidence To analyze socioeconomic disparities in cancer incidence by SES class, we estimated 3-year incidence rates that were age-adjusted to 19 age groups using the 2000 U.S. standard population. We defined a case as any malignant tumor at one of the four leading sites: breast, colorectal, lung, and prostate. We excluded cases with missing information on age and SES (Electronic Supplementary Material B). Assuming that the socioeconomic rankings of tracts are stable over a 3-year period, we assigned 2000–2002 cases using a 2000 SES index and 2006–2008 cases using a 2005–2009 SES index. We also estimated similar SES gradients in cancer incidence separately for white, black, and Asian/Pacific Islander (API). (American Indian and Alaska Natives were excluded because of small sample sizes.) Because bridged-race populations [16] for census tracts, comparable to race data collected for cases, were available only for the year 2000 at the time of this writing, we assumed that populations are constant and multiplied the population estimates by three to account for the use of 3 years of cancer cases. Cancer survival To investigate whether socioeconomic deprivation is associated with survival and the extent that such association differs by race and stage of the cancer at diagnosis, we estimated 5-year cause-specific survival rates by SES class, race, and stage for these cancer sites using actuarial methods [17]. Cause of death is defined based on the SEER causespecific death classification [17, 18]. We focused on the first primary malignant cancer cases diagnosed from 2000 to 2002 to allow sufficient follow-up time. In addition to excluding cases with missing age as in the rate analysis, we further excluded cases identified through autopsy or death certificate, cases without cause-of-death information, and cases with zero survival time [i.e., diagnosis date is the same as the date of last contact (Electronic Supplementary Material 2)].

Results Table 1 shows the amount of common variance explained by each SES index and the standardized scoring coefficients, which we used to weight each measure in estimating the SES scores. These variances can be viewed as the individual and non-redundant contribution that each

123

Cancer Causes Control (2014) 25:81–92

measure makes to the SES index. The SES index explains more than 80 % of common variance for Krieger’s indices and more than 90 % for Yost’s indices. For all four indices, the main SES measures, such as income, education, working class, and poverty, are the dominant contributors to the index as suggested by the large absolute coefficients. The contributions also are very similar over time for both Krieger’s and Yost’s indices. Table 2 shows the distributions of populations and census tracts by SES class for both Krieger’s and Yost’s indices at 2000 and 2005–2009. As expected, for all SEER 17 registries combined, SES classes have equal populations but not census tracts in that there tend to be more tracts in lower SES classes. Whites and APIs are more likely than other racial groups to be in higher SES classes. Because of differential population composition and socioeconomic characteristics among SEER registries, population distributions by SES vary dramatically by registry region, as shown in Fig. 1. There were more people in higher SES classes in relatively affluent regions such as San FranciscoOakland, Connecticut, Atlanta, Utah, Seattle, New Jersey, and San Jose-Monterey. In contrast, more people in Los Angeles, New Mexico, Rural Georgia, and Kentucky were assigned to lower SES classes. In addition, this distribution shifted slightly from 2000 to 2005–2009. Some registries experienced declines in SES compared to their peer registries in that more patients were assigned to a lower SES class in 2005–2009 compared to 2000. The registries with the largest declines were Detroit, Utah, and Atlanta. In comparison, persons in the Hawaii, Los Angeles, and Rural Georgia registries gained slight socioeconomic advantages. For the remaining registries, SES appears stable over time. The above patterns hold for both Krieger’s and Yost’s indices in both quintile and tertile. Despite the above consistency, slight differences were observed between Krieger’s and Yost’s indices such that populations were ranked higher in SES by Krieger’s index than by Yost’s in more rural regions such as Iowa, Kentucky, and New Mexico (Fig. 1). Such discrepancies are further demonstrated by the scatter plots of Krieger’s SES scores versus Yost’s SES scores by census tract rurality at 2000 and 2005–2009 (Fig. 2). Census tracts with less than 75 % urban populations as considered rural. At both time points, rural census tracts tend to have higher SES scores according to Krieger’s index than Yost’s index. Nevertheless, after the SES scores are aggregated into classes, the overall agreement between Krieger and Yost is substantial as suggested by high Kappa values, which are about 0.82–0.83 for both classification schemes. More than 72 % of census tracts were assigned to the same SES quintiles, and 85 % were assigned to the same SES tertiles. For the remaining tracts assigned to different classes, the difference is mostly one class.

Cancer Causes Control (2014) 25:81–92

85

Table 2 Distributions of census tracts and populationsa by socioeconomic classes, SEER 17 combined SES index

Krieger et al. [3]

SES classification scheme

Tertile

Quintile

Yost et al. [6]

Tertile

Quintile

SES class

2000 census

2005–2009 ACS

Census tract

Population

Total

Total

Census tract

Population

Total

Total

8.3

5,078

24,299,539 24,302,754

Race (row percent) White

Black

AIAN

API

Low

5,159

22,975,852

71.9

17.6

2.2

Median

5,050

22,980,591

81.2

8.2

1.0

9.5

5,001

High

5,007

23,050,638

85.4

4.2

0.5

10.0

5,027

24,384,909

Low

3,159

13,795,079

69.7

20.6

2.5

7.1

3,101

14,588,106

2

2,994

13,807,189

76.1

12.2

1.6

10.0

2,973

14,598,333

3

3,052

13,796,958

81.7

8.0

1.0

9.3

3,009

14,604,083

4

3,021

13,796,665

83.3

6.0

0.7

10.0

2,968

14,594,009

High

2,990

13,811,191

86.6

3.2

0.4

9.9

3,055

14,602,672

Low

5,381

23,037,424

74.1

16.8

2.2

6.9

5,198

24,089,046

Median

4,968

23,048,510

80.6

8.9

1.0

9.5

4,810

24,088,325

High

4,954

23,114,414

83.7

4.4

0.5

11.4

4,896

24,169,135

Low

3,366

13,838,020

71.6

19.6

2.6

6.3

3,234

14,461,718

2

3,021

13,836,848

78.6

11.9

1.5

8.0

2,921

14,475,234

3 4

2,944 3,011

13,836,488 13,840,765

80.1 83.1

8.9 6.3

1.0 0.7

10.0 9.9

2,918 2,871

14,467,295 14,458,116

High

2,961

13,848,226

84.0

3.5

0.4

12.1

2,960

14,484,142

Alaska native tumor registry and Louisiana tumor registry are excluded Population Data Source is Woods & Poole Economics, Inc., 2000 census tract population estimates, vintage 2009

By SES Quintile 80

Krieger 2000

40

High Medium Low

Krieger 2005−09

0

High Medium Low

h a ii ll it t A y M le y s ta o era tlan cticu etro ornia awa Iow tuck gele erse exic orgia eatt SJ SMS Uta D lif H A ne S en An w J w M l Ge a K n s Ne C Co Ne Rura Lo ter ea Gr

Ov

100%

By SES Tertile Yost 2000 High Medium Low

Yost 2005−09

33.3

Fig. 1 Side-by-side distributions of populations by SES class (quintile and tertile) and registry for both Krieger’s and Yost’s indices at 2000 and 2005–2009

High Medium Low

0

a

h a ii ll it t A y M le y s ta o era tlan cticu etro ornia awa Iow tuck gele erse exic orgia eatt SJ SMS Uta D lif H A ne S en An w J w M l Ge a K n s Ne C Co Ne Rura Lo ter ea Gr

Ov

123

86

Cancer Causes Control (2014) 25:81–92

Rural Tracts (Urban Pop=75%)

1st Quintile 2nd 3rd

Fig. 2 Scatter plots of Krieger’s SES scores versus Yost’s SES scores by census tract rurality, 2000 and 2005–2009

−2

−1

0

1

2

(see Electronic Supplementary Material 3 for details). These gradients are stable over 6 years of time, as similar gradients are observed for 2000–2002 and 2006–2008. Use of SES tertiles shows similar overall effects of neighborhood SES on cancer incidence (Electronic Supplementary Material 4). Figure 4 shows the incidence rates by Yost’s quintile for white, black, and API. The contribution of neighborhood SES to racial differences in cancer incidence varies by cancer site. For breast cancer, whites and blacks had similar overall but higher incidence rates, and API women had the lowest rate. Neighborhood SES was positively associated with the incidence of breast cancer for whites and API, but not for blacks, whose rates are similar across all SES groups. For colorectal cancer, blacks had the highest incidence rates and API had the lowest rates. These rates do not vary by SES group for all racial groups. For lung and bronchus cancers, blacks had the highest rate and API had the lowest rate. For all racial groups, neighborhood SES was negatively associated with incidence, but the degree varied. The largest SES gradient was found among blacks, followed by whites. Among API, the difference in

Cancer Causes Control (2014) 25:81–92

87

Table 3 Distributions of cancer cases by cancer site, diagnosis year, type of the analysis, SES index, and race, SEER 17 combined Cancer site

Diagnosis year

Analysis type

SES index

Total cases (N)

Race (row percent) White

Breast

2000–2002

2006–2008 Colorectal

2000–2002

2006–2008 Lung

2000–2002

Incidence

Prostate

2000–2002

2006–2008

AIAN

API

Krieger

143,613

85.62

7.46

0.71

6.21

Yost

143,703

85.60

7.48

0.71

6.21

Survival

Krieger

120,437

85.12

7.59

0.78

6.51

Yost

120,517

85.11

7.60

0.78

6.52

Incidence

Krieger

147,750

NA

NA

NA

NA

Yost

146,281

NA

NA

NA

NA

Incidence

Krieger Yost

102,451 102,531

84.18 84.15

8.43 8.45

0.75 0.75

6.64 6.65

Survival

Krieger

81,202

83.45

8.61

0.85

7.08

Yost

81,266

83.42

8.63

0.85

7.10

Krieger

98,149

NA

NA

NA

NA

Yost

97,269

NA

NA

NA

NA

Incidence Incidence Survival

2006–2008

Black

Incidence Incidence

Krieger

124,657

86.10

8.58

0.44

4.88

Yost

4.90

124,806

86.06

8.60

0.44

Krieger

96,923

85.50

8.79

0.49

5.22

Yost

97,046

85.45

8.82

0.49

5.23

Krieger

126,941

NA

NA

NA

NA

Yost

125,835

NA

NA

NA

NA

Krieger

148,009

82.41

10.97

2.17

4.45 4.46

Yost

148,128

82.38

10.99

2.17

Survival

Krieger

132,887

81.88

11.25

2.29

4.58

Incidence

Yost Krieger

132,993 150,929

81.85 NA

11.27 NA

2.29 NA

4.59 NA

Yost

149,233

NA

NA

NA

NA

Alaska native tumor registry and Louisiana tumor registry are excluded NA not applicable

incidence is noticeable only when comparing the highest with lowest SES quintiles, but not between adjacent SES groups. For prostate cancer, black men had substantially higher incidence rates, at almost 1.5 times the rates of white men and triple those of API men. Such substantial SES gradients are similar for white men and black men but become smaller for API men. The results of using Krieger’s indices and Yost’s tertiles are similar to those based on Yost’s quintiles. (See Electronic Supplementary Material 5 for the results based on Krieger’s quintiles.)

for whites and API compared to blacks. In contrast, there is no racial difference in cause-specific survival for colorectal, lung, and prostate cancers. The sensitivity analysis treating cases with missing cause-of-death information as dead or censored did not change the results. For the distributions of cases by SES and stage, see Electronic Supplementary Material 6.

Cause-specific survival

This paper demonstrates an approach to incorporating area measures of SES at the census tract level that protects the confidentiality of patients. By incorporating precalculated SES indices into cancer registry data, researchers can analyze incidence and survival trends by SES category without needing to access the census tract data for any single case. These census tract SES indices are valuable for cancer surveillance because SES disparities affect cancer incidence, stage, treatment, and outcome.

Figure 5 shows the 5-year cause-specific survival rates by race, SES, and stage for patients diagnosed from 2000 to 2002. For breast cancer, survival rates are more favorable for patients residing in high-SES areas for late stage (both regional and distant) breast cancers, regional colorectal cancers, and localized, or regional lung cancers. Survival rates for regional breast cancer are slightly more favorable

Discussion

123

88

Cancer Causes Control (2014) 25:81–92

Fig. 3 Age-adjusted incidence rates by cancer site, SES quintile (Krieger and Yost), and diagnosis year (2000–2002 and 2006–2008), SEER 17 combined

2000 SES Low 2000 SES 2 2000 SES 3 2000 SES 4 2000 SES High

2005−09 SES Low 2005−09 SES 2 2005−09 SES 3 2005−09 SES 4 2005−09 SES High

Colorectal 100

80

80

Incidence Rate

Incidence Rate

Breast 100

60 40 20 0

2

0

0 r2

e

eg

Kri

02

8

0 0−

0 6−

0

0 r2

e

eg

Kri

− 00

0

2 st

Yo

60 40 20 0

08

− 06

2

−0

00

0

2 st

Yo

0 r2

e

eg

Kri

e

eg

Kri

Incidence Rate

Incidence Rate

40 20 0

0 0−

0 r2

e

eg

Kri

− 00

0

0

0 r2

0 6−

0 t2

s

Yo

80 60 40 20 0

08

02

8

2

− 06

s

Yo

2

−0

00

0 t2

Fig. 4 Age-adjusted incidence rates by Yost’s SES quintile, race, and cancer site, 2000–2002, SEER 17 combined

e

eg

Kri

0 r2

e

eg

Kri

8

−0

06

0 r2

2

−0

00

0 t2

s

Yo

8

−0

06

0 t2

s

Yo

Yost Quintile SES Low SES 2 SES 3

SES 4 SES High

Breast

100 0

50

50

100

Incidence Rate

150

150

Colorectal

0

Incidence Rate

White

Black

API

White

API

Prostate

100 0

50

Incidence Rate

150 100 50 0

Incidence Rate

Black

150

Lung and Bronchus

White

123

s

Yo

Prostate

60

e

s

8

−0

06

0 t2

100

80

eg

2

−0

00

0 t2

Yo

Lung and Bronchus 100

Kri

8

−0

06

0 r2

Black

API

White

Black

API

Cancer Causes Control (2014) 25:81–92

89

Regional 100 80 20

0 Black

API

0

20

40

60 40

60

80

100 80 60

Breast

40 20 0 White

White

Black

100 60 40 20 Black

API

100 60 40 20 0 Black

API

API

100 80 60

80

40

60

20

40

0

20 0 API

Black

Distant

100

100 80 60 40 20 0

Black

White

Regional

Localized

White

API

80

100 80 40 20 0 White

API

Black

Distant

60

80 60 40 20 0

Black

White

Regional

100

Localized

White

API

0 White

API

Black

80

80 60 40 20 0 Black

White

Distant

100

100 80 60 40

Colorectal

20 0 White

Lung

API

Regional

Localized

Prostate

Distant

100

Localized

White

Black

API

White

Black

API

Yost Quintile SES Low

SES 2

SES 3

SES 4

SES High

Fig. 5 Five-year cause-specific survival rates by Yost’s SES quintile, race, and stage, 2000–2002, SEER 17 combined

Both Krieger and Yost indices capture the main concept of SES. There was a high level of agreement in the classification of census tracts into tertiles and quintiles using the

Yost and Krieger measures, which resulted in similar interpretations when looking at incidence and survival by SES. Small differences in ranking census tracts reflected

123

90

that Yost’s index gives lower rank than Krieger’s index in more rural areas. This may be due to people in rural areas being more likely to own a house, own a car, and live in less crowded conditions, factors that are included in the Krieger index, but not the Yost index. Although there is no consensus definition of SES, Yost index offers relatively simpler interpretations as the measures included have the same relative meaning across geographic regions. There are also high agreements in the effect of SES between using SES quintiles and tertiles. Tertiles may be more appropriate when one focuses on subgroups in which the sample is limited, although quintiles are commonly used in the literature. Both indices in both classifications are available in SEER upon request. Researchers may want to consider these nuances when deciding between these indices. Higher SES is associated with higher incidence rates for breast and prostate cancer where screening is common in the population. Analysis of screening has consistently shown that individuals with a higher SES are more likely to receive regular screening, which can be associated with increased incidence. Higher incidence in screened populations could be attributed to earlier diagnosis of disease or to screening tests finding cancers that might never have been diagnosed without screening [19]. Survival is presented by stage to minimize the impact of lead time bias and overdiagnosis associated with screening [20]. For breast cancer, there is a trend toward higher survival in late-staged disease for higher SES categories. A number of factors could influence such survival, such as differential access to health care and treatment and comorbidities that may affect treatment. In contrast, no clear trend in survival by SES category was observed for prostate cancer. Lung cancer incidence is higher in low-SES groups, reflecting smoking patterns in the population. Survival is higher for high-SES groups, although it is not consistent by stage and race. No clear pattern emerged in the incidence of colorectal cancer by SES category. Unlike breast and prostate cancers, screening for colorectal cancer removes precancerous polyps, thereby preventing an invasive cancer [21]. Incidence in colorectal cancer is influenced by risk factors and screening, possibly moving rates in opposite directions. The highest SES categories have higher survival than do the lowest categories for regional colorectal cancer; the pattern is not as clear for local or distant cancers. The SES index discussed in this analysis is linked to the SEER registry data. The analyses we presented in this paper illustrate how these indices could be used to answer some of the questions on cancer health disparities. Researchers can gain access to these data and further investigate the issues identified in this paper. A full discussion on cancer health disparities is beyond the scope of this study. One limitation of this analysis is that the median household income that we used to generate the SES indices

123

Cancer Causes Control (2014) 25:81–92

was not adjusted for the differential cost of living (COL) across the SEER 18 areas. It is possible that for people with the same income, the living condition could be higher if they live in a rural area than in an urban area because the cost of living is low. Therefore, these SES indices could have given lower ranks to census tracts where the cost of living is low. This issue is less concerning if the objective of a study is to compare SES classes within a community or a small geographic region because the cost of living tends to be more similar than it does across the country. We conducted a sensitivity analysis to evaluate the impact of the differential COL on estimating national-level cancer incidence rates by examining two alternative approaches for constructing the SES index and comparing them to the originally proposed method. In the first approach, we calculated a COL-adjusted 2000 median household income using the metropolitan statistical area (MSA) level COL index. This COL index was developed based on the 2004 two-parent, onechild family budget estimates from the Economic Policy Institute (EPI) (http://www.epi.org). Replacing the original income measure with this adjusted version, we constructed a similar Yost’s SES index for 2000 following the same procedures as described in the methods section. In the second approach, we constructed separate Yost’s SES indices for each registry; therefore, there is a relatively more consistent correspondence between income and living standards within each registry. Within each registry, census tracts were then grouped into tertiles and quintiles with the same number of populations in each. This approach allows equal representation of SES groups in every registry, therefore less sensitive to registry inclusion and exclusion criteria. We compared these two approaches to the original approach on estimating the 2000 incidence rate by Yost’s SES quintile for all races combined (Electronic Supplementary Material 7) as well as separately for white, black, and API (Electronic Supplementary Material 8). The three methods detected similar SES gradients for all racial groups and all four cancer sites such as breast, colorectal, lung, and prostate. This could be because although income is one of the main SES measures, by incorporating other important SES measures, such as education, occupation, and employment, into the composite index, the impact of differential COL is mitigated. However, due to the following considerations, we will not adopt the index created by the COL adjustment approach in its current form, and will leave the search for refined adjustment methods for future research. First, COL was not fully accounted for because the geographic areas for which the COL index is available are less detailed than the census tracts for which the SES index is constructed. Second, the COL index varies significantly by family size, thus using the index developed for any family size could be problematic because the family size distribution varies across census tracts. Third, although the methods used to estimate the

Cancer Causes Control (2014) 25:81–92

family budget at EPI seem to have based on sound theoretical principles, the quality of the data has not yet been fully evaluated to be used to inform public health decisions. Unlike the first approach, the registry-specific SES index does not require additional data; thus, it is not subject to the data quality limitations. This index not only provides similar overall results as the proposed SES index, but also allows researchers opportunities to conduct valid analysis at registry level by maintaining equal populations in each SES classes. However, the interpretation of the SES classes is slightly different in that the SES ranking for a census tract is relative to the remaining census tracts in the same registry. This index is especially useful if the objective of the study is to investigate the SES difference in a community or region, such as a state. In contrast, the overall SES index is more appropriate for an analysis at national level. This registry-specific index is also linked to SEER data and is available upon request. Another limitation of this analysis is that the factor analysis we used to construct the SES index does not account for the sampling errors in the SES estimates in the 2000 census and the ACS. However, by grouping the census tracts to a finite number of classes, we mitigated the impact of potential misclassification bias in ranking the census tracts. Extensive evaluations of alternative statistical approaches to account for the uncertainty in the SES index are beyond the scope of this paper. Finally, this SEER database will not allow for the modeling of spatial relationships because the geographic location of the census tract would not be available. Census tract-level information may not accurately represent the individual characteristics of a patient and does not incorporate information on key factors that influence outcomes, such as insurance status. Users of these data will not be able to delve further into the specific variables that may affect rates and survival because only the composite index would be available. Furthermore, individual SES scores are not released because they offer opportunities to identify the census tracts. Instead, the indices in tertile and quintile are available and they provide sufficient and valuable information to begin to understand the impact of SES on the burden of disease in the United States. As of this writing, the index is only available for 2000 and 2005–2009. In the future, it will be expanded to each year from 2000 and onward, which provides additional opportunity for conducting trend analysis.

91

2. 3.

4.

5.

6.

7.

8.

9.

10.

11.

12.

13.

14.

15.

16.

17.

References 18. 1. Clegg LX, Reichman M, Miller BA, Hankey B, Singh G, Lin Y, Goodman M, Lynch C, Schwartz S, Chen V, Bernstein L, Gomez S, Graff J, Lin C, Johnson N, Edwards B (2009) Impact of

socioeconomic status on cancer incidence and stage at diagnosis: selected findings from the surveillance, epidemiology, and end results: National Longitudinal Mortality Study. Cancer Causes Control 20(4):417–435 Devesa SS, Diamond EL (1983) Socioeconomic and racial differences in lung cancer incidence. Am J Epidemiol 118(6):818–831 Krieger N, Chen JT, Waterman PD, Soobader M-J, Subramanian SV, Carson R (2002) Geocoding and monitoring of US socioeconomic inequalities in mortality and cancer incidence: does the choice of area-based measure and geographic level matter? Am J Epidemiol 156(5):471–482. doi:10.1093/aje/kwf068 Pugh H, Power C, Goldblatt P, Arber S (1991) Women’s lung cancer mortality, socio-economic status and changing smoking patterns. Soc Sci Med 32(10):1105–1110 Ward EJA, Cokkinides V, Singh GK, Cardinez C, Ghafoor A, Thun M (2004) Cancer disparities by race/ethnicity and socioeconomic status. CA Cancer J Clin 54(2):78–93 Yost K, Perkins C, Cohen R, Morris C, Wright W (2001) Socioeconomic status and breast cancer incidence in California for different race/ethnic groups. Cancer Causes Control 12(8):703–711. doi:10.1023/a:1011240019516 Dutton DB, Levine S (1989) Overview, methodological critique, and reformulation. In: Bunker JP, Gomby DS, Kehrer BH (eds) Pathways to health. The Henry J. Kaiser Family Foundation, Menlo Park, CA U.S. Bureau of Census (1994) Geographic areas reference manual. Suitland, ML. Retrieved from http://www.census.gov/geo/ www/garm.html Witkowski KM (2008) Disclosure risk components of contextualized microdata: identifying unique geographic units and the implications for pinpointing survey respondents. ICPSR working paper series. University of Michigan, Ann Arbor Robert SA, House JS (2000) Socioeconomic inequalities in health: an enduring sociological problem. In: Bird CE, Conrad P, Fremont AM (eds) Handbook of medical sociology. Prentice Hall, Upper Saddle River, NJ, pp 79–97 Gornick ME (2002) Measuring the effects of socioeconomic status on health care. In: Swift EK (ed) Guidance for the National Healthcare Disparities Report. Institute of Medicine of the National Academies, Washington, DC, pp 45–74 Griffin DH, Love SP, Obenski SM (2003) Can the American Community Survey replace the census long form? In: American association for public opinion research. Nashville, TN, pp 94–99 U.S. Bureau of Census (2013) Comparing 2009 American community survey data. Suitland, ML. Retrieved from http://www. census.gov/acs/www/guidance_for_data_users/comparing_2009/ Frey W, Singer A (2006) Katrina and Rita impacts on gulf coast populations: first census findings. Brookings Institution, Washington, DC Conover WJ, Iman RL (1981) Rank transformations as a bridge between parametric and nonparametric statistics. Am Stat 35(3):124–129 Ingram DD, Parker JD, Schenker N, Weed JA, Hamilton B, Arias E, Madans JH (2003) United States Census 2000 population with bridged race categories. Vital Health Stat Ser 2 135:1–55 Howlader N, Noone A, Krapcho M, Neyman N, Aminou R, Waldron W, Altekruse S, Kosary C, Ruhl J, Tatalovich Z, Cho H, Mariotto A, Eisner M, Lewis D, Chen H, Feuer E, Cronin K, Edwards Be (2011) SEER cancer statistics review, 1975–2008, based on November 2010 SEER data submission, posted to the SEER web site, 2011. National Cancer Institute, Bethesda, MD. http://seer.cancer.gov/csr/1975_2008/ Howlader N, Ries LA, Mariotto AB, Reichman ME, Ruhl J, Cronin KA (2010) Improved estimates of cancer-specific survival rates from population-based data. J Natl Cancer Inst 102(20):1584–1598

123

92 ˚ , Hernes E, Møller B, 19. Kva˚le R, Auvinen A, Adami H-O, Klint A Pukkala E, Storm HH, Tryggvadottir L, Tretli S, Wahlqvist R, Weiderpass E, Bray F (2007) Interpreting trends in prostate cancer incidence and mortality in the five Nordic countries. J Natl Cancer Inst 99(24):1881–1887. doi:10.1093/jnci/djm249 20. Draisma G, Boer R, Otto SJ, van der Cruijsen IW, Damhuis RAM, Schro¨der FH, de Koning HJ (2003) Lead times and overdetection due to prostate-specific antigen screening:

123

Cancer Causes Control (2014) 25:81–92 estimates from the European randomized study of screening for prostate cancer. J Natl Cancer Inst 95(12):868–878. doi:10.1093/ jnci/95.12.868 21. Winawer SJ ZA, Ho MN, O’Brien MJ, Gottlieb LS, Sternberg SS, Waye JD, Schapiro M, Bond JH, Panish JF, Ackroyd F, Shike M, Kurtz RC, Hornsby-Lewis L, Gerdes H, Stewart ET (1993) Prevention of colorectal cancer by colonoscopic polypectomy. New Engl J Med 329(27). doi:10.1056/NEJM199312303292701

Using a composite index of socioeconomic status to investigate health disparities while protecting the confidentiality of cancer registry data.

The lack of individual socioeconomic status (SES) information in cancer registry data necessitates the use of area-based measures to investigate healt...
761KB Sizes 0 Downloads 0 Views