Authors: Gloria L. Krahn, PhD, MPH Willi Horner-Johnson, PhD Trevor A. Hall, PsyD Gale H. Roid, PhD Elena M. Andresen, PhD Glenn T. Fujiura, PhD Margaret A. Nosek, PhD Bradley J. Cardinal, PhD Charles E. Drum, JD, PhD Rie Suzuki, PhD Jana J. Peterson, MPH, PhD

Affiliations: From the Oregon Health & Science University, Portland, Oregon (GLK, W-HJ, TAH, CED, RS, JJP); Warner Pacific College, Portland, Oregon (GHR); University of Florida, Gainesville (EMA); University of Illinois, Chicago (GTF); Baylor College of Medicine, Houston, Texas (MAN); and Oregon State University, Corvallis (BJC).

Correspondence: All correspondence and requests for reprints should be addressed to: Gloria L. Krahn, PhD, MPH, Division of Human Development and Disability/ National Center on Birth Defects and Developmental Disabilities, Centers for Disease Control and Prevention, 1600 Clifton Rd NE, MS E88, Atlanta, GA 30333. 0894-9115/14/9301-0056 American Journal of Physical Medicine & Rehabilitation Copyright * 2013 by Lippincott Williams & Wilkins DOI: 10.1097/PHM.0b013e3182a517e6

Functional Assessment

ORIGINAL RESEARCH ARTICLE

Development and Psychometric Assessment of the Function-Neutral Health-Related Quality of Life Measure ABSTRACT Krahn GL, Horner-Johnson W, Hall TA, Roid GH, Andresen EM, Fujiura GT, Nosek MA, Cardinal BJ, Drum CE, Suzuki R, Peterson JJ: Development and psychometric assessment of the function-neutral health-related quality of life measure. Am J Phys Med Rehabil 2014;93:56Y74.

Objective: The aim of this study was to determine the conceptual framework, item pool, and psychometric properties of a new function-neutral measure of healthrelated quality-of-life (HRQOL). Design: This is an expert panel review of existing measures of HRQOL and development of a conceptual model, core constructs, and item pool and a validation by experts in specific disabilities and in cultural competence. Items were cognitively tested, pilot tested for functional bias, field tested with a national sample of adults with various limitations, and reliability tested via repeat administration. Final item selection was based on analyses of factor structure, demographic bias, variance in likelihood of endorsement, and item-total correlation. Psychometric properties were demonstrated through differential item functioning analyses, factor analyses, correlations, and item response theory analyses.

Results: The results supported a four-domain conceptual model of HRQOL (physical health, mental health, social health, and life satisfaction and beliefs) for a 42-item HRQOL measure with an ancillary 15-item environment scale. The measure has strong internal consistency (> = 0.88Y0.97), known-groups validity, and test-retest reliability (r = 0.83Y0.91). Tests of convergent and divergent validity confirmed the ability of the Function-Neutral Health-Related Quality of Life to measure health while being relatively free of content assessing function.

Conclusions: A conceptually grounded four-domain, function-neutral measure of HRQOL that is appropriate for use with persons with and without various functional limitations was developed. Key Words: Persons with Disability, Health Status, Environment, Quality-of-Life, Questionnaires, Reliability, Validity

56

Am. J. Phys. Med. Rehabil. & Vol. 93, No. 1, January 2014 Copyright © 2013 Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.

Disclosures: Gloria L. Krahn is now with the Centers for Disease Control and Prevention, Atlanta, GA. Trevor A. Hall is now with the Northwest Neurobehavioral Health, Boise, ID. Elena M. Andresen is now with the Oregon Health & Science University, Portland. Charles E. Drum is now with the University of New Hampshire, Durham. Rie Suzuki is now with the University of Michigan-Flint. Jana J. Peterson is now with the Pacific University, Forest Grove, OR. Supported by funds from the United States Department of Education, National Institute on Disability and Rehabilitation Research (NIDRR), under grant number H133B040034, principal investigator Gloria Krahn, PhD, MPH, and project officer Phillip Beatty, PhD. The findings and conclusions in this article are those of the authors and do not necessarily represent the official position of the United States Department of Education or the Centers for Disease Control and Prevention. Endorsement by the federal government should not be assumed. Presented in part at the RRTC State of the Science Conference, May 2008, Portland, OR; International Society for Quality of Life Research, October 2009, New Orleans, LA; and American Public Health Association, November 2010, Denver, CO. Financial disclosure statements have been obtained, and no conflicts of interest have been reported by the authors or by any individuals in control of the content of this article.

S

elf-perceivedhealth-relatedquality-of-life(HRQOL) measures are well accepted and important tools in health outcomes research. These are used to reflect health status in populations, identify health disparities, serve as covariates in outcomes research, and evaluate interventions.1Y5 In its 40-yr history, HRQOL research has generated numerous measurement tools.5,6 These measures are variously termed HRQOL, health status, or quality-of-life (QOL), with considerable semantic drift and difference noted over the years in the meanings of these terms. Some researchers regard the terms as interchangeable5; others regard HRQOL and health status as more narrowly focused than QOL,7 whereas still, others use health status to designate functional ability.8 Instruments purported to measure HRQOL vary greatly in what these measure6,9 and typically include some content related to function along with health and QOL.10 The most popular HRQOL measures were created when function was considered to be an important component of health. Subsequent conceptualizations have emphasized the importance of distinguishing function from health to understand the relationship between these constructs11 and to examine health www.ajpmr.com

outcomes within the context of long-standing functional limitations.12 This shift was particularly marked in disability research, with differentiation of disability from health emerging during the 1990s and articulated most clearly in the United States Surgeon Generals’ reports13,14 of 2002 and 2005 that provide directives for how to improve the health of people with disabilities. The World Health Organization’s International Classification of Functioning, Disability and Health15 embodies a significant paradigm shift in regarding disability not as a health condition but as the result of interactions between health conditions and environmental and personal factors, impacting limitations in functioning of body structure, activity limitations, and participation restrictions. This framework recognizes the possibility that persons can be disabled as well as healthy and emphasizes the importance of environment on the disabling process. Environments can impede or facilitate the activities and participation of people with limitations through physical barriers (e.g., stairs but no ramps), policies (e.g., requirements for employment), or attitudes (e.g., people more or less welcoming of ability differences).16 The Craig Hospital Inventory of Environmental Factors17 represents an early effort to measure how the environment enables or disables participation of persons with disabilities. Recently, there have been renewed calls for measures of the environment relative to disabilities. The shift in conceptualizing health and function has significant import for disability and rehabilitation research. Albrecht and Devlieger18 introduced the term disability paradox to highlight their finding that a majority of people living with moderate to serious disabilities still reported having an excellent or good QOL. Similarly, although people with disabilities are four times more likely to report their health to be fair or poor, most of the population with disabilities reports their health to be excellent, very good, or good.19 These findings challenge earlier notions that disability is synonymous with poor health and diminished QOL and suggest that diversity among people with disabilities is important to understand. The primary issue is not that health is entirely independent of function but rather that the relationship of HRQOL and function is dynamic, complex, and important to study.11,12 Such research requires measures that differentiate these constructs sufficiently. This perspective is at odds with that of those who view function as an integral aspect of HRQOL.20 Measures that rely heavily on assessment of function include the Health Utility Index21 and the Medical Function-Neutral Health-Related QoL Measure

Copyright © 2013 Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.

57

Outcomes Study Short FormY36 with its physical functioning domain.22 Such measures are perhaps most applicable in situations in which people lose function because of chronic health problems associated with aging. An assumed equivalence between function and HRQOL becomes more problematic when functional limitations are a stable characteristic of an individual because of congenital conditions or injuries.12,23 For example, competitive athletes with functional limitations may be healthier than nonathletic persons without such limitations. However, common measures of HRQOL that include assessment of function would yield lower scores for persons with disabilities solely because of their functional limitations.24 The problems with using current HRQOL measures in people with functional limitations are substantial.12,25 Using established criteria,26 Andresen and Meyers27 reviewed eight of the most popular HRQOL measures and concluded that none of them were free of bias with respect to people with disabilities. First, the conflation of function with HRQOL results in a Bfunctional penalty[ of unknown magnitude for persons with different types of functional limitations.12 Horner-Johnson and colleagues24 illu-

strated this for the Short FormY36, used with people with various functional impairments. Second, inclusion of items that are not sensitive to change (e.g., ability to walk for persons with lower limb paralysis) makes these measures less useful for assessing impact of interventions. Third, respondents with spinal cord injury are known to Baccommodate[ to the meaning of questions on physical functioning to differing degrees,28,29 leading to measurement error. Efforts to Benable[ existing measures by substituting alternative language in functionally biased items25,30 violate requirements for using normative data,26 resulting in uncertainty about the measured constructs. Finally, the fundamental conceptualization of many measures is based on an assumption that function is an essential component of HRQOL, resulting in negative bias in both obvious and more subtle ways.9,23 As a result, there is a need for measures that assess BHRQOL[ as distinct from Bfunction[ to understand the relation of HRQOL with function, change over time, and the process of adaptation to disability.23 This study’s purpose was to develop a functionneutral measure of HRQOL. The work was conducted in four stages: (1) developing a preliminary

FIGURE 1 Stages in development of the FuNHRQOL measure. PRoQOLID indicates Patient-Reported Outcome and Quality of Life Instruments Database.

58

Krahn et al.

Am. J. Phys. Med. Rehabil. & Vol. 93, No. 1, January 2014

Copyright © 2013 Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.

conceptual framework and definitions, (2) developing and cognitively testing an initial item pool, (3) evaluating items for functional bias and preliminary factor structure, and (4) conducting a national field test to determine psychometric properties of the new measure. Figure 1 presents the steps of these stages. The institutional review board of Oregon Health & Science University provided oversight for all human participant involvement and the required informed consent procedures.

STAGE 1: CONCEPTUALIZING HRQOL THAT IS FUNCTION NEUTRAL Methods Participants for Conceptual Development Expert Panel on Health Measurement. An expert panel of six senior disability researchers from five different universities was recruited to participate in the multiyear process to develop a new HRQOL measure. In addition, four early-career researchers (postdoctoral fellows and new faculty) mentored by a senior expert were invited to participate. The experts were recruited for their expertise in health conceptualization and measurement, disability epidemiology, women’s health, health of people with disabilities, and personal experience living with a disability. The panelists’ combined experience represented more than 150 yrs of scholarly work and more than 450 publications. Validity Experts on Specific Disabilities. Directors or senior staff of five relevant rehabilitation research and training centers and five national disability advocacy organizations served as validity experts in the specific disabilities of mobility, vision, hearing, mental health, and intellectual disabilities. Measurement Expertise. The authors recruited additional ad hoc expertise on framing development of an HRQOL measure, response shift in self-report measures of HRQOL, and assessment of environments relative to disability. An expert in instrument development and national field testing for psychometric assessment conducted and provided oversight for analyses.

Procedures The expert panel worked through conceptual and assessment issues in semiannual 2-day in-person meetings and monthly telephone calls. The role of the expert panel was to remain highly informed about the project and provide ongoing direction, with the specific tasks of agreement on the purpose of the measure and its desired characteristics, definitions www.ajpmr.com

for core constructs, conceptual framework (including domains and key concepts within those domains), methods for item development, and final item selection in collaboration with an expert on psychometrics for large test development. The panelists examined the literature on meaning of the terms health status, functional assessment, QOL, and HRQOL and reviewed information on existing HRQOL measures. The staff of the project conducted the tasks of organizing the expert panel meetings, organizational implementation, reviewing existing generic HRQOL measures as contained in the Patient-Reported Outcome and Quality of Life Instruments Database,31 and researching and providing additional information as needed to the expert panel. This included targeted literature searches on topics identified by the expert panel. The validity experts on specific disabilities provided feedback on the draft conceptual framework.

Results Purpose of the Measure The expert panel determined that the resulting measure should be a research tool useful at the group level. As such, it was intended to provide more detailed information than population HRQOL survey measures do but not the detail needed for clinical management of individuals. The measure was to be as free of functional content as practicable; be based on self-report; be generically useful for people with and without functional limitations; demonstrate acceptable reliability and validity; and consider age, sex, and racial/ethnic differences in development.

Definition of Constructs The panel developed working definitions to serve as conceptual reference points during development of the measure. They considered other definitions but ultimately relied substantially on definitions from the World Health Organization15,32 for the following: Health. A balanced and complete state of physical, mental, and social well-being and not merely the absence of disease or infirmity. Function. Physical and mental activities whose performance can be directly affected by an underlying impairment or chronic health condition. Disability. An umbrella term to cover impairments, activity limitations, and participation restrictions. On the basis of review of the literature and existing measures, the panelists determined to use the following working definition: Function-Neutral Health-Related QoL Measure

Copyright © 2013 Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.

59

Health-Related Quality-of-Life. A broad and multidimensional sense of personal well-being, particularly as it relates to one’s health.

with no disability matched for sex and age with the participants with disability were recruited.

Procedures Conceptual Framework The panel agreed on a preliminary conceptual framework that specified four domains for HRQOL: physical, mental, social, and life satisfaction and beliefs. The first three domains are reflected in the World Health Organization definition of health. The fourth domain, variously regarded as Bspiritual health,[ or Bliving one’s beliefs,[ was determined by the expert panel and the validity experts to be sufficiently important to HRQOL to warrant a distinct domain. In addition, the panel regarded environment as not central to the definition of HRQOL but as an important influence on health and valuable as a potential ancillary scale. To ensure that the core meaning of each domain would be reflected in the item pool, the panel generated initial key concepts for each of the four HRQOL domains and an environment scale (Fig. 2). These key concepts were verified by the external disability experts and subsequently provided guidance for ensuring that the final item pool was sufficiently comprehensive and reflected the intended meaning of each domain.

STAGE 2: DEVELOPING AND COGNITIVELY TESTING THE INITIAL ITEM POOL Methods Participants Item Pool Development. This was accomplished by the expert panel and the validity experts described in stage 1. In addition, senior members from two national centers on disability and cultural competence were recruited to review items for cultural acceptability and interpretation. Cognitive Testing. Ten adults with mobility, mental health, or sensory disabilities and ten adults

Item Generation and Expert Screening. The authors drew upon item content from existing HRQOL measures identified through a search of the Patient-Reported Outcome and Quality of Life Instruments Database.31 The scales used in this study were those contained in the Patient-Reported Outcome and Quality of Life Instruments Database in 2006 that met the criteria of self-reported, noncondition-specific tools. This procedure ensured that this study’s meaning of HRQOL would be similar to that of available HRQOL measures aside from excluding items measuring function. The panel members independently rated all items from all measures to identify those considered important to measuring health.9 Ratings were summarized as content validity ratios, a statistical measure of consensus developed by Lawshe33 that ranges from j1.00 to +1.00.With an expert panel of ten members, a minimum content validity ratio of 0.60 is required to meet a 0.05 significance level, meaning agreement by at least eight of ten respondents. Items were then mapped to the conceptual domains. Where items did not fit into a single domain, these were tentatively assigned to new joint domains (e.g., physical/mental). The authors removed duplicates and rewrote all items into a standard format with the stem BDuring the last four weeks I[ followed by specific item content. The 4-wk period allowed sufficient time for less frequently occurring events to be experienced while still supporting accurate recall.34 The items were examined to ensure that each asked only a single question, and language and grammar were simplified so that the measure achieved a fourthgrade reading level on the Flesch-Kincaid readability statistic.35 The content of the items was compared with the domain key concepts to identify potential gaps in content coverage that might require subsequent new item development.

FIGURE 2 Conceptual domains and key concepts of the FuNHRQOL measure and environment scale.

60

Krahn et al.

Am. J. Phys. Med. Rehabil. & Vol. 93, No. 1, January 2014

Copyright © 2013 Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.

The expert panel members and the validity experts reviewed each item for understandability and for potential penalization for mobility, vision, hearing, communication, or mental processing limitations. The cultural experts reviewed the items for clarity of meaning (did the words have comparable meaning in other cultures?) and cultural appropriateness (would any words be offensive or have unique meaning to another cultural group?). The staff rewrote or removed problematic items. The items from the four HRQOL domains were randomly ordered into a single list, followed by the environment items. A rating-scale format with seven categories of frequency was selected: Bnever or almost never,[ Bnot often,[ Boccasionally,[ Bsometimes,[ Boften,[ Bvery often,[ and Balways or almost always[ to reflect approximately equal-interval points based on empirical studies of word scaling.36,37 Cognitive Testing. Cognitive testing was conducted in person or via telephone when preferred. The participants were presented with all items in written format and were asked to answer each item and then respond to its meaning using an adapted cognitive assessment protocol38 that asked about meaning of the item and the frame of reference used when responding.

Results Item Generation and Expert Screening Eighty-five nonYcondition-specific measures were identified in the Patient-Reported Outcome and Quality of Life Instruments Database, comprising 648 individual items that were reviewed for content essential to health (as distinct from function). Of these, 239 items with content validity ratios of 0.60 or greater were retained, indicating agreement across raters that their content was important to health. These items came from 21 measures.9 The retained items largely related to energy, illness, pain, sleep quality, mood, memory and concentration, social roles and relationships, a sense of purpose in life, and life satisfaction. Many excluded items addressed function explicitly (e.g., BI see normally,[ BI hear normally,[ BDid you have difficulty twisting the lid off a jar,[ Bwalking up stairs,[ and Bbending, kneeling, or stooping[). Other excluded items seemed more tangential to HRQOL (e.g., BWere you always a cooperative patient?[ BHow dissatisfied or satisfied are you with the kind and amount of food you eat?[ BWhen someone told you that you were looking better did you become annoyed?[ BWere you satisfied with your children?[). After duplicates were eliminated, the 120 remaining items were assigned to this study’s HRQOL conceptual domains; and 24 www.ajpmr.com

items, to the environmental domain. Assignments were made independently by four staff members, with disagreements discussed to reach consensus. These assignments were temporary but were used to ensure adequate coverage of each domain. Two or more experts identified 20 items as difficult to understand or as potentially having differing meaning across cultures and 32 items as potentially functionally biased. Most difficulties in assessments of bias related to mental processing limitations (e.g., BWere you able to learn new things?[ BWas it hard to make decisions?[), and the authors consulted with staff of the Mental Health Research and Training Center in making final determinations. The validity experts provided narrative comments on individual items that were potentially problematic for reasons such as use of language (e.g., Blisten[ or Bquiet[ for use with respondents who are deaf), side effects of medication (e.g., ability to sleep, forgetfulness, energy), and physical limitations (e.g., flexibility and strength). The authors deleted seven of the problematic items, divided one item with multiple components into two, and rewrote others to reduce bias and improve clarity. The authors cross-checked the key concepts for the conceptual domains and determined that life satisfaction and beliefs and environment needed a total of six additional items to be generated to reach a minimum of 25 items in each domain. The net result was 120 HRQOL items and 25 environment items for cognitive testing. The lead author of the Craig Hospital Inventory on Environmental Factors17 provided consultation on content for the environment scale.

Cognitive Testing The respondents reported no difficulties with most items. The authors reworded several items, deleted five for low understandability, and added one item on physical intimacy, as recommended by the respondents.

STAGE 3: EVALUATING ITEMS FOR FUNCTIONAL BIAS AND PRELIMINARY FACTOR STRUCTURE This step empirically tested whether the items showed bias with respect to individuals with specific functional limitations and provided an initial check on unidimensionality of the domains and on item redundancy.

Participants Independent living centers, disability advocacy organizations, community-based health and mental Function-Neutral Health-Related QoL Measure

Copyright © 2013 Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.

61

health clinics, and college and university disability offices assisted in recruitment. They distributed information about this study through posters, Listservs, and training events. Interested participants were instructed to telephone the project office. The authors recruited healthy adults who had only one of the following limitations: mobility, sensory (significant vision or hearing loss), or mental health (defined as being in treatment with a mental health professional for a psychiatric or emotional condition) or no limitation. For mobility limitation, the authors specifically recruited persons with spinal cord injury to maximize the likelihood of preinjury good health and minimize the likelihood that a preexisting chronic condition such as diabetes or emphysema had resulted in mobility limitation. To reduce the potential confounding of poor health with functional limitations, all participants for the functional bias testing phase needed to report their health to be excellent, very good, or good; report not having any of the major chronic conditions; and respond affirmatively to having a specific disability or none for the no-disability group. Potential participants for this phase were screened for health problems through telephone administration of a chronic conditions checklist adapted from the National Population Health SurveyVStatistics Canada,39 Cycle 6 and excluded for any of the following: arthritis or rheumatism interfering with daily life; asthma requiring a trip to the emergency department in the past 12 mos; chronic bronchitis, emphysema, or chronic obstructive pulmonary disease; diabetes requiring insulin; epilepsy; heart disease; post-stroke impairment; Alzheimer disease or other dementia; current or past cancer; chronic fatigue syndrome; or multiple chemical sensitivities.

compared with all other groups. Age was analyzed by analysis of variance, and W2 tests were used for all other comparisons. Because the initial analyses revealed differences between people with vision and hearing limitations, subsequent analyses were conducted separating sensory disabilities into these two groups. The SAS40 9.12 was used to calculate partial correlations to identify differential item functioning (DIF). This procedure is suggested by Reynolds41 and has been used in the development of major published tests.42 It compares one group of respondents’ scores on an item with another group’s scores on that item while controlling for the overall level of each individual on the measured domain. Tukey post hoc tests were used to identify differences between specific disability groups and the no-disability group. This resulted in the comparison of four functional limitation groups (vision, hearing, mobility, and mental health) with the no-disability group on each item while controlling for each individual’s total score on a given domain. Negative correlations indicated higher item scores for the no-disability group, implying negative DIF for the functional limitation group. The Statistical Package for the Social Sciences43 16.0 was used to conduct analyses for an initial examination of the measure’s factor structure and the functioning of individual items. Exploratory factor analyses were conducted using both principal components analysis with varimax rotation (noncorrelated factors) and the maximum likelihood factor analysis with promax rotation (correlated factors) to explore preliminary factor structure, initial internal consistency (Cronbach’s alpha) of each domain scale, and item-total correlations of each item with remaining items in that domain.

Procedures Potential participants were screened for eligibility and then were mailed questionnaire packets with the item pool and demographic questionnaire. The respondents with substantial missing data were recontacted to obtain more complete information. Identifying information was separated from assessment information for data entry, and 20% were verified for entry accuracy. The participants were sent $20 gift cards.

Data Analyses To allow adequate sample size for analyses, education was collapsed into high school or less compared with at least some college, race was collapsed into white compared with other races, and marital status was collapsed into married or living together

62

Krahn et al.

Results This sample included 36 adults with no limitations, 54 with mobility limitation, 25 with vision loss, 23 with hearing loss, and 68 with mental health disability. The mean age of the participants was 43.4 yrs (SD, 13.1; range, 18Y75). Demographic analyses revealed group differences on age, sex, race/ethnicity, and employment status (Table 1), and the authors controlled for these variables in the DIF analyses. The authors’ recruitment method did not allow for assessment of response rate. Preliminary factor analyses were supportive of a four- or five-factor structure. The DIF analyses identified nine items with potential functional bias, which were eliminated. Examples of items eliminated because of DIF scores included BDid you

Am. J. Phys. Med. Rehabil. & Vol. 93, No. 1, January 2014

Copyright © 2013 Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.

TABLE 1 Demographic characteristics of the participants in functional bias testing of HRQOL Healthy Adults with Specific Functional Limitations (N = 206) No Limitation n Female, n (%) Age, mean (SD) Education Lower than HS HS/GED Some college/associate degree College graduate Graduate degree Race African American Asian White Native American Ethnicity Hispanic/Latino Marital status Married Never married Divorced Widowed Separated Living together Employed

36 22 (61%) 39.86 (1.86)

Mobility

Vision

Hearing

54 25 23 20 (37%) 13 (52%) 15 (65%) 46.31 (10.70) 52.84 (1.94) 39.61 (1.56)

Mental Health

P

68 43(63%) 40.88 (1.93)

G0.05a G0.05b

0 (0%) 2 (6%) 17 (47%) 13 (36%) 4 (11%)

0 4 16 17 17

(0%) (7%) (30%) (31%) (26%)

0 (0%) 4 (16%) 11 (44%) 6 (24%) 4 (16%)

0 (0%) 3 (13%) 10 (44%) 7 (30%) 3 (13%)

4 10 32 17 5

(6%) (15%) (47%) (25%) (7%)

NSa

4 (13%) 0 (0%) 24 (80%) 2 (7%)

1 0 52 1

(2%) (0%) (96%) (2%)

4 (16%) 0 (0%) 21 (84%) 0 (0%)

1 (5%) 0 (0%) 18 (95%) 0 (0%)

2 1 60 2

(3%) (2%) (92%) (3%)

NSa

13 (24%)

6 (24%)

7 (37%)

15 (60%) 5 (20%) 2 (8%) 2 (8%) 0 (0%) 1 (4%) 11 (44%)

8 (42%) 6 (32%) 2 (11%) 1 (5%) 0 (0%) 2 (11%) 14 (73%)

3 (10%) 8 (37%) 15 (50%) 3 (10%) 1 (3%) 0 (0%) 3 (10%) 23 (74%)

19 18 11 1 0 5 31

(35%) (33%) (20%) (2%) (0%) (9%) (60%)

27 (42%) 11 33 13 2 2 4 27

(17%) (51%) (20%) (3%) (3%) (6%) (41%)

G0.05a NSa

G0.05a

W test. Analysis of variance test. GED indicates General Educational Development; HS, high school; NS, not significant. a 2 b

feel emotionally supported?[ BWere you treated unfairly because of who you are?[ BDid you have physical intimacy in your life?[ BDid your physical health limit daily activities?[ and BWere you in control of your health?[ Fourteen items were eliminated because these were highly redundant with other items, ten items were eliminated from the mental health scale for poor factor loading, and three items were eliminated because these were flagged previously during cultural review. These results are presented in Table 2. The items retained through stage 3 for field testing included 18 physical health,

25 mental health, 27 social health, 11 life satisfaction and beliefs, and 24 environment items.

STAGE 4: CONDUCTING A NATIONAL FIELD TEST TO DETERMINE PSYCHOMETRIC PROPERTIES The national field test was intended to (1) test the conceptual framework, (2) determine final item selection, (3) demonstrate the internal consistency of domain and total scales, (4) assess convergent validity, (5) demonstrate divergent validity, (6) assess

TABLE 2 Number of items eliminated because of redundancy, functional bias, low factor loadings, and cultural concerns Initial Items Redundant DIF Low Factor Loading Culture Concerns Retained Physical Mental health Social health Life satisfaction and beliefs Total Environment

www.ajpmr.com

22 44 35 16 117 24

1 7 1 5 14 0

2 1 6 0 9 0

0 10 0 0 10 0

1 1 1 0 3 0

18 25 27 11 81 24

Function-Neutral Health-Related QoL Measure Copyright © 2013 Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.

63

known-groups validity, and (7) demonstrate testretest reliability and correspondence with selfreported changes in health.

Methods Participants for Reliability and Validity Assessment State offices on disability and health, independent living centers, disability advocacy organizations, and college and university disability offices in the four regions of the United States assisted in recruitment. They distributed notices through their Listservs, health fairs, and training events. Adults with different disabilities were recruited to approximate general population demographics of the country. The sample recruitment plan was to include 175 participants from each of four regions of the country (United States Census Bureau [Northeast, Midwest, South, and West]), with approximately equal distribution by sex, approximately half older and younger than 45 yrs, approximately half with a high school education or less, and approximately 30% who are of minority race/ethnicity. All participants were required to be 18 yrs or older, read English, and experience one or more disabilities but were not screened for health conditions. The authors closely monitored the demographic characteristics of the participants as recruitment proceeded to ensure that the sample distribution approximated their target distribution.

Participants for Test-Retest Reliability For test-retest reliability assessment, the authors recruited a separate sample of adults with disabilities primarily from the North West region of the country. Approximately equal numbers of women and men with a range in ages were enrolled. To ensure variation in health, the authors recruited adults with disabilities who had been excluded from functional bias testing because of known health problems and recruited participants through exercise programs to reach people likely to be in good to excellent health.

Measures Field test participants completed the following measures: Demographic and Basic Health Questionnaire. This assessed sex, age, race, ethnicity, education, employment, marital status, disability status as measured by items from the Behavioral Risk Factor Surveillance System and the American Community Survey, and self-rated overall health (excellent, very good, good, fair, or poor). Responses to the latter item were reverse scored, such that higher scores indicated positive self-rated health.

64

Krahn et al.

Function-Neutral Health-Related Quality of Life Item Pool. The Function-Neutral Health-Related Quality of Life (FuNHRQOL) pool consisted of the items retained from previous steps (81 items in four HRQOL domains and 24 environment items). Activities of Daily Living/Instrumental Activities of Daily Living Limitations. Need for assistance with activities of daily living and instrumental activities of daily living (ADLs/IADLs) was assessed using 14 yes/no items from the National Health Interview SurveyYDisability Supplement.44 Summed scores reflected the total number of items endorsed, with higher scores indicating more limitations. Inventory of Chronic Conditions. Chronic conditions are noninfectious conditions of long duration experienced by the general population that impact health, typically require medical management, and can impact functioning. A list of 23 conditions drawn from the Statistics Canada survey39 were summed so higher scores reflected more chronic conditions (see list in stage 3: functional bias testing). Secondary Conditions Checklist. Secondary conditions were considered as those health conditions that are unique to or experienced more frequently in relation to a preexisting disability. A 17-item checklist was developed on the basis of work by Seekins and colleagues45 as adapted by others.46 Common secondary conditions were listed, and the respondents indicated the severity of each condition from 0, as not experienced, to 3, as a significant problem: too high or too low blood pressure, poor circulation, contractures, diabetes, fatigue, injuries, osteoporosis, pressure sores, alcohol or other drug overuse/abuse, muscle spasms, urinary tract infections, pneumonia, repetitive motion pain, weight problems, chronic pain, stomach problems, and constipation or bowel problems. Scores reflect the number and the severity of conditions, with higher scores indicating greater problems. Craig Hospital Inventory of Environmental Factors, Version 3.0. The Craig Hospital Inventory of Environmental Factors17 assesses environmental barriers in five key areas: accessibility, accommodation, resource availability, social support, and equality. It measures how often respondents have encountered 25 barriers and whether the barrier was a big or little problem. Scoring includes the number and the severity of problems, with higher scores indicating greater barriers.

Measures for Test-Retest Reliability The test-retest participants completed demographic information and the final FuNHRQOL items

Am. J. Phys. Med. Rehabil. & Vol. 93, No. 1, January 2014

Copyright © 2013 Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.

(42 HRQOL, 15 environment) during the first administration. On the second administration, the FuNHRQOL items were repeated, accompanied by a single question asking whether the respondent had any major changes in health during the past month.

Procedures The field-test participants received a paper survey packet containing the measures, a respondent contact information form, and a postage-paid return envelope. Data handling and verification procedures were similar to those for functional bias testing. The process of previewing responses for completeness and recontacting respondents with substantial amounts of missing data resulted in very few missing data (e.g., item response theory [IRT] analyses were conducted on 674 rather than the full 686 respondents). The test-retest participants received the FuNHRQOL survey items with instructions to return the completed survey within 1 wk. To achieve a 4 wk testretest interval, 3 wks after receipt of their first survey, the authors mailed the second survey and again instructed them to return it within 1 wk. The participants were sent $10 gift cards for completing the first survey and $15 for completing the second survey. Data entry and checking procedures matched those used for the field test.

Data Analyses Field Test. Field test analyses were conducted in three steps: (1) preliminary analyses to make the final item selection, (2) assessment of the final item set’s factor structure and scale characteristics, and (3) validity assessments. First, for all analyses, negatively worded FuNHRQOL items were reverse scored; thus, higher scores indicated better HRQOL. Higher scores on the environment scale also indicated better (more facilitative) environments. Analyses were conducted using the Statistical Package for the Social Sciences 16.0 unless otherwise specified. Respondent data were analyzed within each of the four HRQOL domains and the environment domain. The Statistical Package for the Social Sciences 16.0 was used to calculate internal consistency (Cronbach’s alpha); item-total correlation of each item with remaining items in that domain; exploratory factor analyses using both principal components analysis with varimax rotation (noncorrelated components) and maximum likelihood factor analysis with promax rotation (correlated factors); and partial correlations to assess DIF related to race, ethnicity, or sex. IRT analysis was conducted using the WINSTEPS program47 for polytomous rating scale items. For IRT analyses, the authors scaled the measure or Bdifficulty[ score to a www.ajpmr.com

center of 500 and expansion factor of 9.1, allowing alignment of the IRT scales across domains of the questionnaire. The collective findings of all analyses guided the final item selection for each domain. Second, data from the final item set were analyzed within each of the four conceptualized HRQOL domains of the FuNHRQOL. Confirmatory factor analysis using the LISREL48 8.8 was used to evaluate the fit of the four-domain conceptual model of HRQOL to the data. Model fit statistics were compared for models with two, three, and the hypothesized four factors. This approach has been used in validating the structure of measures such as the Stanford-Binet Intelligence Scale, Fifth Edition.42 The authors also created a composite scale combining the four HRQOL domainsandexaminedinternalconsistency(Cronbach’s alpha) of the domains and the combined scale. Third, mean domain and composite FuNHRQOL scores were calculated and converted to standardized scores (scaled with mean of 50; SD, 10). To test convergent validity, these scores were correlated with three assessments of health: self-rated overall health (5-point scale), number of chronic conditions, and secondary conditions score. The correlation between the environment domain of the FuNHRQOL and the Craig Hospital Inventory of Environmental Factors was also examined. For assessment of divergent validity, FuNHRQOL scores were correlated with a measure of function: total number of ADLs/IADLs. Correlations were graded26 as weak (G0.30), moderate (0.30Y0.59), or strong (Q0.60). To evaluate the FUNHRQOL’s ability to distinguish between known groups, responses to the self-rated health item were recoded into two categories (excellent/very good/ good, and fair/poor), and these groups were compared using t tests. Final Item Reduction. The expert panel examined items by domain to retain those with the best measurement characteristics and eliminate items with obvious problems. They simultaneously considered domain-specific factor loadings, item-total correlations, IRT statistics, and DIF analyses of items in the pool. In addition to evaluating statistical information, they considered item content to retain items identified as important to the conceptual model while minimizing redundancy.

Test-Retest Reliability For test-retest reliability, intraclass correlation coefficients (ICCs) were calculated for the FuNHRQOL domains and the composite score using two-way mixed models. In addition, the sample was dichotomized according to the item on the second survey that asked whether the respondent’s health had changed Function-Neutral Health-Related QoL Measure

Copyright © 2013 Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.

65

substantially, and ICCs for each group were calculated. The authors also used independent samples t tests to compare amount of change in HRQOL scores (absolute value) between the groups.

With regard to health, 10% described their health as excellent; 29%, very good; 32%, good; 22%, fair; and 7%, poor.

Field Test Findings

RESULTS Participants Field Test The field test sample included 723 individuals, of whom 37 were excluded because they answered negatively to all self-reported disability identification questions, resulting in data from 686 participants for analysis. According to responses to the American Community Survey functional limitation identification items, 26% endorsed a sensory disability, 73% endorsed a physical disability, and 52% endorsed a mental disability (totaling greater than 100% because the participants could report multiple functional limitations). Characteristics of the field test sample and the test-retest samples are included in Table 3.

The Test-Retest Sample A total of 128 people completed the measure at two time points (see Table 3 for demographic characteristics). Only 14% were employed full time, 18% were employed part time, 17% were students, 15% were retired, and 36% were otherwise not employed.

Table 4 presents findings on multiple criteria for assessing two related items to illustrate the use of findings in making the final item selection. These are two items with similar content on physical pain. Although both had adequate factor loadings on the physical domain, the first item (pain coping) had a low item-total correlation, high IRT informationweighted fit statistic and outlier-sensitive fit statistic, and significant negative DIF score for African American respondents and was removed. Conversely, the second item (prevented participation) performed well on all measures and was retained. A similar process was used to examine findings for each item to determine the final item set for the measure. Forty-two HRQOL items (13 items in physical health, 11 in mental health, 10 in social health, and 8 in life satisfaction and beliefs), plus 15 items in the environmental scale, were retained.

Structure of the Measure The fit statistics improved with progression from the simple two-factor model to a more complex model.

TABLE 3 Demographic characteristics of the participants for national field test and test-retest

n Female, n (%) Age, mean (SD) [range] Education Lower than HS HS/GED Some college/associate degree College graduate Graduate degree Race African American Asian White Native American Other/mixed Ethnicity Hispanic/Latino Limitation Sensory Mobility Mental Self-rated health (fair/poor), n (%) Chronic conditions, mean (SD) [range] Secondary conditions, mean (SD) [range] ADLs/IADLs limitations, mean (SD) [range]

Field Test

Test-Retest

686 397 (58.6%) 47.6 (15.6) [18Y94]

128 65 (50.8%) 45.2 (13.3) [18Y80]

111 (16.6%) 150 (22.4%) 249 (37.1%) 93 (13.9%) 67 (10.0%)

9 (7.0%) 18 (14.1%) 48 (38.5%) 27 (21.1%) 26 (20.3%)

98 (14.5%) 13 (1.9%) 477 (70.8%) 9 (1.3%) 77 (11.4%)

1 (0.8%) 3 (2.4%) 107 (84.3%) 3 (2.4%) 14 (11.1%)

77 (11.4%)

8 (6.3%)

26% 73% 52% 334 (49.3%) 4.49 (2.79) [0Y16] 12.84 (8.63) [0Y48] 4.85 (4.43) [0Y14]

49 (38.9%)

GED indicates General Educational Development; HS, high school.

66

Krahn et al.

Am. J. Phys. Med. Rehabil. & Vol. 93, No. 1, January 2014

Copyright © 2013 Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.

TABLE 4 Example of final item selection with data from field test respondents to mailed survey Physical Domain Items Did you cope with the physical pain you experienced? (removed) Did physical pain prevent you from doing what you wanted to do? (retained)

Item-Total Correlation

Factor Loading (Physical)

IRT Measurea

IRT INFITb

IRT OUTFITc

DIFd for Race

0.24

0.49

496.00

1.54

1.96

j0.15

0.69

0.54

500.54

0.87

0.89

NS

a IRT indicates item Bdifficulty,[ with a lower IRT indicating that the item is more readily endorsed. IRT measure was centered on 500 with an expansion factor of 9.1. b INFIT (information-weighted fit statistic) is an index of model-to-data fit. It should ideally center on 1.00 with a range from 0.50 to 1.50. Values outside this range indicate overly discriminating items or inconsistent response patterns. c OUTFIT (outlier-sensitive fit statistic) is another index of model-to-data fit with parameters similar to those for INFIT. d DIF is assessed by partial correlations of each item with a dummy code for historically marginalized groups vs. culturally dominant groups, whereas total score on the relevant domain is held constant. In the field test sample, correlations stronger than j0.12 (significant at P e 0.01) were considered an indication of possible bias against a marginalized group.

Table 5 provides information on fit statistics for each model. Model improvement was demonstrated by the successive reduction in W2 values, root mean square error of approximation, and expected cross-validation index and increases in goodness of fit. On the basis of these indices, the best fit was the four-factor model proposed for the FuNHRQOL (physical health, mental health, social health, and life satisfaction and beliefs). Factor loadings of items on their assigned domains in the four-factor model are shown in Appendix. The correlations between scales were the following: physical/social = 0.482, physical/mental = 0.556, physical/values = 0.510, social/mental = 0.761, social/ life satisfaction and beliefs = 0.832, and mental/life satisfaction and beliefs = 0.817.

Internal Consistency All domains and the composite FuNHRQOL demonstrated excellent internal consistency reliability. Cronbach’s > (0.88Y0.97) are shown in Table 6.

Convergent, Divergent, and Known-Groups Validity The correlations of each of the FuNHRQOL scales with self-rated health, chronic conditions, secondary conditions, and ADLs/IADLs are shown in Table 6. The composite FuNHRQOL and its individual domains had moderate to strong positive correlations with self-rated health. There were low to moderate negative correlations with number of chronic conditions, whereas correlations with the secondary conditions scalewerenegativeandmoderatetostrong.Conversely,

TABLE 5 Fit statistics for each of three models of the dimensions measured by the FuNHRQOL measurea Fit Indices Model Two-factorh Three-factori Four-factorj

W2b

df c

W2/df d

GFIe

RMSEAf

ECVIg

3477.5 3159.4 3046.2

818 816 813

4.25 3.87 3.75

0.72 0.75 0.76

0.074 0.072 0.072

6.83 6.24 6.04

a

Based on covariance matrix among 42 variables in a sample of N = 686. W statistic (lower values indicate better fit). c df. d 2 W divided by df. e Goodness-of-fit index (higher values indicate better fit). f Root mean square error of approximation (lower values indicate better fit). g Expected cross-validation index (smaller values indicate greater potential for replication of the model). h Factors were (1) physical HRQOL items and (2) all other HRQOL items. i Factors were (1) physical HRQOL, (2) social HRQOL, and (3) mental HRQOL + life satisfaction and beliefs. j Factors were (1) physical, (2) social, (3) mental, and (4) life satisfaction and beliefs. ECVI indicates expected cross-validation index; GFI, goodness of fit; RMSEA, root mean square error of approximation. b 2

www.ajpmr.com

Function-Neutral Health-Related QoL Measure Copyright © 2013 Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.

67

TABLE 6 Internal consistency and convergent and divergent validity of the FuNHRQOL measure Correlationsa Domain Physical health Mental health Social health Life satisfaction and beliefs Composite HRQOL Environment a b

No. Items

Cronbach’s >

Self-rated Healthb

No. Chronic Conditions

Secondary Conditions

ADLs/IADLs Limitations

13 11 10 8 42 15

0.97 0.90 0.89 0.90 0.97 0.88

0.74 0.42 0.40 0.41 0.61 0.44

j0.51 j0.29 j0.29 j0.23 j0.41 j0.36

j0.70 j0.52 j0.44 j0.41 j0.64 j0.55

j0.25 j0.14 j0.12 j0.13 j0.20 j0.16

All correlations significant at 0.01 or less. Pearson correlations unless otherwise noted. Spearman correlations because self-rated health was considered an ordinal variable.

correlations with number of ADLs/IADLs limitations were negative but weak. The environment domain of the FuNHRQOL showed a moderately strong negative correlation (r = j0.56) with the Craig Hospital Inventory of Environmental Factors. Table 7 presents the sample’s mean standardized scores when the sample was dichotomized on selfrated health status. People with self-rated fair/poor health scored significantly lower on each of the individual domains and the combined HRQOL score of the FuNHRQOL. Effect sizes (Cohen’s d) ranged from 0.77 to 1.76.

Test-Retest Reliability Findings The median time between completion of test and retest surveys was 26 days, with 99% falling between 20 and 39 days. The scores at the two times were highly correlated for the full sample (Table 8). In response to the question about a change in health status, 21% of the respondents indicated that their health had changed substantially in the time between test and retest administrations. For those who said that their health had not changed substantially, all ICCs were 0.85 or higher. Among those whose health had changed (either improved or deteriorated),

reliability was more than 0.80 for physical health, life satisfaction and beliefs, and composite FuNHRQOL score, with lower test-retest ICC for mental health, social health, and environment. Differences in ICCs between the two groups were not statistically significant. When the respondents who reported a change in health were compared with those reporting stable health, the respondents endorsing a change in health had an absolute value of difference in the FuNHRQOL scores that was significantly greater on the life satisfaction and beliefs domain (P = 0.049) and approached significance for mental health (P = 0.065) and social health (P = 0.097).

DISCUSSION This article describes the multiple steps to develop a conceptual framework and item pool for the FuNHRQOL measure and to demonstrate its psychometric properties with adults with various disabilities. The framework reflects HRQOL conceptualizations for the general population and specifies the domains of physical health, mental health, social health, and life satisfaction and beliefs. This framework was essential in the development of a pool of items for

TABLE 7 Known-groups validity for the FuNHRQOL measure Mean Standardized Score Domain Physical health Mental health Social health Life satisfaction and beliefs Composite HRQOL Environment a

68

Krahn et al.

Excellent/Very Good/ Good Health (n = 344)

Fair/Poor Health (n = 334)

t

P

Effect Sizea

56.00 53.55 53.56 53.40 55.09 53.57

42.87 45.87 45.96 46.18 44.09 45.88

22.88 10.77 10.67 10.08 17.09 10.76

G0.001 G0.001 G0.001 G0.001 G0.001 G0.001

1.76 0.83 0.82 0.77 1.32 0.83

Cohen d = mean difference divided by the average of the standard deviations.

Am. J. Phys. Med. Rehabil. & Vol. 93, No. 1, January 2014

Copyright © 2013 Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.

TABLE 8 Test-retest reliability of the FuNHRQOL measure Intraclass Correlations of T1 and T2a Overall (n = 128) Domain Physical health Mental health Social health Life satisfaction and beliefs Composite HRQOL Environment

Substantial Change in Healthb (n = 27)

No Substantial Change in Health (n = 101)

T1 j T2c

ICC (95% CI)

T1 j T2

ICC (95% CI)

T1 j T2

0.05 0.16 0.16 0.21 0.16 0.18

0.89 (0.85Y0.92) 0.83 (0.77Y0.88) 0.85 (0.80Y0.89) 0.90 (0.86Y0.93) 0.91 (0.88Y0.94) 0.84 (0.78Y0.88)

1.11 0.45 0.04 0.42 0.34 0.44

0.85 (0.71Y0.93) 0.65 (0.36Y0.82) 0.73 (0.48Y0.87) 0.83 (0.66Y0.92) 0.83 (0.65Y0.92) 0.77 (0.55Y0.89)

0.37 0.05 0.19 0.39 0.28 0.12

ICC (95% CI) 0.87 0.85 0.87 0.90 0.91 0.85

(0.81Y0.91) (0.78Y0.90) (0.81Y0.91) (0.86Y0.93) (0.87Y0.94) (0.78Y0.89)

a

Median days between administration = 26. Measured by yes/no question on retest as to whether the respondent had any major changes in health during the past month. c Absolute value of time 1 mean minus time 2 mean. CI indicates confidence interval; T, time. b

measuring HRQOL. By using items from existing HRQOL measures, the FuNHRQOL parallels existing generic HRQOL measures and is useful with a range of populations. Its distinction is in the development process that specifically minimized function in its measurement of HRQOL. Items for the pool demonstrated relative statistical functional neutrality, in contrast with items of the Short FormY36 that have demonstrated substantial functional penalty for people with preexisting functional limitations.24 Development of the measure was informed by the International Classification of Functioning, Disability and Health framework,15 which addresses barriers and supports in the environment. An advantage of the FuNHRQOL is inclusion of the environment scale to provide a means of measuring the respondent’s perception of the environment as supporting or hindering HRQOL.49 This scale is particularly relevant for persons with functional limitations but should be useful in other research in which social and environmental variables are important in understanding health outcomes, such as aging or culturally diverse populations. The psychometric properties of this new HRQOL measure demonstrate relative function neutrality, which may make it more appropriate than current measures for research involving adults with preexisting functional limitations.1,11,24 This work was conducted before release of the recommendations for scale development in rehabilitation50 but closely parallels those recommended steps. The rigorous process of instrument development is reflected in its strong psychometric properties. Measures of health status and HRQOL for the general population have historically been defined or conceptualized to include measures of functional ability. Measures of function, particularly measures www.ajpmr.com

that document change in function, can be informative about the health of aging adults who lose function, but these are less useful for people with longstanding and relatively stable functional limitations. Some researchers may challenge whether the current measure is truly a measure of HRQOL if function is not included in the assessment; others may challenge whether all functional items have been removed for all disability groups, and still, others may challenge whether the measure should be called a health status measure rather than an HRQOL measure. The authors recognize that there are differing opinions among researchers on meaning and measurement of these constructs but maintain the value of an HRQOL measure that is relatively function neutral. The authors believe that the current results are highly promising. The FuNHRQOL performs well on previously specified criteria.26 The conceptual model underlying the FuNHRQOL was confirmed through factor analyses, with excellent internal consistency for each scale. The FuNHRQOL correlated well with measures of chronic and secondary health conditions, affirming that it measures Bhealth.[ Scores across all domains were more highly correlated with secondary conditions than chronic conditions, suggesting that the measure will be sensitive to HRQOL differences caused by preventable health conditions more prevalent among persons with functional limitations. Encouragingly, the FuNHRQOL scores did not correlate highly with the ADLs/ IADLs measures (r = j0.13 to j0.25). This confirms the relative function neutrality of the measure. What shared variance was found between the FuNHRQOL and the ADLs/IADLs is likely attributable to genuinely poorer health of some people who had health-related functional limitations. The test-retest correlations during a 4-wk period indicate high reliability. There Function-Neutral Health-Related QoL Measure

Copyright © 2013 Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.

69

was some support for the intent that the measure be sensitive to change, with respondents in the testretest reliability stage who reported change in health status also showing differences on FuNHRQOL scores. More research is clearly indicated to determine its sensitivity to detecting change over time, such as determining whether changes resulting from an intervention program would be reflected on the measure.

Study Limitations Although numerous challenges were addressed, some limitations remain. The first relates to measurement error stemming from response shift, the phenomenon of altering relative assessments of constructs because of one’s significant life experiences. For example, persons with functional limitations describe their health as excellent or very good although experiencing more days of self-assessed poor health than do their peers without disability giving the same health rating.19 The response shift phenomenon complicates the assessment of HRQOL within and across groups with diverse life experiences. With consultation from an expert in response shift, the panel explored this issue conceptually23 but was not able to develop a method to statistically reduce this source of variance. Further, the FuNHRQOL was initially intended to be used across all disabilities, including persons with mild intellectual disabilities. However, the cognitive tasks required to provide meaningful reports of self-perceived health51 resulted in the authors not including persons with intellectual disabilities or other cognitive limitations (e.g., traumatic brain injury, dementia) in measure development. Future research is needed to modify and assess the use of the FuNHRQOL in persons with various cognitive disabilities. Finally, the efforts and resources required to obtain a large national sample of adults with disabilities resulted in the authors using this data set both to conduct the final item selection and to establish the psychometric characteristics. Ideally, different samples would have been used for these separate steps. This may have contributed to stronger performance on psychometric characteristics than would be obtained with a separate sample. This article provides the initial report on psychometric properties of the FuNHRQOL, and the authors suggest future focused studies. Although the measure is intended to be broadly useful for the general population, the psychometric characteristics described here are based on a sample of persons with disabilities. Performance of the FuNHRQOL with the general population requires additional research.

70

Krahn et al.

A larger sample of respondents will be important to examine the robustness of this study’s findings among specific demographic groups (e.g., by age, race/ethnicity). The current data are based on selfreported disability status; future use with clinical samples could facilitate additional validation of the nature and the severity of the disability. Knowngroups validity in this study was based on a single self-report item describing overall health; future studies should examine the ability of the FuNHRQOL to distinguish between groups on other characteristics. In addition, more data are needed regarding the sensitivity of the FuNHRQOL to health changes, including its ability to assess changes after intervention. Additional research is also needed to examine the relationship of various environments to HRQOL scores. Finally, the current length of the measure requires approximately 20 mins to complete and potentially could be reduced through computerized adaptive testing methods.52 The use of item response theory methods in the development of the FuNHRQOL should facilitate computerized adaptive testing application in the future.

CONCLUSIONS In summary, the FuNHRQOL is a 42-item measure of HRQOL with an accompanying 15-item scale for assessment of the environment. The conceptual framework and definitions rely on those of the World Health Organization, ensuring conceptualization that is useful internationally. Development of the item pool was grounded in the content of existing HRQOL measures, ensuring comparability in conceptualization of HRQOL. The measure is relatively free of functional content, making it highly suitable for persons with preexisting functional limitations but, by nature of its development, should be relevant for people without functional limitations as well. It should be particularly useful for researchers examining the relationship between health and function or researchers assessing effect of health promotion interventions. ACKNOWLEDGMENTS

The authors thank the members of the validity panel and the cultural experts for their work in responding to the items. They include representatives from five rehabilitation research and training centers (vision, deaf/hard of hearing, aging with physical disabilities, aging with developmental disabilities, and mental illness), the American Federation for the Blind, the National Arc/Self Advocates Becoming Empowered, the National Council on Independent Living, the Prevention Research Center on Deaf Health (Rochester), the National Alliance on Mental

Am. J. Phys. Med. Rehabil. & Vol. 93, No. 1, January 2014

Copyright © 2013 Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.

Illness, the National Center for Cultural Competence (Georgetown University), and the Center for Capacity Building on Minorities with Disabilities Research (University of IllinoisYChicago). The authors thank the consultants who generously shared their expertise in specific areas in the course of this work: Christina Bethell on development of health assessment measures, Carolyn Schwartz on response shift in self-report measures, and Gale Whiteneck on assessment of environments in the context of the disabling process. The authors thank Denise Spielman for recruitment coordination; Susan Wingenfeld, Amy Cline, and Amy Sharer for their assistance in screening participants for health conditions, managing data entry, and manuscript formatting and referencing; Martha Bose for data coding and checking; and Megan Scott for field test data management and scoring. REFERENCES 1. Centers for Disease Control and Prevention: Measuring Healthy Days. Atlanta, GA, Centers for Disease Control and Prevention, 2000 2. Fayers P, Hays R (eds): Assessing Quality of Life in Clinical Trials: Methods and Practice, ed 2. New York, NY, Oxford University Press, 2005 3. Heidrich J, Liese AD, Lowel H, et al: Self-rated health and its relation to all-cause and cardiovascular mortality in southern Germany. Results from the MONICA Augsburg cohort study 1984-1995. Ann Epidemiol 2002;12:338Y45 4. Idler EL, Russell LB, Davis D: Survival, functional limitations, and self-rated health in the NHANES I epidemiologic follow-up study, 1992. First National Health and Nutrition Examination Survey. Am J Epidemiol 2000;152:874Y83 5. McHorney CA: Health status assessment methods for adults: Past accomplishments and future challenges. Annu Rev Public Health 1999;20:309Y35 6. Garratt A, Schmidt L, Mackintosh A, et al: Quality of life measurement: Bibliographic study of patient assessed health outcome measures. BMJ 2002;324:1417Y9 7. Burckhardt CS, Anderson KL: The Quality of Life Scale (QOLS): Reliability, validity, and utilization. Health Qual Life Outcomes 2003;1:60. doi:10.1186/1477-7525-1-60 8. Horsman J, Furlong W, Feeny D, et al: The Health Utilities Index (HUI): Concepts, measurement properties and applications. Health Qual Life Outcomes 2003;1:54 9. Hall T, Krahn GL, Horner-Johnson W, et al, and the RRTC Expert Panel on Health Measurement (2011): Examining functional content in widely used health related quality of life measures. Rehabil Psychol 2011;56:94Y9 10. Wilson IB, Cleary PD: Linking clinical variables with health-related quality of life: A conceptual model of patient outcomes. JAMA 1995;273:59Y65 www.ajpmr.com

11. Johnson RJ, Wolinsky FD: The structure of health status among older adults: Disease, disability, functional limitation, and perceived health. J Health Soc Behav 1993;34:105Y21 12. Krahn GL, Fujiura G, Drum CE, et al: The dilemma of measuring perceived health status in the context of disability. Disabil Health J 2009;2:49Y56 13. United States Department of Health and Human Services: The Surgeon General’s Conference Report. Closing the Gap: A National Blueprint to Improve the Health of Persons with Mental Retardation. Washington, DC, Office of the Surgeon General, USDHHS, 2002 14. United States Department of Health and Human Services: The Surgeon General’s Call to Action to Improve the Health and Wellness of Persons with Disabilities. Washington, DC, Office of the Surgeon General, USDHHS, 2005 15. World Health Organization: International Classification of Functioning, Disability and Health. Geneva, Switzerland, World Health Organization, 2001 16. CDC Public Health Grand Rounds. Where in health is disability? Public health practices to include people with disabilities. December 18, 2012. Available at: http://www.cdc.gov/about/grand-rounds/archives/ 2012/December2012.htm. Accessed June 24, 2013 17. Whiteneck GG, Harrison-Felix CL, Mellick DC, et al: Quantifying environmental factors: A measure of physical, attitudinal, service, productivity, and policy barriers. Arch Phys Med Rehabil 2004;85:1324Y35 18. Albrecht GL, Devlieger PJ: The disability paradox: High quality of life against all odds. Soc Sci Med 1999; 48:977Y88 19. Drum CE, Horner-Johnson W, Krahn GL: Self-rated health and healthy days: Examining the Bdisability paradox[. Disabil Health J 2008;1:71Y8 20. Helmes E: Function and disability or quality of life? Issues illustrated by the Osteoporosis Functional Disability Questionnaire (OFDQ). Qual Life Res 2000; 9(suppl 1):755Y61 21. Feeny D, Furlong W, Boyle M, et al: Multi-attribute health status classification systems: Health Utilities Index. Pharmacoeconomics 1995;7:490Y502 22. Ware JE, Sherbourne CD: The MOS 36-item ShortForm Health Survey (SF-36). I. Conceptual framework and item selection. Med Care 1992;30:473Y83 23. Schwartz CE, Andresen EM, Nosek MA, et al: Response shift theory: Important implications for measuring quality of life in people with disability. Arch Phys Med Rehabil 2007;88:529Y36 24. Horner-Johnson W, Krahn GL, Suzuki R, et al: Differential performance of SF-36 items in healthy adults with and without functional limitations. Arch Phys Med Rehabil 2010;91:570Y5 25. Meyers AR, Andresen EM: Enabling our instruments: Accommodation, universal design, and access to participation in research. Arch Phys Med Rehabil 2000;81(suppl 2):S5Y9 26. Andresen EM: Criteria for assessing the tools of

Function-Neutral Health-Related QoL Measure Copyright © 2013 Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.

71

disability outcomes research. Arch Phys Med Rehabil 2000;81(suppl 2):S15Y20

40. SAS 9.12 for Windows [computer program]. Version 9.12. Cary, NC, SAS Institute Inc, 2004

27. Andresen EM, Meyers AR: Health-related quality of life outcomes measures. Arch Phys Med Rehabil 2000;81(suppl 2):S30Y45

41. Reynolds CR: Methods for detecting construct and predictive bias, in Berk RA (ed): Handbook of Methods for Detecting Test Bias. Baltimore, MD, Johns Hopkins University Press, 1982, p 199Y227

28. Tate DG, Kalpakjian CZ, Forchheimer MB: Quality of life issues in individuals with spinal cord injury. Arch Phys Med Rehabil 2002;83(suppl 2):S18Y25 29. Dudley-Javoroski S, Shields RK: Assessment of physical function and secondary complications after complete spinal cord injury. Disabil Rehabil 2006;28:103Y10 30. Froehlich-Grobe K, Andresen E, Caburnay C, et al: Measuring health-related quality of life for persons with mobility impairments: An enabled version of the ShortForm 36 (SF-36E). Qual Life Res 2008;17:751Y70 31. Emery MP, Perrier LL, Acquadro C: Patient-Reported Outcome and Quality of Life Instruments Database (PROQOLID): Frequently asked questions. Health Qual Life Outcomes 2005;3:12 32. Preamble to the Constitution of the World Health Organization as adopted by the International Health Conference, New York, June 19Y22, 1946; signed on July 22, 1946, by the representatives of 61 states (Official Records of the World Health Organization, no. 2, p. 100) and entered into force on April 7, 1948 33. Lawshe CH: A quantitative approach to content validity. Pers Psychol 1975;28:563Y75 34. Moriarty DG, Zack MM, Kobau R: The Centers for Disease Control and Prevention’s Healthy Days measuresVPopulation tracking of perceived physical and mental health over time. Health Qual Life Outcomes 2003;1:37. Available at: http://www.hqlo.com/ content/1/1/37. Accessed June 24, 2013 35. Microsoft Word 2007 [computer program]. Redmond, WA, Microsoft Corporation, 2006 36. Hakel MD: How often is often? Am Psychol 1968; 23:533Y4 37. Pohl NF: Scale considerations in using vague quantifiers. J Exp Educ 1981;49:235Y40

72

42. Roid GH: Stanford-Binet Intelligence Scales, ed 5. Schaumburg, IL, Riverside, 2003 43. SPSS for Windows [computer program]. Version 16.0. Armonk, NY, SPSS Inc, 2008 44. National Center for Health Statistics: Disability Followback Survey (NHIS Phase II) Adult’s Questionnaire. Atlanta, GA, CDC, 1995 45. Seekins T, Smith N, McCleary T, et al: Secondary disability prevention: Involving consumers in the development of a public health surveillance instrument. J Disabil Policy Stud 1990;1:21Y36 46. Cahill A, Ravelsoot C: Consortium to evaluate the efficacy of health promotion interventions for people with disabilities, in American Association on Health and Disability (ed); Advancing Health Promotion for People with Disabilities on the Public Health Agenda: Proceedings. Rockville, MD, 2003, p 106. Available at: aahd.webchoices.us/site/static/proceedings/ proceedings.pdf. Accessed Aug 19, 2013 47. Linacre JM: WINSTEPS Rasch measurement computer program [computer program]. Beaverton, OR, 2007 48. Jo¨reskog KG, So¨rbom D, : LISREL 8.8 for Windows [computer program]. Lincolnwood, IL, Scientific Software International, Inc, 2006 49. Whiteneck G: Conceptual models of disability: Past, present, & future, in Field MJ, Jette AM, Martin L (eds): Workshop on Disability in America. Washington, DC, National Academics Press, 2005, p 50Y66 50. Velozo CA, Seel RT, Magasi S, et al: Improving measurement methods in rehabilitation: Core concepts and recommendations for scale development. Arch Phys Med Rehabil 2012;93:S154Y63

38. Rapkin BD, Schwartz CE: Toward a theoretical model of quality-of-life appraisal: Implications of findings from studies of response shift. Health Qual Life Outcomes 2004;2:14

51. Fujiura GT, for the RRTC Expert Panel on Health Measurement: Self-reported health of people with intellectual disabilities. Intellect Dev Disabil 2012;40:352Y69

39. Statistics Canada: National Population Health Survey, Household Component. Cycle 6 (2004/2005), Questionnaire. Ottawa, Canada, Statistics Canada, 2006

52. Kreitzberg CB, Stocking ML, Swanson L: Computerized adaptive testing: Principles and directions. Comput Educ 1978;2:319Y29

Krahn et al.

Am. J. Phys. Med. Rehabil. & Vol. 93, No. 1, January 2014

Copyright © 2013 Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.

APPENDIX Items of the FuNHRQOL Measure and the Ancillary Environment Scale Domain/Item Physical Did you have enough energy to do what you wanted? Did you quickly recover your energy after you did things that took physical energy? Were you physically healthy enough to do what you needed to do? Did you experience physical pain? Did physical pain prevent you from doing what you wanted to do? Did you experience minor illness? Were you free from illness? Did you experience major illness? Did you sleep poorly because of your physical health? Did you wake from sleeping feeling rested? Was your overall physical health good? Was your overall physical health poor? Did you have sickness that interfered with your usual activities? Mental Did you have negative feelings? Did you feel depressed? Did being nervous keep you from doing your usual activities? Did you feel anxious? Did you feel relaxed? Was your overall emotional health good? Were you able to remember things you needed? Were you able to make decisions about your life? Did you pay attention well? Were you able to learn new things? Did you have hopeful feelings? Social Did you have intimacy in your life? Did you feel loved? Did you feel accepted? Did you have chances to participate in social activities in your community? Did you feel lonely? Were you a good friend? Could you count on someone to listen to you when you needed to talk? Did you have a good time with other people? Did you feel happy about relationships with others? Did you feel important to your family or friends? Life satisfaction and beliefs Were you satisfied with your daily life in general? Did you believe things would work out? Were you happy about who you are? Has there been general harmony and balance in your life? Did your personal beliefs help you face difficulties? Did your life have purpose? Did your personal beliefs give meaning to your life? Did your daily work or tasks bring you satisfaction?

www.ajpmr.com

Factor Loading

IRT

0.65 0.64 0.78 0.63 0.73 0.51 0.70 0.67 0.60 0.87 0.88 0.76

500.87 501.44 500.26 503.14 500.54 497.95 502.39 496.44 500.24 501.91 500.61 499.25 498.85

0.70 0.75 0.59 0.58 0.74 0.87 0.59 0.58 0.67 0.67 0.73

500.57 501.19 497.21 502.05 503.18 499.90 497.95 495.01 498.25 497.42 499.20

0.41 0.81 0.85 0.50 0.67 0.52 0.70 0.73 0.87 0.84

507.24 499.06 500.33 504.44 502.47 496.63 500.18 499.50 500.61 499.89

0.78 0.81 0.84 0.75 0.55 0.78 0.60 0.70

502.99 500.76 500.11 503.55 498.12 498.84 497.45 502.19

Function-Neutral Health-Related QoL Measure Copyright © 2013 Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.

73

ITEMS OF THE ANCILLARY ENVIRONMENT SCALE TO THE FuNHRQOL MEASURE IRT a

Environment Scale Did you feel safe in the places that you live and go? Did you like where you live? Were you satisfied with the medical care you received? Did you have the opportunity to learn new things? Did you have the opportunities to engage in healthy behaviors? Did you get the health information you needed? Did the general public like store clerks, bus drivers, workers treat you fairly? Did you have problems with transportation?b Did noise keep you from doing what you needed to do?b Did the weather keep you from doing what you needed to do?b Did the air quality affect your health?b Did policies of governments or agencies make it hard to get service you need?b Did you get where you needed to go? Were you able to get the medical care you needed? Did you feel that society valued you?

498.44 499.69 499.19 501.94 501.61 500.24 498.97 500.61 496.79 500.52 498.49 502.97 488.93 497.98 503.90

Based on N = 674 for IRT. a IRT indicates Bdifficulty[ level of endorsing item. b Indicates that item is reverse coded so higher scores indicate more positive rating.

74

Krahn et al.

Am. J. Phys. Med. Rehabil. & Vol. 93, No. 1, January 2014

Copyright © 2013 Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.

Development and psychometric assessment of the function-neutral health-related quality of life measure.

The aim of this study was to determine the conceptual framework, item pool, and psychometric properties of a new function-neutral measure of health-re...
1MB Sizes 0 Downloads 0 Views