http://informahealthcare.com/dre ISSN 0963-8288 print/ISSN 1464-5165 online Disabil Rehabil, Early Online: 1–10 ! 2015 Informa UK Ltd. DOI: 10.3109/09638288.2015.1041610

RESEARCH PAPER

Validity and reliability of a novel measure of activity performance and participation Phil Murgatroyd1 and Leila Karimi2 Occupational Therapist, Fixby, Huddersfield, UK and 2School of Public Health and Biosciences, La Trobe University, Melbourne, Australia

Disabil Rehabil Downloaded from informahealthcare.com by Gazi Univ. on 05/04/15 For personal use only.

1

Abstract

Keywords

Purpose: To develop and evaluate an innovative clinician-rated measure, which produces global numerical ratings of activity performance and participation. Method: Repeated measures study with 48 community-dwelling participants investigating clinical sensibility, comprehensiveness, practicality, inter-rater reliability, responsiveness, sensitivity and concurrent validity with Barthel Index. Results: Important clinimetric characteristics including comprehensiveness and ease of use were rated 48/10 by clinicians. Inter-rater reliability was excellent on the summary scores (intraclass correlation of 0.95–0.98). There was good evidence that the new outcome measure distinguished between known high and low functional scoring groups, including both responsiveness to change and sensitivity at the same time point in numerous tests. Concurrent validity with the Barthel Index was fair to high (Spearman Rank Order Correlation 0.32–0.85, p40.05). The new measure’s summary scores were nearly twice as responsive to change compared with the Barthel Index. Other more detailed data could also be generated by the new measure. Conclusions: The Activity Performance Measure is an innovative outcome instrument that showed good clinimetric qualities in this initial study. Some of the results were strong, given the sample size, and further trial and evaluation is appropriate.

Activities of daily living, outcome measurement, participation, rehabilitation History Received 2 February 2014 Revised 19 January 2015 Accepted 13 April 2015 Published online 28 April 2015

ä Implications for Rehabilitation  



The Activity Performance Measure is an innovative outcome measure covering activity performance and participation. In an initial evaluation, it showed good clinimetric qualities including responsiveness to change, sensitivity, practicality, clinical sensibility, item coverage, inter-rater reliability and concurrent validity with the Barthel Index. Further trial and evaluation is appropriate.

Introduction Illness and disability result in changes in habitual patterns of activities of daily living (ADLs), including both the range of activities and performance methods used [1]. A broad spectrum of activities is affected, from basic self-care through to complex household, social, leisure and productive activities [2]. Limitations in activity performance are associated with increased mortality, risk of moving to residential care, physical deterioration, adverse psychological responses and the disruption or ending of involvement in social roles and networks [3]. The International Classification of Functioning, Disability and Health (ICF) defines activity performance as ‘‘the execution of a task or action by an individual’’ and activity limitation as Address for correspondence: Phil Murgatroyd, 39 Fixby Road, Fixby, Huddersfield West Yorks HD2 2JG, UK. Tel: 0770 810 6615. E-mail: [email protected]

‘‘difficulties an individual may have in executing activities’’ [4, p. 10]. Activity limitation is a common phenomenon across clinical populations, even though symptoms, signs, psychological responses and lifestyles are varied and individual. Over the past 50 years or so, there has been increasing recognition of the importance of activity performance and participation as a marker of health and disability [1,2]. Katz’s Activities of Daily Living scale [5] was an early influential measure in this field and addressed six basic self-care activities. In 1969, Lawton and Brody expanded the field to include instrumental ADLs, defined as some of the more complex activities required to live independently in the community [6]. Over the past four decades, there has been a proliferation of instruments designed to measure this area, and according to McDowell [7] there are over 100 published functional scales, some disease specific and some generic, based on different terminologies, methodologies and conceptual frameworks, for a variety of purposes and settings. This has allowed a valuable body of knowledge regarding activity

Disabil Rehabil Downloaded from informahealthcare.com by Gazi Univ. on 05/04/15 For personal use only.

2

P. Murgatroyd & L. Karimi

performance to be accumulated [3]. In parallel, the performance of personally meaningful activities has come to be viewed as the primary goal of transdisciplinary rehabilitation [8,9]. However, it has been known for many years that there are problems with current measurement approaches in this field [10] and this dissatisfaction persists [11–13]. One of the issues is which activities should be measured. There is no common set of ‘‘normal’’ activities to guide the choice of the activities to include in item sets [14]. This is particularly true of more complex activities such as leisure and social activities, where there is a potentially unlimited set of candidates for inclusion on item lists [10]. On the other hand, lengthy item lists are impractical to use. However, the choice of which items are measured will influence the validity of research results. For instance, instrumental ADL tools measuring domestic tasks generate higher disability ratings for men, but in many cases simply reflect traditional gender roles rather than disability [1]. The most commonly used functional tools – such as FIM [15] and Barthel Index [16] – are structured around mobility and basic self-care activities, which represent a small fraction of the ways people use their time. The selected items, while important and relatively easier to measure, may not be the most important for quality of life and have been called a poor reflection of what really matters in successful ageing [17]. In other words, they are both insufficiently comprehensive and specific to capture change [12] and can be viewed as reflecting an outdated conceptualization of disability, which does not correspond to rehabilitation practice [13]. Other problems with the focus on basic activities is a ceiling effect [18], making it hard to distinguish among higher functioning people, which is vital in attempts to measure the activity performance of people living in the community. Adequate knowledge about subtle early adaptations could be useful in focusing prevention activities, but this is not currently available from the most widely used tools [3]. In summary, current approaches to item selection result in limited representation of the incidence and prevalence of activity limitation, and hence offer limited scope in evaluating interventions to counter these problems. Scoring the activity items, once selected, is another problem area. Generally, this is done by categorizing performance within a number of constructs or dimensions [19] such as independence or difficulty. For some basic activities and mobility, independence is arguably the normal standard of human adult performance in western industrialized countries at least. But for more complex activities, independent performance is only one of a range of possible performance methods. For example, sharing tasks, paying for services and divisions of labor are commonly seen human behaviors in good and ill health. Consequently, there is a risk that independence-based scoring systems could attach a label of dependency on simple personal preference. Asking people whether they have difficulty in performing certain activities is another common approach to scoring [2]. One advantage is that it expresses the participant’s own perception of her/his situation (although this is lost when an external rater is used). However, terms such as difficulty, limitation, problem and also quantifiers (‘‘a lot’’ and ‘‘a little’’), are open to multiple interpretations and influences, whether by research participants or external raters [19]. Consequently, question wordings affect prevalence estimates considerably [20]. The subjectivity involved in these concepts also has the potential to be influenced by personality factors, psychological adaptations and environmental accommodations often seen in response to disability [10]. Behavioral flexibility and the tendency to modify environments to personal or group advantage have been called key human traits [21,22] and apply in both good and bad health. Qualitative

Disabil Rehabil, Early Online: 1–10

accounts of disability show clearly how people adapt both their performance methods and the situations they operate in so as to make trade-offs between a range of factors, including task difficulty, resource availability, the responses of others to disability and their own subjective feelings [23,24]. This behavioral flexibility is challenging for scoring systems predicated on concepts of dimensionality. Closely related to broadly-defined activity performance is the ICF concept of participation, defined as ‘‘involvement in a life situation’’ (p. 10) [4]. The seeming simplicity of this definition is misleading [25], and it is interpreted in multiple ways both by different rehabilitation stakeholders [26] and in the academic literature [27]. Common themes in discussions of participation are autonomy, social roles, social and community integration and choice. This variety in the coverage of participation is matched by variation in approaches to measurement, which address both objective and subjective features [28]. In terms of items, although complex community and social activities are often discussed in this context, participation measures also include more basic activities including mobility and self-care [28]. Despite activity performance being a key clinical concern, there are frequent reports of problems with measuring activity performance in both post-acute populations and community rehabilitation with the tools currently available [18,29,30]. This applies to health-related changes in activity performance and the effectiveness and efficiency of clinical interventions. Common complaints are that there is a lack of detailed knowledge regarding intervention effectiveness and that it is not possible to track service user progress throughout the various stages of an episode of care [13]. The methodological problems result in an inadequate understanding of effective and efficient models of care and service configurations, and hamper the evaluation of different clinical approaches [13]. As for participation, Whiteneck and Dijkers [31] suggest the conceptual and methodological difficulties may be such as to pose an obstacle to the successful use of the concept in rehabilitation research. The proliferation of scales and measurement approaches could be interpreted as a sign of the importance of measuring activity performance, and also frustration with measures that often do not capture clinically meaningful change. Some commentators have explicitly opposed the development of new functional scales, which contribute no new information, and argue instead that efforts would be better spent on establishing the validity and reliability of existing ones using psychometric methods [7]. However, others argue that there is a need for methodological advances, including tools which are simple to use and are based on a universal language of function across health conditions [12].

The ICF terminology and definitions The ICF [4] was published by the World Health Organization in 2001 with the aim of providing ‘‘a unified and standard language and framework for the description of health and health-related states’’ (p. 5). Other goals include facilitating communication and the comparison of information across settings. The ICF is intended to describe functioning and disability, and is made up of four core components namely body functions and structures, activities and participation, environmental factors and personal factors. The ICF defines activity performance as ‘‘the execution of a task or action by an individual’’, with activity limitations being ‘‘difficulties an individual may have in executing activities’’ (p. 10). Participation is defined as ‘‘involvement in a life situation’’ with participation restrictions defined as ‘‘problems an individual may experience in involvement in life situations’’ (p. 10). Body functions are ‘‘the physiological functions of body systems (including psychological functions)’’ while body structures are ‘‘anatomical parts of the body such as organs, limbs and their components’’. ‘‘Impairments

Disabil Rehabil Downloaded from informahealthcare.com by Gazi Univ. on 05/04/15 For personal use only.

DOI: 10.3109/09638288.2015.1041610

are problems in body function or structure’’. Environmental factors ‘‘make up the physical, social and attitudinal environment in which people live and conduct their lives’’ while personal factors are not specified. In terms of activities and participation the ICF uses a single list of items, covering a broad range of life areas or domains ranging from basic mobility to complex multi-component activities such as employment and leisure activity. Each of these domains may be characterized as either participation or activity, according to the ICF. Four options are given for separating activity from participation, namely total separation of the two, partial or total overlap, and nestling activities beneath headings of participation. The ICF offers two ways of qualifying both activities and participation, namely performance or capacity, with performance being ‘‘what an individual does in his or her current environment’’ (p. 15). Capacity refers to what a person does in a standardized environment. In addition to participation as performance, the ICF indicates that the participation concept also includes subjective perceptions of participation (‘‘the lived experience of people in the actual context in which they live’’) (p. 15). There has been considerable discussion of the distinction between the ICF activity and participation concepts, and the ICF approach has been criticized for creating conceptual and methodological confusion [28,31], and the need for conceptual clarity is stressed [32], although it has also been suggested that such a consensus may not actually be necessary [25]. Other common themes in the literature are that activity performance and participation overlap to some extent, that activity performance determines participation to some extent and that there is no agreed single way of distinguishing them [27,31]. Given the multi-component and broad coverage of the ICF participation concept, and the broad range of participation measures and approaches available, it has been suggested that both researchers and scale developers should specify their approaches to participation and the distinction with activity [26,28]. The general approach taken in the development of the Activity Performance Measure (APM) is based on the understanding that transactional human functioning is an enormously complex, dynamic phenomenon involving multiple components over time. Classification systems based on theoretical frameworks such as the ICF’s activity and participation are valuable depending on their explanatory and predictive power, but do not have inherent separate existences in the human organism as it interacts with people and places [33]. Until a definitive and broadly accepted conceptual division is established, a pragmatic approach accepts that activity and participation inevitably involve overlapping and parallel aspects of human living, and that contrasting measurement approaches are both necessary and appropriate. What does seem less problematic is that from a healthcare perspective, there is a need for knowledge regarding the practical details of how a broad range of tasks are completed, including the extent of any limitation, and this falls under the ICF activity performance concept in the APM. Similarly, there is a need to know about a person’s regular areas of involvement in the world, and how this narrows and expands due to illness and rehabilitation – this falls under the ICF concept of participation as performance in the APM. See www.theapm.net for a detailed account of how the APM scoring system represents the distinction between activity and participation.

Development of the Activity Performance Measure Various information sources contributed to the development of the APM: (1) literature review, (2) review of measures in this field and (3) clinical expertise. The development process of review, reflection, discussion with interested colleagues and informal trial

Validity of Activity Performance Measure

3

was an iterative and incremental one spread over several years and diverse clinical settings before it reached the point of a formal validation project. The literature review focused on the rationale and history of activity performance measurement, theoretical considerations, critiques of measurement approaches, dilemmas and validation approaches. Charmaz’s qualitative research [24] regarding how people live with chronic disability and why they make certain choices was a key reference in developing the scoring approach. The ways participation has been conceptualized and measured were also reviewed. In the literature review, McDowell’s selection of measures with good validity and reliability evidence was taken as a starting point [7]. Google Scholar and CINAHL were searched for measures published after McDowell using the following search terms: International Classification of Functioning, Disability and Health, outcome measure, function, behavior, activities of daily living, instrumental activities of daily living, participation and functional status. References to outcome measures were followed up. Individualized approaches including Goal Attainment Scaling [11] were reviewed. In terms of item selection, the review process included the kinds of activities selected, with considerations including how common the activity was (daily–monthly–yearly?), and the ‘‘broadness’’ of the definition [e.g. posting a letter (narrow) through involvement in community activities (broad)]. The proven success of gathering what may be quite different behaviors and environments such as bathing and showering under a related broad category such as self-care was noted [10]. Other clinical considerations included whether the item was commonly included and addressed in clinical assessment and therapy, everyday importance versus the need to be able to cover individuality, observability versus self-report, number of items versus practicality, the range of component tasks falling under each item, and how feasible it was to gather information about items from interview. Scoring approaches were reviewed from the perspectives of minimizing the need for subjective judgment and hence reliability problems, and also careful consideration of the hierarchies of responses as indicating (or assuming) degrees of activity limitation and participation restriction (scoring approaches based on dimensions like independence have been discussed). These were compared with qualitative literature on living with disability, personal experience of attempting to position service user behavior on rating scales, literature on non-adherence to clinical recommendations and common clinical approaches in judging how people progress in rehabilitation and reporting on this, e.g. in team meetings, family meetings and discharge reports. This led to a number of guiding principles for an outcome measure intended for use in clinical settings: – Assessment of actual activity performance is preferable to assessing capacity as a reflection of people’s everyday lives. – For practicality, the information required must be readily identifiable from interview. – Without questioning the importance of the subjective aspects of participation, everyday performance and the effect of clinical input on performance are likely to be a focus of interest in clinical settings and at health policy levels [25]. – To match contemporary rehabilitation best practice and thinking about disability, the item set should cover as broad a range of activities as possible including mobility and selfcare through to community activities [13], while remaining practical to use. – Scoring approaches based on continuums of greater or lesser difficulty, limitation, problems or proportions of a task where support is given, result in reduced validity due to the degree of subjective interpretation involved [27].

4 –





Disabil Rehabil Downloaded from informahealthcare.com by Gazi Univ. on 05/04/15 For personal use only.



– –



P. Murgatroyd & L. Karimi Independence is an appropriate scoring criterion for only a limited number of basic activities. For other activities, a person’s preferred or habitual performance method is the appropriate criterion of successful performance. Scoring systems need to be able to distinguish between people, who choose not to engage in certain activities due to lifestyle preference, and those whose non-participation is due to health problems. Nevertheless, a limited number of basic activities can be seen as human universals, and performance of them can be considered normal in western industrialized societies at least [34]. Defining items broadly to gather together related activities will enhance comprehensiveness, individualization and practicality. Activity performance is best understood transactionally as situations or interactions involving person and environment. Including environments and performance methods in measurement could allow some representation of how humans integrate individuality, choice and preference within activity performance as well as practical considerations [35]. It is not appropriate to use population-based norms for participation across diverse sociocultural groups [31]. A person can still be considered to show participation in an activity, even when assisted, due to the possibility of directing and shaping aspects of it [36]. Similarly, participation can continue, even when a person adapts her or his habitual performance methods [31]. The cessation by a person of the performance of a habitual activity is a clear and readily identifiable marker of a high degree of participation limitation in a given domain. While this does not address important aspects such as satisfaction or levels of control, it does have the advantage of showing less scope for subjective judgment. Measuring premorbid, admission and discharge time points, and the transitions between them, allows a contextual understanding of an individual’s performance and changes to it. Using this approach, each person functions as their own reference point when measuring both activity performance and participation. These considerations led to the research version of the APM.

Brief description of the APM The APM is a clinician-developed tool, which is intended to respond to the known difficulties with current approaches to measuring activity performance and participation in clinical contexts. Briefly, the research version of the APM consisted of 27 activity items, ranging from mobilizing in the home through to complex social activities. The way a person performs an activity item is rated on a 0–4 scale (if the item is relevant to the person). The item scores are combined to produce an Activity Summary and a Participation Summary score. For acute admissions, three time points are covered (premorbid, admission and discharge), allowing each person to act as their own reference point when analyzing changes over time and clinical interventions. In the case of chronic conditions, only admission and discharge are covered. The intention of the APM is to be a generic, clinician- or researcher-rated measure completed in semi-structured faceto-face interviews before and after an episode of care. The information required to complete the APM is commonly collected by holistic services and clinicians addressing activity performance. The APM is rooted in the ICF’s conceptual framework and terminology, and integrates a number of innovations including a flexible, broadly defined item set. The scoring system is structured around behavior and highly tangible aspects of how people interact with their environment rather than based on

Disabil Rehabil, Early Online: 1–10

assumptions of dimensionality. Account is also taken of habitual lifestyles and preferences with the overall aim of producing individualized yet comparable metrics. In addition to summary scores of activity performance and participation, the APM could also be used to produce more detailed information such as items of most change or changes in compensatory performance methods across time points. The conceptual framework underpinning the APM is based in part on the notion of evolved human functions, which underlie and drive individual behavioral, environmental and subjective variability and sociocultural values [21,34]. For a more detailed description of the APM and other resources, see www.theapm.net.

Clinimetric validation approach Just as there are many options when choosing item sets or scoring approaches, so alternatives exist when considering how to evaluate a measure of activity and participation. In general, two approaches to validity and reliability testing of multi-item measurement scales can be distinguished in the literature, namely psychometric and clinimetric approaches [37]. Psychometric approaches are based on the assumption that the items in the scale are measuring a single construct or latent variable, which is unobservable directly [38]. In this approach, the items are considered to be effect indicators of the underlying latent trait of activity performance or participation. Under this approach, it is assumed that there is a hierarchical pattern in the activity items, from low to high difficulty, reflecting the latent trait. Validity is determined by the extent to which items scores achieved by participants relate to the assumed hierarchy. This has led to approaches where validation and reliability testing are based on examinations of statistical homogeneity among the items, such as item response theory. Feinstein [37] argued that many scales used in clinical settings are different in significant ways from psychometric scales, and that this influences appropriate validation approaches. He suggested that one important difference is that clinimetric scales are driven by the imperatives of dealing with multifaceted clinical situations. According to Fayers and Hand [38], many clinical scales are not based on assumptions of an underlying latent trait, but consist of ‘‘multiple items which are not expected to be homogeneous because they indicate different aspects of a complex clinical phenomenon’’ (p. 234). In other words, a clinimetric approach seeks to measure many attributes with a single instrument, rather than a single attribute with multiple items, as in the psychometric approach [39]. A clinical scale, including the scoring system, may include very heterogeneous environmental, psychological and physical symptoms among various other phenomena. The APM’s broadly defined activity set and scoring system can be characterized as highly heterogeneous. In any event, psychometric approaches to scale development, while widely found, are not considered the best procedure in this area. It has been suggested that clinimetric approaches may be more appropriate for activity performance and participation [31,37,38,40–43]. For example, Streiner suggests that ADLs are a composite variable and therefore not suitable for latent trait approaches [43]. Dijkers questions whether latent-trait methods are appropriate for participation [27]. Additionally, it has been suggested that psychometric methods and their stress on the importance of statistical homogeneity can reduce the ability of scales to distinguish between different groups of patients, as well as changes within patients, and therefore generate weaker evidence in clinical trials [41,42]. A clinimetric approach to validation fits well with the APM’s clinical origins, causal indicators and heterogeneous activity items and score criteria. Key validity qualities from a

Validity of Activity Performance Measure

DOI: 10.3109/09638288.2015.1041610

clinimetric perspective include reliability, responsiveness and sensitivity, face validity, comprehensiveness, practicality and item weighting [42]. These are considered in the methodology proposed below.

Disabil Rehabil Downloaded from informahealthcare.com by Gazi Univ. on 05/04/15 For personal use only.

Design and participants The study design was a cohort design, following a convenience sample of 49 participants aged 18 years and over with occupational therapy goals admitted to four community rehabilitation centers in a large Australian city. Participants attended the centers on a sessional basis or received home-based sessions. Service users referred to occupational therapy in community rehabilitation centers appeared suitable for testing a measure of activity performance and participation in that they live in the community, and have health problems likely to affect activity performance. The expectation was that the sample would be varied in terms of demographics and clinical diagnoses, giving an opportunity to carry out various analyses based on known groups. Nevertheless, the variability would not be so extreme as to make differences in activity performance scores virtually automatic – e.g. relatively healthy service users, or people requiring full nursing care, were unlikely to be included. This variety would also be reflective of the population group, with whom the APM is intended to be used, as recommended by Streiner and Norman [19]. Occupational therapists were considered to be appropriate as associate researchers in this project because of their expertise in assessing and treating activity changes [44]. Once management and ethical approval was gained, six occupational therapists in the service agreed to collect data for the project. The criteria for inclusion of participants were as follows: – Referred to occupational therapy in a community rehabilitation centre. – Occupational therapy goals identified and participant received occupational therapy. Exclusion criteria –





If a participant’s activity performance methods varied widely from day to day, or were changing quickly, e.g. due to rapid neurological change, or if it was impossible to get accurate overall information about a person’s activity performance. If the participant developed significant new health problems during the admission, died or moved outside the area before the end of therapy. If the only occupational therapy goal was major equipment funding.

Procedures and measures The study was approved by the human research ethics committees of the relevant health network and university. Participants were asked for consent via a Participant Information and Consent Form. All data were permanently deidentified. Devising an approach to test the APM’s validity and reliability posed a number of challenges – the reasons for not following a psychometric approach have been discussed. Additionally, there is no clear ‘‘gold standard’’ matching the APM’s coverage. The approach chosen was to analyze the results generated by the APM in relation to demographic groups, whose activity performance is known to vary. The tests included sensitivity – did the APM summary scores show variation between the known groups at one time point – and responsiveness – did the results show differences in amount of change as individuals progressed through clinical intervention over time? Other aspects of validation included interrater reliability, practicality, clinician perceptions of the APM’s results and comprehensiveness.

5

The community occupational therapists/associate researchers were given 90-min training and written cheat sheets on how to complete the case report form and the consent process. In addition to the rationale and a general description of the project, key points in the training included the recommended approach to the semistructured interview such as a suggested question order and style. The aim was to maximize both validity and reliability by ensuring the tangible information needed was collected, and this included cross-checking between time points to maximize recall accuracy and using probing questions when appropriate. The premorbid and admission sections of the APM were completed at admission by the treating occupational therapists. Premorbid admissions were defined as people, who had an acute hospital admission in the 12 months prior to community rehabilitation. The discharge section was completed at discharge from occupational therapy, or after 16 weeks, if the person had not been discharged by then. Demographic variables, which are known risk and protective factors for activity performance [1,45,46], were collected: age, number of medications used at admission, comorbidity, cohabiting status, use of a mobility aid and gender. Barthel Index [47] and Charlson Comorbidity Index [48] scores were derived from the clinical and research documentation. The participating therapists were asked to note how long it took to complete the APM paperwork. Additionally, in order to gather data about face validity, the case report forms included a comments section, and participating occupational therapists were asked to give written feedback after the project. The researchers were asked to provide general feedback and rate the following qualities on a scale of 0–10 (negative–positive): – Was the data collection form easy or difficult to use? – Did the APM capture clinically significant activity performance change by participants? – Which areas of activity performance were captured well and which not so well? – Was the range of activities clinically comprehensive? – Should additional activities be included? – Was the training sufficient and what else should be included? – What about the supporting documentation (cheat sheet)? – Any other comments. Additionally, a focus group was organized, during which the project results were fedback to the researchers and points from the feedback forms were followed up and opened up for discussion. Field notes were taken that included direct quotations when possible. The principal investigator wrote a manual detailing the scoring rules. The co-principal investigator was given additional training in how to rate the APM via the scoring manual and discussion. The principal and co-principal researchers produced item scores separately using the recording forms completed by the associate researchers. The item scores were then processed. The results were analyzed using SPSS v20 and the online calculators available at http://plaza.ufl.edu/algina/index.programs.html. With the exception of inter-rater reliability, the results of rater 1 (P.M.) are reported given there was little difference between the two raters.

Data analysis Inter-rater reliability was tested by calculating intraclass correlations for the item scores, Activity Summary and Participation Summary scores produced by the principal and co-principal investigators. The intra-class correlation was interpreted as follows: 50.40 – poor reproducibility; 0.40–0.75 – fair to good reproducibility and 40.75 – excellent reproducibility [49]. A single measure of a 2-way mixed model with absolute

6

P. Murgatroyd & L. Karimi

Disabil Rehabil Downloaded from informahealthcare.com by Gazi Univ. on 05/04/15 For personal use only.

Table 1. Known-group responsiveness testing.

Disabil Rehabil, Early Online: 1–10

hypotheses

used

for

High scoring groups

Low scoring groups

Younger Low comorbidity Male Low medication use Acute admission No mobility aid

Older High comorbidity Female High medication use Chronic admission Uses mobility aid

agreement was used at a 0.95 confidence interval in a repeated measures multivariate analysis of variance. Clinical sensibility, practicality and item coverage were examined via the qualitative feedback from the occupational therapists. Responsiveness to change and sensitivity were based on the APM’s ability to detect differences between the known high and low scoring groups in Table 1. Responsiveness to change – defined as the ability to detect change between admission and discharge – was tested using the robust non-pooled Cohen’s D proposed by Algina et al. [50,51] using the SD of the discharge scores [52]. This software also generated 95% confidence intervals using bootstrap of 600 samples with replacement. The cross-sectional sensitivity testing was carried out using Algina et al.’s robust pooled Cohen’s D for independent groups [53]. Sensitivity tests were carried out at all appropriate time points. For both responsiveness and sensitivity, interpretation followed Cohen’s suggested criteria: 50.2 trivial, 0.2–0.49 small, 0.5–0.79 moderate, 0.8 and above large [52]. Significance testing was carried out of the paired and independent known group results using non-parametric tests (the Wilcoxon and Mann–Whitney U, respectively).

Table 2. Participant characteristics. Premorbid (n ¼ 35) Respondent n (%)

Variable

Gender Male 23 (66) Female 12 (34) Age Mean (SD) Range Acute/chronic admission Acute 35 Chronic N/A Mobility aid Yes 7 (20) No 28 (80) Cohabiting status Lived with someone else 29 (83) Lived alone 6 (17) Other measures Charlson Comorbidity Index Mean (SD) Range Number medicationsa Mean (SD) Range Primary condition at admission Neurological Orthopedic Cardiac Others

The Barthel Index is a commonly used and well-respected functional measure [7]. Consequently, the relationship was examined between Barthel Index scores and the Activity and Participation Summary scores using Spearman Rank Order Correlation, given that the data were treated as ordinal. Preliminary inspections were performed to ensure no violation of the assumptions of linearity. Interpretation was as follows: 0.0– 0.24 – low; 0.25–0.49 – fair; 0.5–0.74 – moderate to good and exceeding 0.75 – good to excellent [54]. The responsiveness of the various measures was also compared. Finally, a comparison was also made with the combined APM items, which correspond most closely to the Barthel items – this is not reported here.

Results The sample consisted of 48 people (one person died during the admission). Table 2 shows the participant characteristics. The community rehabilitation centers covered areas of high and low socio-economic status. Approximately one-third of the sample was first- or second-generation migrants. Medical backgrounds were diverse across diagnostic categories and included psychological problems, with 73% (n ¼ 35) of the referrals being from an acute background and 27% (n ¼ 13) from a chronic background. The responsiveness and sensitivity results for the acute and chronic referrals are reported separately due to their differing clinical backgrounds and potential for gains in rehabilitation [45,55]. As for the younger acute participants (i.e. 565 years), the majority (10/13) had suffered serious neurological

30 (63) 18 (37)

Discharge (n ¼ 48) 30 (63) 18 (37)

64.17 (17.85) 20–93 35 13

35 13

28 (58.3) 20 (41.7)

20 (41.7) 28 (58.3)

40 (83) 8 (17) 5.04 (3.04) 0–13 7 (4.22) 0–16 22 11 5 10

a

Based on 36 observations.

Table 3. Activity and participation summaries scores for sample. N

Concurrent validity

Admission (n ¼ 48)

Activity summary Premorbid 35 Admission 48 Discharge 48 Participation summary Premorbid 35 Admission 48 Discharge 48

Minimum

Maximum

Mean

SD

43 31 35

100 97 100

93.66 70.91 80.36

10.84 15.10 14.45

12 9 11

27 22 24

20.66 16.33 18.52

3.17 3.33 3.46

events with limited recovery (mainly strokes or cerebral haemorrhages). For that reason, the acute admissions were divided into three comparably sized age-based groups (0–65, n ¼ 12; 66–77, n ¼ 12 and 78+, n ¼ 11), with the latter two groups used for the responsiveness and sensitivity tests. Table 3 contains the results of the APM summary scores. Both Activity and Participation Summaries showed decline at admission, compared with premorbid, followed by an improvement during the episode of care, but not to the premorbid level. This was a common pattern throughout the study. Practicality and clinical sensibility The occupational therapists took 17 min on an average to complete the data collection forms (1st phase of data collection) after completing their standard clinical interviews. The raters needed 5 min on an average to score the forms, giving a total of 22 min per participant. The clinicians’ perceptions of using the APM are given in Table 4. Themes in the written feedback and focus group meeting were that some clinicians said they found the APM a useful prompt to

Validity of Activity Performance Measure

DOI: 10.3109/09638288.2015.1041610

Table 4. Occupational therapist perceptions of using the APM.

points, the correlation is moderate to excellent. The lower premorbid figures may result from the smaller sample at this time point, but the correlation here is fair. p is 50.001 except at premorbid, when it is 50.05. However, while the correlation between the measures was satisfactory, the APM summary scores were considerably more responsive to change, as summarized in Table 10. The detailed effect size statistics for the Barthel Index are reported in Supplementary Table S1.

Scorea

Characteristic Ease of use Clinical comprehensiveness Quality of training Quality of cheat sheets Ability to capture clinical change All scores totaled

8.4 8.6 8.2 10 8.4 8.7

Discussion

a

Score range 0 ¼ 10. 10 ¼ Easiest/most comprehensive/highest quality.

Table 5. Inter-rater reliability.

Disabil Rehabil Downloaded from informahealthcare.com by Gazi Univ. on 05/04/15 For personal use only.

Activity summary Participation summary

7

Premorbid

Admission

Discharge

0.97 0.97

0.98 0.96

0.95 0.97

cover a full range of activities in assessment. Managing finances, education, banking and laundry were mentioned as possible additional items, and some therapists commented that the APM did not capture change in smaller component tasks, such as bed transfers, or therapy inputs such as the use of energy conservation and compensatory upper limb techniques. No clinicians reported being unable to get the required information in clinical interview, e.g. due to cognitive or communication difficulties. The two scorers found that generally the recording forms were completed appropriately in order to score them. Nevertheless, problems were noted including omissions and ambiguous entries. The scorers had to contact two occupational therapists to clarify entries on 7/48 forms.

Inter-rater reliability The inter-rater reliability results for the Activity and Participation Summaries are given in Table 5. These scores indicate high reproducibility, but there was more variation at the item level. In total, 81 items were rated over the three time points. Of these, 76 (94%) achieved 40.75 single measure inter-rater reliability, indicated as very good. The remaining items ranged from 0.58 to 0.72 (fair to good reproducibility), with the exception of premorbid heavy housework (0.16 – poor reproducibility).

Responsiveness to change The effect sizes of the Activity and Participation Summaries between admission and discharge for the sample as a whole were moderate (respectively 0.72 and 0.65, p50.001) (Table 6). In terms of validity evaluation, 10 tests of responsiveness comparing known groups between admission and discharge were carried out. The data were non-normal and unequal variances assumed. The results are summarized in Table 7 and the 10 tests reported individually in Supplementary Table S1. Sensitivity Twenty-eight sensitivity tests were calculated – they are summarized in Table 8 and reported individually in Supplementary Tables S2–S13.

Concurrent validity Table 9 reports the correlation between the Barthel Index and APM summary scores. For the admission and discharge time

The APM is a novel measure of activity performance and participation, which was evaluated in a convenience sample of 48 people, who received community occupational therapy. The research version used activity items and innovative scoring approaches to provide global numerical ratings of activity performance and participation. In an initial study with a highly heterogeneous sample, the APM demonstrated good clinimetric qualities including responsiveness to change, sensitivity, practicality, clinical sensibility, item coverage, inter-rater reliability and concurrent validity with the Barthel Index. Clinical sensibility In terms of clinical sensibility, the occupational therapist feedback was globally positive. Inevitably, there have to be compromises between micro and macro levels, and additional items would be useful for some populations (e.g. education or managing finances). The high inter-rater reliability and perceptions of the principal investigators/scorers would indicate that the tangible information required to complete the APM can be realistically collected from participants. However, the use of a two-stage process – one stage describing performance and the other scoring it – opened up opportunities for additional error and increased administration time. It would be appropriate to trial using the APM in a more conventional way, where a single person (or team) records and scores the information. This would also allow a reduction in the average time to complete the form of 22 min (in addition to standard clinical interviews). Another strength was that no difficulties were reported when using the APM across the socioculturally diverse groups in the study. This would correspond well with the APM’s attempt to strike a balance between universal aspects of activity performance and individualized measurement. Inter-rater reliability There are several possible explanations for the high inter-rater reliability scores. One is the highly behavioral and tangible nature of the scoring system, which was designed to align closely with clinician reasoning and documentation styles. Another explanation is the large number of items comprising each scale (up to 27 per time point), which is known to increase reliability [19]. The moderate to fair scores on some items were mainly due to occasional ambiguous wording by occupational therapists, particularly on complex multi-component activities, and omissions recording mobility aid use. However, in general, reliability remained good to excellent on these items too. Responsiveness to change and sensitivity Responsiveness to change and sensitivity are key qualities for an outcome measure and shortcomings in these areas are a recurrent theme in critiques of activity measurement tools [12]. Overall, this study gave strong initial support to the APM as a useful tool able to identify groups with known differences in activity performance, especially given the low sample numbers. As

8

P. Murgatroyd & L. Karimi

Disabil Rehabil, Early Online: 1–10

Table 6. Responsiveness of activity and participation summaries. Admission Mean Activity summary Whole sample 70.91 Participation summary Whole sample 16.33

Discharge

SD

N

Mean

SD

N

Unpooled robust Cohen’s D (95% confidence interval)

Wilcoxon testa

15.77

48

80.36

14.68

48

0.72 (0.43, 1.07)

5.19, 0.001b

3.33

48

18.52

3.46

48

0.65 (0.35, 1.06)

5.26, 0.001b

a

Wilcoxon: standardized test statistic, p value. p value 50.001.

b

Disabil Rehabil Downloaded from informahealthcare.com by Gazi Univ. on 05/04/15 For personal use only.

Table 7. Summary of responsiveness tests.

Table 10. Mean responsiveness effect sizes (unpooled robust Cohen’s D), activity and participation summaries and Barthel Index.

In line with hypotheses and statistically significanta In line with hypotheses and not statistically significanta Contrary to hypotheses and statistically significanta Contrary to hypotheses and not statistically significanta Total

2 6 0 2 10

a

Statistically significant: Wilcoxon test 50.05 and no overlap between robust Cohen’s D confidence intervals.

Score Activity summary Participation summary Barthel Index

Mean responsiveness (¼unpooled Robust Cohen’s D)

Range of unpooled Robust Cohen’s D

Number of tests

0.96 0.87 0.5

0.32–1.79 0.17–2.11 0.35–0.96

11 11 9a

a

The lack of variability in Barthel Index scores led to statistical tests failing twice.

Table 8. Summary of sensitivity tests. In line with hypotheses and statistically significanta In line with hypotheses and not statistically significanta Inconclusiveb Total

9 14 5 28

a

Statistically significant: Mann–Whitney U 50.05 and robust Cohen’s D confidence interval does not contain 0. b Inconclusive: effect size 0.2 to 0.2.

Table 9. Spearman rank order correlation between Barthel Index and APM-generated scores.

Premorbid Participation summary Activity summary Admission Participation summary Activity summary Discharge Participation summary Activity summary

n

Spearman’s

Sig. (2-tailed)

35 35

0.32 0.40

0.03 0.008

48 48

0.62 0.85

0.001 0.001

48 48

0.68 0.81

0.001 0.001

regards responsiveness to change, the results were in line with hypotheses 8/10 times, while statistical significance was achieved in two of these tests. The only test in the study to go against the hypotheses involved males and females and was not statistically significant. The Barthel Index gave similar results for the male and female groups. The effect sizes achieved were sometimes very large. The largest in the study was the sensitivity testing of the Activity Summary between acute participants using and not using a gait aid premorbidly, with a robust Cohen’s D of 3.43 (p ¼ 0.001). The highest figure for responsiveness (2.11, p ¼ 0.001) was achieved using the Participation Summary between admission and discharge in the low comorbidity acute participants. These figures seem remarkable in groups of 7 and 28 for the sensitivity comparison, and 11 for responsiveness to change. However,

robust Cohen’s D of over 0.8 (large effect size) were not isolated instances (see Supplementary Material). Confidence intervals were often wide, but not always – e.g. the Participation Summary distinguished between acute and chronic referrals, and high and low scoring groups on the Charlson Comorbidity Index, without overlapping CIs. Improved responsiveness in measuring rehabilitation outcomes is a major concern in the literature and among evidence-minded clinicians. However, given that the small sample size was an issue in the study and limits the generalizability of the results, further investigation with larger samples seems warranted to check if these levels of responsiveness and sensitivity can be reproduced. As Table 10 indicates, the mean Activity Summary effect sizes were nearly twice as large as those obtained using the Barthel Index (0.96 versus 0.5), while the Participation Summary was also more responsive (0.87 versus 0.5). Also, noteworthy is the difference in ranges, with the Barthel tracking a range of 0.61 (0.35–0.96), compared with 1.47 (0.32–1.79) for the Activity and 1.94 (0.17–2.11) for the Participation Summaries, respectively. Whereas most of the Activity and Participation Summary scores indicated the demographic groups as achieving large and moderate effect sizes during the admission, the Barthel Index equivalent scores indicated small effect sizes in six out of nine tests. The oft-mentioned ceiling effect of the Barthel [18] is evident in Supplementary Figure S1 showing Barthel Index and Activity Summary scores at discharge for the 35 acute participants. The maximum score on the Barthel (20) corresponds with a score of 74–100 on the Activity Summary. The correlation between the measures is also evident. Similar charts are available for the other time points. Here, the APM results aligned clearly with the critiques of the Barthel Index found in the literature – although it is worth recalling that this much used measure was not designed for the kind of complex activities treated in community occupational therapy. In summary, the APM performed well on the key tests of responsiveness and sensitivity, especially given the small group sizes.

DOI: 10.3109/09638288.2015.1041610

Other uses The focus of this study was on the APM’s clinimetric characteristics and especially the summary scores. However, for rehabilitation researchers the APM has the potential to generate a large range of additional data. This could include items of most and least change, counts of performance methods and changes between them, or profile scores of related activities (e.g. domestic or community activities) – this opportunity to provide a lessgeneric representation of activity and participation is referred to as important by [9] and would offer meaningful alternatives to summary scores, which have the potential to conceal important information [31]. Similarly, other transitions could be examined, e.g. comparing admission and discharge with premorbid status, or participant progress between services.

Disabil Rehabil Downloaded from informahealthcare.com by Gazi Univ. on 05/04/15 For personal use only.

Weaknesses This study was a pilot study. Weaknesses included the size of the convenience sample and its high heterogeneity. Although heterogeneity had been expected, and used as a basis for the statistical approach to validation, nevertheless, it reduces generalizability. There were limited comparisons with other measures (only the Barthel Index on responsiveness). Given that sensitivity to change depends on sample and treatment effects, as well as measurement instrument characteristics [19], more head-to-head comparisons with other measures would have been informative. The focus of the validity testing was on activity limitation – head-to-head comparisons with participation measures and extended or community ADL measures would be useful. Also, it would be useful to compare with impairment-specific measures including cognition and communication. As would be expected with a new measure, several areas were not examined, including test–retest reliability, minimal clinical difference and service user perceptions of the rating system and scores. It would be useful to test inter-rater reliability with two raters for single participants.

Conclusion While this study appeared broadly supportive of future use and validation of the APM, it also identified areas where item coverage, training, administration processes and documentation can be improved. Further trial and evaluation of the APM is warranted, particularly in clinical areas, where it is felt that there is insufficient knowledge of activity performance and participation change and how these areas relate to clinical inputs. In terms of ongoing validation, further concurrent, sensitivity and responsiveness testing, especially head-to-head comparisons of larger samples with extended ADL, participation and functional measures, would be a good next stage.

Acknowledgements The authors specially thank Frances Wright, who as co-principal investigator was instrumental in the realization of this study. They also thank Dr. Priscilla Robinson, Anne Cattermole, Deidre Mahon, Ry Li, Maria Mercuri, Cheryl Ellix and Andrea Funke.

Declaration of interest The authors report no declarations of interest.

References 1. Lynch SM, Brown JS, Taylor MG. Demographics of disability. In: Uhlenberg P, ed. International handbook of population aging. New York: Springer; 2009.

Validity of Activity Performance Measure

9

2. Albert SM, Freedman VA. Public health and aging: maximizing function and well-being, 2nd ed. New York: Springer Publishing Company LLC; 2009. 3. Altman BM. Population survey measures of functioning: strengths and weaknesses. In: Wunderlich GS, ed. Improving the measurement of late-life disability in population surveys beyond ADLs and IADLs. Washington DC: The National Academies Press; 2009:99–156. 4. World Health Organisation. International classification of functioning, disability and health. Geneva: WHO; 2001. 5. Katz S, Ford A, Moskowitz R, et al. Studies of illness in the aged. J Am Med Assoc 1963;185:914–19. 6. Lawton M, Brody E. Assessment of older people: self-maintaining and instrumental activities of daily living. Gerontologist 1969;9: 179–86. 7. McDowell I. Measuring health a guide to rating scales and questionnaires. Oxford: Oxford University Press; 2006. 8. Dittmar SS, Gresham GE, eds. Functional assessment and outcome measures for the rehabilitation health professional. Gaithersburg (MD): Aspen; 1997. 9. Wade D, Smeets R, Verbun J. Research in rehabilitation medecine: methodological challenges. J Clin Epidemiol 2010;63:699–704. 10. Kovar M, Lawton M. Functional disability: activities and instrumental activities of daily living. In: Kovar M, Lawton M, eds. Annual review of geriatrics and gerontology. New York: Springer Press; 1994:57–75. 11. Turner-Stokes L. Goal attainment scaling (GAS) in rehabilitation: a practical guide Clinical Rehabilitation Online [serial on the Internet]. 2008: Available from: http://cre.sagepub.com/cgi/ rapidpdf/0269215508101742v1.pdf [last accessed 8 Jul 2009]. 12. Grill E, Stucki G. Scales could be developed based on simple clinical ratings of International Classification of Functioning, Disability and Health Core Set categories. J Clin Epidemiol 2009; 62:891–8. 13. Jette AM, Haley SM. Contemporary measurement techniques for rehabilitation outcomes assessment. J Rehabil Med 2005;37:339–45. 14. Pearson V. Assessment in function in older adults. In: Kane RL, Kane RA, eds. Assessing older persons measures, meaning and practical applications. Melbourne: Oxford University Press; 2000:17–34. 15. Uniform Data Systems. Guide for the Uniform Data Set for Medical Rehabilitation (Adult FIM), version 5.0. Buffalo (NY): State University of New York at Buffalo; 1999. 16. Mahoney F, Barthel D. Functional evaluation: the Barthel Index. Md State Med J 1965;14:56–61. 17. Lawton M. Quality of life in chronic illness. Gerontologist 1999;45: 181–3. 18. Cohen ME, Marino RJ. The tools of disability outcomes research functional status measures. Arch Phys Med Rehabil 2000;81:S21–9. 19. Streiner D, Norman G. Health measurement scales. Oxford: Oxford University Press; 2008. 20. Gill T, Robson J, Tinetti M. Difficulty and dependence: two components of the disability continuum among community-living older persons. Ann Intern Med 1998;128:96–101. 21. Tooby J, Cosmides L. The psychological foundations of culture. In: Barkow J, Tooby J, Cosmides L, eds. The adapted mind. Oxford: Oxford University Press; 2000:19–136. 22. Tattersall I. Becoming human evolution and human uniqueness. New York: Mariner Books; 1999. 23. McCuaig M, Frank G. The able self: patterns and choices in independent living for a person with cerebral palsy. Am J Occup Ther 1991;45:224–34. 24. Charmaz K. Good days, bad days. New Jersey: Rutgers University Press; 1991. 25. Reed G, Lux J, Bufka L, et al. Operationalizing the international classification of functioning, disability and health in clinical settings. Rehabil Psychol 2005;50:122–31. 26. Magasi S, Hammel J, Heinemann A, et al. Participation: a comparative analysis of multiple rehabilitation stakeholders’ perspectives. J Rehabil Med 2009;41:936–44. 27. Dijkers MP. Issues in the conceptualization and measurement of participation: an overview. Arch Phys Med Rehabil 2010;91:S5–16. 28. Magasi S, Post M. A comparative review of contemporary participation measures’ psychometric properties and content coverage. Arch Phys Med Rehabil 2010;91:S17–28.

Disabil Rehabil Downloaded from informahealthcare.com by Gazi Univ. on 05/04/15 For personal use only.

10

P. Murgatroyd & L. Karimi

Disabil Rehabil, Early Online: 1–10

29. Turner-Stokes L. Goal attainment scaling (GAS) in rehabilitation: a practical guide. Clin Rehabil 2009;23:362–70. 30. Smith R, Darzins P, Steel C, et al. Outcome measures in rehabilitation. Parkville: National Ageing Research Institute; 2001. 31. Whiteneck G, Dijkers M. Difficult to measure constructs: conceptual and methodological issues concerning participation and environmental factors. Arch Phys Med Rehabil 2009;90:S22–35. 32. Badley E. Enhancing the conceptual clarity of the activity and participation components of the International Classification of Functioning, Disability and Health. Soc Sci Med 2008;66:2335–45. 33. Chalmers A. What is this thing called science? Buckingham: Oxford University Press; 1999. 34. Brown D. Human universals. Philadelphia: Temple University Press; 1991. 35. American Occupational Therapy Association. Uniform terminology for occupational therapy, 3rd ed. Am J Occup Ther 1994;48: 1047–59. 36. Perenboom R, Chorus A. Measuring participation according to the International Classification of Functioning, Disability and Health (ICF). Disabil Rehabil 2003;25:577–87. 37. Feinstein A. Clinimetrics. New Haven: Yale University Press; 1987. 38. Fayers P, Hand D. Causal variables, indicator variables and measurement scales: an example from quality of life. J R Stat Soc A Stat Soc 2002;165:233–52. 39. Feinstein A. Multi-item ‘‘instruments’’ vs Virginia Apgar’s principles of clinimetrics. Arch Intern Med 1999;159:125–8. 40. Feinstein AR, Josephy BR, Wells CK. Scientific and clinical problems in indexes of functional disability. Ann Intern Med 1986; 105:413–20. 41. Fava G, Balaise C. A discussion on the role of clinimetrics and the misleading effects of psychometric theory. J Clin Epidemiol 2005; 58:753–6. 42. Fava G, Tomba E, Sonino N. Clinimetrics: the science of clinical measurements. Int J Clin Pract 2011;66:11–15.

43. Streiner D. Being inconsistent about consistency: when coefficient alpha does and doesn’t matter. J Pers Assess 2003;80:217–22. 44. Molineux M. Occupation for occupational therapists. Oxford: Blackwell Publishing; 2004. 45. Australian Institute of Health and Welfare. Disability and its relationship to health conditions and other factors. Canberra: AIHW (Disability Series); 2004. 46. Guralnik J, Ferrucci L. Demography and epidemiology. In: Halter JB, Ouslander JG, Tinetti ME, et al, eds. Hazzard’s geriatric medicine and gerontology. New York: McGraw Hill Medical; 2009:45–67. 47. Collin C, Wade D, Davies S, Horne V. The Barthel ADL Index: a reliability study. Int Disabil Stud 1988;10:61–3. 48. Charlson M, Szatrowski TP, Peterson J, Gold J. Validation of a combined comorbidity index. J Clin Epidemiol 1994;47:1234–51. 49. Rosner B. Fundamentals of biostatistics. Belmont: Duxbury Press; 2005. 50. Algina J, Keselman H, Penfield R. Effect sizes and their intervals: the two-level repeated measures case. Educ Psychol Meas 2005;65: 241–58. 51. Algina J, Keselman H, Penfield R. Confidence interval coverage for Cohen’s effect size statistic. Educ Psychol Meas 2006;66: 945–60. 52. Grissom R, Kim J. Effect sizes for research: univariate and multivariate applications. Hoboken: Taylor and Francis; 2012. 53. Algina J, Keselman H, Penfield R. An alternative to Cohen’s standardized mean difference effect size: a robust parameter and confidence interval in the two independent groups case. Psychol Methods 2005;10:317–28. 54. Portney L, Watkins M. Foundations of clinical research: application to practice. 3rd ed. Upper Saddle River (NJ): Prentice Hall Health; 2009. 55. Australian Institute of Health and Welfare. Older Australia at a glance. Canberra: AIHW 52; 2007.

Supplementary material available online. Tables S1–S13 and Figure S1.

Validity and reliability of a novel measure of activity performance and participation.

To develop and evaluate an innovative clinician-rated measure, which produces global numerical ratings of activity performance and participation...
227KB Sizes 0 Downloads 6 Views