Archives of Gerontology and Geriatrics 61 (2015) 149–153
Contents lists available at ScienceDirect
Archives of Gerontology and Geriatrics journal homepage: www.elsevier.com/locate/archger
Psychometric properties of the Balance Computerized Adaptive Test in residents in long-term care facilities Wen-Shian Lu a,b,*, Bella Ya-Hui Lien b, Ching-Ling Hsieh c,d a
School of Occupational Therapy, Chung Shan Medical University, Occupational Therapy Room, Chung Shan Medical University Hospital, Taiwan Department of Business Administration, National Chung Cheng University, Taiwan c School of Occupational Therapy, College of Medicine, National Taiwan University, Taiwan d Department of Physical Medicine and Rehabilitation, National Taiwan University Hospital, Taiwan b
A R T I C L E I N F O
A B S T R A C T
Article history: Received 24 September 2014 Received in revised form 5 April 2015 Accepted 23 April 2015 Available online 18 May 2015
Purpose: To validate the psychometric properties and efﬁciency of the Balance Computerized Adaptive Testing (Balance CAT) when applied to residents in long-term care (LTC) facilities. Design and methods: A cohort study was conducted in central Taiwan. The Balance CAT, the Berg Balance Scale (BBS), and the Barthel Index (BI) were administered to each participant with the ability to follow simple instructions by a trained rater in two days. The Pearson’s correlation coefﬁcient (r) was used to determine the concurrent validity of the Balance CAT. ANOVA and post hoc analysis were employed to investigate the discriminative ability of the Balance CAT. The paired t test was used to validate the efﬁciency. Results: A total of 120 participants completed assessments of the Balance CAT, the BBS, and the Barthel Index (BI). The Pearson’s r between the scores of the Balance CAT and the BBS was 0.90. Groups with different levels of dependence had signiﬁcantly different mean scores of the Balance CAT. The mean IRT reliability of the Balance CAT scores was 0.93. The mean administration time of the Balance CAT was about 28% of that of the BBS, and the mean number of items used in the Balance CAT was 3.4. Conclusions: The Balance CAT had excellent concurrent and discriminative validity, reliability, and efﬁciency in residents of LTC facilities. These results indicate that the Balance CAT is a sound and practical measure for assessing the balance function of residents of LTC facilities. ß 2015 Elsevier Ireland Ltd. All rights reserved.
Keywords: Balance Computerized adaptive testing Long-term care
1. Introduction The population of the elderly worldwide is increasing continuously, a phenomenon that has led to more residents living in long-term care (LTC) facilities (Lin, Chou, Liang, Peng, & Chen, 2010). Balance, or postural control, represents the ability to maintain stability when a person needs to sustain or change body posture. This elementary body function is highly related to the mobility and function in activities of daily living of LTC residents (Crocker et al., 2013; Weening-Dijksterhuis, de Greef, Scherder, Slaets, & van der Schans, 2011). Thus, restoration or even improvement of the residents’ balance function is a very important indicator of the quality of care at LTC facilities.
* Corresponding author at: [3_TD$IF]No. [4_TD$IF]110, [5_TD$IF]Sec. [6_TD$IF]1, [7_TD$IF]Jianguo [8_TD$IF]N. [9_TD$IF]Rd, [10_TD$IF]Taichung [1_TD$IF]City [12_TD$IF]40201, Taiwan. Tel.: +886 816424730022. E-mail address: [email protected]
(W.-S. Lu). http://dx.doi.org/10.1016/j.archger.2015.04.009 0167-4943/ß 2015 Elsevier Ireland Ltd. All rights reserved.
For management of the balance function of LTC residents, it is necessary for clinicians to assess residents’ balance function periodically. The ﬁrst step is to determine a suitable tool to assess speciﬁc characteristics, which is a concern in the clinical setting in LTC facilities. Thus, a valid and reliable measure of balance function is a critical foundation for clinicians to assess and manage residents’ balance function. A measure with good validity and reliability is critical for assessing a speciﬁc characteristic of a certain population (Hobart, Lamping, & Thompson, 1996). The concurrent validity presents a level of correlation between a new measure and a gold standard measure, which is a well-known standard measure for assessing the speciﬁc construct (Mokkink et al., 2010). The discriminative validity refers to the ability of a measure to discriminate patients with various levels of ability (especially with extreme ability) or with different characteristics (e.g., normal vs. abnormal) (Kline, 1998). Reliability represents the level of variation of the measurements (e.g., random measurement error) of a measure (Hobart et al., 1996; Kline, 1998). The efﬁciency (especially
W.-S. Lu et al. / Archives of Gerontology and Geriatrics 61 (2015) 149–153
important in a clinical setting) of a measure refers to the time needed for administration in practice, or administration time. To determine whether a measure is sound and efﬁcient for use, the above properties need to be investigated. The Berg Balance Scale (BBS) (Berg, Wood-Dauphinee, Williams, & Maki, 1992) is commonly used to assess the balance function of residents of LTC facilities. However, the BBS, having 14 items, is time-consuming for both clinicians and residents. The computerized adaptive test of balance function (Balance CAT) was developed to assess the balance function of stroke patients reliably and efﬁciently (Hsueh et al., 2010). The Balance CAT has shown to have high association with the BBS (concurrent validity), sufﬁcient responsiveness, and good predictive validity when administered to stroke patients (Hsueh et al., 2010; Yu, Hsueh, Hou, Wang, & Hsieh, 2012). Thus, the Balance CAT could be potentially useful for residents of LTC facilities. However, psychometric properties may be sample dependent. Particularly, the characteristics (e.g., average balance function, course of disease) of residents of LTC facilities are not the same as those of stroke patients in a clinical setting. These differences may inﬂuence the suitability of the Balance CAT when administered to residents of LTC facilities. In other words, the psychometric properties of the Balance CAT in LTC residents are still unknown. Thus, the purpose of this study was to validate the concurrent validity, discriminative validity, reliability, and efﬁciency of the Balance CAT in residents of LTC facilities. 2. Methods 2.1. Subjects We recruited residents from LTC facilities in central Taiwan to participate in this study. The following criteria were set to recruit the participants: (1) stable physical and psychological condition conﬁrmed by the senior nurse in the LTC facility, and (2) ability to follow simple instructions to complete the interview and performance testing. Recruited residents were excluded if they presented an unstable physical or psychological condition or rejected to participate in further assessments. The study protocol was approved by the Institutional Review Board of a local university. Each participant provided written informed consent. 2.2. Procedure The BBS, the Barthel Index (BI), and the Balance CAT were administered to each participant by a single trained rater in 2 days. To control for possible bias of the order effect, the order of administration of the BBS and the Balance CAT was reversed for half of the participants. The rater was an occupational therapist and had received 8 h of training on the use of the BBS, the BI, and the Balance CAT before administering these measures formally to the participants. The scores of the mini-mental state examination (MMSE) and sociodemographic characteristics of the participants were collected from the nursing records of the nursing homes.
CAT ﬁrst presents a question for the rater to judge whether the patient ‘‘cannot stand’’, or ‘‘can stand but cannot walk’’, or ‘‘can walk.’’ Then the CAT presents the subsequent items on the basis of the responses. For example, if the patient can stand, the subsequent item will be selected from items which require the ability to stand, such as ‘‘standing without support and with eyes closed for 10 s, ‘‘picking up a pen on the ﬂoor’’, or ‘‘marching in place’’. If the patient can complete an item (e.g., standing without support and with eyes closed for 10 s), the CAT will present a more difﬁcult item (e.g., marching in place). The assessment ends when the assessment reaches the predeﬁned stopping rule (i.e., IRT reliability coefﬁcient 0.9 or maximum test length of 6 items). The total score of the Balance CAT ranges from 0 to 10, with a higher score representing better balance function. The BBS is the most commonly used measure of the balance ability of the elderly in various contexts (Berg et al., 1992). This scale contains 14 common balance-related activities of daily living (including sitting, standing, and picking up something on the ﬂoor when standing), and each item is scored from 0 to 4. The total score ranges from 0 to 56, with a higher score representing higher balance ability. The BBS is valid and reliable when administered to the elderly (Berg et al., 1992; Holbein-Jenny, Billek-Sawhney, Beckman, & Smith, 2005). The BI is the most widely used measure of the level of independence in activities of daily living (ADL) (Mahoney & Barthel, 1965). The BI comprises 10 items and has a total score ranging from 0 to 100. The BI has good psychometric properties for evaluating the ADL performance of stroke survivors and the aged (Holbein-Jenny et al., 2005; Hsueh, Lee, & Hsieh, 2001; Hsueh, Lin, Jeng, & Hsieh, 2002; Sainsbury, Seebass, Bansal, & Young, 2005; Stone, Ali, Auberleek, Thompsell, & Young, 1994). The BI score is useful for categorizing a person’s level of dependence. A BI score of 0–20 indicates total dependence; one of 21–60, serious dependence; one of 61–90, moderate dependence; one of 91–99, slight dependence; and a score of 100, total independence (Lin, Wang, Chen, Wu, & Portwood, 2005; Mahoney & Barthel, 1965). The MMSE is commonly used to detect patients’ mental states. The dimensions of MMSE including orientation, memory, attention, name, following verbal and written commands, writing a sentence spontaneously, and copying a complex polygon. The total score of the MMSE ranges from 0 to 30, with a higher score representing better mental status. The MMSE has sufﬁcient concurrent validity and test-retest and inter-rater reliability in mentally impaired patients and normal subjects (Crum, Anthony, Bassett & Folstein, 1993; Folstein, Folstein & McHugh, 1975). 2.4. Statistical analysis
2.4.1. Concurrent validity The Pearson correlation coefﬁcient (r) was employed for analyzing the concurrent validity between the Balance CAT and the BBS scores. Correlations were considered large for jrj 0.75, moderate for 0.40 jrj < 0.74, and small for jrj < 0.40 (McCarthy et al., 2002; Salter et al., 2005). Our hypothesis was that there would be a large association between the scores of the Balance CAT and those of the BBS.
The Balance CAT is a computer-based instrument, and the program is installed on a web-based server (website: http://140. 112.116.44/cat/). Its item bank contains 34 balance related items, such as sitting, standing, standing on one leg, and jumping vertically with both legs. (Appendix 1; Hsueh et al., 2010). The raters can use a mobile instrument (e.g., a cell phone) to administer the CAT via the Internet. The Balance CAT presents instructions, scoring criteria and animations of examinees’ required movements and postures for each item. In the beginning of testing, the Balance
2.4.2. Discriminative validity We used two ways to validate the discriminative validity of the Balance CAT. First, we calculated the ratios of the participants with the highest and lowest scores of the Balance CAT to the total participants, respectively. A ratio of greater than 20% represented a ceiling or ﬂoor effect (Holmes & Shea, 1997; Wang, Zhang, McArdle & Salthouse, 2008). We hypothesized that the ratios of participants with the highest and lowest scores of the Balance CAT to total participants would both be less than 10%.
W.-S. Lu et al. / Archives of Gerontology and Geriatrics 61 (2015) 149–153
Second, we used one-way ANOVA and post hoc test to investigate the signiﬁcance of differences in average scores of the Balance CAT between the four groups with different levels of dependence (BI score: 0–20, 21–60, 61–90, 91). Our hypothesis was that the scores of Balance CAT of the groups with lower levels of dependence would be statistically signiﬁcantly higher than those of the groups with higher levels of dependence. 2.4.3. IRT reliability The IRT reliability of a CAT’s score is relative to the standard error (SE) of each measurement, and can be estimated for each measurement. The SE of each measurement was estimated based on the generalized partial credit model (GPCM). The IRT reliability was calculated according to the formula: reliability = (1 SE2) (Cella, Gershon, Lai, & Choi, 2007; Coster, Haley, Ni, Dumas, & Fragala-Pinkham, 2008; Jette, Haley, Ni, Olarsch, & Moed, 2008). The CAT reports a participant’s score and an IRT reliability for each measurement simultaneously. The mean (SD) IRT reliability was calculated. 2.4.4. Efﬁciency The paired t test was used to compare the administration times of the Balance CAT and the BBS, which were administered individually. The level of signiﬁcance was set at 0.05. In addition, the mean number of items used to administer the Balance CAT was calculated. We hypothesized that the time and number of items needed for administering the Balance CAT would be signiﬁcantly less than those needed for administering the BBS.
Table 1 Characteristics of the participants. Characteristic
Gender Male, n (%) Age (years) Mini-mental state examination Duration of residence in LTCb facility (months) Balance CATc Berg balance scale Barthel index
63 (52.5) 72.0 (12.7) 14.7 (8.5) 35.1 (32.5) 4.1 (1.8) 17.6 (16.0) 46.3 (27.1)
Major diagnosis (diagnosis, n (%)) Stroke Spinal cord injury Parkinsonism Poliomyelitis Huntington’s disease Other disease related to muscle-skeletal systemd Dementia Chronic psychiatry diseasee Other chronic diseasef No disease diagnosis
46 (38.3) 4 (3.3) 6 (5.0) 2 (1.7) 1 (0.8) 14(11.7) 20 (16.7) 6 (5.0) 19 (15.8) 2 (1.7)
a b c d e
Standard deviation. Long-term care. Balance computerized adaptive testing. Including: arthritis, total hip replacement, fracture, etc. Including: schizophrenia, alcohol abuse, paranoia, affective psychiatric disease,
Including: hypertension, diabetes mellitus, cancer, kidney disease, lung disease,
the post hoc comparison (Scheffe method) revealed that the groups with lower levels of dependence consistently had higher scores on the Balance CAT than did those with higher levels of dependence.
3. Results 3.1. Sociodemographic and characteristics of the participants A total of 128 residents living in 4 LTC facilities participated in this study. Of the original 128, 8 participants failed to complete the tests. The 120 participants (63 male) who completed the tests had an average age of 72 years. The mean (SD) scores of the BBS and Balance CAT were 17.6 (16.0) and 4.1 (1.8), respectively. The mean score of 46.2 (range of 0–100) of the BI implied that the research group had various levels of ADL dependence. Further sociodemographic and clinical characteristics of the participants are shown in Table 1. 3.2. Concurrent validity The Pearson’s r between scores of the Balance CAT and those of the BBS was 0.90 (p < 0.001). The strength of association was very high. 3.3. Discriminative validity A total of 4 participants (3.3%) had the highest score (8.07), and 2 participants (1.7%) had the lowest score (0.0). Notably, no participant had a score of 10 (the highest possible score of the Balance CAT). Table 2 shows that the mean scores of the Balance CAT between the four groups with various levels of dependence were signiﬁcantly different (F(119) = 36.2, p < 0.001). Furthermore,
3.4. IRT reliability The mean (SD) IRT reliability of these 120 participants was 0.93 (0.03). The IRT reliability of each score ranged from 0.85 to 0.97. 3.5. Efﬁciency The mean (SD) administration time of the Balance CAT was 59 (51.2) s, which differed signiﬁcantly from the mean (SD) time of 205 (125.2) s for the BBS (t = 14.0, p < 0.001). The mean (SD) number of items used to administer the Balance CAT was 3.4 (1.2). 4. Discussion This study is the ﬁrst to verify the quality of the Balance CAT when applied to residents of LTC facilities. To test the concurrent validity of the Balance CAT, we used the BBS as a criterion. The BBS is a well-known instrument with sufﬁcient validity and reliability for assessing the balance function of the elderly (Berg et al., 1992). The high correlation between the Balance CAT and the BBS supports our hypothesis and the concurrent validity of the Balance CAT. In comparison to another study exploring the concurrent validity of the Balance CAT with the BBS when applied to patients with stroke, the correlation between the Balance CAT
Table 2 Scores of the Balance CAT of participants with various levels of dependence (BI score).
Balance CAT Mean (SD) *
p < 0.001.
0–20 (n = 30)
21–60 (n = 48)
61–90 (n = 35)
391 (n = 7)
W.-S. Lu et al. / Archives of Gerontology and Geriatrics 61 (2015) 149–153
and the BBS was 0.88 (Pearson’s r), which is very close to our results (0.90). Thus, our results indicate that the Balance CAT is a valid measure for assessing the balance function of residents of LTC facilities. With regard to the discriminative validity of the Balance CAT, no signiﬁcant ceiling or ﬂoor effects were found, implying that the Balance CAT has discriminative ability when applied to residents of LTC facilities. Thus, our hypothesis was supported. In addition, the Balance CAT was shown to have the ability to discriminate the various levels of dependence (F(119) = 36.2, p < 0.001), which is a critical characteristic of residents of LTC facilities. The results imply that the score of the Balance CAT can be used as an indicator of the level of dependence of residents of LTC facilities. The aforementioned observations support the discriminative validity of the Balance CAT in residents of LTC facilities. The IRT reliability (or SE) of every Balance CAT measurement can be obtained because the CAT was developed according to IRT. A CAT with high mean IRT reliability means that the instrument has small random measurement error and that the measurement is more reliable. Our results showed that the IRT reliabilities of measurements of the Balance CAT ranged from 0.85 to 0.97, with a mean score of 0.93, implying that the random measurement error of individual measurement was small. Thus, the Balance CAT was reliable when applied to residents of LTC facilities. Our results might also show the potential for the Balance CAT as a responsive instrument. An instrument with smaller SE (i.e., random measurement error) of each measurement can detect change in patients more easily than another instrument with larger SE. In other words, if the SE of an instrument is large, the observed change score is very likely to result from measurement error. Thus, the Balance CAT, having high IRT reliability, has the potential to detect change in a patient. However, to validate the responsiveness of the Balance CAT, a longitudinal research design is still warranted. In terms of the administration efﬁciency, the average administration time of the Balance CAT was only 28% of that of the BBS. In addition, only 3.4 items, on average, were needed to administer the CAT. These results highly support the efﬁciency of the Balance CAT. Such efﬁciency supports the use of the Balance CAT in LTC facilities, where the clinicians are usually busy. The limitations of this study included that the sample was convenient and the participants needed to have the ability to follow simple instructions. These limitations hamper the generalization of our results to all residents of LTC facilities. In addition, the item parameters of the Balance CAT were estimated from a cohort of patients with stroke. We could not examine the item parameters of the Balance CAT in LTC residents because we did not administer every item of the CAT on the residents. Future studies might need to validate the item parameters of the Balance CAT in LTC residents, although such a validation would consume both time and effort.
5. Conclusions In summary, our results showed that the Balance CAT has excellent concurrent and discriminative validity, reliability, and efﬁciency. Thus, the Balance CAT appears to be a practical measure for assessing the balance function of residents of LTC facilities.
Conﬂict of interest All authors of this article do not have any ﬁnancial and personal relationships with other people or organizations that could inappropriately inﬂuence our work.
Appendix 1 Table A1 Contents of item bank of the Balance CAT. Itema 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 a
Testing position Sitting with trunk support for 10 s (on a chair with a backrest) Sitting without trunk support for 10 s Reaching for a pen on the less affected side and putting it into a chest pocket on the more affected side Reaching for a pen on the more affected side and putting it into a chest pocket on the less affected side Standing on only the less affected leg for 5 s Picking up a pen on the ﬂoor (in front of the less affected leg) Picking up a pen on the ﬂoor (centered in front of the patient) Sitting to supine Sitting to standing Supine to standing Standing with support for 10 s Standing without support for 10 s Standing without support and with eyes closed for 10 s Turning the body to the more affected side to pick up a pen Picking up a pen on the ﬂoor (in front of the less affected leg) Maintaining a striding posture for 10 s (more affected leg forward, knee bent) Picking up a pen on the ﬂoor (centered in front of the patient) Standing with feet together for 10 s Maintaining a striding posture for 10 s (less affected leg forward, knee bent) Picking up a pen on the ﬂoor (in front of the more affected leg) Standing with feet together and with eyes closed for 10 s Marching in place Standing on only the less affected leg Standing heel to toe, more affected foot forward Standing to squatting Maintaining a squatting position Squatting to standing Standing to tiptoe Tapping alternate feet Jumping vertically with both legs Standing heel to toe, less affected foot forward Standing on only the more affected leg Hopping in place on the less affected foot Hopping in place on the more affected foot
Sitting Sitting Sitting Sitting Sitting Sitting Sitting Sitting Sitting Lying Standing Standing Standing Standing Standing Standing Standing Standing Standing Standing Standing Standing Standing Standing Standing Standing Standing Standing Standing Standing Standing Standing Standing Standing
The items are generally arranged in increasing order of difﬁculty.
References Berg, K. O., Wood-Dauphinee, S. L., Williams, J. I., & Maki, B. (1992). Measuring balance in the elderly: Validation of an instrument. Canadian Journal of Public Health, Revue Canadienne de Sante Publique83(Suppl. 2), S7–S11. Cella, D., Gershon, R., Lai, J. S., & Choi, S. (2007). The future of outcomes measurement: Item banking, tailored short-forms, and computerized adaptive assessment. Quality of Life Research, 16(Suppl. 1), 133–141. http://dx.doi.org/10.1007/s11136-0079204-6 Coster, W. J., Haley, S. M., Ni, P., Dumas, H. M., & Fragala-Pinkham, M. A. (2008). Assessing self-care and social function using a computer adaptive testing version of the pediatric evaluation of disability inventory. Archives of Physical Medicine and Rehabilitation, 89(4), 622–629. http://dx.doi.org/10.1016/j.apmr.2007.09.053 Crocker, T., Forster, A., Young, J., Brown, L., Ozer, S., Smith, J., et al. (2013). Physical rehabilitation for older people in long-term care. Cochrane Database Syst Rev, 2, CD004294. http://dx.doi.org/10.1002/14651858.CD004294.pub3 Crum, R. M., Anthony, J. C., Bassett, S. S., & Folstein, M. F. (1993). Population-based norms for the Mini-Mental State Examination by age and educational level. Journal of the American Medical Association, 269(18), 2386–2391. Folstein, M. F., Folstein, S. E., & McHugh, P. R. (1975). Mini-mental state. A practical method for grading the cognitive state of patients for the clinician. Journal of Psychiatric Research, 12(3), 189–198.
W.-S. Lu et al. / Archives of Gerontology and Geriatrics 61 (2015) 149–153 Hobart, J. C., Lamping, D. L., & Thompson, A. J. (1996). Evaluating neurological outcome measures: The bare essentials. Journal of Neurology, Neurosurgery and Psychiatry, 60(2), 127–130. Holbein-Jenny, M. A., Billek-Sawhney, B., Beckman, E., & Smith, T. (2005). Balance in personal care home residents: A comparison of the Berg Balance Scale, the MultiDirectional Reach Test, and the Activities-Speciﬁc Balance Conﬁdence Scale. Journal of Geriatric Physical Therapy, 28(2), 48–53. Holmes, W. C., & Shea, J. A. (1997). Performance of a new, HIV/AIDS-targeted quality of life (HAT-QoL) instrument in asymptomatic seropositive individuals. Quality of Life Research, 6(6), 561–571. Hsueh, I. P., Chen, J. H., Wang, C. H., Chen, C. T., Sheu, C. F., Wang, W. C., et al. (2010). Development of a computerized adaptive test for assessing balance function in patients with stroke. Physical Therapy, 90(9), 1336–1344. Hsueh, I. P., Lee, M. M., & Hsieh, C. L. (2001). Psychometric characteristics of the Barthel activities of daily living index in stroke patients. Journal of the Formosan Medical Association, 100(8), 526–532. Hsueh, I. P., Lin, J. H., Jeng, J. S., & Hsieh, C. L. (2002). Comparison of the psychometric characteristics of the functional independence measure, 5 item Barthel index, and 10 item Barthel index in patients with stroke. Journal of Neurology, Neurosurgery and Psychiatry, 73(2), 188–190. Jette, A. M., Haley, S. M., Ni, P., Olarsch, S., & Moed, R. (2008). Creating a computer adaptive test version of the late-life function and disability instrument. Journals of Gerontology. Series A, Biological Sciences and Medical Sciences, 63(11), 1246– 1256. Kline, P. (1998). The new psychometrics: Science, psychology, and measuremnet. Abingdon, Oxon: Routledge. Lin, L. C., Wang, T. G., Chen, M. Y., Wu, S. C., & Portwood, M. J. (2005). Depressive symptoms in long-term care residents in Taiwan. Journal of Advanced Nursing, 51(1), 30–37. http://dx.doi.org/10.1111/j. 1365-2648.2005.03457.x Lin, M. H., Chou, M. Y., Liang, C. K., Peng, L. N., & Chen, L. K. (2010). Population aging and its impacts: Strategies of the health-care system in Taipei. Ageing Research Reviews, 9(Suppl. 1), S23–S27. http://dx.doi.org/10.1016/j.arr.2010.07.004
Mahoney, F. I., & Barthel, D. W. (1965). Functional evaluation: The Barthel Index. Maryland State Medical Journal, 14(2), 61–65. McCarthy, M. L., Silberstein, C. E., Atkins, E. A., Harryman, S. E., Sponseller, P. D., & Hadley-Miller, N. A. (2002). Comparing reliability and validity of pediatric instruments for measuring health and well-being of children with spastic cerebral palsy. Developmental Medicine and Child Neurology, 44(7), 468–476. Mokkink, L. B., Terwee, C. B., Patrick, D. L., Alonso, J., Stratford, P. W., Knol, D. L., et al. (2010). The COSMIN study reached international consensus on taxonomy, terminology, and deﬁnitions of measurement properties for health-related patientreported outcomes. Journal of Clinical Epidemiology, 63(7), 737–745. http:// dx.doi.org/10.1016/j.jclinepi.2010.02.006 Sainsbury, A., Seebass, G., Bansal, A., & Young, J. B. (2005). Reliability of the Barthel Index when used with older people. Age and Ageing, 34(3), 228–232. http:// dx.doi.org/10.1093/ageing/aﬁ063 Salter, K., Jutai, J. W., Teasell, R., Foley, N. C., Bitensky, J., & Bayley, M. (2005). Issues for selection of outcome measures in stroke rehabilitation: ICF activity. Disability and Rehabilitation, 27(6), 315–340. http://dx.doi.org/10.1080/09638280400008545 Stone, S. P., Ali, B., Auberleek, I., Thompsell, A., & Young, A. (1994). The Barthel Index in clinical practice: Use on a rehabilitation ward for elderly people. Journal of the Royal College of Physicians of London, 28(5), 419–423. Wang, L., Zhang, Z., McArdle, J., & Salthouse, T. A. (2008). Investigating ceiling effects in longitudinal data analysis. Multivariate Behavioral Research, 43(3), 476–496. http:// dx.doi.org/10.1080/00273170802285941 Weening-Dijksterhuis, E., de Greef, M. H., Scherder, E. J., Slaets, J. P., & van der Schans, C. P. (2011). Frail institutionalized older persons: A comprehensive review on physical exercise, physical ﬁtness, activities of daily living, and quality-of-life. American Journal of Physical Medicine and Rehabilitation, 90(2), 156–168. http://dx.doi.org/ 10.1097/PHM.0b013e3181f703ef Yu, W. H., Hsueh, I. P., Hou, W. H., Wang, Y. H., & Hsieh, C. L. (2012). A comparison of responsiveness and predictive validity of two balance measures in patients with stroke. Journal of Rehabilitation Medicine, 44(2), 176–180. http://dx.doi.org/ 10.2340/16501977-0903