Knee Surg Sports Traumatol Arthrosc DOI 10.1007/s00167-014-3283-z

KNEE

The measurement properties of the IKDC‑subjective knee form Hanna Tigerstrand Grevnerts · Caroline B. Terwee · Joanna Kvist 

Received: 14 February 2014 / Accepted: 27 August 2014 © European Society of Sports Traumatology, Knee Surgery, Arthroscopy (ESSKA) 2014

Abstract  Purpose  To evaluate the methodological quality of studies reporting on the measurement properties of the International Knee Documentation Committee subjective knee form (IKDC-SKF) and to evaluate their results following the COnsensus-based Standards for the selection of health status Measurement INstruments (COSMIN) guidelines. Methods  Systematic search of articles published about the measurement properties of the IKDC-SKF, review of the studies’ methodological quality, and synthesis of the results using the COSMIN guidelines. Results  Twenty-six studies were identified and reviewed. There was strong evidence for good internal consistency, test–retest reliability, and responsiveness. There was moderate evidence for good content and structural validity. With the SF36 as a gold standard, the level of evidence for criterion validity was indeterminate. There was conflicting evidence for hypothesis testing and not enough evidence to evaluate measurement error and cross-cultural validity. There were no floor or ceiling effects. Conclusions  This review shows that the IKDC-SKF is a measurement instrument with good internal consistency, test–retest reliability, content and structural validity, and Electronic supplementary material  The online version of this article (doi:10.1007/s00167-014-3283-z) contains supplementary material, which is available to authorized users. H. T. Grevnerts · J. Kvist (*)  Division of Physiotherapy, Department of Medical and Health Sciences, Linköping University, 581 83 Linköping, Sweden e-mail: [email protected] C. B. Terwee  Department of Epidemiology and Biostatistics and the EMGO Institute for Health and Care Research, VU University Medical Centre, Amsterdam, The Netherlands

responsiveness and interpretability (no floor and ceiling effects). Further evaluation of measurement error, minimal important change, and hypotheses testing is recommended. The IKDC-SKF seems to be useful as a general instrument for all kinds of knee injuries, which might facilitate its clinical use in situations in which time is a factor. Level of evidence  Systematic review, Level III. Keywords  COSMIN · Knee injury · Outcome · Questionnaire

Introduction Knee injuries and diseases such as osteoarthritis, patellofemoral pain syndrome, and ACL injury can result in the need for surgery and long-term rehabilitation and can prevent participation in sports activities. The prevalence of knee injuries has increased in children and adolescents, and athletes with a previous ACL injury are much likelier to have a new knee injury and to develop osteoarthritis [25, 34, 51, 52]. The effectiveness of treatments must be evaluated with reliable and valid instruments, meeting the requirements of different perspectives. From the patient perspective, the treatment has to result in perceived better function, including the absence of pain or other symptoms, and the treatment must restore the patient’s ability to participate in desired activities. Patient-reported outcome measures (PROMs), which are self-administrated questionnaires, represent the patient’s perspective of their health and seem a valid and practical way to evaluate outcomes [6]. For a PROM to be useful, it should be well designed and appropriate for use in the population it is intended to measure. The measurement properties of the instrument should also be well documented and of high quality [10, 19]. There are several PROMs for assessing knee symptoms and injuries [26, 40] but

13



in order to compare outcomes of different methods, a standardization of evaluation is crucial. The International Knee Documentation Committee (IKDC) was formed in 1987 with the aim of developing a standardized international documentation system for knee surgery. The “IKDC-Standard Knee Evaluation Form” was published in 1993, and the “IKDC-subjective knee form” (IKDC-SKF) was published in 2000. The IKDC-SKF is a patient-reported outcome that measures the patient’s perception of symptoms, function, and symptom-free sports activity. This tool is described as being a knee-specific, rather than a disease-specific, PROM [16]. The questionnaire consists of 18 items and results in a total score that ranges from 0 to 100, where 100 represents no impairment and a high participation level. The IKDC-SKF was found to be difficult for children to use, so in 2011 the Pedi-IKDC was developed and tested on children aged 10–18 [20]. The measurement properties of the IKDC-SKF were initially tested for reliability, validity, and responsiveness by Irrgang et al. [16, 17]. In order for clinicians to treat patients based on evidence-based practice, the results of published research should continuously be evaluated and critically reviewed. Systematic critical evaluation of the methodology and review of the results of previous studies in terms of the measurement properties of the IKDC-SKF will make it easier for both clinicians and researchers to determine whether the IKDC-SKF is appropriate for use. This systematic review aimed to evaluate the methodological quality and results of studies reporting on the measurement properties of the IKDC-SKF following the COnsensus-based Standards for the selection of health status Measurement INstruments (COSMIN) guidelines and the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) guidelines [24, 32].

Materials and methods A literature search of PubMed, Cinahl, and Scopus was conducted (for details please see Supplementary File 1). For PubMed, a precise search filter for searching for studies on measurement properties [45] was used with the search terms “IKDC AND knee.” For searching Cinahl and Scopus, only the terms “IKDC AND knee” were used. The literature search covered articles published until March 24, 2013, and was conducted as described in Fig. 1. Abstract screening was conducted by two independent researchers (HTG, JK) using the following eligibility criteria: 1. The aim of the study had to be to evaluate the measurement properties of the IKDC-SKF. 2. The study had to evaluate the measurement properties of the IKDC-SKF, even when the main aim of the

13

Knee Surg Sports Traumatol Arthrosc

Fig. 1  Flowchart showing the literature search process

study was to evaluate another measurement instrument, or to compare the IKDC-SKF to another measurement instrument. 3. The study had to be written in English or Swedish. When there was doubt about whether a study should be included in our analysis, the full text article was retrieved. After independent assessment by both researchers, a consensus was achieved by discussing the conflicting viewpoints (Fig. 1). The definitions, taxonomy, and guidelines for assessing the methodological quality of health-related patientreported outcomes (HR-PRO) as developed by the COSMIN panel were used [32, 33] (Table 1). A detailed description of the checklist and the evaluation criteria can be found on the web site www.cosmin.nl. The COSMIN checklist has been evaluated for reliability [30] and has been used in several studies [4, 11, 46, 48, 49]. The scoring was performed independently by two researchers (HTG and JK). When there were differences, a consensus was reached after discussion or after consulting a third researcher (CBT). The scoring procedure was as follows: • Each study was given a methodological quality score according to the COSMIN checklist for each measurement

Knee Surg Sports Traumatol Arthrosc Table 1  Definition taxonomy according to COSMIN Reliability Internal consistency Test–retest reliability Measurement error

Internal relationship of the items included in the (sub)scale The extent to which repeated measures close to each other give the same results Systematic and random error in a patient’s score that is not attributed to true changes in the construct to be measured

Validity Content validity (including face validity) Relevance and comprehensiveness of the items Structural validity The degree to which the score is a reflection of the dimensionality of the construct. Hypothesis testing Comparing the score of the measurement instrument to differences in scores of subgroups of patients or to scores of another measurement instrument Cross-cultural validity Adaptation for use in other languages, including the translation process and group comparisons in another language Criterion validity Comparisons with a “gold standard” tool Responsiveness A measure of longitudinal validity Interpretability

Interpretation of scores, including floor- and ceiling effects

property it evaluated. The score was given according to the “worst score counts” system, meaning that the lowest score in any section was the overall score given to the study. The score was “excellent,” “good,” “fair,” or “poor” [47]. In general, an item is scored as “excellent” when there is evidence that the methodological quality aspect of the study to which the item is referring is adequate. An item is scored as “good” when relevant information is not reported in an article, but it can be assumed that the quality aspect is adequate. An item is scored as “fair” if it is doubtful whether the methodological quality aspect is adequate. Finally, an item is scored as “poor” when evidence is provided that the methodological quality aspect is not adequate. • The results for each measurement property in each study got a positive “+,” indeterminate “?,” or negative “−” score. In this scoring system, a positive score indicates good reliability (e.g., ICC > 0.70) or validity (e.g., more than 75 % of results in accordance with predefined hypotheses), an indeterminate score indicates that this could not be evaluated, and a negative score indicates poor reliability or validity [44]. Studies with “poor” scores for methodological quality in the previous step were excluded. • The summarized level of evidence for each measurement property was then analyzed with best evidence synthesis, taking into consideration both the methodological quality of the study and the results of their tests. The overall score was as follows: +++ (strong positive), ++ (moderately positive), −−− (strong negative), + (limited findings), ? (indeterminate), +/− (conflicting findings) [10]. Results Twenty-six studies were reviewed. Characteristics for each study are shown in Supplementary File 2. Table 2 shows

the scoring of the methodological quality of the studies and the results for the measurement properties. Internal consistency Eleven studies had scores of “good” for their methodological quality [9, 12, 14–16, 23, 27, 28, 35, 41]. Cronbach’s α ranged from 0.77 to 0.97 for the studies. One study [27] was given a negative score because of a Cronbach’s α value of 0.97, which was considered too high and to indicate possible redundancy (the suggested value range is 0.7–0.9). Overall, we found strong evidence that the IKDC-SKF has positive internal consistency (Table 2). Test–retest reliability The 12 studies reporting on reliability [9, 12–14, 16, 20, 23, 27, 35, 37, 43, 50] had scores of “good” to “poor” for methodological quality. The reason for a “poor” score [13, 23, 35, 37] included “small sample size,” “test conditions NOT similar for both measurements,” and “no ICC, Spearman, or Pearson test.” The ICC ranged from 0.87 to 0.98, which was considered adequate (>0.70) [44]. Overall, we found strong evidence that the IKDC-SKF has positive test–retest reliability (Table 2). Measurement error Four studies reported on measurement error [13, 16, 27, 50]. Of these, two had scores of “good” [27, 50], one had a score of “fair” [16] due to “moderate sample size,” and one had a score of “poor” [13] due to “small sample size.” The Standard Error of Measurement (SEM) was reported to be 4.4–4.6, the Minimal Detectable Change (MDC) to be 15.6 at the 6-month follow-up and 13.7 at the 12-month

13



Knee Surg Sports Traumatol Arthrosc

Table 2  Methodological scores and levels of evidence for each measurement property Author/year

Type of studya

Agel/2009 Boykin/2013 Björklund/2009 Briggs/2009 Cook/2010 Crawford/2007 Fu/2011 Greco/2010 Haverkamp/2006 Higgings/2007 Irrgang/2001 Irrgang/2006 Kocher/2011 Kong/2012 Laboute/2009 Lertwanitch/2008 van Meer/2012 Metsavaht/2010 Metsavaht/2011 Metsavaht/2012 Padua/2004 Piontek/2012 Reinke/2011 Schmitt/2010 Shelbourne/2013 Siqueira/2012

D Pedi C C D A B A B A A A Pedi C C B A B D C B B D A D D

Summarized level of evidencec

Internal consistencyb

Test–retest Reliabilityb

Measurement errorb

Structural validityb

Hypothesis testingb

Translation validityb

Criterion validityb

P

Responsivenessb F/+

G/−

G/+ G/− G/+ G/+ G/+

F/+ F/+ P G/+ F/+

G/+

G/+

G/+

P G/+ G/+

G/+ G/+ G/+

Fd

F/+ G/− G/− F/+ F/?

G/− G/−

P

F/+ P

F/?

Gd

F/?

G/? G/+

G/− G/− G/+ P P

G/−

F/? G/? G/?

G/− F/+ F/− G/+

P P

F/?

G/? P

G/− G/− G/+ G/−

G/+ Gd

G/+

G/−

F/+ G/+

G/+ P P

F/+ +++

+++

G/+

+

++

+/−

?

?

+++

a

  Type of study: A: aimed to investigate measurement properties of the IKDC, B: investigated the measurement properties of the IKDC while translating it into another language, C: investigated the measurement properties of another instrument and used the IKDC for comparison, D: compared the IKDC to other measurement instruments without specifically investigating measurement properties, Pedi: investigated the measurement properties of the Pedi-IKDC b

  Methodological quality assessed using the COSMIN score: E excellent, G good, F fair, P poor. Studies with a “Poor” rating were not included in the scoring of results of each measurement property. The results for each measurement property in each study were rated as follows: positive, +; indeterminate scoring, ?, or negative, − c

  Summarized level of evidence: +++ for strong positive scoring, ++ for moderate positive scoring, −−−strong negative scoring, +for limited findings, ? for indeterminate scoring +/− for conflicting findings d

  Studies reporting only effect size, standardized response means (SRM), and minimal detectable change (MDC) could not be scored for results

follow-up, the individual Smallest Detectable Change (SDCind) was 12.2, and the Limits of Agreement (LoA) was −6.1 to 7.1. SDC was smaller compared to Minimal Clinical Important Difference (MCID, results reported under interpretability) [13, 17]. Because of the variety of methods used for the calculations of measurement error and because there were only two studies that reported the MCID [13, 17], we concluded that there was not enough evidence to judge the measurement error of the IKDC-SKF (Table 2).

13

Content validity Two type A studies measured content validity [16, 50]. All items included in the IKDC-SKF were considered to be relevant for the construct to be measured, for the target population, and for the purpose of the measurement by both the medical profession [16] and by patients [50]. However, none of the studies evaluated whether the patients missed any important aspects of knee symptoms, function, or sports activity participation in the questionnaire. We

Knee Surg Sports Traumatol Arthrosc

concluded that there is moderate evidence that the IKDCSKF has good content validity. Structural validity The three studies that reported on structural validity all had scores of “good” for methodology [15, 16, 41]. Two studies identified one factor that accounted for >48 % of the variation [16, 41]. One study reported finding two dimensions of the IKDC-SKF [15], but the results were difficult to interpret; thus, this study received a score of “indeterminate” (Table 2). There is the most evidence for a one-factor solution, but the low degree of variation explained by the first factor indicates that the IKDC-SKF may be multifactorial. Hypothesis testing A total of 16 studies reported on hypothesis testing as part of construct validity [1, 3, 7–9, 14, 20–22, 27–29, 38, 42, 43, 50]. The hypotheses were primarily correlations to other PROMs, e.g., Cincinnati score, the Lysholm score and the WOMAC score (Supplementary File 2). Of these, five studies had scores of “good” [7, 8, 20, 29, 50], and five studies had scores of “fair” [3, 8, 9, 27, 28, 38], mainly due to “vague hypothesis” or “moderate sample size.” Six studies had scores of “poor” for the methodological quality [1, 14, 21, 22, 42, 43] due to “no description of the constructs measured by the comparator instrument,” “no measurement properties reported on comparator instrument,” “unclear what was expected from hypothesis,” or “unclear what was expected regarding correlations or differences.” A positive rating for the results was given when 75 % of the hypotheses were confirmed [44]. There were more than 75 % confirmed hypotheses in nine studies (of which two had the methodological score of “good,” four of “fair” and three of “poor”) [1, 3, 9, 14, 20, 27, 29, 42], and fewer than 75 % confirmed hypotheses in four studies (of which three had the methodological score of “good” and one of “fair”) [7, 8, 28, 50]. We concluded that there is conflicting evidence about the construct validity of the IKDC-SKF (Table 2). Cross‑cultural validity, including translation validity Six studies translated the IKDC-SKF into another language [12, 14, 23, 27, 35, 37]. One study had a score of “good” [35] and four a score of “fair” [12, 14, 23, 27] for the quality of the translation due to “sample in pre-testing not adequately described.” One study had a score of “poor” [37] due to “no backward translation made” and “no pre-testing done.” No study reported on cross-cultural validity by comparing similar subgroups of patients in different countries who answered the IKDC-SKF in different languages. Thus, the IKDC-SKF was given an indeterminate score for level of evidence for cross-cultural validity (Table 2).

Criterion validity Twelve studies reporting on criterion validity had scores of “excellent” [50] or “good” [5, 9, 12, 15, 16, 20, 23, 27, 28, 35, 41] for methodological quality. According to COSMIN, there is no gold standard for PROs, so the study that had a score of “excellent” was downgraded to “good.” Almost all of the studies used SF36 or SF12 as the gold standard. However, the studies on Pedi-IKDC used the Child Health Questionnaire (CHQ) as the gold standard [5, 20] and one study that included young adolescents used the PedsQL [41]. SF36, HCQ, and PedsQL were considered reasonable “gold standards” for use for the valuation of the criterion validity since the SF36 was used for the development of the IKDC-SKF [16]. There was a significant correlation between the IKDC-SKF and the physical component summary, physical functioning, and bodily pain domains of the SF-36 for all studies. The Pedi-IKDC studies showed a significant correlation between Pedi-IKDC and the HCQ. Although there were significant correlations between IKDC-SKF and the gold standard, in all cases but one [27] the correlation was not strong enough (i.e., not ≥0.70) to show strong support for criterion validity. The criterion validity was given an indeterminate score for level of evidence (Table 2). Responsiveness Seven studies that reported on the responsiveness of the IKDC-SKF had scores of “good” [9, 17, 20, 50] or “fair” [1, 3, 13] for methodological quality. The score “fair” was due to “moderate sample size,” “vague hypothesis,” and “only some information about the measurement properties of the comparator instrument are described.” One study reported on correlations (≥0.50) in changes to two instruments measuring the same construct [1]. Two studies had confirmed [50] or partially confirmed [17] hypotheses. Two studies [17, 50] reported that the area under the receiver operating characteristic curve (AUC) was ≥0.70, which was considered a positive result. Four [3, 9, 17, 20] studies reported Standardized Response Means (SRMs) between 1.35 and 4.4, effect sizes of 0.76–2.11, and MDCs of 8.8– 15.6. The responsiveness was scored as having a “strong positive” level of evidence due to four studies that had positive scores, two of “good” [17, 50] and two of “fair” [1, 13] methodological quality (Table 2). Interpretability and floor and ceiling effects Eleven studies reported on floor and ceiling effects [1, 3, 9, 14, 16, 23, 27, 28, 35, 50]. One study [14] reported on normally distributed scores, and one study [1] showed a floor and ceiling effect (each 10 %), and one study [20]

13



showed a ceiling effect of 6 %. The other nine studies [3, 9, 16, 23, 27, 28, 35, 50] showed no floor or ceiling effects. This indicates that the IKDC has no floor or ceiling effects, since floor or ceiling effects should not exceed 15 % [44]. The MCID was reported to be 6.3 at the 6-month follow-up and 16.7 at the 12-month follow-up after articular cartilage repair [13]. Irrgang et al. [17] reported two MCID; 11.5 in order to maximize the sensitivity and 20.5 in order to maximize the specificity.

Discussion The most important finding of the present study was that there is strong evidence that the IKDC-subjective knee score has good internal consistency, test–retest reliability, and responsiveness, and moderate evidence that it has good content and structural validity according to the COSMIN evaluation criteria. When the SF36 was used as the gold standard, the level of evidence for criterion validity was indeterminate. There is conflicting evidence for construct validity (hypothesis testing), since there were studies that had fair or good methodological quality reporting on both positive and negative results. There is not enough evidence to judge the measurement error and cross-cultural validity of the IKDC-SKF. The reliability of the IKDC-SKF was assessed by evaluation of the internal consistency, test–retest reliability, and measurement error according to the COSMIN taxonomy [33]. The IKDC-SKF shows good interrelationships among the items. Specifically, for the IKDC-SKF, Cronbach’s α was not too high, meaning that some items measure the same aspect and could be reduced; but Cronbach’s αwas also not too low, meaning that the items can be summarized in a total score. The small sample size in most of the studies that looked at test–retest reliability reduced the quality of the studies from excellent to good, fair, or poor, even though the ICC values were high. Measurement error was only reported in four studies. One can argue that the measurement error can be calculated post hoc, by using the ICC variance components. Unfortunately, the included studies did not report the variance components, so we could not calculate the SEM. According to Irrgang [16], ±9 points are needed to reveal a true change in score. More studies in other patient groups are needed for better interpretation of IKDC-SKF changes in clinical studies. The IKDC-SKF is considered to be a relevant score both for medical professionals and for patients with knee disorders [16, 50]. During the development of the IKDCSKF, questions were deleted if the patients chose not to answer them [16]. In the study by van Meer et al. [50], the patients had to grade the relevance of each question, and the patients favored the use of the IKDC-SKF over the use

13

Knee Surg Sports Traumatol Arthrosc

of the KOOS (Knee injury and Osteoarthritis Outcome Score) for short-term outcomes after ACL reconstruction. No study has reported whether the patients actually miss any questions about knee symptoms, function, and sports activity that could be relevant to the outcome of a treatment. Therefore, the IKDC-SKF is considered to have a moderate level of evidence for content validity. Structural validity has been tested in three studies, and two of them found some evidence that the IKDC-SKF was unidimensional [16, 41]. The study by Higgins et al. [15] showed two dimensions, but the results are difficult to interpret. If the IKDC-SKF is regarded as unidimensional, the total score summarizes patient-reported knee symptoms, function and level of symptom-free sports activity. That covers different aspects of functioning according to the International Classification on Functioning, Disability and Health (ICF). Even though the IKDC-SKF has good internal consistency, it remains unclear whether these aspects should be summarized as one score and what information that score actually gives. The outcome measure should depend on the clinical or research question. The IKDC-SKF provides one score, making it easy to use in clinics and in general research settings, but other outcome measures should be added for evaluating specific aspects of functioning. There were conflicting findings for construct validity (hypothesis testing). The hypotheses that have been tested usually concerned correlations with other PROMs, for example the Cincinnati score, the Lysholm score and the WOMAC score. These studies reported conflicting results. Some studies tested hypothesis about correlations with performance-based tests, but these studies mostly aimed to validate the performance-based tests. It is difficult to interpret these correlations because neither of the instruments that are being compared can be assumed to be valid. This makes it difficult to draw conclusions about the level of evidence for hypothesis testing as a measure of construct validity. There is no evidence regarding cross-cultural validity since no studies compared similar subgroups of patients that answered the IKDC-SKF in different languages, e.g., in multiple group factor analysis or tests of Differential Item Functioning (DIF). All the studies except one [37] that translated the IKDC-SKF into another language, followed a standardized translation procedure with expert panels, forward and backward translations and pilot studies. Some of the translated IKDC-SKF forms included in this review [12, 27, 35] have been approved by the American Society of Sports Medicine’s (AOSSM) and are published on the webpage for free use. Nonetheless, only one study had a score of “good” [35] and four had scores of “fair” for the quality of the translation, mostly due to incomplete reporting of the translation procedure as required by the

Knee Surg Sports Traumatol Arthrosc

COSMIN guidelines. Most of the studies concluded that there were no difficulties in translating the IKDC-SKF and only minor cultural adaptations were needed. For example, in the Brazilian translation, the item “skiing” was replaced with “surfing.” However, it is not known whether skiing is comparable to surfing in terms in the possible strain to the knee. Due to translation and cultural differences, different populations may differ in their interpretation of questions, which may lead to differences in scores. Therefore, DIF testing between language versions is recommended [36]. When assessing criterion validity the decision was made to use the SF36 as a reasonable gold standard for the IKDC-SKF, since the SF36 was used in the developmental article for the IKDC-SKF [16]. For the Pedi-IKDC, the CHQ was used as a gold standard in the developmental article and was therefore used as gold standard for the Pedi-IKDC. None of the studies showed a high correlation of the IKDC with the gold standard. However, it is not clear whether the SF36 and CHQ should be considered adequate gold standards, since they do not measure the same construct as the IKDC-SKF. For example, the questions related to the “bodily pain” section of the SF36 addresses pain in general, but the IKDC-SKF specifically addresses knee pain. The IKDC-SKF also includes specific questions about other knee related symptoms, such as locking and swelling. The significant but low correlations might suggest that IKDC-SKF and SF36 measure similar, but not identical aspects, and that the differences are such that the SF36 should not be considered the gold standard for the IKDCSKF. This supports the statement made by the COSMIN group that gold standards do not exist for PROs [31] and highlights the need for disease- or organ-specific questionnaires such as the knee-specific IKDC-SKF. It has been suggested that the IKDC-SKF can be hard for children to interpret [18]. The Pedi-IKDC was developed to minimize the risk of lower comprehension and validity for use in children. The main differences between the IKDCSKF and the Pedi-IKDC are that some expressions that were hard for children to comprehend have been replaced (e.g., “puffy” replaced “swollen” and “get stuck in place” replaced “catch”). The two studies [5, 20] that assessed the use of the Pedi-IKDC, for children 10–18.9 years old, had good methodological quality and showed positive results regarding internal consistency, test–retest reliability and hypothesis testing. Schmitt et al. [41] used the regular IKDC-SKF in 673 children aged 6–18 years and found good internal consistency and structural validity for that population. Thus, the need for a specific children’s version of the IKDC-SKF cannot be determined from the present studies. The articles that were reviewed included subjects who were diagnosed with several different types of knee conditions, e.g., ACL rupture, patellofemoral pain

syndrome, osteoarthritis, and meniscus tears, and some studies included participants with general “knee symptoms.” Although many different knee conditions are represented in these studies, more studies are needed to cover all measurement properties in different populations and to provide support for good measurement properties of the IKDC-SKF as a knee-specific measurement instrument, which was its initial purpose [16]. Rodriguez-Merchan [39] argued that there is no PROM that is relevant to all knee conditions. The IKDC-SKF seems to be useful as a general instrument for all kinds of knee injuries, which might facilitate its clinical use in situations in which time is a factor. The use of one general instrument rather than different instruments for different knee conditions could save time and effort and make it possible to compare treatment results between different knee conditions. The strength of the present review is the systematic way to include and evaluate previous studies on measurement properties of the IKDC-SKF by using the COSMIN guidelines. The COSMIN checklist is, to our knowledge, the only one of its kind (i.e., the only tool for evaluating the quality of studies on PROMs). It has been used in several studies [4, 11, 46, 48, 49], although it is a “young” tool and has some limitations. For example, the standards for certain properties [e.g., the required number of participants to obtain a “good” (n = 50–99) or “excellent” score (n  ≥ 100)] are rather high, which might result in studies with otherwise acceptable methodologies to be given a low score. This problem could be solved by excluding sample size from the quality assessment and by incorporating it into the levels of evidence ratings. In that way, multiple small studies could, taken together, still provide strong evidence. This has been done in some reviews [2]. Inconsistent use and interpretation of terminology by authors in terms of the measurement properties complicates the process of evaluating, interpreting, and comparing results from different studies. For example, Crawford et al. [9] interpreted floor and ceiling effects as a measure of content validity, while according to COSMIN this should be considered an aspect of interpretability. For future studies, it would be helpful to use the COSMIN taxonomy and checklist for both the design and reporting of studies on measurement properties.

Conclusions This review shows that the IKDC-SKF is a measurement instrument with good internal consistency, test–retest reliability, content and structural validity, responsiveness and interpretability (no floor and ceiling effects). The IKDCSKF is useful as a general instrument for all kinds of knee injuries, which might facilitate its clinical use in situations

13



in which time is a factor. Other outcome measures should be added for evaluating specific aspects of functioning. Further evaluation of measurement error, minimal important change, and hypotheses testing is recommended. Conflict of interest  The authors declare that they have no conflict of interest.

References 1. Agel J, LaPrade RF (2009) Assessment of differences between the modified Cincinnati and International Knee Documentation Committee patient outcome scores: a prospective study. Am J Sports Med 37(11):2151–2157 2. Bartels B, de Groot JF, Terwee CB (2013) The six-minute walk test in chronic pediatric conditions: a systematic review of measurement properties. Phys Ther 93(4):529–541 3. Bjorklund K, Andersson L, Dalen N (2009) Validity and responsiveness of the test of athletes with knee injuries: the new criterion based functional performance test instrument. Knee Surg Sports Traumatol Arthrosc 17(5):435–445 4. Boger EJ, Demain S, Latter S (2013) Self-management: a systematic review of outcome measures adopted in self-management interventions for stroke. Disabil Rehabil 35(17):1415–1428 5. Boykin RE, McFeely ED, Shearer D, Frank JS, Harrod CC, Nasreddine AY, Kocher MS (2013) Correlation between the child health questionnaire and the International Knee Documentation Committee score in pediatric and adolescent patients with an anterior cruciate ligament tear. J Pediatr Orthop 33(2):216–220 6. Bream E, Charman SC, Clift B, Murray D, Black N (2010) Relationship between patients’ and clinicians’ assessments of health status before and after knee arthroplasty. Qual Saf Health Care 19(6):e6 7. Briggs KK, Lysholm J, Tegner Y, Rodkey WG, Kocher MS, Steadman JR (2009) The reliability, validity, and responsiveness of the Lysholm score and Tegner activity scale for anterior cruciate ligament injuries of the knee: 25 years later. Am J Sports Med 37(5):890–897 8. Cook C, Hegedus E, Hawkins R, Scovell F, Wyland D (2010) Diagnostic accuracy and association to disability of clinical test findings associated with patellofemoral pain syndrome. Physiother Can 62(1):17–24 9. Crawford K, Briggs KK, Rodkey WG, Steadman JR (2007) Reliability, validity, and responsiveness of the IKDC score for meniscus injuries of the knee. Arthroscopy 23(8):839–844 10. de Vet HC, Terwee CB, Mokkink HG, Knol DL (2011) Measurement in medicine: a practical guide. University Press, Cambridge 11. Dobson F, Hinman RS, Hall M, Terwee CB, Roos EM, Bennell KL (2012) Measurement properties of performance-based measures to assess physical function in hip and knee osteoarthritis: a systematic review. Osteoarthritis Cartilage 20(12):1548–1562 12. Fu SN, Chan YH (2011) Translation and validation of Chinese version of International Knee Documentation Committee subjective knee form. Disabil Rehabil 33(13–14):1186–1189 13. Greco NJ, Anderson AF, Mann BJ, Cole BJ, Farr J, Nissen CW, Irrgang JJ (2010) Responsiveness of the International Knee Documentation Committee subjective knee form in comparison to the Western Ontario and McMaster Universities Osteoarthritis Index, modified Cincinnati Knee Rating System, and Short Form 36 in patients with focal articular cartilage defects. Am J Sports Med 38(5):891–902 14. Haverkamp D, Sierevelt IN, Breugem SJ, Lohuis K, Blankevoort L, van Dijk CN (2006) Translation and validation of the Dutch

13

Knee Surg Sports Traumatol Arthrosc version of the International Knee Documentation Committee subjective knee form. Am J Sports Med 34(10):1680–1684 15. Higgins LD, Taylor MK, Park D, Ghodadra N, Marchant M, Pietrobon R, Cook C (2007) Reliability and validity of the International Knee Documentation Committee (IKDC) subjective knee form. Joint Bone Spine 74(6):594–599 16. Irrgang JJ, Anderson AF, Boland AL, Harner CD, Kurosaka M, Neyret P, Richmond JC, Shelborne KD (2001) Development and validation of the International Knee Documentation Committee subjective knee form. Am J Sports Med 29(5):600–613 17. Irrgang JJ, Anderson AF, Boland AL, Harner CD, Neyret P, Richmond JC, Shelbourne KD (2006) Responsiveness of the International Knee Documentation Committee subjective knee form. Am J Sports Med 34(10):1567–1573 18. Iversen MD, Lee B, Connell P, Andersen J, Anderson AF, Kocher MS (2010) Validity and comprehensibility of the International Knee Documentation Committee subjective knee evaluation form in children. Scand J Med Sci Sports 20(1):e87–e95 19. Kimberlin CL, Winterstein AG (2008) Validity and reliability of measurement instruments used in research. Am J Health Syst Pharm 65(23):2276–2284 20. Kocher MS, Smith JT, Iversen MD, Brustowicz K, Ogunwole O, Andersen J, Yoo WJ, McFeely ED, Anderson AF, Zurakowski D (2011) Reliability, validity, and responsiveness of a modified International Knee Documentation Committee subjective knee form (Pedi-IKDC) in children with knee disorders. Am J Sports Med 39(5):933–939 21. Kong DH, Yang SJ, Ha JK, Jang SH, Seo JG, Kim JG (2012) Validation of functional performance tests after anterior cruciate ligament reconstruction. Knee Surg Relat Res 24(1):40–45 22. Laboute E, Savalli L, Puig PL, Trouve P, Larbaigt M, Raffestin M (2010) Validity and reproducibility of the PPLP scoring scale in the follow-up of athletes after anterior cruciate ligament reconstruction. Ann Phys Rehabil Med 53(3):162–179 23. Lertwanich P, Praphruetkit T, Keyurapan E, Lamsam C, Kulthanan T (2008) Validity and reliability of Thai version of the International Knee Documentation Committee subjective knee form. J Med Assoc Thai 91(8):1218–1225 24. Liberati A, Altman DG, Tetzlaff J, Mulrow C, Gotzsche PC, Ioannidis JP, Clarke M, Devereaux PJ, Kleijnen J, Moher D (2009) The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate healthcare interventions: explanation and elaboration. BMJ 339:b2700 25. Lohmander LS, Ostenberg A, Englund M, Roos H (2004) High prevalence of knee osteoarthritis, pain, and functional limitations in female soccer players twelve years after anterior cruciate ligament injury. Arthritis Rheum 50(10):3145–3152 26. Lysholm J, Tegner Y (2007) Knee injury rating scales. Acta Orthop 78(4):445–453 27. Metsavaht L, Leporace G, Riberto M, de Mello Sposito MM, Batista LA (2010) Translation and cross-cultural adaptation of the Brazilian version of the International Knee Documentation Committee subjective knee form: validity and reproducibility. Am J Sports Med 38(9):1894–1899 28. Metsavaht L, Leporace G, de Mello Sposito MM, Riberto M, Batista LA (2011) What is the best questionnaire for monitoring the physical characteristics of patients with knee osteoarthritis in the Brazilian population? Revista Brasileira de Ortopedia 46(3):256–261 29. Metsavaht L, Leporace G, Riberto M, Sposito MM, Del Castillo LN, Oliveira LP, Batista LA (2012) Translation and cross-cultural adaptation of the lower extremity functional scale into a Brazilian Portuguese version and validation on patients with knee injuries. J Orthop Sports Phys Ther 42(11):932–939 30. Mokkink LB, Terwee CB, Gibbons E, Stratford PW, Alonso J, Patrick DL, Knol DL, Bouter LM, de Vet HC (2010) Inter-rater

Knee Surg Sports Traumatol Arthrosc agreement and reliability of the COSMIN (COnsensus-based Standards for the selection of health status Measurement Instruments) checklist. BMC Med Res Methodol 10:82 31. Mokkink LB, Terwee CB, Knol DL, Stratford PW, Alonso J, Patrick DL, Bouter LM, de Vet HC (2010) The COSMIN checklist for evaluating the methodological quality of studies on measurement properties: a clarification of its content. BMC Med Res Methodol 10:22 32. Mokkink LB, Terwee CB, Patrick DL, Alonso J, Stratford PW, Knol DL, Bouter LM, de Vet HC (2010) The COSMIN checklist for assessing the methodological quality of studies on measurement properties of health status measurement instruments: an international Delphi study. Qual Life Res 19(4):539–549 33. Mokkink LB, Terwee CB, Patrick DL, Alonso J, Stratford PW, Knol DL, Bouter LM, de Vet HC (2010) The COSMIN study reached international consensus on taxonomy, terminology, and definitions of measurement properties for health-related patientreported outcomes. J Clin Epidemiol 63(7):737–745 34. Oiestad BE, Engebretsen L, Storheim K, Risberg MA (2009) Knee osteoarthritis after anterior cruciate ligament injury: a systematic review. Am J Sports Med 37(7):1434–1443 35. Padua R, Bondi R, Ceccarelli E, Bondi L, Romanini E, Zanoli G, Campi S (2004) Italian version of the International Knee Documentation Committee subjective knee form: cross-cultural adaptation and validation. Arthroscopy 20(8):819–823 36. Petersen MA, Groenvold M, Bjorner JB, Aaronson N, Conroy T, Cull A, Fayers P, Hjermstad M, Sprangers M, Sullivan M, European Organisation for R, Treatment of Cancer Quality of Life G (2003) Use of differential item functioning analysis to assess the equivalence of translations of a questionnaire. Qual Life Res 12(4):373–385 37. Piontek T, Ciemniewska-Gorzela K, Naczk J, Cichy K, Szulc A (2012) Linguistic and cultural adaptation into Polish of the IKDC 2000 subjective knee evaluation form and the Lysholm scale. Pol Orthop Traumatol 77:115–119 38. Reinke EK, Spindler KP, Lorring D, Jones MH, Schmitz L, Flanigan DC, An AQ, Quiram AR, Preston E, Martin M, Schroeder B, Parker RD, Kaeding CC, Borzi L, Pedroza A, Huston LJ, Harrell FE Jr, Dunn WR (2011) Hop tests correlate with IKDC and KOOS at minimum of 2 years after primary ACL reconstruction. Knee Surg Sports Traumatol Arthrosc 19(11):1806–1816 39. Rodriguez-Merchan EC (2012) Knee instruments and rat ing scales designed to measure outcomes. J Orthop Traumatol 13(1):1–6 40. Roos EM, Roos HP, Lohmander LS, Ekdahl C, Beynnon BD (1998) Knee Injury and Osteoarthritis Outcome Score (KOOS)— development of a self-administered outcome measure. J Orthop Sports Phys Ther 28(2):88–96 41. Schmitt LC, Paterno MV, Huang S (2010) Validity and internal consistency of the International Knee Documentation Committee

subjective knee evaluation form in children and adolescents. Am J Sports Med 38(12):2443–2447 42. Shelbourne KD, Barnes AF, Gray T (2012) Correlation of a single assessment numeric evaluation (SANE) rating with modified Cincinnati knee rating system and IKDC subjective total scores for patients after ACL reconstruction or knee arthroscopy. Am J Sports Med 40(11):2487–2491 43. Siqueira AC, Baraúna MA, Dionísio VC (2012) Functional evaluation of the knee in subjects with patellofemoral pain syndrome (PFPS): comparison between KOS and IKDC scales. Revista Brasileira de Medicina do Esporte 18(6):400–403 44. Terwee CB, Bot SD, de Boer MR, van der Windt DA, Knol DL, Dekker J, Bouter LM, de Vet HC (2007) Quality criteria were proposed for measurement properties of health status questionnaires. J Clin Epidemiol 60(1):34–42 45. Terwee CB, Jansma EP, Riphagen II, de Vet HC (2009) Development of a methodological PubMed search filter for finding studies on measurement properties of measurement instruments. Qual Life Res 18(8):1115–1123 46. Terwee CB, Schellingerhout JM, Verhagen AP, Koes BW, de Vet HC (2011) Methodological quality of studies on the measurement properties of neck pain and disability questionnaires: a systematic review. J Manipulative Physiol Ther 34(4):261–272 47. Terwee CB, Mokkink LB, Knol DL, Ostelo RW, Bouter LM, de Vet HC (2012) Rating the methodological quality in systematic reviews of studies on measurement properties: a scoring system for the COSMIN checklist. Qual Life Res 21(4):651–657 48. Tijssen M, van Cingel R, van Melick N, de Visser E (2011) Patient-Reported Outcome questionnaires for hip arthroscopy: a systematic review of the psychometric evidence. BMC Musculoskelet Disord 12:117 49. Uijen AA, Heinst CW, Schellevis FG, van den Bosch WJ, van de Laar FA, Terwee CB, Schers HJ (2012) Measurement properties of questionnaires measuring continuity of care: a systematic review. PLoS ONE 7(7):e42256 50. van Meer BL, Meuffels DE, Vissers MM, Bierma-Zeinstra SM, Verhaar JA, Terwee CB, Reijman M (2013) Knee injury and osteoarthritis outcome score or International Knee Documentation Committee subjective knee form: which questionnaire is most useful to monitor patients with an anterior cruciate ligament rupture in the short term? Arthroscopy 29(4):701–715 51. von Porat A, Roos EM, Roos H (2004) High prevalence of osteoarthritis 14 years after an anterior cruciate ligament tear in male soccer players: a study of radiographic and patient relevant outcomes. Ann Rheum Dis 63(3):269–273 52. Walden M, Hagglund M, Magnusson H, Ekstrand J (2011) Anterior cruciate ligament injury in elite football: a prospective threecohort study. Knee Surg Sports Traumatol Arthrosc 19(1):11–19

13

The measurement properties of the IKDC-subjective knee form.

To evaluate the methodological quality of studies reporting on the measurement properties of the International Knee Documentation Committee subjective...
330KB Sizes 4 Downloads 8 Views