Journal of Autoimmunity xxx (2014) 1e4

Contents lists available at ScienceDirect

Journal of Autoimmunity journal homepage: www.elsevier.com/locate/jautimm

Criteria for Behçet’s disease with reflections on all disease criteria Hasan Yazici a, *, Yusuf Yazici b a b

Division of Rheumatology, Department of Medicine, Cerrahpasa Medical Faculty, University of Istanbul, Istanbul, Turkey NYU Hospital for Joint Diseases, 333 East 38th Street, New York, NY 10016, USA

a r t i c l e i n f o

a b s t r a c t

Article history: Received 7 October 2013 Accepted 13 November 2013

With no specific histologic, laboratory or imaging features the diagnosis/classification of Behçet’s Disease (BD) remains clinical. As such, disease criteria are needed. The International Study Group Criteria set is the most widely used. It has some limitations, especially in telling BD from Crohn’s disease. On the other hand the main issue, as it also applies to many of the other criteria sets in rheumatology, is our lack of appreciation of a list of misconceptions e some examples of which are unluckily also found in the 2010 ACR/EULAR RA Criteria set- about diagnostic/classification criteria making and their implementation. 1. The view that classification and diagnostic criteria should be different is ill advised in that the cerebral/ arithmetic basis of both are the same. 2. The default promise of diagnostic criteria to come once we formulate a classification criteria set is an extension of the previous misconception. 3.Taking pains to avoid circularity in criteria making is unwarranted since the essence of criteria making is circular. In addition we fail to exploit the utility of the disease criteria in ruling out, rather than ruling in, the diseases we seek. Finally we also fail to appreciate the paramount importance of the Bayesian prior (the pretest) probability in formulating and implementing these disease criteria. To formulate criteria tailored to subspecialties, as well as giving the often forgotten family history more importance in our criteria sets are some ways to improve the prior probability on which our diagnostic/classification decisions will be based. We first have to reconcile with ourselves that probabilities are very important in our practice and research. Moreover that reconciliation must also be shared with the public, which includes our patients. Ó 2014 Elsevier Ltd. All rights reserved.

Keywords: Behçet’s disease Diagnostic criteria Classification criteria RA criteria Bayes theorem Circularity

ago “Presence of disease ‘criteria’ affirms our ignorance of the essence of disease” [3].

1. Introduction It can be debated whether a chapter on Behçet’s disease (BD) should really be included in a special issue devoted to diagnostic criteria in autoimmunity, since there are many ways by which BD stands somewhat distant, especially when one considers the disease mechanisms [1,2]. On the other hand BD often comes into the differential diagnosis of autoimmune diseases, and vice versa. Furthermore some conceptual barriers and misconceptions about disease criteria have kept and still keep us busy in BD, we suggest, seem equally to concern the bone fide auto immune diseases, as well. The need, in turn, for developing such criteria in either situation surely stems from the fact most, if not all, the disease conditions we discuss in this issue are constructs rather than comprehensively well defined pathologies like Heberden’s nodes, gout, or tuberculous arthritis. As James Fries had aptly said years

* Corresponding author. Tel.: þ90 2163456036. E-mail addresses: [email protected] (H. Yazici), (Y. Yazici).

[email protected]

2. The International Society Study Group criteria (ISGC) for diagnosis of Behçet’s Disease (BD) Although some features of BD, like the pathergy phenomenon, are peculiar, there are no specific clinical, histologic, laboratory e including genetic- or imaging features specific for BD. So one needs a construct to define what is meant by BD. Again in Fries’ elegant words we need a construct in the “current conventional wisdom” [3]. Up to 1990 there had been several BD criteria sets including those from Mason and Barnesand O’Duffy [4,5]. All of these had been prepared ad hoc, as probably all disease criteria were at those times, being based on “eminence” rather than “evidence” [6]. As such, formal clinical decision making parameters like sensitivity, specificity and positive or negative predictive values were not available. The then newly formed International Society for Behçet’s Disease took as its first task to prepare an evidence based criteria set to distinguish BD from conditions that usually come into its differential diagnosis and formed a study group to prepare a

0896-8411/$ e see front matter Ó 2014 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.jaut.2014.01.014

Please cite this article in press as: Yazici H, Yazici Y, Criteria for Behçet’s disease with reflections on all disease criteria, Journal of Autoimmunity (2014), http://dx.doi.org/10.1016/j.jaut.2014.01.014

2

H. Yazici, Y. Yazici / Journal of Autoimmunity xxx (2014) 1e4

diagnostic criteria set for BD (ISGC). To this end the group collected the clinical features of 914 patients recognized as BD from more than a dozen Behçet centers in 7 countries. In addition, 309 patients followed with diagnoses which usually came into the differential diagnosis of BD such as rheumatoid arthritis, systemic lupus, etc were collected; this group made up the controls. Since recurrent oral ulceration was almost always present in BD it was decided to take this clinical symptom as the never changing item of our criteria set. The exclusion of 3% of the patients without oral ulceration from the BD group and 69% of the patients for the similar reason from the control group left 890 patients in the first and 97 in the second group who had not had oral ulcers 3 times or more within the previous year. Following this, a random 60% sample from the BD group was taken to make up the training group (n ¼ 534) while the remaining 40% of the patients (n ¼ 356) became the validation group. The next step was to discern which set of findings in the training group helped a physician to differentiate a patient in that group from a patient in the control group. Although the actual arithmetic is obviously more involved [7] the basic tool here was the calculation of likelihood ratios (LR) which are of two kinds. The LRþ is the probability of a disease feature to be present among the diseased divided by the probability of the same disease feature among the non-diseased and is expressed as sensitivity/1specificity. The LR-, on the other hand, is the probability of a disease feature to be absent among the diseased, as compared to the probability of the same feature to be absent among the nondiseased and is expressed as 1-sensitivity/specificity. Turning back to the preparation of the ISGC criteria, the expected weight of evidence (the log-LRs) [7] of each of the clinical features were calculated, corrected for the prevalence of each feature [7,8] and those which most heavily helped a patient to be classified as BD made up the ISGC set [9] as shown in Table 1. This set, derived as described from the comparisons between the training and the controls groups, had a sensitivity of 91% and a specificity of 96%. These were furthermore re-calculated a second time among the validation and the control groups and were found to have 95% sensitivity and 98% specificity. The final task was to compare the criteria set thus formed with the other 5 criteria sets this time using the sensitivity and the specificity of each of these different sets of criteria among the whole group of patients BD and the controls. It was gratifying to see that the new ISBD criteria had a relative value (which was the sum of sensitivity and specificity) than all the previously suggested criteria sets [9]. We have on purpose given a rather detailed explanation of how the ISGC were prepared in that the methodology we have described is more or less the same in all non-eminence based disease criteria we use today. We must also briefly discuss what the ISGC criteria mean when used as a diagnostic tool assuming (taking the mean values of the training and the validation groups) a 93% sensitivity and 97% specificity. Here we have to resort back to LRs. A 93% sensitivity and a 97% specificity give a LRþ of 0.93/1  0.97 ¼ 31 and a LR of 1  0.93/0.97 ¼ 0.07. We now go to the Bayes’ formula which says post-test odds ¼ pretest odds  LR [10]. We also have to underline that a familiarity with the said formula is essential for any meaningful discussion of criteria making. A practical advice is Table 1 ISGC set for diagnosing BD, adopted from Ref. [11]. Recurrent oral ulceration (at least 3 times within the previous year) Plus 2 of: Recurrent genital ulceration Eye lesions (retinitis or uveitis) Skin lesions (erythema nodosum and/or papulopustular lesions Positive pathergy reaction Findings applicable only in absence of other clinical explanations.

not to forget converting probabilities to odds and vice versa when you use this formula. There are also excellent web pages that you can do these calculations for you given the sensitivity, specificity and the pretest figures, i.e. [11]. Assume in a rheumatology clinic which recruits 1000 patients per year, 10 have BD (pretest probability ¼ 1.0%). If we apply the ISGC to every patient seen in a year in this clinic the probability of any one patient fulfilling the ISGC to have BD will be 23.8%. However the interpretation of the patient who does not fulfill the criteria will be quite different. His/ her probability of having BD is now only 0.07% a very low probability which is bound to be useful diagnostically. There were and still are worries about the ISGC set. 1. The validation group was, by definition, made up of patients with clinical features very similar to that in the training group since the groups were separated by randomization. So it was no great surprise that the sensitivity and the specificity in the groups were very close to one another. A better approach would have been to test the findings in the training group to a totally unrelated group of patients making up the validation group. It must however be pointed out that the external validity of the ISGC set indeed was later tested among patients from other regions and its performance was quite satisfactory, in fact considerably better when compared to the previous criteria schemes [12]. 2. Another shortcoming was related to the issue of inflammatory bowel disease (IBD). It might, and not infrequently, be impossible to tell BD from IBD especially if the patient presents mainly with intestinal involvement. On the other hand there was limited contribution of the gastroenterology clinics to the preparation of the ISGC criteria set in providing patients and controls with major gastrointestinal disease. When our group looked at this formally some years later [13] we saw that the specificity of the ISGC set did indeed need improvement when used among patients with IBD. There surely have been attempts to improve ISGC and make new sets of criteria [14,15]. Judging just from the standpoint of improved sensitivity and specificity they have not performed better than ISGC. It is also important to note that the basic methodology and the arithmetic we have attempted to explain in some detail have not been duly appreciated in many of these attempts. For example it was reported [14] that a new international set of criteria for BD [International Criteria for BD (ICBD)], when tested in Iran, had shown a marked increase in sensitivity (from 78.1 for ISGC to 98.2% for ICBD in contrast to a small, what was described as only a small decrease in specificity (from 98.4% for ISGC to 95.6% for ICBD). The authors considered this rather desirable. What is not appreciated here that this drop in specificity, in fact, causes many more patients with BD to be misclassified when applied to the 10 patients with BD among 1000 patients example we gave above. This high improvement in sensitivity will not miss any of the 10 patients in our example but the small decrease in specificity will cause 46/990 of the patients without BD to be diagnosed as such. Note that in the same setting the ISGC would fail to diagnose 2/10 patients while labeling only 16/990 of the patients without BD as such. In brief the new criteria would misclassify a total of 46 patients among the 1000 patients while the ISGC would misclassify 18. We should not forget that specificity is much more important than sensitivity to correctly identify rare diseases. In a more recent attempt from Italy a group of investigators [15] also attempted to show that the new ICBD classified BD better than ISGC. However they only studied 29 patients with BD. Not only was this number quite inadequate, but how they set out to define a specificity without a control group, which they did not have, was most puzzling.

Please cite this article in press as: Yazici H, Yazici Y, Criteria for Behçet’s disease with reflections on all disease criteria, Journal of Autoimmunity (2014), http://dx.doi.org/10.1016/j.jaut.2014.01.014

H. Yazici, Y. Yazici / Journal of Autoimmunity xxx (2014) 1e4

3. Conceptual problems in making disease criteria and examples from the 2010 ACR/EULAR rheumatoid arthritis classification criteria We propose that practically all attempts, including those related to BD, at making disease criteria in rheumatology have had and continue to have conceptual problems. 1. Almost all disease criteria are declared before they have been tested in real life situations. As we have seen above in the Bayes’ theorem pretest probabilities are very important using criteria for diagnosis. With increasing pretest probability the diagnostic utility of disease criteria will significantly improve, so we should search for ways to accomplish this when we prepare criteria. One way to achieve this would be to prepare subspecialty specific disease criteria [16]. For example, the main diagnostic issue in CNS disease in BD is how to differentiate it from multiple sclerosis or central nervous system vasculitis. A similar situation exits in GI involvement where the main problem is distinguishing BD from idiopathic inflammatory bowel disease. Another, and almost not at all utilized inclusion in disease criteria is the family history. All clinicians would agree that a patient with an early onset arthritis has a greater pre-test probability of having RA if she also has a father with RA. Why we have thus far omitted this important aspect of clinical disease from our criteria is hard to explain. 2. As we have seen above, in making a diagnosis we also get very valuable information from a patient not fulfilling a criteria set. This is usually not given enough emphasis in the discussions related to the utility of disease criteria. What is perhaps more important is that the value of those findings, the absence of which would be rather specific for the disease being sought, are not considered. Assuming we prepare a set of criteria for BD for neurologists, there would be good reason to include features like negative results in multimode evoked potentials or the absence of typical demyelination plaques in imaging. 3. An important conceptual problem in criteria development is the concern to avoid circular reasoning. We note this concern rather strongly voiced in the recent 2010 ACR/EULAR rheumatoid arthritis classification criteria [17,18]. What we need to realize is that criteria development is, by definition, circular [19]. We propose that the conceptual problem arises when we do not differentiate the process of circularity, from circular reasoning. Circular logic is to come to a conclusion unaware that the conclusion was inescapable. On the other hand to identify what you have previously defined is not circular logic [19]. To give an example, the authors of the recent ACR/EULAR classification criteria underline that, in order to avoid circularity, they did not use the 1987 RA criteria as an endpoint to discern in a patient with early arthritis what factors were responsible for bringing about the advanced disease that the 1987 criteria recognized [17,18]. This is hard to understand. There is nothing circular about defining an end and then analyze what causes that end. The authors used methotrexate use as surrogate for the bad end in this exercise. This, we propose, was an unlucky choice. This only led to specificity values of around 60e70% for their criteria when tested in real life [20], almost unacceptable in any disease criteria exercise. 4. Of the conceptual issues we have been discussing probably the most important is the unwarranted differentiation between diagnostic and classification criteria. The dominant view for the last several decades has been that we should have separate criteria sets for research and diagnosis, the former requiring classification and the latter diagnostic criteria. Whenever we propose a new set of disease criteria we, almost by default,

3

promise diagnostic criteria to come [19]. On the other hand this never gets realized in that there is no other methodology involved to prepare a set of diagnostic criteria as different from classification criteria. They are one and the same. A diagnosis is nothing but a classification a physician makes in the individual patient [19]. Moreover in this separation of the diagnostic and classification criteria there is the implication that a diagnosis is something more specific than a classification. That simply is not true. We very frequently and inescapably deal with probabilities in our everyday practice and we should surely relate this to our patients. The new ACR/EULAR classification criteria for RA is actually nothing more or less than a set of criteria for starting MTX in a patient with early arthritis. Why do we not then simply tell our patients with early arthritis that our science is not good enough to exactly predict whether they will develop the bad and crippling disease we call RA. On the other hand we also tell them, if we start them on MTX, chances are rather good that we will be able circumvent, at least partially, that bad outcome. It is interesting to note that the authors of the new criteria say that their criteria set was neither to diagnose early RA in the rheumatology clinic nor a guideline for the general physician to help in referral [17]. Finally, with the recent realization that about 20e30% of the patients fulfilling the criteria for early RA turn out to have diseases other than RA later on [20], it effectively precludes their use in drug studies, proposed as the current main reason for the existence of these criteria [17].

4. Conclusion Making diagnostic/classification criteria is inescapable especially when we deal with diseases of unknown etiology and nonspecific histology, imaging and laboratory. This is also true for BD. On the other hand our lack of appreciation of how important probabilities are in the practice and science of medicine and the prevailing misconception that diagnostic and classification criteria are somewhat different, adversely effects both the preparation of new disease criteria as well as an effective use of the existing sets. Acknowledgment The authors are grateful to Mrs. Angela Tetmeyer Yazici for her editorial help. References [1] Yazici H, Ugurlu S, Seyahi E. Behçet syndrome: is it one condition? Clin Rev Allerg Immunol 2012;43:275e80. [2] Direskeneli H. Autoimmunity vs autoinflammation in Behcet’s disease: do we oversimplify a complex disorder? Rheumatology (Oxford) 2006;45:1461e5. [3] Fries JF. Disease criteria for systemic lupus erythematosus. Arch Intern Med 1984;144:252. [4] Mason RM, Barnes CG. Arthritis of Behçet’s disease. Ann Rheum Dis 1969;28: 95e103. [5] O’Duffy JD. Suggested criteria for diagnosis of Behçet’s disease. J Rheumatol 1974;1(Suppl. 1):18 [abstr]. [6] Pincus T, Yazici Y, Bergman MJ. Hotel-based medicine. J Rheumatol 2008;35: 1487e8. [7] Spielgelhalter DJ. Statistical epidemiology for evaluating gastrointestinal symptoms. Clin Gastroenterol 1985;14:489e515. [8] International Study Group for Behçet’s Disease. Evaluation of diagnostic (‘Classification’) criteria in Behçet’s disease. Br J Rheumatol 1992;31:199e208. [9] International Study Group for Behçet’s Disease. Criteria for diagnosis of Behçet’ disease. Lancet 1990;335:1070e80. [10] Sox H. Tools for decision making. In: Max MB, Lynn J, editors. Interactive textbook on clinical Symptom Research National Institute of Health, Department of Health Services; http://painconsortium.nih.gov/symptomresearch/ chapter_14/Part_1/sec5/chspt1s5pg1.htm [accessed 04.10.13]. [11] Birnbaum MH. http://psych.fullerton.edu/mbirnbaum/bayes/BayesCalc.htm [accessed 04.10.13].

Please cite this article in press as: Yazici H, Yazici Y, Criteria for Behçet’s disease with reflections on all disease criteria, Journal of Autoimmunity (2014), http://dx.doi.org/10.1016/j.jaut.2014.01.014

4

H. Yazici, Y. Yazici / Journal of Autoimmunity xxx (2014) 1e4

[12] O’Neill TW, Rigby AS, Silman AJ, Barnes C. Validation of the international study group criteria for Behçet’s disease. Br J Rehumatol 1994;33:115e7. lu M, Ozyazgan Y, Ozdog an H, Yazici H. [13] Tunç R, Uluhan A, Melikog A reassessment of the International Study Group criteria for the diagnosis (classification) of Behçet’s syndrome. Clin Exp Rheumatol 2001;19(5 Suppl. 24):S45e7. [14] Davatchi F, Sadeghi Abdollahi B, Shahram F, Nadji A, Chams-Davatchi C, Shams H, et al. Validation of the International Criteria for Behçet’s disease (ICBD) in Iran. Int J Rheum Dis 2010;13:55e60. [15] di Meo N, Bergamo S, Vidimari P, Bonin S, Trevisan G. Analysis of diagnostic criteria in adamantiades-Behçet disease: a retrospective study. Indian J Dermatol 2013;58:275e7. [16] Yazici H. Diagnostic versus classification criteria e a continuum. Bull NYU Hosp Jt Dis 2009;67:206e8.

[17] Aletaha D, Neogi T, SIlman A, Funovits J, Felson DT, Bingham 3rd CO, et al. 2010 Rheumatoid arthritis classification criteria: an America College of Rheumatology/European League Against Rheumatism collaborative initiative. Ann Rheum Dis 2010;69:1580e8. [18] Neogi T, Aletaha D, Silman AJ, Naden RL, Felson DT, Aggarwal R, et al. The 2010 American College of Rheumatology/European League Against Rheumatism classification criteria for rheumatoid arthritis: phase 2 methodological report. Arthritis Rheum 62: 2582e91. [19] Yazici H. A critical look at diagnostic criteria: time for a change? Bull NYU Hosp Jt Dis 2011;69:101e3. [20] Sakellariou G, Scirè CA, Zambon A, Roberto Caporali R, Montecucco C. Performance of the 2010 Classification Criteria for Rheumatoid Arthritis: a systematic literature review and a meta-analysis. PLoS One 2013;8:e56528.

Please cite this article in press as: Yazici H, Yazici Y, Criteria for Behçet’s disease with reflections on all disease criteria, Journal of Autoimmunity (2014), http://dx.doi.org/10.1016/j.jaut.2014.01.014

Criteria for Behçet's disease with reflections on all disease criteria.

With no specific histologic, laboratory or imaging features the diagnosis/classification of Behçet's Disease (BD) remains clinical. As such, disease c...
210KB Sizes 1 Downloads 0 Views