Scandinavian Journal of Gastroenterology

ISSN: 0036-5521 (Print) 1502-7708 (Online) Journal homepage: http://www.tandfonline.com/loi/igas20

Prognostic Scores in Oesophageal or Gastric Variceal Bleeding C. Ohmann, H. Stöltzing, L. Wins, E. Busch & K. Thon To cite this article: C. Ohmann, H. Stöltzing, L. Wins, E. Busch & K. Thon (1990) Prognostic Scores in Oesophageal or Gastric Variceal Bleeding, Scandinavian Journal of Gastroenterology, 25:5, 501-512 To link to this article: http://dx.doi.org/10.3109/00365529009095522

Published online: 08 Jul 2009.

Submit your article to this journal

Article views: 2

View related articles

Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalInformation?journalCode=igas20 Download by: [The University Of Melbourne Libraries]

Date: 16 January 2016, At: 08:01

Prognostic Scores in Oesophageal or Gastric Variceal Bleeding C. OHMANN, H. STOLTZING, L. WINS, E. BUSCH & K. THON Theoretical Surgery Unit and Dept. of General and Trauma Surgery, Centre of Operative Medicine I, Heinrich Heine University of Dusseldorf, Dusseldorf, FRG

Scandinavian Journal of Gastroenterology 1990.25:501-512.

Ohmann C, Stoltzing H, Wins L, Busch E, Thon K. Prognostic scores in oesophageal or gastric variceal bleeding. Scand J Gastroenterol 1990, 25, 501-512 Numerous scoring systems have been developed for the prediction of outcome of variceal bleeding; however, only a few have been evaluated adequately. The object of this study was to improve the classical Child-Pugh score (CPS) and to test other scores from the literature. Patients ( n = 82) with endoscopically confirmed variceal bleeding and long-term sclerotherapy were included in the study. Linear logistic regression (LR) was applied to different sets of prognostic variables with regard to 30-day mortality. In addition, scores from the literature were evaluated on the data set. Performance was measured by the accuracy and receiver-operating characteristic curves. The application of LR to all five CPS variables (accuracy, 80%) was superior to the classical CPS (70%). LR with selection from the CPS variables or from other sets of variables resulted in no improvement. Compared with CPS only three scores from the literature, mainly based on subsets of the CPS variables, showed an improved accuracy. It is concluded that CPS is still a good scoring system; however, it can be improved by statistical analysis using the same variables. Key words: Bleeding; evaluation; logistic regression; oesophageal varices; prognosis; receiver-operating characteristic curves; scoring system

P D Dr. Christan Ohmann, Funktionsbereich Theoretische Chirurgie, Klinik fur Allgemeine und Vnfallchirurgie, Zentrum fur Operative Medizin I, Moorenstr. 5 , 0-4000 Diisseldorf, FRG

Three steps are essential for prognostic scoring systems to be used in clinical routine: development, evaluation, and clinical application. In liver cirrhosis and variceal bleeding numerous scoring systems have been developed (1-12). Only a few have been evaluated adequately ( 1 , 2 , 4 , 6 ) , and even fewer are in widespread clinical use (9, 10). This situation makes it difficult to assess the clinical value of most existing scores. Prognostic scores, especially if they have been developed by sophisticated statistical techniques (such as Cox regression, linear logistic regression) primarily reflect the individual sample investigated and often present results that are optimistically biased (13). Without thorough evaluation, including application of the scoring system in a new clinical setting, it remains uncertain whether the results achieved with a specific data

set are reproducible in other centres under different conditions (14). I n variceal bleeding prognostic scores are used t o define groups a t low and high risk. Pretreatment estimations of risk are helpful in demonstrating comparability of treatment groups in randomized clinical trials and in supporting comparisons between treatment results of different centres. A t present, most clinical studies investigating variceal bleeding use a rather simple classification system designed in 1964 (9) and slightly revised thereafter ( l o ) , the Child-Pugh classification (CPS). This system has been designed by an expert using his clinical knowledge, an approach hardly suitable t o take correlations between the variables into consideration and to produce adequate score values. Apparently, it seems as if the most widely applied prognostic

502

C . Ohmann et al.

score in liver cirrhosis and variceal bleeding has not been developed by adequate methods (15). We therefore performed a study with the following questions: Can the CPS be improved by statistical analysis? How does the CPS perform compared with other scoring systems?

Scandinavian Journal of Gastroenterology 1990.25:501-512.

PATIENTS AND METHODS Data were obtained from a prospective observational study on injection sclerotherapy in 95 consecutive patients with bleeding oesophageal or gastric varices. These patients were recruited from a total population of 505 patients with upper gastrointestinal bleeding admitted to the Surgical Clinic of Marburg University between January 1981 and April 1985. All patients were investigated by early emergency endoscopy, and only those with endoscopically confirmed bleeding from oesophageal or gastric varices were admitted to the study. Thirteen patients had to be excluded because of incomplete data (in 12 patients more than one CPS variable was missing, and 1 patient was lost to follow-up study). Of the excluded patients eight died (five within 48 h). Altogether 82 patients could be analysed in this study. In all but two patients injection sclerotherapy was performed during emergency endoscopy on admission (one with no treatment, one with balloon tamponade). In eight patients no primary haemostasis was achieved at emergency endoscopy; five of those patients died within 30 days. This was followed by repeated sessions, weekly during hospital stay and bi-weekly after discharge, until no bluish, bulging serpentined varices could be identified (not necessarily eradicated). After this phase sufficient protection against further bleeding was assumed. A systematic follow-up study, including endoscopy, was performed every 4 months, and persistent or reappearing varices were again treated by repeated sclerotherapy until eradication of the varices was achieved. In patients with rebleeding from gastric or oesophageal varices the procedure was repeated. On the average patients received 2 sessions (median) during the hospital stay (range, 1 to 10). The procedure was performed with a flexible endoscope (Olympus GIF Q, GIF 1T) and with intra-

and para-vasal injection of polidocanol (Aethoxysclerol@, 0.5%) in accordance with a standardized technique (16). History, clinical examination, endoscopic findings, treatment, and outcome were documented prospectively on a computer questionnaire. A set of 19 prognostic variables recorded immediately after admission and before endoscopy was selected for analysis: ascites, encephalopathy ,bilirubin value, albumin value, prothrombin index, general appearance, history-taking, stomach disease, shock index, aspartate aminotransferase value, alanine aminotransferase value, bleeding activity, sex, renal disease, hepatitis, enlarged liver, creatinine value, gastric varices, and multiple lesions. For all variables, standardized definitions were used in accordance with the literature (9,17). Death within 30 days was taken as the outcome variable. Twenty patients died within 30 days; 62 survived this time period. The causes of portal hypertension were alcoholic cirrhosis (39 patients), posthepatic cirrhosis (21), alcoholic and posthepatic cirrhosis ( l l ) , primary biliary cirrhosis (3), portal vein thrombosis (l),other causes (4), and remained unknown in 3 patients. Statistical analysis Univariate analysis was done with the chisquare test, comparing dead and surviving patients with regard to single variables. Multivariate analysis was performed by linear logistic regression (see Appendix) (18). This method was used for estimating the probability of the outcome death, given signs, symptoms, and test results of the patient. Three different sets of variables were studied: a) CPS variables as categoric variables (qualitative data); b) CPS variables with bilirubin and prothrombin index as interval-scaled variables (measurements) and ascites, encephalopathy and albumin as categoric variables; and c) all 19 categoric variables including the CPS variables. Analysis was performed using all the variables (except for the 19 categoric variables) and using stepwise selection. The performance of the different applications of the linear logistic regression model was assessed by the accuracy, the sensitivity, and the specificity of the predictions dependent on a given

Scandinavian Journal of Gastroenterology 1990.25:501-512.

Prognostic Scores in Variceal Bleeding

cut-off point (see Appendix) (19). A cut-off point is a selected probability value for death (for example, 0.5) above which the prediction is defined as death and below which the prediction is defined as survival. For each model a set of different cut-off points was investigated. The relationship between sensitivity and specificity dependent on the cut-off point was analysed by receiver-operating characteristic curves (ROC curves; see Appendix and Ref. 19). The ROC curves determined by linear logistic regression analysis were compared with the ROC curve of the CPS. Throughout this paper Harley’s modification was taken as the CPS (11). Harley’s score is identical to Pugh’s score (10) except for the variable prothrombin time, which has been replaced by the prothrombin index (Table I). In calculating the CPS, 2 points were assigned for ascites; there was no discrimination between slight and moderate ascites. For 27 patients in whom serum albumin values were missing, a score of 1 point was assigned. Thus the total score calculated may underestimate the actual CPS. There were 16 patients with Child A ( s 6 points; mortality, O%), 31 with Child B (7 to 9 points; mortality, 16%),

503

and 35 with Child C ( 2 10 points; mortality, 43%). The mortality was higher in patients with rebleeding (8 of 22 = 36%) than in patients without rebleeding (12 of 60 = 20%). The score values given by the clinical expert in the CPS (for example, bilirubin < 34 pmol/l = one point) were compared with statistically derived scores from linear logistic regression and the Independence Bayes model (see Appendix and Ref. 19). In addition, several prognostic scores in the literature were applied to our data, and the results were compared with the CPS. The scores investigated were as follows: a) Two expert modifications of the CPS: Terblanche’s score (12) with slightly different intervals with regard to the variable prothrombin index (%) (> 70, 1 point; 4G70, 2 points; 35, 1; 30-34, 2; < 30, 3) and bilirubin (pmol/ 1) ( s 2 5 . 7 , 1; 25.W2.7, 2; 3 4 2 . 8 , 3). b) Two statistical scores developed for upper gastrointestinal bleeding, including bleeding varices: Borsch’s score (3) based on 6 variables not including any of the CPS variables (age, 60 =

Table I. Single-factor analysis of the CPS variables and comparison of expert and statistical scores Univariate analysis Variable Ascites

Encephalopathy

Serum bilirubin (Pmm Serum albumin? (g/O Prothrombin index (YO)

No. of patients

Category

26

No

Slight Moderate None Moderate Severe 5 1 >35 28-35

7s 5c-75 60 = 1; pulse (beats/min), < 100 = 0, 101-120 = 0.5, > 120 = 1; liver disease, no = 0, yes = 1; haemoglobin (g/100ml), > 12 = 0, 9.1-12.0 = 0.5, S 9 = 1; blood stigmata, - = 0, = 1; shock index, > 1.2 = 0, s 1.2 = 1; total score = 2.5 . age + 2 . pulse 1.5 . haemoglobin 1.5 . liver disease 1 . shock 1 .blood stigmata), and Provenzale’s score (2), mixing general indicators of upper gastrointestinal bleeding with signs of liver disease (melaena, no = 0, yes = -1; haematochezia, no = 0, yes = 1; drop in haematocrit of 5%, no = 0, yes = 1; duration of bleeding (h), >12 = 0, S 1 2 = 1, < 3 = 2; systolic blood pressure (mm Hg), > 100 = 0, 90-99 = 1, 80-89 = 2, < 80 = 3; renal disease, no = 0, yes = 1; encephalopathy, no = 0, yes = 1; prothrombin time (sec), < 12 = 0, 12-15 = 1, > 15 = 2; total score = sum of all scores). c) Three statistical scores for patients with bleeding varices: Sauerbruch’s score A (6) based on 4 variables (-3.454 + bilirubin (pmol/l) . 0.00726 + ascites . 0.55293 + aspartate aminotransferase (IU/l) . 0.00423 + age . 0.03878), Sauerbruch’s score B (6) based on 2 variables (-2.1 bilirubin (pmol/l) . (-0.0049) + prothrombin index (YO) . 0.047), and Garden’s score (1) based on 3 variables (10 - prothrombin ratio . 4.3 - creatinine (pmol/l) . 0.03 - encephalopathy . 0.85). d) Three statistical scores developed for patients with liver disease: Orrego’s score A (7) based on 4 variables (3.6 prothrombin time

+

Scandinavian Journal of Gastroenterology 1990.25:501-512.

+

+

+

+

+

+

(sec) . 0.21 - haemoglobin (% normal) . 0.019 - albumin (g/l) . 0.06 encephalopathy . 1.14), Orrego’s score B (7) based on 3 variables (6 haemoglobin (% normal) . 0.025 - albumin (g/ 1) . 0.16 + encephalopathy score . 1.03), and Gines’s score (8) based on 7 variables ((0.5847 . bilirubin (pmol/l) - 10.3) 0.06306 (gammaglobulin (g/l) - 18.9) . 0.04734 + (hepatic stigmata -0.42) . 0.34566 (prothrombin time (%) - 85.5) . (-0.01873) + (sex - 0.68) . 0.68602 + (age - 50.2) 0.02411 + (alkaline phosphatase - 0.55) . 0.42832).

+

-

+

+

-

RESULTS Univariate analysis (Table I) Univariate analysis showed significant differences between death and survival in only three of the Child-Pugh variables (ascites, encephalopathy, prothrombin index (Table I)). In bilirubin only a trend was observed (p < O.lO), and in albumin no significant result was obtained. However, in one third of the patients the albumin values were missing. Of the 14 other variables 7 showed significant differences: general appearance, history of stomach disease, feasibility of history, shock index (pulse/systolic blood pressure), aspartate aminotransferase, alanine aminotransferase, and bleeding activity. Linear logistic regression (Table II) Given a cut-off point of 10, application of the

Table 11. Results of linear logistic regression and the Child-Pugh expert score Stepwise selection of variables

Accuracy? (”/.)

Child-Pugh* (categoric)

Yes No

70 78

Child-Pugh (mixed categoric/ interval-scaled)

Yes No

80

All 19 variables (categoric)

Yes

71

-

70

Method

Variables

Linear logistic regression

Expert scoring ~~~~~

~

Child-Pugh score

73

~~

* Child-Pugh variables: encephalopathy, ascites, bilirubin, albumin, prothrombin index. t Cut-off points: probability for death = 0.5 (linear logistic regression) and LO (Child-Pugh score) (see Appendix).

Scandinavian Journal of Gastroenterology 1990.25:501-512.

Prognostic Score3

CPS resulted in an accuracy of 70%. Comparable results were achieved with stepwise linear logistic regression applied to the Child-Pugh variables taken as categoric variables (nominal scale) and using a cut-off point of 0.5 (accuracy, 70%). Ascites and encephalopathy were selected in all but 1 (= 81) of the cross-validatory applications of stepwise logistic regression, in which selection of variables and estimation of coefficients was performed on N - 1 = 81 patients, and testing was done on the patients left out. Prothrombin index was selected in half of the sequences (41), albumin in three, and bilirubin in one sequence. An accuracy of 73% was calculated by applying stepwise logistic regression t o the CPS variables analysed as a mixture of categoric variables (ascites, encephalopathy, albumin) and interval-scaled

it1

Vurrceul Bleeding

variables (bilirubin, prothrombin index). Bilirubin and encephalopathy were selected in all and ascites in all but 2 of the 82 cross-validatory sequences. Albumin and prothrombin index were not selected by any sequence. Stepwise selection from the 19 variables, including the Child-Pugh variables, did not improve the results (accuracy, 71%). The following variables were selected in most of the cross-validatory sequences: general appearance (73), second lesion at endoscopy apart from varices (69), enlarged liver (79), prothrombin index (68), and sex (57). By omitting the selection procedure and applying linear logistic regression to all five CPS-Pugh variables, markedly better results could be achieved. Analysing the CPS variables as categoric variables resulted in an accuracy of 78%,

I

e

7 4

--

SO5

loglstlc ltlc rcycuim with MILDVWi8blel Iblel k r t c r i c l l l logistic rcyrulon with MILDv r i l b 1 u hixed

Specificity (XI Fig. I . ROC curves for the Child-Pugh score and the linear logistic regression models. Cut-off points: Child-Pugh’s score (Harley’s modification (11)): 5, 6, . . ., 15; linear logistic regression: 0.00, 0.05, . . ., 1.00; (A)cut-off point = 0.50, (m) cut-off point = 0.20 for linear logistic regression (mixed categoric/interval-scaled).

Scandinavian Journal of Gastroenterology 1990.25:501-512.

506

C. Ohmann et al.

and analysing them as mixed categoric/intervalscaled variables resulted in an accuracy of 80%. In Fig. 1 the ROC curves for the CPS and the logistic regression models with all five Child-Pugh variables (categoric; mixed categoric/intervalscaled) are presented. The meaning of the ROC curve can best be explained by an example. For a cut-off point of 0.5 and for the logistic regression model with all five CPS variables (mixed categoric/interval-scaled) a sensitivity of 50% and a specificity of 89% was calculated (accuracy, 80%; Table I) resulting in the point on the curve indicated by filled triangle (Fig. 1).If a cut-off point of 0.20 is selected, sensitivity increases to 75% and specificity decreases to 79%, resulting in the point on the curve indicated by the filled square (Fig. 1). Repeating the procedure for all cutoff points and connecting neighbouring points by

straight lines gives the ROC curve. The CPS is superior to the logistic regression models in the left area (high sensitivity, low specificity) but inferior in the right area (high specificity, moderate sensitivity). If both a high specificity (more than 80%) and a high sensitivity (more than 70%) are desired, the linear logistic regression models are superior to the classical CPS. Statistical scoring (Table I ) In Table I the statistically derived score values are compared with the expert scores of the CPS. Remarkable agreement could be observed for the variables ascites, encephalopathy, and prothrombin index. The scores did not coincide for the variables albumin, for which no difference could be calculated between the moderate and high-risk group (CPS, 2.3), and bilirubin, for which the risk

100

90--

80--

70--

60--

i t i

50--

V

1

t

40--

Y

(XI

30--

20--

--

Rikker's mdificatim

Terblanche's mdittcationl lo--

0

I

0

ib

I

1

io

$0

I

a0

do

I

do

I

-io

I

do

, I

90

100

Specificity

(XI Fig. 2. ROC curves for modifications of the Child-Pugh score. Cut-off points: Child-Pughs score (Harley's modification (1 1)): 5 , 6 , . . .,15; Rikker's modification (5): 4, 5 , . . ., 12; Terblanche's modification (12): 5, 6, . . ., 15.

Prognostic Scores in Variceal Bleeding

in the middle category (CPS, 2) was reduced compared with the low-risk category (CPS, 1). There was no difference between the logistic regression and the independent Bayes model, except for the variable albumin. This was because missing data for albumin were taken as a separate category in the logistic regression model but not in the independent Bayes model.

Scandinavian Journal of Gastroenterology 1990.25:501-512.

Prognostic scores in the literature (Figs. 2-5)

In Fig. 2 three expert modifications of the CPS are compared. CPS is superior to Terblanche’s modification (12). Rikker’s score ( 5 ) is inferior to the two scores using all five CPS variables. This indicates again that expert scores based on subsets of the CPS variables do not perform satisfactorily. In Fig. 3 statistical scores developed for upper gastrointestinal bleeding in general, including

bleeding varices, are compared with CPS. Borsch’s score (3) has a poorer performance than the CPS. Provenzale’s score (2) is superior to the CPS in the left and in the right area but not in the important middle area with a high sensitivity and a high specificity. In Fig. 4 statistical scoring systems for patients with bleeding varices are presented. Remarkable improvements compared with the CPS were achieved with Sauerbruch’s score A (6).Sauerbruch’s score B (6) performed better than the CPS but poorer than Sauerbruch’s score A. The results with Garden’s score (1)were disappointing. Statistical scoring systems developed for patients with liver disease (alcoholic liver disease, cirrhosis) are compared with the CPS in Fig. 5. Orrego’s score A (7) is clearly superior to CPS. Orrego’s socre B (7) performs poorer than CPS in the left area but better in the right area.

90--

80--

5 e

70--

n

s

60--

1

t 1 V 1

50---

t

40--

Y

1%)

507

lo{ 201

,

4 0 04

Specificity

(XI Fig. 3. ROC curves for the Child-Pugh score and scores for upper gastrointestinal bleeding. Cut-off points: Child-Pugh’s score (Harley’s modification (11): 5 , 6 , . . . , 15; Borsch’s score (3): 0, 0.5, . . ., 9.5; Provenzale’s score (2): -1, 0, 1, . . ., 11; missing data in eight patients.

Scandinavian Journal of Gastroenterology 1990.25:501-512.

508

C . Ohmann et al.

Gines score (8) is inferior to the CPS. The accuracy of the prognostic scores varies with the cutoff point. In Table I11 the accuracies are given for a fixed specificity of about 80%. In this situation CPS is 76% accurate (cut-off point, 11)compared with 70% in Table I (cut-off point, 10). The results can be summarized as follows. Compared with the CPS (ll),no improvement was achieved with the other expert scores (Terblanche’s modification (12), Rikker’s modification (5)) and the scores for upper gastrointestinal bleeding (Provenzale’s score (2), Borsch’s score (3)). From the scores developed by statistical methods and using continuous variables (cardinal-scaled), three were markedly superior to the CPS (Sauerbruch’s score A (6), Orrego’s score A (7), logistic regression (Child variables, mixed)). Two of four variables in Sauerbruch’s score A, three of four variables in Orrego’s score

A, and all five variables in the linear logistic regression model belong to the CPS variables. The only additional variables in these scores were aspartate aminotransferase and age (Sauerbruch’s score A (6)) and haemoglobin (Orrego’s score A (7)).

DISCUSSION The outcome criterion of 30 days’ mortality was chosen because patients with acute variceal bleeding generally are in such a poor condition that short-term survival is the primary goal of treatment. The results, however, would not have been changed markedly if a period of 6 weeks had been investigated. Only three patients died after 30 days and within 6 weeks. The basic question was whether a score could discriminate between deaths and survivors, rather than analysing sev-

8%-

s

70--

s

60--

e n 1

t 1

50--

V 1

t

40--

Y

Wsrley‘r mdificationl

--

%erbruch‘s

score I

SwerbPuch’s score B

Specificity

1%) Fig. 4. ROC curves for the Child-Pugh score and other scores for variceal bleeding. Cut-off points: ChilcH‘ugh’s score (Harley’s modification (11)): 5, 6, . . ., 15; Sauerbruch’s score A (6): - 3 , -2.7, . . ., 4.2; Sauerbruch’s score B (6): 2.8, 2.5, . . ., -2.3; Garden’s score (1): -10, -9, . . ., 5.

Scandinavian Journal of Gastroenterology 1990.25:501-512.

Prognostic Scores

--

Gin6.5'6

in

Variceal Bleeding

509

score

Orrego's score A

- Itreno's scmt I I

0 0

10

I 1

20

I

I

30

9 I I

40

I

1

1

I

1

i0

$0

j0

6

&I

100

Specificity (XI Fig. 5. ROC curves for the Child-Pugh score and other scores for liver diseaze. Cut-off points: Child-Pugh's score (Harley's modification (1 1)): 5 , 6 , . . . , 15; Ginis score (8): 0, 1, . . ., 20, missing data in 25 patients; Orrego's score A (7): 0, 0.5, . . ., 5.5. missing data in 27 patients; Orrego's score B (7): -2. -1.5, . . , , 4.5, missing data in 27 patients.

era1 risk groups (such as Child A , B, C). If only two risk groups are considered, this is best measured by the positive and negative predictive Table 111. Accuracy of the prognostic scores (specificity fixed to approximately 80%) Score

Accuracy (%)

Rikkers Provenzale Borsch Terblanche Garden Orrego B Ginis Sauerbruch B Orrego A Sauerhruch A Child-Pugh score

69 71 71 71

72 73 74

76 78 80

76

value (19). Predictive values, however, depend on the prevalence (that is, mortality), whichvaries considerably between different centres. Sensitivity and specificity d o not depend on the prevalence and are therefore used as the performance measure in our study. In addition, the accuracy of the predictions was determined. A specific data set was used for statistical analysis and evaluation of existing scores on variceal bleeding. Any results obtained relate to the population under study, in our case patients with endoscopically confirmed variceal bleeding and longterm injection sclerotherapy, and are related to short-term outcome. The target population of the scores in the literature covers patients with liver cirrhosis (8, 9), alcoholic liver disease (7), variceal bleeding (1.4-6, 10,12), and upper gastroin-

Scandinavian Journal of Gastroenterology 1990.25:501-512.

510

C. Ohmann et al.

testinal bleeding including variceal bleeding (2,3) and other combinations (11). Some scores are based on short-term outcome (1,3), others on longer follow-up periods (6-8). Treatment policies include operation ( S l l ) , endoscopic treatment (6), and combinations of these (1,4,5,12). It is known that treatment may well change the score values profoundly by having a beneficial effect in some patients and harmful effects in other patients. In our study only variables recorded before start of endoscopic treatment (immediately after admission) were used in the analysis. As is frequently the case in studies of emergency patients, several data are missing. This leads to the conclusion that prognostic scoring systems should be based only on data that are either immediately availabls or can be collected for the purpose of emergency decision-making. Even in our prospective study and for a simple scoring system such as CPS, missing data could not be avoided, indicating that many of the scores are still too complicated for routine clinical use. To be considered of possible value for a prognostic score, a single variable should have some ability to discriminate between deaths and survivors. Univariate analysis in the literature applied to the CPS variables results in significant differences in the overwhelming majority of cases, indicating the prognostic value of these variables (1,7). In our study the prognostic value of ascites, encephalopathy, and prothrombin index was significant. Whether, however, all these variables would contribute independently to prognosis if analysed multivariately is a matter of controversy. The application of linear logistic regression to all five CPS variables brought good results in this study (Table 11). Selection from the CPS variables or selection from a greater pool of variables, including the CPS variables, resulted in a lowering of performance. This indicates that all five CPS variables are important factors for a prognostic score. This is underlined by the poorer results achieved with Rikker’s score (5) using only four of the Child-Pugh variables. In addition, Campbell’s score based on the Child variables and expert scoring (20), could not be improved by the addition of other easily obtainable variables or by using various subsets of Child’s criteria and

additional variables. Sauerbruch et al. (6) found that in their data set the CPS (Terblanche’s modification (12)) was superior to two other scores based on statistical methods (Sauerbruch A, B). In some studies, however, the prognostic value of Child’s score (and modifications) is described as incomplete, and other variables are demonstrated to have additional prognostic information (21). From the statistical scores of the literature investigated in our study, only three showed a clear improvement over CPS: Sauerbruch’s score A, including bilirubin, ascites, aspartate aminotransferase, and age; Sauerbruch’s score B, based on bilirubin and prothrombin index; and Orrego’s score A, including prothrombin time, albumin, encephalopathy , and haemoglobin. These improvements were achieved with subsets of the CPS variables combined with one or two other variables. In all these studies sophisticated statistical methods were used to develop the score. Thus we have the conflicting situation that it is unclear for most studies in the literature which accuracies would have been achieved with all five CPS variables in the analysis and whether improvements over the CPS are due to better methods of analysis or new sets of variables. From the results of our study it can be concluded, first, that statistical scoring considerably improves expert scoring and, secondly, that statistical scores based mainly on the CPS variables performed best. To improve the CPS, the scores for bilirubin and albumin should be redefined. Even better results can be expected if statistical scoring is performed with continuous but not categoric information for the variables albumin, prothrombin index, and bilirubin. Further studies on prognostic scoring systems in variceal bleeding should compare any new scoring system with the classical CPS (10) and with statistical scoring on the five CPS variables as a base line. ACKNOWLEDGEMENTS The work was supported by grant Oh 3912-2 from the Deutsche Forschungsgemeinschaft (DFG). The authors thank Konrad Lang for assistance in the calculations and for preparing the drawings,

Prognostic Scores in Variceal Bleeding Ursula Willems for typing the manuscript, and

Dr. M‘ Kraemer for helping with the

REFERENCES

bleeding oesophageal varices. Br J Surg 1973, 60, 646449 11. Harley HAJ. Morgan T, Redeker AG, et al. Results of a randomized trial of end-to-side portocaval shunt and distal splenorenal shunt in alcoholic liver disease and variceal bleeding. Gastroenterology 1986.91. . ~ ,~, 802-809 . 12. Terblanche J, Northover JMA, Bornman P, et al. A prospective controlled trial of sclerotherapy in the long term management of patients after esophageal variceal bleeding. Surg Gynecol Obstet 1979, 148, 32S333 13. Ohmann C, Kiinneke M. Zaczyk R. Thon K, Lorenz W. Selection of variables using ‘Independence Bayes’ in computer-aided diagnosis of upper gastrointestinal bleeding. Statist Med 1986,5,50>515 14. Wasson JH, Sox HC, Neff RK, Goldman L. Clinical prediction rules. Applications and methodological standards. N Engl J Med 1985, 313, 793-799 15. Spiegelhalter DJ. Statistical aids in clinical decisionmaking. Statistician 1982, 31, 19-36 16. Paquet KJ. Prophylactic endoscopic sclerosing treatment in the esophagel wall in varices-a prospective controlled randomized trial. Endoscopy 1982, 14. 4-5 17. Ohmann C, Thon K, Stoltzing H, et al. The personal computer as an aid to documentation of upper gastrointestinal endoscopy. Theor Surg 1986, 1, 69-83 18. Dixon WJ, ed. BMDP statistical software. University of California Press, Berkeley, 1985, 330-344 19. Weinstein MC. Fineberg HV. Clinical decision analysis. Saunders, Philadelphia, 1980 20. Campbell DP, Parker DE, Anagnostopoulos CE. Survival prediction in portocaval shunts: a computerized statistical analysis. Am J Surg 1973, 126, 74g751 21. Cello JP, Deveney KE, Trunkey DD, et al. Factors influencing survival after therapeutic shunts. Results of a discriminant function and linear logistic regression analysis. Am J Surg 1981. 141, 257-265 22. Toussaint GT. Bibliography on estimation of misclassification. IEEE Trans Inform Theory 1974, 20, 472-479 23. Hanley JA, McNeil BJ. A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology 1983. 148, 839-843 ~~

Scandinavian Journal of Gastroenterology 1990.25:501-512.

1. Garden OJ, Motyl H, Gilmour WH. Utely RJ. Carter DC. Prediction of outcome following acute variceal haemorrhage. Br J Surg 1985. 72, 91-95 2. Provenzale D, Sander RS, Wood DR, et al. Development of a scoring system to predict mortality from upper gastrointestinal bleeding. Am J Med Sci 1987,

30, 2 6 3 2 3. Borsch G , Matuk ZE. Leverkus F. Prognosis in upper gastrointestinal bleeding: univariate and multivariate statistical analyses in 477 bleeding episodes (in German). Med Klin 1987. 82, 774-780 4. TerCs J , Baroni R. Bordas JW. Visa J. Pera C, Rodes J . Randomized trial of portacaval shunt, stapling transection and endoscopic sclerotherapy in uncontrolled variceal bleeding. J Hepatol 1987, 4, 159-167 5. Rikkers LF, Burnett DA. Volentine G D , Buchi KN, Cormier RA. Shunt surgery versus endoscopic sclerotherapy for long-term treatment of variceal bleeding. Early results of a randomized trial. Ann Surg 1987, 206. 267-271 6. Sauerbruch T, Ansari H, Wotzka R, Soehendra N, Kopcke W. Prognostic factors in cirrhosis of the liver. variceal bleeding and sclerotherapy: comparison of prognosis systems obtained by discriminant analysis with the Child-classification (in German). Dtsch Med Wochenschr 1988, 113, 1114 7. Orrego H , Israel Y . Blake J E , Medline A. Assessment of prognostic factors in alcoholic liver disease: toward a global quantitative expression of severity. Hepatology 1983. 3. 896905 8. GinCs P. Quintero E , Arroyo V. et al. Compensated cirrhosis: natural history and prognostic factors. Hepatology 1987. 7. 122-128 9. Child CG. Turcotte JG. Surgery and portal hypertension. In: Child CG. ed. The liver and portal hypertension. Saunders, Philadelphia, 1964, 50 10 Pugh RNH. Murray-Lyon IM, Dawson JL, Pietroni MC, Williams R . Transection of the oesophagus for

APPENDIX In linear logistic regression analysis, t h e probability for death given symptoms, signs, and test results S,, . . S, is estimated by t h e following formula:

..

p (D/S, >. . . S , ! ) e x p ( P , , + P , . S , . . . +PS ;), l + e x p @ , + P I . S , + . . .+Pn.Sn)’

where PI are t h e estimated coefficients. T h e analy-

5 11

~

sis was performed with BMDP LR on an IBM PS/2 Mod 60 (18). Selection of variables was based on forward and backward stepping, assessing the significance of a candidate variable with the maximum likelihood ratio statistic, and entering or removing t h e variable if t h e preassigned limits (default values) were passed. To obtain an unbiased estimation of t h e performance of t h e statistical model, t h e cross-validation or leavingone-out m e t h o d was used. In this approach all patients except one are used f o r developing t h e

Scandinavian Journal of Gastroenterology 1990.25:501-512.

512

C. Ohmann et a1

model (estimating the coefficients), which is then tested on the one patient left out. This is repeatedly done for all 82 patients (22). The performance of the different applications of the linear logistic regression model and of the scores in the literature was assessed by the accuracy, the sensitivity, and the specificity of the predictions (19). To determine these measures of performance, a cut-off point has to be specified. A cut-off point is a probability (for example, 0.5) or a score value (for example, 10 in CPS) used to differentiate between dead and survivors. Above the cut-off point death is predicted, otherwise survival. For each model a set of different cutoff points was investigated. The accuracy was calculated as the percentage of correct predictions, the sensitivity as the percentage of correct

between the areas (23). CPS was significantly superior (p < 0.05) to Rikker’s score and stepwise linear logistic regression applied to the CPS variables (categoric); all other comparisons were not statistically significant, Whether this is due to lack of statistical power, to methodologic problems of the test, or to equivalence of the scores is a matter of controversy. The scores given by the clinical expert in the CPS classification (for example, bilirubin < 34 pmol/l = 1 point) were compared with statistically derived scores from linear logistic regression and the Independence Bayes model. In the Independence Bayes model conditional independence of the symptoms within each prognostic group is assumed:

-

predictions in the subgroup of patients with outcome death, and the specificity as the percentage of correct predictions in the subgroup of patients with outcome survival. Sensitivity and specificity depend on the selection of the cut-off point. If sensitivity is plotted against specificity for different cut-off points a (concave) curve results, known as receiver-operating-characteristic curve (ROC curve) i19). Statistical criteria have not been fully developed for ROC curves. One index of performance is the area under the ROC curve, varying from 0.5 (no apparent accuracy) to 1.0 (perfect accuracy, 100%) as the ROC curve moves towards the right and top. A test of the difference between areas under two ROC curves can be done but is, however, not satisfactory if the curves cross or differ only in parts, as in our case. Nevertheless, we performed such a test using the trapezoidal rule to approach the areas, a conservative estimate for the standard deviations, and Kendall’s t as measure of the correlation Received 4 July 1989 Accepted 13 November 1989

where D = death, D = survival, P(D), P(D) = prior probabilities, and P(S,/D) = estimated probability for Sigiven D . Logarithmic transformation of the ratio of the probability for death and the probability for survival given the clinical findings results in a simple additive scoring system (15):

(logistic regression)

(Independence Bayes theorem) For better comparison of the CPS with the statistical scores, a constant term was added to each statistical score to give identical sums for the base-line combination of variables (no ascites, no encephalopathy, albumin > 35 g/l, bilirubin < 34 pmol/l, prothrombin > 75%).

Prognostic scores in oesophageal or gastric variceal bleeding.

Numerous scoring systems have been developed for the prediction of outcome of variceal bleeding; however, only a few have been evaluated adequately. T...
897KB Sizes 0 Downloads 0 Views