Journal of Consulting and Clinical Psychology 1979, Vol. 47, No. 5, 799-817

Some Characteristics of Effective Psychiatric Treatment Programs Robert B. Ellsworth

Joseph F. Collins

Veterans Administration Medical Center Salem, Virginia

Veterans Administration Medical Center Perry Point, Maryland

Nancy A. Casey

Reginald A. Schoonover

Veterans Administration Medical Center Northampton, Massachusetts

American University

Robert H. Hickey

Leon Hyer

Veterans Administration Medical Center Pittsburgh, Pennsylvania

Veterans Administration Medical Center Augusta, Georgia

Stuart W. Twemlow

John R. Nesselroade

Topeka, Kansas

Pennsylvania State University

This Veterans Administration cooperative study sought to identify the ward milieu characteristics of effective psychiatric programs. It is the largest study undertaken so far in terms of number of programs (79), patients (21,667), characteristics examined (191), and adequacy of outcome measures used to estimate program effectiveness. It was performed as a multivariable, correlational natural history study. Differences in characteristics of patients treated in each program were controlled statistically, and a crossvalidation design and multiple outcome measures were used to make spurious findings less likely. Program variables were divided into treatment and setting characteristics, or those characteristics that the staff did and did not have control over. Since the most promising treatment characteristics are to be used in a blind follow-up to this study, this article deals mainly with the setting variables. The major finding reported here is that patients admitted to and treated on wards in which a mixture of acute and chronic patients were being treated had better outcomes than patients treated on wards with a'more narrowly defined patient population.

At this time there is little agreement on what the essential characteristics of an effective treatment program are. Twenty-five years ago, Stanton and Schwartz (1954) demonstrated that the treatment environment had a marked effect on a patient's hospital adjustment. For example, when there was an unresolved conflict among the staff, patient behavior became more maladaptive (increased acting out, psychotic symptoms, and withdrawal). During the intervening 25 years, hundreds of articles on milieu therapy have attempted to describe the attributes of a treatment environment that facilitate the recovery of psychiatrically hospitalized patients. In addition, there have

been numerous informal reports on the ameliorative effects of different setting characteristies on the adjustment of patients observed in the treatment setting. These studies have a major limitation, however, because research in the last decade has shown that in-setting behavior is not strongly related to posthospital community adjustment (Ellsworth, Foster, Childers, Arthur, & Kroecker, 1968; Erickson, 1975). Ellsworth (1979) recently reviewed 23 studies that represent serious attempts to identify one or more characteristics of the therapeutic environment that contribute to lowered rates of rehospitalization or to improvement in community adjustment. Only

In the public domain

799

800

ELLSWORTH, COLLINS, CASEY, SCHOONOVER, RICKEY, ET AL.

a few consistent findings emerge regarding the characteristics of programs with the highest turnover rates. These include an administrative decision to implement a rapid turnover program, to place long-stay patients in the community, and to adopt a unit system in which each team admits, treats, and This study was supported by the Veterans Administration Cooperative Studies Program, Medical Research Service, Veterans Administration Central Office, Washington, D.C. 20420. The participation of the following Veterans Administration medical centers, investigators, and data collectors in this large cooperative study is gratefully acknowledged : Augusta, Ga., Leon Hyer, Arthur Bryant, Gail Sikes; Battle Creek, Mich., Lawrence Schwartz, Carolyn Knockemus, Jean Smith, Stuart Picard; Brecksville, Ohio, Durand Jacobs, John Lowenfeld, Jim Super, Diane Leskiewicz; Brockton, Mass., Jerald R. Biddle, Gail Quaglieri, Betty White-Rose; Chillicothe, Ohio, William Hirschman, Denise Runyon, Linda Dunn; Dallas, Tex., Walter Penk, John Carpenter; Danville, 111., August Johnson, Dorothy Knight, Charlene Sherman; Lyons, N.J., Leon Hyer, Nicholas Devone, Charles Buchbauer, Mark Bracaglia, Kathy Locke; Northampton, Mass., Nancy Casey, Joan Phakos; Palo Alto, Calif., Melvin Gallen, Vera Blum, Ron Grant, Jerry Guercio; Perry Point, Md., Edward Sieracki, Monolyne Gaddy, Elaine Smith, Betty Fair ; Pittsburgh, Pa., Robert Hickey, Bernadette Brandenstein, Pamela Smail, John Spiegel; Salem, Va., Robert B. Ellsworth, Reggie Schoonover, Margaret Charlton, Joyce Glowers, Pat Collins, Katie James, Murlene Reed; Sepulveda, Calif., James Hawkins, Linda Heinz, Forest Davis; St. Louis, Mo., Nadia Ramzy, Richard Yakimo, Jeff Mayberry, Debra Schirber; Togus, Maine, David Briggs, Warren Doe, Judy Gidney; Topeka, Kans., Richard Cellar, Leily Arndt, Una Thomas; Waco, Tex., Thomas Frank, Clifford Knape, Doris Neumann, Karen Jackson, Kaye Johnson, Lani Phelps, Charlene Pruneda, Gary White. The following participants are also gratefully acknowledged: consultants, John R. Nesselroade, Stuart W. Twemlow; data analysis, Joseph F. Collins, Stephanie Freeman, Roderic Gillis, Doris Bosworth, Bertha D. Carter; operations committee, Lee Gurel, Gerard E. Hogarty, Marcus O. Kjelsberg, Thomas B. Stage; human rights committee, Colleen Crigler, Reverend N. Ellsworth Bunce, James A. Crothers II, Alan S. Davis, Reverend Maurice Moore; central administration, James A. Hagans. Requests for reprints should be sent to Joseph F. Collins, Cooperative Studies Program Coordinating Center, Veterans Administration Medical Center, Perry Point, Maryland 21902.

discharges a wide range of patient types. But turnover rates are among the poorest indices of program effectiveness (Erickson & Paige, 1973), since they tend to reflect administrative policy rather than quality of treatment. The program characteristics related to return rates, another commonly used outcome criterion, are not clear. A few findings have been suggested, but these have not been replicated. Return rates are regarded as a limited measure of treatment effectiveness because patients return for a variety of reasons, including poor community adjustment, return to a previously helpful resource, and use by families of an expedient solution when tensions arise between themselves and the expatient. The only reported setting characteristic that seems to be related to good posthospital adjustment (the most adequate index of treatment outcome) is the small size of the hospital or treatment unit. Problems of Earlier Milieu Studies Perhaps the major reason that there are so few consistent findings in the studies on milieu treatment is that almost all of the studies have one or more major flaws. As elaborated by Ellsworth (1979), they include the following: (a) failure to control for the effects of patient input characteristics on the outcome measure being examined, (b) the use of poorly selected measures of program outcome, (c) the reporting of chance findings due to the small sample of programs relative to the large number of program characteristics, (d) possible errors in attributing causality to correlational findings, and (e) the inability to control for attention-placebo effects. These limitations are reviewed briefly below. 1. Failure to control for patient input characteristics represents a major flaw in many studies because input characteristics are often highly correlated with the outcome measures used in comparing the effectiveness of different programs. For example, marital status (married), diagnosis (nonpsychotic), and age (young) have been found to be positively related to early release. The probability of rehospitalization, another commonly used outcome measure,

EFFECTIVE PSYCHIATRIC TREATMENT PROGRAMS is related to frequency of previous admissions; and the level of posthospital adjustment is largely a function of the patient's preadmission adjustment. When studies fail to control for the effects of these patient input variables on the outcome being examined, the probability of reaching erroneous conclusions about the impact of certain program elements is high. 2. Another major problem is that few studies have used the outcome measures regarded as most adequate for evaluating programs, namely, posthospital functioning and adjustment. Until recently, the cost of obtaining adequate information about patients' posthospital adjustment was prohibitive for all but a few well-funded studies, because this information was available only through personal interviews. The lack of scales adequate for mailing prevented most studies from using posttreatment adjustment as a measure of program effectiveness. Now a small number of valid and reliable rating scales have been developed for use by patients or their relatives (Hargreaves, Attkisson, & Sorensen, 1977). These scales significantly reduce costs when sent through the mail, although a trade-off problem of data loss emerges. 3. The probability of reporting spurious findings is high when the number of program characteristics is large, the number of programs is small, and the results are not crossvalidated. Very few studies have had a program sample size larger than 18, and those that did were restricted to using measures of program effectiveness that are not related to community adjustment, such as return and turnover rates. One method for guarding against the reporting of findings that can arise by chance is to replicate the study. This has occurred for only 1 of the 23 studies reviewed (Ullmann, 1967). Thus, it seems certain that many of the findings reported in previous milieu studies are ascribable to chance. 4. The limitations of attributing causality to correlational findings are well-known, and most of the studies evaluating the effects of certain milieu characteristics have relied on correlational data. A correlation may indi-

801

cate a causal relationship: For example, people who do not smoke have fewer problems with lung cancer, or highway death rates were reduced sharply when the 55mile-per-hour speed limit was introduced. However, a finding that treatment programs with higher staff ratios also have higher turnover rates does not necessarily portray a cause-effect relationship. Higher staff ratios and higher turnover have been found to be related to a third variable, namely, earlier success in placing long-stay patients in the community (Ellsworth, Dickman, & Maroney, 1972). When long-stay patients were placed out of the hospital, this both reduced the number of patients (thereby increasing staff/patient ratios) and freed additional beds to receive, treat, and discharge newly admitted patients (thereby increasing turnover). Also, unless staff characteristics such as education level, age, and sex are controlled, measures of program characteristics may reflect staff differences rather than the nature of the treatment environment (Edelson & Paul, 1976). The findings from correlational data are more believable when the effects of patient input and staff background characteristics are statistically controlled in identifying the program elements most likely to be related to outcome effectiveness. Even then the findings are more open to question than are those that come from a well-designed experimental study. 5. Controlled experiments, on the other hand, also have limitations. A major weakness is the failure to control for attentionplacebo effects. Treatment teams and patients in special programs typically know that the program is special and that its effectiveness is being researched. Because of the effects of this artifact, Erickson (1975) has concluded, Projects involving psychosocial interventions cannot escape the charge that their results are due to the poorly understood and ubiquitous attentionplacebo effect. Comparing an experimental innovation with a traditional control group is not enough, (p. 534)

Controlled experiments present potential problems other than attention-placebo effects.

802

ELLSWORTH, COLLINS, CASEY, SCHOONOVER, HICKEY, ET AL.

A limitation of many experimental studies is the artificial constraint they impose on the delivery of treatment. Certain procedures often have to be followed in prescribed ways for predetermined periods of time. The "same" treatment delivered in the usual dayto-day program may differ in important ways from the treatment delivered under experimental conditions. Finally, there is the problem of choosing which program elements (out of potentially hundreds) to evaluate experimentally. Very few independent variables (program characteristics) can be controlled for examination in an experiment. How does one choose the most relevant and promising program characteristics for study ? The research to be reported in part here is relevant to this question. Overview of the Present Study This study is the largest ever undertaken to examine the ward milieu in terms of patient and program sample size and is one of the most complex in variety of program characteristics studied and outcome measures used. It was conceived as a multivariable, correlational natural history study. As such, wards were left to operate in their usual manner while systematic observations were made of their program characteristics and outcome effectiveness. This approach had the advantage of observation of the programs as they normally functioned, without any artificial constraints being imposed on them, and also made it possible to examine a large number of program variables. The primary purpose was to identify the setting and treatment characteristics related to measures of outcome effectiveness. Only the findings pertaining to setting characteristics are reported here, however, for reasons discussed later. The study was designed to handle many of the weaknesses of earlier correlational studies. For example, the effects of patient and staff input characteristics on the measures of program characteristics and outcomes were controlled by using appropriate statistical methods. The measures of program effectiveness included patients' functioning in the community in addition to the traditional, but

limited, return rate criterion common to earlier studies. The probability of reporting spurious findings was reduced by including a large sample of ward programs and a replication design to cross-validate the findings. Nevertheless, caution should be exercised in attributing causality to the findings, since regression rather than experimental design methods were used to control the effects of patient and staff input characteristics. Also examined was whether different types of patients had similar treatment outcomes when treated on wards with varying levels of the characteristics of therapeutic programs. For example, do younger or better educated patients have better outcomes if treated in programs with high levels of outcome-related characteristics than if treated in programs with low levels? It may be that some groups of patients do not show a differential response to the high- versus lowlevel programs. Such patients may show an above average outcome if treated in other types of programs. This issue of patient-byprogram-type interaction will be examined in another article. And finally, as presented in the Discussion section of this article, there are some conclusions that may be of interest to those involved in studies evaluating the outcome effectiveness of multiple programs. Since this study represents a multioutcome evaluation of many programs, several findings of interest to program evaluators are discussed. Method Selection of Programs and Patients Thirty-four Veterans Administration hospitals treating large numbers of psychiatric patients were invited to participate in the study. Of these hospitals, 20 responded favorably, but 2 subsequently withdrew, 1 because of low admission rates to wards and 1 because of lack of staff interest in the study. The 18 remaining hospitals were located in various parts of the United States and represented a good mixture of programs with respect to variety of patient load, staffing ratios, and program characteristics. A total of 104 wards began the study, but only 79, those that were able to achieve a sample size of patient follow-ups sufficient to determine the treatment effectiveness of their programs, finished. Data collection extended over a 2-year period, with approximately 18 months

EFFECTIVE PSYCHIATRIC TREATMENT PROGRAMS for patient intake. AH programs were unclassified, that is, they admitted, treated, and discharged a wide variety of patients with various psychiatric disorders. As shown in Table 1, a total of 34,355 male psychiatric patients were screened upon admission during the 18 months of intake. Female patients were not included because they represented less than 2% of admissions to these hospitals. Of those screened, 21,667 received sufficient treatment on a study ward to qualify for the study. However, 4,339 were judged by the ward nurse to be too confused to complete the initial testing and 3,146 refused to give consent. Only 1,796 patients were not included for other reasons, primarily because they were not contacted within 5 days after admission. The remaining 12,388 provided an intake sample that does not represent the most disturbed and uncooperative patients, but does represent almost all of the other patients treated on the participating wards. The background characteristics of patients admitted to the study were as follows: (a) age:

Table 1 Patient Sample Size Item

n

Patients screened (« = 34,355) Insufficient treatment on study ward Hospitalized 10+ days elsewhere Did not remain for treatment 5+ days Less than 50% of time on study ward Total qualified for study

5,626 5,403 1,657 21,667

Dropped from study Too confused to complete self-rating 4,339 Refused to give consent 3,146 Not contacted within 5 days 1,140 Already in study ("repeaters") 261 Remained hospitalized beyond 6 months 395 Total included in study 12,388 Lost to follow-up Rehospitalized within 3 months of release 1,105 Total eligible for follow-up 11,283 Follow-ups completed Follow-ups from veterans

4,589*

Follow-ups from significant others Patients who named others Pre-PARS returned by others Follow-up PARS returned by others Total pre-PARS and post-PARS

8,801b 5,721° 3,382d 3,382«

Note, PARS = Personal Adjustment and Role Skills Scale. "41% of eligibles. "78%, or 8,801 out of 11,283. "65% of patients who named others. d 59% of others who returned pre-PARS. e 30% of eligibles.

803

13% under 25, 32% 26-40, 39% 41-55, and 16% over 55 years old; (b) marital status: 40% married, 33% separated, widowed, or divorced, and 27% never married; (c) education: 37% less than high school, 30% high school graduates, and 33% at least some college; (d) income: 65% under $8,000, 22% between $8,000 and $12,000, and 13% above $12,000; (e) race: 84% white and 16% other; (f) diagnosis: 22% neuroses, 49% psychoses, 16% substance abuse, and 13% other; (g) service-connected disability: 40%; (h) employed: 38% mostly, 37% sometimes, and 25% not employed full time during past 5 years; (i) amount of hospitalization (during last 5 years) : 29% none, 30% 1-3 months, 26% 3-12 months, and 15% 1 year or more.

Replication Strategy The experimental unit for this study was the ward program. To increase the number of programs that were available for a replication analysis, wards that had a high intake of patients were divided into two data sets based on time; that is, patients who entered the study during the first 9 months of intake were placed into a different data set than were those who entered during the second 9 months. Of the 79 wards eventually used in this study, 43 were classified as low intake and 36 were classified as high intake, resulting in a total of 115 program data sets [43 + (36 X 2) = 115]. The low-patient-intake wards were randomly assigned to the original and replicate data sets. One of the data sets from each of the high-patient-intake wards was randomly assigned to the original set of programs and the other to the replicate set. This random assignment resulted in 58 programs in the original data set and 57 in the replicate set. Since no ward was represented twice within any one data set, the analysis within data sets was based on independent measures of ward program characteristics and outcomes.

Measures of Program Characteristics The program characteristics included in the study were selected from the research, clinical, and theoretical literature regarding the nature of therapeutic environments for psychiatric patients. Much has been written about the constructs of the therapeutic milieu (Ellsworth, Maroney, Klett, Gordon, & Gunn, 1971; Lawton & Cohen, 1975 ; Moos, 1974; Paul & Lentz, 1977; Van Putten, 1973; White, 1972), and the research literature on the characteristics of effective programs has been summarized by Ellsworth (1979). The present study represents a serious attempt at measuring the important characteristics derived from the clinical, theoretical, and research literature. A total of 440 measures of program characteristics were considered initially. Many were dropped because (a) they were redundant, that is, they were highly cor-

804

ELLSWORTH, COLLINS, CASEY, SCHOONOVER, HICKEY, ET AL.

Table 2 Setting Characteristics

ment milieu. Program variables were grouped into setting and treatment characteristics. Sixty-eight were setting characteristics; that is, they were not under the control of the treatment teams. As outCharacteristic Source lined in Table 2, these included (a) physical characteristics of the ward, such as size, separate tv Structural and physical room, and privacy; (b) administrative-policy Visitors room, tv room, reading room Patients' POW characteristics, including such measures as average Use of individual lockers Patients' POW ward census, percentage of female patients, number of students, and patient/staff ratios; (c) staffSpecial conveniences (washer, in-setting characteristics, including sex, training, refrigerator) Observations years of experience, and age; and (d) patientsStaff offices on the same floor Observations in-setting characteristics, including such things as Size of bathroom Observations the percentage of patients who were chronically Dormitories partitioned Observations hospitalized and the percentage of alcoholics. % patients in 2-to-4-bed rooms Observations These patients-in-setting characteristics constitute Administrative policy and practice the "suprapersonal" environment, as described by % patients female Records Lawton and Cohen (197S). It was expected that M daily census Records patients admitted to and treated in an environNo. patients/doctor Records ment with a higher proportion of chronic patients, No. patients/nurse Records for example, might have different outcomes from No. patients/social worker Records those of patients treated in an environment with No. patients/psychologist Records a higher percentage of acutes or alcoholics. The M days of hospital stay Records movement in the early 1960s toward creating No. residents, no. nursing unclassified programs (i.e., those treating a mixture students, etc. Records of patient groups) represented a major attempt to improve the treatment effectiveness of psychiatric Staff hospitals by rearranging the suprapersonal characYears experience in mental teristics of the treatment environment (Ellsworth Staff health et al., 1972). % nurses with bachelor's degree Staff The 123 treatment characteristics were those pro% board-certified psychiatrists Staff gram elements directly under the control of the Staff % staff female treatment teams. These included (a) medicaPatients tion practices, such as what percentage of paPatients' POW tients were on major tranquilizers and what perTime in hospital Patients' POW centage received drug combinations; (b) psycho% service connected Patients' POW logical treatment variables, including hours in Living with relative Patients' POW psychotherapy and patient participation in disRecently drinking Patient's POW charge planning; (c) ward environment, including M age Patients' POW those things that could be altered by the staff to % married Patients' POW improve appearances and comfort, or to encourage % treated before Records Length of stay interaction, such as furniture groupings and recreRecords % psychotic ation items; (d) notations according to problemRecords % neurotic oriented medical record procedures (Weed, 1969) ; Records % alcoholic (e) periodic observations of the percentage of pa% other Records tients socially active or passive; and (f) aftercare information regarding the taking of medication Note. POW = perception of ward. and the frequency of aftercare counseling. Although these aftercare variables were assumed to related with other measures in the study; (b) be at least partially under the control of the they were trivial, that is, they affected less than treatment teams, they were not direct measures of 2% of the patients on a ward (e.g., a rare com- program characteristics. The last group of treatbination of medications) ; or (c) they failed to ment characteristics were psychosocial and were differentiate among programs at an acceptable derived from the Perception of Ward (POW) level of statistical significance (/> < .01 for VETS F (114, 4466) and PARS F (114,3262). 8

fered on 10 of the 14 measures of pretreatment adjustment. Here again, any effect that this would have had on outcome was controlled by residualizing the scores as described earlier. These pretreatment adjustment differences point out the importance of developing residual outcome scores, since the measures of program effectiveness would otherwise have reflected, in part at least, the pretreatment differences of patients admitted to the different programs. The important question of whether the treatment programs differed from each other in effectiveness is answered in Table 3. A

ELLSWORTH, COLLINS, CASEY, SCHOONOVER, HICKEY, ET AL.

808

Table 4 Intercorrelations Among Treatment Outcome Measures for 115 Program Data Sets Residual return rate

Outcome measure 1. 2. 3. 4.

Judged improvement on VETS Judged improvement on PARS Residual Global Adjustment on VETS Residual Global Adjustment on PARS

Note. VETS * p < .05. **p < .01.

.46**

.74** .58**

.30** .71** .40**

.19* .15 .13 .06

Veterans' Adjustment Scale; PARS = Personal Adjustment and Role Skills Scale.

finding that they did not differ would have indicated either that treatment programs were not affected by the differences in program characteristics or that the measures of program outcomes used were not valid. As seen in Table 3, however, the primary measures of treatment effectiveness, the VETS and PARS Global Adjustment scores, reflected significant differences among programs. The programs also differed significantly on the VETS and PARS judged improvement scores and on return rates, but did not differ on Work Earnings or Alienation outcomes or on four of the PARS scores. Overall, the VETS ratings appeared to be more sensitive to outcome differences among programs than did the PARS ratings. This may be partly accounted for by the fact that some programs had only 17 PARS follow-up ratings, whereas all programs met the minimum sample size estimate of 24 VETS follow-up ratings. Interrelationships Within Program and Outcome Measures The interrelationships within program and outcome measures provide another estimate of the validity of the measures used in this study. Was there, for example, agreement among patients, staff observations, and records with respect to various program characteristics ? Was there agreement in the ratings of patients and significant others with respect to the outcome effectiveness of programs ? An examination of the interrelationships within program characteristics found moderate to high agreement (p < .01) among

many of these measures. For example, ward scores on patient-perceived Structured Program correlated .63 with nursing perception of Order and Organization. Staff perception of Autonomy correlated with patient perception of Patient Involvement (r — .54) and with .Patient Autonomy ( r = . S l ) . Several observer ratings of the physical environment were also related to patient perceptions. For example, ward scores on patient-perceived Private Space correlated .39 with the recorded presence of reading rooms and correlated .41 with the presence of a separate tv room. And finally, patient perception of Medication Amount correlated .56 with data from record reviews on the percentage of major tranquilizers given three times a day. The intercorrelations among the various measures of program effectiveness also revealed moderate agreement (p < .01) between the different sources of information, namely, patients and others. As seen in Table 4, there was a correlation of .46 between the judged improvement program scores from veteran ratings and the judged improvement scores from others. For the Global outcome scores, there was a correlation of .40 between the residual program scores computed from patients' ratings and those computed from others' ratings. There were higher correlations between the VETS judged improvement and the VETS Global residual scores (r = .74) and between the PARS judged improvement and the PARS Global residual scores (r=.71), but this was expected because these ratings came from the same data source. The intercorrelations in Table 4 indicate that if conclusions about program effective-

EFFECTIVE PSYCHIATRIC TREATMENT PROGRAMS ness were drawn from veterans' ratings alone they would be in moderate agreement with conclusions drawn from others' ratings alone. However, veterans' ratings would indicate some wards to be more effective than would others' ratings and vice versa. Since the two sources of data are not uniformly consistent, it is important to validate the findings from one source with those from the other. A program whose discharged patients demonstrate significant improvement in pre- versus posthospital adjustment by both self-ratings and others' ratings can be designated as effective with a high degree of confidence. Return rate was an interesting outcome measure. As can be seen in Table 4, the only significant correlation was a small one, r(115) = .19, p < .05, indicating that programs having the highest scores on VETS judged improvement also had somewhat higher return rates. However, none of the other correlations between residualized return rates and adjustment measures were significant. This seems to demonstrate that return rate and community adjustment, as measured by this study, are not necessarily measures of the same outcome, since the ward averages for these two types of outcome did not correlate well with each other. Relationship Between Program Characteristics and Treatment Effectiveness It was noted that the primary distinction between setting and treatment characteristics was that the treatment teams had no direct control over the former. For example, ward staff do not determine the staff/patient ratios, the number of small versus large bedrooms, or the percentage of never married patients admitted to their ward. On the other hand, they do influence such things as the level of Blocked Communication between patients and staff and the percentage of patients receiving a major tranquilizer. Of the 191 program characteristics measured in this study, 68 were considered to be setting and 123 were considered to be treatment characteristics. This large number of program characteristics was reduced to a manageable number prior to analysis by corre-

809

lating each of the characteristics with each of the measures of program outcome, namely, VETS and PARS Global residuals, VETS and PARS judged improvement, and return rate. These correlations were computed separately for the 58 programs in the original data set and the 57 in the replicate set. This resulted in 10 sets of 191 correlations: 5 outcome measures X 2 data sets. Several program characteristics, such as percentages and ratios, were found to have nonlinear distributions, and log or other types of transformations were applied to these characteristics. Only program characteristics that correlated significantly (p < .05) with the outcome measure under examination were retained as potential predictors of that outcome. For example, 30 program characteristics were found to correlate significantly with the VETS Global residuals for the original data set programs: 21 treatment and 9 setting characteristics. The next step was to perform a series of best subset regression analyses (Brown & Dixon, 1977) in order to select the combinations of program characteristics that best predicted each measure of program effectiveness. With regard to setting characteristics, a total of 10 best subsets were established, 8 for predicting the original and replicate VET'S and PARS Global residuals and VETS and PARS judged improvement, and 2 for predicting the return rates on both the original and replicate program data sets. Ten best subsets of treatment program characteristics were also established in the same way, but are not described in this report'because of their possible application in a blind experimental study. It must be recognized that this procedure for selecting best subsets of predictors of program characteristics yields artificially high multiple correlation coefficients (Rs), largely because some characteristics correlate with a particular program outcome only by chance. Thus, we decided to estimate the validity of each best subset by predicting other outcomes from these. This was done by using each best subset of characteristics to predict the remaining measures of program effectiveness. For example, five set-

810

ELLSWORTH, COLLINS, CASEY, SCHOONOVER, HICKEY, ET AL.

Table 5 Cross- Validation of Best Subsets of Setting Characteristics Outcome predicted from best subsets 1. VETS original residuals 2. VETS original judged improvement 3. VETS replicate residuals 4. VETS replicate judged improvement 5. PARS original residuals 6. PARS original judged improvement 7. PARS replicate residuals 8. PARS replicate judged improvement

1

M R





.349

.437*





.261

.523** .274 .482** .164 .314



.388 .544** .327 .437*

.443*

.235 .490** .263 .413* — — .411 .359

.381* .365

.521** .442*

.459** .428** — .328 .466* .343

— .284

.569** .491*

.409

.454*

.204

.292

.266

.192

.529** .445*

No. predictors

.312 .489** .257 .501** .391 .336

.370 .386

.446*

.329 .179





.243

.513** .508** .348 .470*





.468*





Note. VETS = Veterans' Adjustment Scale; PARS = Personal Adjustment and Role Skills Scale. * p < .05 for number of predictors and program data sets. ** p < .01 for number of predictors and program data sets.

ting program characteristics composed the best subset for predicting VETS residual outcomes for the original 58 wards. These five characteristics were then used to predict the replicate wards' VETS residuals and judged improvement scores and to predict the four PARS outcomes. These outcomes were relatively independent of the residual VETS outcomes on the original set of wards because they came either from other sources (i.e., significant others) or from the alternate data set. Outcomes from the same source on the same data set (i.e., VETS Global residuals and VETS judged improvement) were largely redundant, as seen in Table 4, and redundant outcome scores were not used for estimating the validity of best subsets of program characteristics. As seen in Table 5, there were four best subsets of setting characteristics that had statistically significant averaged cross-validated Rs. For example, Subset 3, the set of program characteristics that best predicted the VETS residual outcomes on the replicate data set, also predicted the original VETS residuals (# = .521, p < .01), the original VETS judged improvement (R = .442, p < .05), the original PARS judged improvement (R = .544, p < .01), and the replicate PARS judged improvement (R =

.437, p < .05). The average cross-validated R for the six outcome measures for Subset 3 was .433 (p < .05 for four predictors). The same procedures were followed in selecting the best subsets of treatment characteristics for predicting the measures of program outcomes, as mentioned earlier. These cannot be disclosed here, however, because the most promising ones are going to be investigated further in an experimental study, by blind comparison with a set of control variables. The opportunity for testing these findings experimentally would be compromised if either staff or patients knew what they were. Consequently, this report focuses on setting characteristics only. Best subsets were also considered for predicting return rates. For the treatment characteristics, there were no characteristics that predicted return rates to any significant degree. However, two suprapersonal characteristics did correlate well with both the original and the replicate return rates, namely, patients' self-reports of recent drinking and whether they had recently been hospitalized for psychiatric reasons. Ward scores for the patients' self-reports of recent drinking, derived from the patient POW testing, correlated .59 ( / > < . 0 1 ) with return rate for the original 58 wards and .32 (p < .02)

EFFECTIVE PSYCHIATRIC TREATMENT PROGRAMS for the 57 replicate wards. These correlations suggest that initially eligible study patients admitted to wards in which many patients had recent problems with drinking were more likely to be rehospitalized within 3 months than were those patients admitted to other wards. Ward scores for patients' self-reports of recent psychiatric hospitalization correlated .39 (p < .01) with return rate for the original set of wards and .40 (p < .01) for the replicate wards. These correlations suggest that wards whose admissions had recently been hopitalized would tend to have higher return rates than wards in which most admission patients had not recently been hospitalized.

811

averaged beta weight t statistics above 1.00 that were also consistently good predictors for both the original and replicate set of outcome measures. The overall best subset of setting characteristics were as follows: (a) separate tv room, (b) percentage of ward population hospitalized more than 3 months, and (c) percentage of ward population who were single (never married). The direction of each relationship to program effectiveness may be surprising to many. Wards that performed best on the various outcome measures were characterized by the lack of a separate tv room, by a relatively high percentage of long-stay patients on the ward, and by a rather high percentage of single patients admitted to the ward. The least effective proSetting Characteristics of Effective Programs grams, on the other hand, were those that The four subsets of setting characteristics had a separate room for watching tv, a lower shown in Table 5 were reduced further to percentage of long-stay patients, and a one best subset. This was accomplished by higher percentage of married or once marfirst listing all characteristics that appeared ried admissions. The possible reasons for in the subsets that had significant averaged this are considered in the Discussion section. Table 6 presents the correlations between cross-validated Rs. There were 11 different the three setting characteristics and the varisetting characteristics that appeared at least ous measures of program outcomes. As can once in the four subsets shown in Table 5. Six of these 11 characteristics, however, were be seen, the variables separate tv room and found to contribute little toward the predic- percentage never married were somewhat tion of outcome. This was revealed by aver- better predictors of program outcomes than aging the t statistics for each characteristic's was percentage hospitalised 3+ months. beta weight used in predicting the six al- Neverthless, this last variable, when comternative outcomes, where the ^-statistic tests bined with the other two, improved the R. whether the beta weight is significantly As can also be seen from Table 6, the ward different from zero. Program characteristic outcomes scored from the VETS rating were predictors that had averaged ^-statistic val- more predictable from these three setting ues below 1.00 were dropped (t = 1.00 has characteristics than were the PARS residual a two-tailed significance level of approxi- outcomes, and the PARS judged improvement mately .30 for our samples of wards). Those outcomes were more predictable than were predictors with ^-statistic values above 1.00 the PARS residuals. This suggests that the were retained, and all five were used in a simple-to-obtain rating of judged improveprediction equation for predicting all eight ment from significant others and from veteroutcomes. When used in combination, three ans is a useful measure of program effectiveof the five were found to have beta weight t ness. The Rs of the combined predictors are statistics that averaged above 1.00 and were given at the bottom of Table 6. The first set also consistently good predictors for both the of Rs (Set A) is the unadjusted multiple original and replicate set of outcome measures. These three were considered to be the correlation coefficient. These unadjusted Rs overall best subset of setting characteristics. are often inflated because of the number of The same process was applied to the treat- predictors or observations used. The second ment characteristics, resulting in six with set of Rs (Set B), therefore, is the set of

812

ELLSWORTH, COLLINS, CASEY, SCHOONOVER, HICKEY, ET AL.

Table 6 Correlations Between Setting Characteristics and Program Outcomes VETS outcome Original Setting characteristic Separate tv room % population hospitalized 3 + months % population never married

R

Adjusted R

PARS outcome

Replicate

Original

Resid.

J. imp

Resid.

-.37**

-.27*

-.12

Replicate

Resid.

J. imp

J. imp

Resid.

J. imp

MR

-.28*

-.27*

.19

.14

.24

.13

.22

-.18 .21

-.24 ,23

-.27* .18

-.25 .19

.35** .513** .472**

.26* .392* .327*

.27* .528** .487**

.39** .505** .462**

.12 .352 .274

.43** .537** .498**

,01 .324 .235

.27* .437* .383*

.26 .448 .392

Note. VETS = Veterans' Adjustment Scale; PARS = Personal Adjustment and Role Skills Scale; Resid. • residual; J. imp. = judged improvement. * p < .05 for ns of 57 or 58 programs and number of predictors. ** p < .01 for MS of 57 or 58 programs and number of predictors.

Rs obtained when the Rs are adjusted for the number of predictors and observations. As can be seen, these Rs drop somewhat from Set A. By considering the adjusted Rs, one concludes that an average of about 15% (.3922) of the variance in the measures of program effectiveness can be accounted for by these three setting characteristics. Although not presented in this article, the six best treatment characteristics accounted for about 30% of the variance in program outcomes when the Rs were adjusted. This suggests that treatment characteristics are more important in accounting for differences in program effectiveness than are setting characteristics. To describe the most and least effective wards in terms of these three setting characteristics, the 79 treatment programs were examined with respect to their averaged VETS and PARS judged improvement and residual score outcomes. The 20 programs with the highest combined VETS and PARS outcomes and the 20 wards with the lowest outcomes were identified. Only 15% of the most effective programs had separate tv rooms, whereas 60% of the least effective programs had them (p < .01 for the difference between proportions). The patient population hospitalized beyond 3 months averaged 44% for the most effective wards and 25% for the least effective programs (p < .01), and 33% of the patients admitted to the most effective programs were never married, whereas 22 % ] of those admitted to

the least effective programs were never married (p < .01). These findings represent another perspective in the confirmation of the fact that patients who were admitted to and treated on wards without separate tv rooms and on wards having higher proportions of long-stay and single-admission patients had better treatment outcomes than patients admitted to contrasting types of wards. Several setting characteristics that might reasonably be expected to correlate with measures of program effectiveness were found not to do so. For example, the most and least effective programs did not differ from each other in terms of (a) number of patients on the ward, (b) days of treatment for admission patients, (c) staff/patient ratios, (d) whether the psychiatrist was board certified, (e) the percentage of staff nurses with bachelor's degrees, and (f) the number of years the staff had worked in mental health. Thus, size of ward, length of stay for admission patients, staffing ratios, and external measures of staff qualification were found to be unrelated to program effectiveness. Some expected suprapersonal characteristics of the ward's patient population were also not related to program effectiveness, For example, study patients treated on wards that had a high percentage of patients diagnosed as psychotic or alcoholic had treatment outcomes as good as those for patients treated on wards having fewer of these kinds of patients.

EFFECTIVE PSYCHIATRIC TREATMENT PROGRAMS

Effect of Best Settings on Different Patient Subgroups Also examined was whether or not all patient types responded equally well when treated on the best setting wards, that is, on those wards having a high level of the three setting characteristics. Ward scores for the three setting characteristics were standardized and combined for the 115 program data sets. The 40 programs with the highest and lowest levels of setting characteristics were identified. Eleven patient background characteristics were examined, including age, race, education, income, previous treatment and hospitalization, length of difficulty, employment, marital status, diagnosis, and size of city of residence. Two-way analyses of variance were computed that examined the PARS and VETS outcomes for different subgroups of patients within each background characteristic (e.g., under age 25, 26-35, etc.), the outcomes of patients treated under high and low setting conditions, and the interaction between patient characteristics and level of setting conditions. In examining patient subgroup response to high and low setting conditions, four statistically significant interaction effects were noted. The subgroups of patients not especially responsive to the best setting programs were those who (a) had been treated before but over a year ago, (b) had never married, (c) were diagnosed as psychotic, and (d) came from towns with populations under 5,000. These patient subgroups had only slightly better outcomes if treated in the best setting programs. On the other hand, patients showing unusually good response to the best setting programs were those who (a) had no previous psychiatric treatment, (b) were married, (c) had a diagnosis of alcohol abuse, and (d) came from cities between 15,000 and 100,000 in population. It may be that patients who are relatively poor responders to these setting conditions would respond better to other program conditions. Or it may be that their treatment response is not strongly affected by any program characteristics. A subsequent analysis is planned to determine whether there are

813

other kinds of treatment environments that are more helpful to these poor-responder patient subtypes. Discussion Although several promising findings have emerged from this massive study and more are anticipated as the data are analyzed further, the findings described so far have also raised questions. How, for example, can one account for the finding that only a few of the many program characteristics measured were related to treatment outcome? How does one interpret the findings? On first consideration, the three setting characteristics that were predictive of program effectiveness seem surprising. Consider first the variable of separate tv room. This appears to be a marker item representing other aspects of the treatment environment. For example, the presence or absence of a separate tv room was correlated (p < .01) with (a) presence of conveniently located vending machines (r = .23), (b) staff offices on the same floor (r — .27), (c) a separate reading room ( f = . 3 7 ) , and (d) number of patients on the ward (r = .29). This cluster of items suggests a physical structure designed to provide adequate space, and such an arrangement tended to occur on the larger wards. The four variables related to separate tv room were themselves not significantly correlated with outcome, but they help to identify aspects of the environment that characterize wards with tv rooms. The size of the correlation between the separate tv room characteristic and the measures of program effectiveness, plus caution in attributing causality to correlational findings, indicates that separate tv rooms should not necessarily be abolished. The findings suggest that watching tv may serve as a substitute for more therapeutic interactions, and a separate tv room may encourage withdrawal from ward activities. The variable percentage hospitalised 3+ months serves as a marker item for the observed presence (p < .01) of (a) current magazines (r=.31), (b) newspapers (r —

814

ELLSWORTH, COLLINS, CASEY, SCHOONOVER, HICKEY, ET AL.

.25), (c) pictures on the dayroom walls (r = .26), (d) doors on toilet stalls (r=.33), and (e) music during meals (r = .25). This cluster of items suggests attempts by the staff to make the physical environment pleasant and more livable and indicates the kind of environment that may be found on wards treating a mixture of newly admitted and long-stay patients. The percentage of patients on the wards for 3 or more months also correlated (p < .01) with (f) fewer staff per patient (r = — .52), (g) fewer nursing students (r = — .23), ( h ) a higher percentage of whites (r = — .42), and (i) the average number of hospital days for study patients admitted to the ward (r = .29). The meaning of the percentage of whites is not clear. But the fewer staff per patients, the absence of nursing students, and the slightly longer treatment times are consistent with the kind of environment that treats more chronically hospitalized patients as well as new admissions. Longer treatment times for study paitents should not be construed as being characteristic of the more effective programs, because the correlation between ward treatment times and PARS and VETS outcomes averaged only .09. Rather, teams that treat both newly admitted and long-stay patients appear to take slightly more time in working with their newly admitted patients. The relations between the chronicity of ward patient populations and measures of treatment effectiveness with newly admitted patients clearly support the rationale used in introducing the "unclassified unit system" into psychiatric treatment in the early 1960s (Duran, 1964; Ellsworth & Stokes, 1963; Garcia, 1960; Jackson & Smith, 1961; Ortega, 1958; Schulberg & Baker, 1969). Recall that the criteria of program effectiveness in this study were ratings of the posthospital functioning of patients hospitalized less than 6 months. Evidence for the effectiveness of unclassified wards can also be found in the work of Fairweather (1964), who reported that a mixture of acute and chronic, active and passive patients increased the effectiveness of his treatment method. Currently, there appears to be a movement away from the unclassified unit system (Ellsworth et al.,

1972) as more specialized wards are set up to treat designated categories of patients (first admissions, depressive disorders, longstay patients, etc.). This drift back toward the classified system may well result in an increasing number of wards filled mostly with chronically hospitalized patients. History could repeat itself if these wards again generate the self-fulfilling prophecy characteristic of the "back wards" in the 1950s, namely, that "chronics" can't improve much or attain independent living outside a mental hospital. The variable percentage never married correlated well (p < .01) with several suprapersonal measures of the treatment environment. Wards that admitted a high percentage of single patients also had more patients who (a) were diagnosed as psychotic (r = .61), (b) had been treated before (r = .53), (c) had been hospitalized before (r = .50), (d) were younger (r = — .72 with average age for the ward), (e) were better educated (r = .46), (f) were employed prior to admission (r=.39), and (g) had not lived with a relative prior to admission (r = — .30). These variables suggest settings in which a mixture of patients were treated: some with long-standing problems (Variables a, b, and c) and some who were younger, better educated, and employed. For older patients, the status of never married is often indicative of the marginal psychosocial adjustment characteristically found in the process type of schizophrenic patient. Single status for younger patients does not necessarily indicate marginal adjustment, and more of these patients were found to be better educated and recently employed. Again, this mixture of acute and chronic patients identified by the marker term never married appears to characterize the more effective programs and lends additional support to the effectiveness of the unclassified unit system for treating newly admitted patients. The question of why only 3 setting and 6 treatment characteristics were related to program effectiveness also requires consideration. Out of a possible 191 characteristics, only 3 setting and 6 treatment characteristics

EFFECTIVE PSYCHIATRIC TREATMENT PROGRAMS

815

withstood the tests of statistical significance. slightly higher return rates, as seen in Obviously, many that have been reported in Table 4. other studies as contributing to treatment Although the wards differed significantly outcome did not show up here. First, some on many of the outcome measures of proof the 191 characteristics turned out to be gram effectiveness, some of these differences redundant measures. For example, the pa- reached only a minimal level of statistical tients' number of children was a good pre- significance (p < .05), and several measures dictor of some program outcomes, but the indicated no differences. This was surprising underlying variable was already measured because it would not have taken large obby the percentage never married. Another served differences to reach statistical signifimajor factor was the use of a cross-validation cance given the sample size of programs and design. Without this, a great many more patients. It is commonly assumed that provariables would have been reported er- grams do differ in outcome effectiveness, but roneously as contributing significantly to it is less commonly known that these differoutcome. On the other hand, there is still ences are not usually very large. Furtherthe possibility that other program character- more, there was not much stability in the istics will emerge as significant in the treat- measures of outcomes over time, since for ment of specific kinds of patients. Data per- the 36 high discharge programs, the corretaining to the secondary purpose of the study lations between Time 1 and Time 2 averaged are presently being analyzed to determine only .15. Thus, a ward that was more treatwhether different kinds of patients respond ment effective at Time 1 was not necessarily differentially to treatment programs with so at Time 2. Although measures of treatother characteristics. ment effectiveness do not differ sharply among programs and do not remain conEvaluating Outcomes of Multiple Programs stant over time, the level of observed program effectiveness is apparently related to This study found that psychiatric pro- certain program characteristics, as indicated grams differed in treatment effectiveness, by this study. but not on all criteria. The ratings from Those involved in program evaluation patients appeared to provide more sensitive might also find interesting the finding of measures of program differences than the only moderate agreement (rs in the .40s) ratings by significant others. Part of this may in the measures of ward effectiveness, as have been due to the somewhat lower num- scored from different data sources (veterans ber of PARS ratings used to estimate pro- vs. others). Thus, one cannot generalize with gram effectiveness. But that the ratings from a high degree of certainty from one data veterans differentiated more clearly among source to another. Strupp and Hadley (1977) wards was surprising because patients have have pointed out that patients and families traditionally not been considered to be ac- represent two rather different perspectives, curate reporters of their adjustment and namely, observations versus subjective feelfunctioning. Also surprising was the finding ings. It would seem safer to regard a prothat the simple-to-obtain measure of judged gram that performed well from both perspecimprovement performed as well as the more tives as more effective than one that perelaborate measures of program effectiveness, formed well on only one. Some programs as scored from pre- and postratings by veter- may in fact produce substantial changes in ans and others. And finally, this study found observable behavior but not in subjective no evidence that ward return rates, tradifeelings. In such a situation, one would need tionally used measures of program effectiveness, were related to the measures of com- to consider the relative merit of one perspecmunity adjustment obtained from veterans tive as compared with another. Data loss in follow-up ratings was a and others. In fact, there was a slight tendency for programs whose patients were ad- problem in this study, as in others, and the justing better after discharge also to have data loss from significant others was bigger

816

ELLSWORTH, COLLINS, CASEY, SCHOONOVER, HICKEY, ET AL.

than that from patients. The reason for this is that data loss from others can occur on two occasions, at the initial and follow-up mailings. A special study of data loss, which probed the pool of nonresponders to the mailed questionnaire, found no evidence that data loss was related to patient adjustment. Thus, the relative effectiveness of programs can be estimated with validity even though some programs have somewhat more data loss than others. It must be recognized that only the relative effectiveness of programs can be estimated when data loss is high, because the absolute level of patient outcomes probably cannot be estimated accurately when there are high rates of data loss. A certain amount of caution should be exercised in considering the findings and conclusions of this study. First, the findings apply only to cooperative and capable male veterans treated in Veterans Administration medical centers, and some findings may not be generalizable to other types of psychiatric hospitals. Second, statistical means were used to control the effects of patient background and initial adjustment on the outcome measures used in this study. Although statistical controls were essential to avoid confounding the analyses and reports of results, they are not as acceptable as experimentally controlled studies. On the other hand, the advantages of the present study are many. It is the largest study of psychiatric treatment settings ever undertaken, both with respect to the number of program characteristics examined and to the sample size of patients and programs. A serious effort was made to control for the effects of patient input characteristics on the outcome measures used. The outcome measures were highly relevant because they measured community rather than hospital adjustment and were not limited to return rates typical of many previous studies. Finally, there was a serious attempt to crossvalidate the selection of program characteristics by using them to predict program effectiveness from different perspectives (veterans and relatives) and to predict both original and replicate data set outcomes. As

such, the findings from this study can be accepted with considerably more confidence than can those of earlier studies. Reference Note 1. Ellsworth, R. B. Characteristics of psychiatric programs and their relationship to treatment effectiveness (Veterans Administration cooperative study proposal). Salem, Va.: Veterans Administration Hospital, 1975.

References Brown, M. B., & Dixon, W. J. (Eds.). BMDP-77: Biomedical computer programs (P Series). Berkeley: University of California Press, 1977. Duran, F. A. The unit system—Its effects upon nursing. Journal of Psychiatric Nursing, 1964, 2, S32-S39. Edelson, R. I., & Paul, G. L. Some problems in the use of "attitude" and "atmosphere" scores as indicators of staff effectiveness in institutional treatment. Journal of Nervous and Mental Disease, 1976,162, 248-257. Ellsworth, R. B. Consumer feedback in measuring the effectiveness of mental health programs. In M. Guttentag & E. L. Struening (Eds.), Handbook of evaluation research (Vol. 2). Beverly Hills, Calif.: Sage, 1975. Ellsworth, R. B. Characteristics of effective treatment settings: A research review. In J. A. Gunderson, O. A. Will, Jr., & L. R. Mosher (Eds.), The principles and practices of milieu therapy, New York: Aronson, 1979. Ellsworth, R. B. Does follow-up loss reflect poor outcome ? Evaluation and the Health Professions, in press. Ellsworth, R. B., Dickman, H. R., & Maroney, R. J. Characteristics of productive and unproductive unit systems in VA psychiatric hospitals. Hospital and Community Psychiatry, 1972, 23, 261-271. Ellsworth, R. B., Finnell, K. C, & Leuthold, C. Community treatment for young psychiatric patients : A case study in program evaluation. Evaluation and the Health Professions, 1978, 1, 66-80. Ellsworth, R. B., Foster, L., Childers, B., Arthur, G., & Kroecker, D. Hospital and community adjustment as perceived by psychiatric patients, their families, and staff. Journal of Consulting and Clinical Psychology Monograph, 1968, 32(5, Pt. 2). Ellsworth, R. B., Maroney, R., Klett, W., Gordon, H., & Gunn, R. Milieu characteristics of successful psychiatric treatment programs. American Journal of Orthopsychiatry, 1971, 41, 427-441. Ellsworth, R. B., & Stokes, H. A. Staff attitudes and patient release. Psychiatric Studies &• Projects, 1963, 1, No. 7.

EFFECTIVE PSYCHIATRIC TREATMENT PROGRAMS Erickson, R. C. Outcome studies in mental hospitals: A review. Psychological Bulletin, 1975, 82, 519-540. Erickson, R., & Paige, A. Fallacies in using lengthof-stay and return rates as measures of success. Hospital and Community Psychiatry, 1973, 24, 559-561. Fair weather, G. W. (Ed.). Social psychology in treating mental illness: An experimental approach. New York: Wiley, 1964. Garcia, L. B. The Clarinda Plan: An ecological approach to hospital organization. Mental Hospitals, 1960,11, 30-31. Hargreaves, W. A., Attkisson, C. C., & Sorensen, J. E. Resource materials for community mental health evaluation (DHEW Publication No. ADM 77-328). Washington, D.C.: U.S. Government Printing Office, 1977. Jackson, G. W., & Smith, F. V. The Kansas Plan. Mental Hospitals. 1961, 12, 5-8. Lawton, M. P., & Cohen, J. Organizational studies of mental hospitals. In M. Guttentag & E. L. Struening (Eds.), Handbook of evaluation research (Vol. 2). Beverly Hills, Calif.: Sage, 1975. Mace, A. E. Sample size determination. New York: Reinhold, 1964. Moos, R. H. Evaluating treatment environments: A social ecological approach. New York: Wiley, 1974.

817

Ortega, M. J. A service-centered plan for a therapeutic community. Mental Hospitals, 1958, 9, 5-9. Paul, G. L., & Lentz, R. J. Psychosocial treatment of chronic mental patients: Milieu vs. social learning programs. Cambridge, Mass.: Harvard University Press, 1977. Schulberg, H. C., & Baker, F. Unitization: Decentralizing the mental hospitalopolis. International Journal of Psychiatry, 1969, 7, 213-223. Stanton, A. H., & Schwartz, M. S. The mental hospital. New York: Basic Books, 1954. Straus, A., Schatzman, L., Buchen, R., Ehrlich, D., & Sapshin, M. Psychiatric ideologies and institutions. New York: Free Press of Glencoe, 1964. Strupp, H. H., & Hadley, S. W. A tripartite model of mental health and therapeutic outcomes. American Psychologist, 1977, 32, 187-196. Ullmann, L. P. Institution and outcome. New York: Pergamon Press, 1967. Van Putten, T. Milieu therapy: Contraindications? Archives of General Psychiatry, 1973, 29, 640643. Weed, L. L. Medical records, medical education, and patient care. Cleveland, Ohio; Press of Case Western Reserve University, 1969. White, N. F. The descent of milieu therapy. Canadian Psychiatric Association Journal, 1972, 14, 41-58. Received April 16, 1979 •

Some characteristics of effective psychiatric treatment programs.

Journal of Consulting and Clinical Psychology 1979, Vol. 47, No. 5, 799-817 Some Characteristics of Effective Psychiatric Treatment Programs Robert B...
2MB Sizes 0 Downloads 0 Views