Comparisons Among Quality Assurance Systems: From Outcome Assessment to Clinical Utility

May 25, 2017 | Autor: Larry Beutler | Categoria: Psychology, Clinical Psychology, Psychiatry
Share Embed


Descrição do Produto

Copyright 2001 by the American Psychological Association, Inc. 0022-006X/OI/S5.00 DOI: HU037//0022-006X.69.2.197

Journal of Consulting and Clinical Psychology 2001, Vol. 69, No. 2, 197-204

Comparisons Among Quality Assurance Systems: From Outcome Assessment to Clinical Utility Larry E. Beutler

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

University of California, Santa Barbara This special section describes contemporary systems for assessing the quality and effectiveness of service delivery. These systems have in common their commitment to the belief that by continuously monitoring treatment-related change, identifying problem cases, and providing feedback to clinicians or agencies regarding patient progress the benefits of treatment may be increased. Aside from their commonalities, much is to be learned from the varying ways with which these systems gather information and provide feedback to clinicians or health care managers. The methods vary both as a function of the sociopolitical climate of the country in which they were developed and of the personal preferences and assumptions of the developers. An articulation of these differences can be of interest to health care managers and to psychotherapy researchers.

The Editor of the special section asked me to comment on the commonalities and differences among the quality assurance (QA), tracking, and assessment systems that have been presented in this special section and to compare these systems with a system that has been developed in my research laboratory. This latter system is based on the initial work of Beutler and Clarkin (1990) and is called systematic treatment selection (STS; Beutler & Williams, 1995; Fisher, Beutler, & Williams, 1999). STS is conceptualized in a significantly different way than the systems presented here: Although it includes many features of the other systems, its principal focus is to preemptively suggest treatments that will maximize patient benefit rather than to recommend changes only after negative effects appear. Thus, it serves as a useful method to which the others can be contrasted. Its conceptualization and design also illustrate some emerging developments and challenges in the field of clinical assessment, generally, and of the application of clinical research to practice, more specifically. This brief presentation addresses several issues that face those of us who dare enter the synaptic space between research and practice; it also raises some issues that are just beginning to emerge as we begin to apply computer technology to the task of improving clinical treatment. It raises these issues, however, in the hope that doing so will serve as a reinforcement both to the authors who have presented their work in this special section and to the collaborating managed care communities who have worked with them. It also is designed to be a challenge to these managed care programs to go even further than they have. To anticipate my conclusions, I find the systems presented in this special section to be, at once, exciting, on the cutting edge of health care, and unnecessarily limited in both the knowledge used and the technology transferred from the research laboratory to the real world of clinical practice.

The Sociopolitical Nature of QA The articles in this special section illustrate very nicely how political forces have converged to force researchers to address the clinical utility of their work. Throughout the western world, health care costs have risen dramatically, and this has provoked political bodies to mandate methods of reducing these costs. Though sometimes belated and almost always after a circuitous route of pandering to special interests, false starts, and mistakes, many politicians have turned to research scientists to find ways to increase the efficacy and effectiveness of mental health treatments. Although western countries share both the need to reduce costs and the hope of finding answers in research science, the particular political forces at work in different countries have affected the nature of the resulting procedures. As noted by Kordy et al. (2001), for example, the Bismark law placed the responsibility for maintaining the health of its citizens on the German government. This system emphasizes parity between medical and mental health, and because the provider corporations are separated by law from the national health insurance program, decisions about treatment are relatively protected from the politics of funding. Thus, long-term therapy is widely supported in a way that starkly contrasts with contemporary programs in the United States. Although the German government mandates and financially supports health coverage for all citizens, the providers themselves set the policies and define how, how much, and what services will be provided. Only the provider corporations have access to the knowledge of who treated whom, with what results, and over what period. Thus, both quality control and definition of services fall indirectly to practitioners. There is distinct appeal to this system, but it should be remembered that lack of fiscal responsibility for treatment, on the part of the provider, has often been a recipe for abuses of the system by providers (e.g., excessively long and redundant care) in the United States. On the positive side, only a small percentage of patients are privately insured in Germany. Thus, those who seek to assess the quality of services can reach most patients who seek mental health services by working with a small number of regional provider

Work on this article was supported by Grant RO-I-DA09395. Correspondence concerning this article should be addressed to Larry E. Beutler, Counseling/Clinical/School Psychology Program, Department of Education, University of California, Santa Barbara, California 93106.

197

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

198

BEUTLER

corporations, all of which operate according to similar rules. Therefore, those who assess clinical utility of treatment have access to large and representative samples of patients through a relatively well-integrated network of providers, all of whom are playing by similar rules. Within this value system, a QA program that requires completion of a long battery of tests and measures, such as that advocated by Kordy et al. (2001), is more acceptable than it would be in the United States, where more short-term (some would say "short-sighted") gains are valued. Like Germany, the United Kingdom maintains a centralized system of health care, but like the United States, the political environment favors short-term treatments (Roth & Fonagy, 1996). The particular relationships between the National Health Service and the providers in the United Kingdom, however, apparently are such as to make features like an assessment system with a projection of the normatively expected course of change a contentious issue (Barkham et al., 2001). Perhaps because the government is so closely involved in both assessing the quality of the services provided and making decisions about coverage, clinicians in the United Kingdom are fearful that if their services fail to bring the patient into accord with the projected course of change, the government will intrude and micromanage the clinicians' practices (Barkham et al.. 2001). The centralized, government-supported, health care management systems that characterize the United Kingdom and Germany have certain advantages over the systems of care available in the United States. The U.S. system is characterized by multiple, competing, and usually contradictory health care management corporations with which one must negotiate in order to study quality of care among large groups of patients. The systems of quality management used in the United States, such as those represented by Lambert, Hansen, and Finch (2001) and Lueger et al. (2001) must be adaptable to many different sets of procedures, expectations, demands, and objectives. These differences between and among countries not only result in differences in the nature of the methods used to measure the effects of care but in differences in the nature of the care itself. For example, in the United States quality must be evaluated over a much shorter time frame than in Europe, especially in comparison with Germany. Concomitantly, the nature of treatments that are thought to be empirically supported are likely to be more highly structured, more rigidly specified (e.g., manualized), and more varied than they are in these other countries. Compare, for example, the recommendations of effective treatments made by Roth and Fonagy (1996) in the United Kingdom with those provided by Nathan and Gorman (1998) in the United States. The former recommendations are strikingly more flexible and integrative. The sociopolitical contexts in which the presented systems occur have led to considerable variability in how they are structured, delivered, and used. Some (e.g., Clinical Outcomes in Routine Evaluation—Outcome Measure [CORE-OM], Barkham et al., 2001) are in the public domain, others (AKQUASI, Kordy et al., 2001) are supported by governmental agencies, and still others (Outcome Questionnaire—45 [OQ-45], Lambert et al., 2001; COMPASS, Lueger et al., 2001) are offered for a fee to practitioners or patients. Likewise, they differ in length, format, comprehensiveness, and a variety of other ways. None are suitable for children and most are untested among nonambulatory patients. These differences may make them appealing and differentially

useful to different populations. These differences also dictate the degree to which they provide useful data from a research perspective. A Comparative Analysis There are significant differences in the ways that the instruments and systems presented in this series are conceptualized and in the values that these systems implicitly embody. Note, for example, that both the United Kingdom and the German systems are flexible in their methods of assessment, allowing for clinicians to exercise options that increase the specificity and breadth of problems and populations for which the methods are suited. Thus, these European systems allow for both inpatient and outpatient use and provide a multitude of feedback procedures. In contrast, on one hand the U.S. systems use a single, unvarying (inflexible) assessment system, are most applicable to outpatient settings, and provide a limited variety of feedback options. On the other hand, all of the systems operate on a tacit model that focuses on error correction rather than on prospective planning and adapting treatments to patients and settings. These assumptions, including both the similarities and the differences among systems, have a significant impact on both their usefulness to the health care system and to researcher scientists. Including the differences mentioned, each of the presentations in this Special Section has identified strengths and weaknesses of its own or other systems. Some features of all of these systems are similar, suggesting a common value system, whereas others are very distinctive, bringing into high relief the nature of specific underlying assumptions. These confessions and observations by the authors provide the basis to compare the systems on eight easily identifiable dimensions: (a) the source of information, (b) the length of assessment, (c) the degree of flexibility in selecting the assessment procedures, (d) the breadth of population to which the system applies, (e) the variation in the use of a projection of the course of change, (f) the methods of providing feedback to clinicians and care-giving institutions, (g) the indices of reliable change, and (h) the use of a marker or signal of risk. There are other qualities, such as cost and method of recruiting participants, that also distinguish the systems, but these factors are largely incidental to the structure of the systems themselves. For this comparison, I restrict my discussion to the eight formal aspects of the various systems identified above.

Source of Assessment The four methods presented rely heavily or totally on patient self-report. The COMPASS includes four items that are completed by the therapist, and the OQ-45 is experimenting with a method that includes clinician and parental reports for children. However, the most dramatic exception to the general reliance on patients' reports is the Stuttgart-Heidelberg (Kordy et al., 2001) method, which includes a variable number of instruments that are completed by outside observers, usually the clinician. Maintaining source variability is an advantage in view of the observation that various sources of rating are poorly correlated (Beutler & Hamblin, 1986; Garfield, Prager, & Bergin, 1971; Strupp, Horowitz, & Lambert, 1997).

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

SPECIAL SECTION: COMPARING QA SYSTEMS

Reliance on self-report has other limitations as well. Lueger et al. (2001), for example, point out that self-report measures assume that the pathology itself does not distort the responses. This assumption requires a great deal of faith. It is likely that some of the same factors that lead patients to be noncompliant in completing these instruments also contribute to low change (e.g., low investment and poor motivation). Self-report measures are highly affected by drop-out and compliance rates. Note, for example, that Barkham et al. (2001), in analyses related to their studies, found that among patients in secondary care, only 9% (224 of 2,507) completed a second assessment. Other samples were more compliant, but with a good deal of variability. In what is probably quite representative of such studies, Kordy et al. (2001) found that 17% did not provide outcome data, and an additional 4% provided unusable data. Conclusions about the effects of treatment on the basis of 9% to 80% of those who receive it pose a significant risk to external validity. This, however, is the nature of self-reports. Although self-reports make sense from a clinical perspective, given that clinicians dislike adding time to their own schedules and because most assessments are based on patient self-report, methods that are less likely to be directly affected by patient motivation and benefit are likely to provide a sounder basis for research that transfers to clinical decision making. Although they are likely to have their own biases, especially if reimbursement is tied to effectiveness, clinicians' ratings would be likely to provide for less confounding of patient factors and outcomes than self-reports, even if they do not produce higher rates of compliance.

Length of Assessment Brevity of a system, especially if it relies on self-report measures, probably enhances the likelihood of patient cooperation. Of the systems presented, the shortest is the CORE-OM (Barkham et al., 2001), which consists of 34 items, yielding four subscales. This instrument is short, but can be even shorter by the use of one of two 18-item short forms. It is logical to assume that short forms may be conducive to securing higher rates of patient cooperation, and thus, they may be especially useful for conducting sequential evaluations over time. The use of alternate forms, by the same token, may reduce the tempering effects of practice. Alternatively, the briefer the assessment, the more likely one is to sacrifice sensitivity and specificity of measurement. The Stuttgart-Heidelberg system (Kordy et al., 2001) is the longest assessment system of the four and, depending on the clinician's selection of measures, possesses good reliability and sensitivity. It relies on an extensive but variable battery that provides rich information, but it does so at the expense of both patient and clinician time and effort. As noted earlier, it would probably be difficult to ensure the needed level of cooperation by patients and therapists in the United States. Another downside of brevity is that the number of domains assessed is limited. The OQ-45 (Lambert et al. 2001) is nearly as short as the CORE-OM system but provides only three subscales on which outcomes are assessed. This necessarily limits the amount of information available to the clinician and on which a decision to alter treatment is made. To a lesser extent, this concern is also present with the COMPASS (Lueger et al., 2001). Although the latter is a longer instrument than the OQ-45 (78 items on first

199

administration, and 54 on subsequent administrations), in its original form it embodies not only three subscales but a general composite measure as well. This latter measure may be expected to be more stable than the individual measures from which it is composed. Because it led the way into this field and has accumulated a very large database (now over 40,000 cases), the COMPASS has a much richer research history to support the use of its scales for clinician profiling, predicting change, and modifying treatment than any of the other systems. It remains the standard in the field, and other systems must prove themselves against it.

Flexibility of Selecting Assessment Procedures Flexibility is defined as the degree to which the assessment procedure can be adapted to either the needs of the individual patient or to the preferences of the assessing clinician. The most flexible system is the Stuttgart-Heidelberg (Kordy et al., 2001) program, which allows clinicians to select from a wide variety of instruments, reflecting various levels of specificity for different problems. An advantage of this latter procedure is that clinicians can select instruments with which they are familiar and have previously found to be useful. The CORE-OM also provides some flexibility of assessment procedures, both by virtue of the availability of three forms (the long form and two short forms) and by the use of populationspecific extensions. These latter extensions are composed of questions that allow more depth in the assessment of certain problems of the clinician's choice. These extensions also can be patterned to fit the needs of the population seen in a given setting. Each of the other systems relies on a single instrument that was developed specifically for the purpose of tracking outcomes. It is noteworthy, however, that Lambert et al. (2001) are experimenting with a form of the OQ-45 for children that can variously be applied by parents, clinicians, or children themselves. This addition should increase the flexibility of this instrument. Generally, however, the advantage of using the same instrument for multiple purposes is in the clarity and simplicity of the interpretation and administration, but the cost is a loss of flexibility and population-specific relevance.

Breadth of Population Applicability A related issue is the breadth of the population to which the procedure can be applied. Breadth is limited, in all these systems, by the self-report format. In virtually all cases, this makes the procedures inaccessible or inappropriate for children. Their use is limited to adults or, perhaps, to adolescents who are able to read at a suitable level and who have sufficient functional ability to attend and comply with the requirements of a written examination. They are also limited by the patient's level of insight, cooperation, and availability. However, within these constraints, the AKQUASI (Kordy et al., 2001) and CORE-OM (Barkham et al., 2001) systems are designed to be adaptable to patients who vary widely in level of impairment, and at least the former, conceptually, could be extended and adapted to the use of checklists for children. Both of these instruments can be applied to inpatient as well as outpatient samples and to those who have serious as well as transitory conditions. The OQ-45 (Lambert et al., 2001) and COMPASS

200

BEUTLER

(Lueger el al., 2001) systems are largely applicable to outpatient samples, but the OQ-45 (Lambert et al.. 2001) is being adapted for use with seriously impaired patients. The COMPASS computes projections on a sample that is restricted to those who initially score within the pathological range, however, and the OQ-45 is apparently not very sensitive to changes among mildly disturbed patients (Smart, 1998). Yet we do not know if the other instruments are any more sensitive to change in mildly disturbed groups, because these figures have not been computed.

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Projecting the Course of Treatment

An important feature of a QA system is its ability to identify when treatment is not working. Because of severity of impairment and social deficits, some patients may change very slowly or not at all, even with the best of treatments. Without treatment they may deteriorate, and with treatment they make few gains. Treatment may be oriented toward maintenance rather than toward improvement in these cases. However, unless we can identify who these patients are and distinguish them from those who are receiving insufficient or ineffective treatment, concluding that the treatment is not working would be inappropriate. Thus, being able to project a course of change for a given patient on the basis of an algorithm that is drawn from a group of similar patients is a critical feature of a good QA instrument. All of these instruments, except the CORE-OM, have applied some form of growth curve analysis to plotting the projected courses of change with which the actual course of a patient's progress can then be compared. Comparison of patient progress with a statistically derived projection is one of the usual bases from which the various systems flag patients who are doing poorly. This is then used to initiate recommendations for changing treatment. However, Barkham et al., (2001) observed that clinicians in the United Kingdom are resistant to these projections, apparently fearing that the federal insurance managers will usurp clinical prerogative if the information is provided. The other systems, at least potentially, allow for a multidimensional projection of change, ranging from three (OQ-45) to a virtually endless array of dimensions (Stuttgart-Heidelberg; Kordy et al., 2001). The number of dimensions and scales yielded by the system is important because it limits the degree to which the complexity of problems presented by patients and the varied rates of change that are expected of these varied problems can be identified. It is important to note, however, that some advantage accrues to the COMPASS system (Lueger et al., 2001) because of its use of an empirically defined phase model to predict change. That is, expected course is not represented simply by a line on a graph but by a pattern and predictive sequence of changes. The ability of the instrument to determine when patients are progressing in accordance with the expected and necessary sequence of changes potentially allows for early identification of those who need a change of treatment and even provides guidance to what type of treatment might be initiated. Providing Feedback to Clinicians All four of the systems are designed to provide feedback to administrators or to clinicians. The Stuttgart-Heidelberg (Kordy et

al., 2001) and CORE-OM (Barkham et al., 2001) systems particularly address the differences between feedback to an institution and feedback to an individual clinician. These distinctions reflect the nature of the federally based health care systems that they are designed to serve. However, the OQ-45 (Lambert et al., 2001) and the COMPASS (Lueger et al., 2001) systems are easily applicable to these multiple purposes, and the latter instrument has been adapted to provide clinician profiles for the use of health care systems as well (Sperry, Brill, Howard, & Grissom, 1996). However, the richness of the feedback provided differs substantially among the various systems. The Stuttgart-Heidelberg (Kordy et al., 2001) system provides the richest feedback, including report cards, graphs, and charts. The COMPASS and OQ-45 also provide good feedback, with the provision of graphs that include an index of failure of treatment. The latter instrument also provides color codes to identify categorical groups of individuals who vary in how closely they are following the projected course of change. Lambert et al. (2001) have directly tested the effects of providing feedback, and their work points to some important directions in how this can improve patient outcome. The type of feedback provided by the OQ-45 seems to help clinicians adapt to the cases who are identified as getting worse, but it has less of an effect on those who are simply not making the expected rate of change (Lambert et al., 2001).

Index of Change All instruments report an index that will allow the clinician or health care manager to identify the degree to which the change made is clinically significant. That is, they attempt to identify individuals who have become more similar to a normal population than to a help-seeking one. This requires a normative reference sample, and most instruments have solicited such a sample for this purpose. The nature of the normative sample by which one identifies the nontreatment-seeking patients varies from instrument to instrument, however. At best, it should be noted that a comparison of a targeted patient with a nonpathological normative sample, as a way of indicating need for further treatment, does not ensure that a patient who is found to be similar to nonpatient norms is either nonsymptomatic or not in need of further treatment. Both the number and variety of the dimensions assessed and the statistical procedures used to suggest "equivalence" with nonsymptomatic groups affect the accuracy of identifying the level of a patient's current clinical need. Available statistical methods to test the level of similarity and the equivalence of the patient to a nonpathological group are seldom used, and even when they are, the procedures used are seldom broadly based enough to assure that all potential areas of pathological functioning are measured. The Stuttgart-Heidelberg (AKQUASI; Kordy et al., 2001) system, which uses established and broad-ranging instruments, has perhaps the highest level of specificity. It is doubtful, however, that any of the systems have adequate nonpatient norms and adequate breadth to assure that similarity means that the patient is no longer in need of treatment. Thus, although it is advantageous to have systematic means to compare patient responses with nonpatients, there remains a need for a truly census-based sample. Alternatively, because of the problems of breadth and sensitivity in all of these comparisons, there may be a place for the development of a system that assesses

SPECIAL SECTION: COMPARING QA SYSTEMS

whether a clinically significant problem exists without reference to a nonpatient group. This would probably require an external judgment, however. Degree of compliance with diagnostic criteria holds some promise for these purposes, but this possibility has not been explored by the current methods.

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Markers of Risk All of the systems identify patients who are considered to be at risk. Notably, each of the different systems marks a different proportion of individuals as being at risk. This fact suggests differences in sensitivity and specificity. The Stuttgart-Heidelberg (Kordy et al, 2001) system identifies 30% of patients as at risk for deterioration, a figure that seems unusually high. The OQ-45 (Lambert et al., 2001) reports a substantially lower percentage. The emphasis of these instruments has been on sensitivity rather than on specificity. Thus, they are likely to be overinclusive in identifying risk, but the consequences of overidentifying the presence of risk on those identified should be carefully considered. Kordy et al. note that the accuracy of identifying "alarm" (signal or at-risk) cases increases with the length of the initial assessment and the use of multiple sources of ratings, an observation that argues against brief and single-source measures.

The STS System1 The STS assessment system (Beutler & Williams, 1995; Fisher et al., 1999) evolved from the work of Beutler and Clarkin (1990). It is a computer-interactive, clinician-based measure that is designed not only to track patient progress but also to help clinicians and health care agencies map out and develop effective treatment plans. The STS is founded on the observation that patient factors interact with the nature of the treatment provided. That is, patients respond differently to different kinds and styles of interventions as a function of their own coping and response styles. However, to optimize the patients' treatment, these preexisting qualities must be identifiable. Likewise, the validity of the system requires that the nature of the fit and mix of treatment and patient qualities be identified, all of which require very time-intensive analyses at the stage of instrument development. The STS was originally designed as a clinician-based rather than as a self-report measure in order to overcome problems of patient noncompliance and selective drop out. However, it is also recognized that a self-report measure has advantages in terms of time savings. Hence, a newly evolving alternative form of the STS is now being piloted that presents the items in a spoken form, by means of a computer. The patient responds to the verbally presented questions by pressing one of three keys ("yes," "no," and "repeat"). This form is expected to reduce the contamination of low reading and education levels on self-report measures. It also permits the measures to be administered remotely by telephone. The STS was not originally designed as a method of evaluating treatment quality or outcome. It was specifically derived from an effort to help clinicians initially design and select treatment methods and was constructed on the strength of evidence that certain patient qualities and characteristics either portend certain levels of outcome, in their own right, or serve as indicators that potentiate and activate the power of different types of treatment (see Beutler, Clarkin, & Bongar, 2000). Stated another way, the STS addresses

201

the fact that there are some patients for whom medication, multiperson, individual, relationship-oriented, insight-oriented, behaviorally focused, therapist-directed, patient-directed, cathartic, and supportive therapies (among others) are especially effective and others for whom each of these families of treatments are ineffective. This distinction of the STS, compared with the tracking systems described here, means that treatments designed to enhance treatment are offered before treatment actually begins rather than outcomes alone to redirect treatment after it has failed. The development and subsequent cross-validation of the STS has focused on developing predictive algorithms that allow the identification of characteristics of treatments that are most likely to produce good effects (Beutler et al., 1999; Beutler et al., 2000). Among other things, treatments vary in (a) context—for example, the frequency, inpatient-outpatient setting, intensity of care, (b) patterning of medical and psychosocial modes of treatment, (c) applications in a group, couple, or individual format, (d) level of therapist guidance and directiveness, (e) balance of interventions that directly alter behavior and symptoms relative to those that enhance awareness and insight, and (f) preference for cathartic versus supportive methods (Beutler & Clarkin, 1990). Decisions about treatment selection reflect variations in these dimensions. Like the systems presented in this special section, the STS was concerned with flexibility and length. Three methods were used to ensure flexibility and to address issues of length. First, the system allows the clinician to select one of three levels of assessment, each requiring less time but yielding less or more information in the report and accompanying graphs. The first level allows completion within a period of 10 min, with the others progressively increasing the amount of time required by either the clinician or the patient. The second procedure both to increase flexibility and reduce clinician time provides a means for the clinician to enter scores from supplemental instruments, drawing from a list of established psychological measures (e.g., Beck Depression Inventory [Beck, 1978]; Minnesota Multiphasic Personality Inventory [Butcher, Dahlstrom, Graham, Tellegen, & Kaemmer, 1989]; SCL-90-R [Derogatis, Rickels, & Rock, 1976]). The scores from these tests replace one or more of the subscales of the STS, and the computerized administration circumvents these questions. The computer generates score equivalents from these measures and inserts these into the algorithms for the development of the output. Thus, the most complete assessment varies in length from 20 to 40 min, with the longer time required when the clinician does not use additional, formal psychological tests. The third procedure for increasing efficiency is to circumvent the use of irrelevant subscales. A total of 31 subscales are available in the STS. Many of these are symptom scales that are activated only by checking a primary and a secondary problem from a checklist. The core dimensions that are used in treatment planning are reflected in 6 composite scales. These variables are used to indicate what type and quality of treatment is expected to be helpful. The specific symptom and problem scales allow the clinician to select and evaluate specific problem areas that can then 1 Information on the systematic treatment selection (STS) can be obtained from the Center for Behavioral HealthCare Technologies, Inc., 3600 S. Harbor Blvd, #86, Oxnard, CA 93035. Electronic mail may be sent to [email protected]

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

202

BEUTLER

be tracked over time. A general well-being composite is also derived and provides a general measure of change to be supplemented by the more specific symptom and problem scores selected by the clinician. A short measure made up of the General WellBeing Scale and individually identified symptom scales is used to track change over time. Thus, follow-up evaluations take only 10 min of patient or clinician time and can be done in conjunction with entering a progress note in the system's database. The STS has not yet been applied broadly to general clinic samples, as have the other instruments described here. It is just entering this arena. To date, its psychometric properties have been established (Fisher et al., 1999), and its predictive validity has been established on both prospective and archival samples from randomized trials research (Beutler et al., 1999, 2000). The latter samples were broad ranging, included members with chemicalabuse and anxiety-based conditions, and all manifested from mild to moderately severe depression. In addition, my colleagues and I are in the process of testing it on several clinical samples (still unpublished) to assess clinician response and compliance and have begun marketing it to clinicians and to health care organizations. Aside from the descriptive material provided here, I summarize some of the responses we have obtained from clinicians and administrators. In return for the 20 to 45 min of clinician or patient time required to complete the highest level of the STS, depending on the number of additional extensions the clinician selects for exploration, the clinician gets the following information: 1. an extensive narrative intake report, 2. a projected course of treatment, 3. tracking of both specific symptoms identified as problems and general well-being, 4. bar graphs that describe patient dimensions at intake, including symptom areas, interpersonal problems, addiction, risk, etc., 5. bar graphs that depict patient personality and trait-state patterns that are relevant to treatment (distress, functional impairment, coping style, social support, chronicity, resistance proneness, etc.). The narrative report is quite complete, addressing considerations of therapist assignment, length of expected treatment, treatment goals, risk level, level of needed care, recommendations about medication use and medical consultation, formulation of problem, and recommended treatment procedures. In addition, a pull-down menu contains a list and associated descriptions of specific empir-

ically derived treatments that best fit the particular patient on the basis of the assessed patient characteristics. This menu selects from among 42 briefly described manuals that can be called up and that describe the treatment procedures that characterize these treatments. Thus, even before beginning treatment, the clinicians have both a narrative intake, a cross-cutting description of a workable treatment, and a list of empirically supported manuals (with more extensive descriptions) to help guide their decisions. Uniquely, the STS also provides a number of other up-front tools for assisting the clinician. These include a diagnostic helper that contains all of the symptoms and descriptions of the Diagnostic and Statistical Manual of Mental Disorders (4th ed., American Psychiatric Association, 1994) and a procedure for profiling therapists in a given clinic and tracking their success (this allows clinician profiling and establishes a procedure for entering progress notes while rating ongoing treatment). To put the STS in the context of the current discussion, Table 1 provides a summary comparison of the various systems under consideration here with the eight points of contrast previously identified. As one notes from the table, the STS system includes (a) both self-report and clinician-report forms to address problems of source variance, attrition, and bias and (b) a moderately long assessment format to increase the stability of specific measures— but this length is modified and shortened either directly by selecting one of three levels of completeness or indirectly by keying subroutines that confine the questions asked to areas identified as being of primary and secondary concern by the clinician. Moreover, (c) the procedures are designed to be flexible, allowing both selection of targeted areas of assessment and replacement of the STS questions with scores from standardized tests and (d) the STS can be applied on a very wide range of patients— inpatients and outpatients, including those with educational or reading handicaps and the socially impaired—because it relies on clinician rating and is not dependent on patient reading and comprehension ability. At present, its predictive validity has not been tested on nonambulatory patients, but concepts measured by the STS are currently being explored for use with children and noncompliant adolescents. Continuing, (e) the STS provides a projection of expected treatment response that is statistically derived by extracting subsamples of patients from the extant database that are similar to the identified patient on certain key variables (impairment, distress, and

Table 1 A Comparison of Five Systems for Managing Psychotherapy Outcomes CORE-OM

COMPASS

STS

Patient only Very short Nonflexible

Patient only Very short and variable Flexible

Patient (4 clinician questions) Moderate Nonflexible

Yes

OP Yes

OP No

OP Yes

Clinician and patient Variable Very flexible IP and OP

Graphic and narrative

Graph and color codes

Graphic

Yes Yes

Yes Yes

Yes Yes

Graphic Yes (modified)

Dimensions

S-H System

Source Length Question flexibility Breadth Projected response Feedback Clinical change Signal risk

Clinician and patient Very long Very flexible IP and OP

OCM5

Yes

Yes Graphic and narrative

Yes Yes

Note. S-H System = Stuttgart-Heidelberg System; OQ-45 = Outcome Questionnaire—45, CORE-OM = Clinical Outcomes in Routine Evaluation— Outcome Measure: STS = Systematic Treatment Selection measure; IP = inpatient; OP = outpatient.

203

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

SPECIAL SECTION: COMPARING QA SYSTEMS

coping style). Subsequent evaluations of the patients compare their rate of response with this standard. Next, (f) the STS also provides multidimensional feedback to the clinician or manager through a series of bar graphs, line graphs, and narration and (g) it identifies when the patient has made clinically reliable changes and when treatment termination is considered to be possible. These latter estimates are based on the relative absence of diagnostic symptoms rather than on the basis of a normal group comparison. Finally, (h) the STS identifies several markers of risk, including a marker that signifies the failure to change at the expected rate. Like the other instruments described here, this marker is based on the Reliable Change Index (Jacobson & Truax, 1991), in which the patient's change is compared with the rate expected, with confidence intervals defined by the standard error of measurement. In addition to these things, most clinicians in our beta-testing samples have found the up-front tools helpful in establishing a diagnosis and developing a treatment plan. Clinicians also seem to be drawn to the minimanuals that not only refer the clinician to relevant readings about how to treat specific problems but also describe, in a brief document, the treatments that have been found empirically to be effective with the targeted patient group. Health care systems find the procedure for entering clinician data and for tracking clinician performance to be useful. This latter procedure is used in the report, when applied to multiclinician settings, to identify and flag the specific therapist who is most likely to be helpful to a given patient, by calculating and comparing available therapists on measures of amount and speed of change among similar patients, as speed of change is previously defined. Because it was designed to develop treatment plans, the development of the STS has been very labor intensive. The standardization sample on which psychometric and cross-validation studies were based has been described elsewhere (Beutler et al., 2000; Fisher et al., 1999) but deserves brief description here. It was extracted from three sites and included samples from four systematic, randomized-treatment studies (archival), and one naturalistic, prospective sample. The patients were 289 individuals with varied and carefully developed diagnoses, including two samples of patients with major depression; a sample of chemically dependent and comorbid individuals; and a sample of mixed, consecutively admitted outpatients. The structured treatments included group, couple, and individual formats and represented cognitive, experiential, psychoanalytic, and biological (antidepressant pharmacotherapy) models. In addition, the treatments included a manualized self-directed therapy and a therapy-as-usual protocol. All treatments were monitored for compliance and checked for proficiency. To validate the fit of classes-of-therapy procedure to type of patient, independent clinicians rated recordings of intake interviews, supplementing their ratings with demographic information, test scores, and historical information that would ordinarily be available to clinicians at intake. Using a rating scale that identified key treatment ingredients and processes, a separate group of raters judged each of two sessions of treatment for each patient to assess the nature of the treatment that was actually provided to the patient. Because the patient and treatment variables were clinicianbased, drop out and noncompliance were not problems, and outcome data were available (by means of the STS) on 284 of the 298 patients in the total sample. These are the patients who completed more than one treatment session and who had at least undertaken

a videotaped intake interview from which both patient and treatment characteristics could be judged. Both structural equation modeling (Beutler et al., 2000) and regression analyses (Beutler et al., 1999) were used to crossvalidate and refine 18 specific hypotheses that then formed the basis for specifying a set of treatment-planning guidelines. These principles are described by Beutler et al. (2000), and the application of these principles to psychotherapy are described by Beutler and Harwood (2000).

Conclusions Each of the systems evaluated here, including the STS, has its own pattern of strengths and weaknesses. The selection and use of a given procedure should depend on the nature of the objectives and the values that guide treatment in a given setting. Each advantage exists at some cost, and all systems have both costs and benefits to clinicians and health care managers who might use them. Not the least of these are variations in the up-front financial costs of implementing the system and in the presence or absence of supplemental information and tools that might assist treatment. For example, as noted earlier, some systems are in the public domain (CORE-OM), whereas others vary in expense in concert with the effort required to develop and maintain them and the availability of additional flexibility of use and application. By whatever means, the movement to measure patient change objectively must be heralded as a major step forward in making clinical research available and useful to practitioners. This trend toward clinical utility will and should continue. It will set the itinerary for much of the clinical research in the new century. To make the use of these methods and systems applicable to research, however, there are a variety of problems that must be overcome. These include the problems associated with variable rates of patient compliance, self-report bias, and acceptability of measures to clinicians. Moreover, to make any of the systems optimally useful in a research context, detailed information is needed about the nature of the treatments available and used. Time-intensive research is needed on what actually takes place in treatment to replace the current reliance on clinician and patient reports, both of which may be quite inaccurate. Supplementing the largely naturalistic and correlational studies with procedures that introduce varying levels of control or randomization in the assignment of treatments and therapists will also be necessary to resolve many of the unanswered questions facing researchers who venture to make their work clinically useful.

References American Psychiatric Association. (1994). Diagnostic and statistical manual of mental disorders (4th ed.). Washington, DC: Author. Barkham, M., Margison, F.. Leach, C., Lucock, M., Mellor-Clark, )., Evans, C., Benson, L., Connell, J., Audin, K., & McGrach, G. (2001). Service profiling and outcome benchmarking using the CORE-OM: Toward practice-based evidence in the psychological therapies. Journal of Consulting and Clinical Psychology, 69, 184-196. Beck, A. T. (1978). Depression Inventory. Philadelphia: Center for Cognitive Therapy. Beutler, L. E., Albanese, A. L., Fisher. D., Karno, M., Sandowicz, M., & Williams. O. B. (1999, June). Selecting and matching treatment to

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

204

BEUTLER

patient variables. Paper presented at the annual meeting of the Society for Psychotherapy Research. Braga. Portugal. Beutler. L. E.. & Clarkin. J. F. (1990). Systematic' treatment selection: Toward targeted therapeutic interventions. New York: Brunner/Mazel. Beutler. L. E.. Clarkin, J. F., & Bongar, B. (2000). Guidelines for the systematic treatment of the depressed patient. New York: Oxford University Press. Beutler, L. E., & Hamblin, D. L. (1986). Individual outcome measures of internal change: Methodological considerations. Journal of Consulting and Clinical Psychology, 54. 48-53. Beutler. L. E.. & Harwood, T. M. (2000). Prescriptive psychotherapy. New York: Oxford University Press. Beutler. L. E., & Williams, O. B. (1995, July/August). Computer applications for the selection of optimal psychosocial therapeutic interventions. Behavioral Healthcare Tomorrow, 6, 66-68. Butcher, J. N., Dahlstrom. W. G., Graham, J. R., Tellegen, A., & Kaemmer, B. (1989). Manual for administration and scoring: MMPI-2. Minneapolis: University of Minnesota Press. Derogatis. L. R., Rickels, K., & Rock, A. F. (1976). The new SCL-90 and the MMPI: A step in the validation of a new self-report scale. British Journal of Psychiatry, 128, 280-289. Fisher, D., Beutler, L. E., & Williams, O. B. (1999). Making assessment relevant to treatment planning: The STS clinician rating form. Journal of Clinical Psychology, 55. 825-842. Garfield, S. L.. Prager, R. A., & Bergin, A. E. (1971). Evaluation of outcome in psychotherapy. Journal of Consulting and Clinical Psychology, 37, 307-313. Jacobson, N. S., & Truax, P. (1991). Clinical significance: A statistical approach to defining meaningful change in psychotherapy research. Journal of Consulting and Clinical Psychology, 59, 12-19.

Kordy, H., Hanover, W., & Richard, M. (2001). Computer-assisted feedback-driven quality management for psychotherapy: The StuttgartHeidelberg Model. Journal of Consulting and Clinical Psychology, 69, 173-183. Lambert, M. J., Hansen, N. B., & Finch, A. E. (2001). Patient-focused research: Using patient outcome data to enhance treatment effects. Journal of Consulting and Clinical Psychology, 69, 159-172. Lueger, R. J., Howard, K. I., Martinovich, Z., Lutz, W., Anderson, E. E., & Grissom, G. (2001). Assessing treatment progress of individual patients using expected treatment response models. Journal of Consulting and Clinical Psychology, 69, 150-158. Nathan, P. E., & Gorman, J. M. (Eds.). (1998). A guide to treatments that work. New York: Oxford University Press. Roth, A., & Fonagy, P. (1996). What works for whom? A critical review of psychotherapy research. New York: Guilford Press. Smart, D. W. (1998, August). Recovery curves, outcome modeling and satisfaction: A counseling center study. A symposium on the OQ-45 presented at the 106th Annual Convention of the American Psychological Association, San Francisco, CA. Sperry, L., Brill, P. L., Howard, K. I., & Grissom, G. R. (1996). Treatment outcomes in psychotherapy and psychiatric interventions. New York: Brunner/Mazel. Strupp, H. H., Horowitz, L. M., & Lambert, M. J. (1997). Measuring patient changes in mood, anxiety, and personality disorders: Toward a core battery. Washington, DC: American Psychological Association.

Received June 5, 2000 Revision received June 12, 2000 Accepted June 27, 2000

Low Publication Prices for APA Members and Affiliates Keeping you up-to-date. All APA Fellows, Members, Associates, and Student Affiliates receive—as part of their annual dues—subscriptions to the American Psychologist and APA Monitor. High School Teacher and International Affiliates receive subscriptions to the APA Monitor, and they may subscribe to the American Psychologist at a significantly reduced rate. In addition, all Members and Student Affiliates are eligible for savings of up to 60% (plus a journal credit) on all other APA journals, as well as significant discounts on subscriptions from cooperating societies and publishers (e.g., the American Association for Counseling and Development, Academic Press, and Human Sciences Press). Essential resources. APA members and affiliates receive special rates for purchases of APA books, including the Publication Manual of the American Psychological Association, and on dozens of new topical books each year. Other benefits Of membership. Membership in APA also provides eligibility for competitive insurance plans, continuing education programs, reduced APA convention fees, and specialty divisions. More information. Write to American Psychological Association, Membership Services, 750 First Street, NE, Washington, DC 20002-4242.

Lihat lebih banyak...

Comentários

Copyright © 2017 DADOSPDF Inc.