Comprehensive assessment of thyroidectomy skills development: a pilot project

Share Embed


Descrição do Produto

The Laryngoscope

C 2011 The American Laryngological, V

Rhinological and Otological Society, Inc.

Comprehensive Assessment of Thyroidectomy Skills Development: A Pilot Project David A. Diaz Voss Varela, MD; Mohammad U. Malik, MD; Carol B. Thompson, MS, MBA; Charles W. Cummings, MD; Nasir I. Bhatti, MD; Ralph P. Tufano, MD Objectives/Hypothesis: To test the validity, reliability, and feasibility of an evaluation tool designed to measure the development of trainees’ surgical skills in the operating room for thyroid surgery. Study Design: Prospective validation study. Methods: A modified Delphi technique was employed to develop a new Objective Structured Assessment of Technical Skills–based instrument for thyroid surgery. During a 1-year period, 16 otolaryngology–head and neck surgery residents (ranging from postgraduate year 2 to 6) and one endocrine surgery fellow were evaluated by one faculty member obtaining a total of 94 evaluations. Performance was rated using a task-based checklist (TBC) and a global rating scale (GRS). The TBC measured trainees’ thyroidectomy technical skills, and the GRS assessed their overall surgical performance. Results: Based on four clinical levels (junior, intermediate, senior, and surgical fellow) our tool demonstrated construct validity for both components of the assessment instrument, specifically for the TBC showing a mean difference of 0.9 (95% confidence interval: 0.5-1.3, P < .001) between the contiguous clinical levels senior versus intermediate. Cronbach a, a measure of internal consistency, was 0.96 for both components of the instrument. The correlation between the TBC and GRS was also high within trainee (r ¼ 0.62, n ¼ 94, P < .001) and across trainees (r ¼ 0.96, n ¼ 17, P < .001). Conclusions: Our tool proved to be a valid, reliable, and feasible instrument for assessing competency in thyroid surgery. It is effective in providing timely formative feedback during and upon the conclusion of the surgical procedure by identifying procedural tasks for which additional training is necessary. In addition, it enables longitudinal tracking of residents’ surgical performance, thus ensuring their appropriate development. Key Words: Accreditation Council for Graduate Medical Education (ACGME), surgical competency, education, thyroidectomy, core competencies, surgical skills assessment, otolaryngology. Level of Evidence: 1b. Laryngoscope, 122:103–109, 2012

INTRODUCTION In the past decade, graduate medical education has experienced major changes that are paving the way for a new outcome-based education. Residency programs are now expected to produce competent physicians because of increased pressure by the public and by their respective specialty boards.1 In 2001, the Accreditation Council for Graduate Medical Education (ACGME) implemented the Outcome Project, whose mission statement is to enhance residency education through outcomes.1 Currently, residency programs are required to objectively measure their trainees for six core competencies: patient From the Department of Otolaryngology–Head and Neck Surgery, Johns Hopkins University School of Medicine (D.A.D.V.V., M.U.M., C.W.C., N.I.B., R.P.T.); and the Biostatistics Center, Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health (C.B.T.), Baltimore, Maryland, U.S.A. Editor’s Note: This Manuscript was accepted for publication September 8, 2011. Ralph P. Tufano, MD, uses the Medtronic NIM monitor while performing and assessing residents’ performance in thyroid surgery. The authors have no funding, financial relationships, or conflicts of interest to disclose. Send correspondence to Nasir I. Bhatti, MD, 601 N. Caroline St., Suite 6241, Johns Hopkins Outpatient Center, Baltimore, MD 21287. E-mail: [email protected] DOI: 10.1002/lary.22381

Laryngoscope 122: January 2012

care, medical knowledge, practice-based learning and improvement, interpersonal and communication skills, professionalism, and systems-based practice. Although most of these competencies can be relatively easy to assess, measuring trainees’ operative competence (an integral part of patient care) in surgical specialties may pose a challenge owing to the lack of standardized objective assessment tools.2 Work-hour limitations and other barriers, such as faculty workload and insufficient administrative support, may hinder the program directors’ ability to fully comply with the ACGME mandate.3 The addition of objective, quantifiable measures to the subjective end-ofrotation assessments that have traditionally guided trainee evaluations would help program directors meet the ACGME’s requirements. Therefore, the creation and implementation of objective, valid, and reliable assessment tools for surgical competency are deemed essential. In 1997, Martin et al.4 developed a performance-based evaluation for assessing trainees’ surgical competence, also known as Objective Structured Assessment of Technical Skills (OSATS). Subsequently, several surgical specialties, such as general surgery, ophthalmology, and obstetrics and gynecology, have successfully adopted this evaluation format to assess their residents’ surgical skills.5–7 In otolaryngology–head and neck surgery Diaz Voss Varela et al.: Thyroidectomy Skills Development

103

Fig. 1. Task-based checklist component of the thyroid surgery assessment tool.

(OHNS), several groups have been able to develop, implement, and validate feasible and reliable instruments for various surgical procedures: endoscopic sinus surgery, mastoidectomy, and rigid bronchoscopy and direct laryngoscopy.2,8,9 Thyroidectomy, a core procedure in OHNS, is a good example of a surgical procedure for which residents need to be able to evaluate the patient in a comprehensive manner before surgery while integrating their knowledge of the complex detail of head and neck anatomy with their raw surgical skills.10 These characteristics make this procedure an ideal candidate for evaluating a trainee’s surgical competence with an OSATS-based assessment tool. Therefore, the purpose of this study was to develop, implement, and pilot-test a newly developed two-component (a task-based checklist [TBC] and a global rating scale [GRS]) thyroidectomy evaluation instrument for head and neck surgery trainees. In addition, we wanted to evaluate this tool for its feasibility and construct validity for the assessment of technical skills in primary thyroid surgery in the operating room (OR). Laryngoscope 122: January 2012

104

MATERIALS AND METHODS Study Design and Participants After obtaining approval from the institutional review board at the Johns Hopkins Hospital, we proceeded with this prospective pilot study of thyroidectomy skills development and assessment in head and neck surgery trainees during a period of 1 year. Sixteen residents, ranging from postgraduate year (PGY) 2 to 6, from the department of OHNS at the Johns Hopkins University and one endocrine surgery fellow (PGY-7) (in an American Association of Endocrine Surgeons accredited fellowship who finished an ACGME accredited general surgery residency) from the same institution were observed while performing thyroid surgery in the OR. All trainees were evaluated with a newly developed TBC and GRS for thyroid surgery at the end of each procedure by one faculty member of the division of Head and Neck Surgery whose practice primarily focuses on thyroid and parathyroid surgery.

Components of the Assessment Tool A modified Delphi technique was used to develop the contents of a new OSATS-based instrument for thyroid surgery by a panel of head and neck surgeons with thyroid surgery experience in the OHNS department. With a 5-point Likert scale

Diaz Voss Varela et al.: Thyroidectomy Skills Development

Fig. 2. Global rating scale component of the thyroid surgery assessment tool.

linked to descriptors at the middle and ends of each scale item, a thyroidectomy TBC was developed with detailed faculty input. After extensive review, 10 items, considered the critical, specific, and assessable tasks for achieving the goals of a thyroidectomy procedure, were included in this checklist (Fig. 1).

Laryngoscope 122: January 2012

Based on a previously developed and validated tool to assess technical skills in the OR by Winckel et al.,11 a second component of the evaluation instrument (GRS) for the same procedure was created by the same panel of faculty members (Fig. 2). The purpose of the GRS is to evaluate a trainee’s overall surgical performance. It also aims to measure visual-motor

Diaz Voss Varela et al.: Thyroidectomy Skills Development

105

Fig. 3. Mean and 95% confidence interval (CI) scores on the taskbased checklist (TBC) for trainees at different levels of training. PGY ¼ postgraduate year. and cognitive performance required to perform a thyroidectomy in a safe and successful manner. A descriptive five-point Likert scale was also developed for each of its nine items. Two additional yes/no questions and a ‘‘procedure comments’’ section were included to help the evaluating faculty member provide structured and timely formative feedback to the trainee at the end of each case. The first yes/no question queried whether the rater had provided feedback during and after the procedure to the resident. The second one queried whether the faculty member thought the trainee was competent to perform the surgery independently. The ‘‘procedure comments’’ section was useful for the evaluator to rank each procedure as a standard or difficult case. Similar TBC and GRS components have been validated for other OHNS core procedures.2,8,12

Statistical Analysis Construct validity was evaluated for each component of the tool by comparing trainees’ mean percentage scores across advancing PGY levels using general linear model analysis, adjusting for multiple evaluations per trainee. The inter-item reliability for both the TBC and the GRS was measured by assessing their internal consistency with Cronbach a. A value of .80 was considered acceptable. Correlations between TBC and GRS scores were performed within trainee13 and across trainees.14 All data analyses were performed using STATA version 11.1 (StataCorp LP, College Station, TX) software.

The tool’s internal consistency, used to determine its inter-item reliability, was evaluated with Cronbach a. The Cronbach a on the 94 observations was 0.96 for both the TBC and the GRS. Because Cronbach a does not adjust for the multiple evaluations per trainee, three additional evaluations were performed based on one evaluation for each of the 17 trainees: their earliest evaluation, their latest evaluation, and a randomly selected evaluation. The a values for the TBC were 0.96, 0.98, and 0.97, respectively. The a values for the GRS were 0.96, 0.96, and 0.96, respectively. All are consistent with the Cronbach a values obtained on the 94 evaluations. Correlations between the two components of the assessment tool were also evaluated. Within the same trainee, an increase in the TBC score was shown to be associated with an increase in the GRS score, taking into account that there are multiple evaluations per trainee (r ¼ 0.62, n ¼ 94, P < .001). Trainees with a high value on the TBC score also tended to have a high value on the GRS score, taking into account multiple evaluations per trainee (r ¼ 0.96, n ¼ 17, P < .001). For this study, both the TBC and the GRS demonstrated construct validity. To capture major gradations of experience, rather than just by PGY level, we created four clinical groups to distinguish levels of trainees for this analysis: junior (PGY-2), intermediate (PGY-3 and 4), senior (PGY-5 and 6), and clinical surgical fellow (PGY-7). Our results show a trend of achieving a higher average score with advancing clinical groups for both the TBC (Fig. 3) and the GRS (Fig. 4). Following a general linear model analysis adjusting for within trainee correlation of evaluations, we performed pairwise comparisons between contiguous clinical groups adjusting for an experiment-wise error rate. Table I shows comparisons made for the TBC. Similarly, Table II shows comparisons made for the GRS. The mean difference between intermediate and junior trainees was not significant for either the TBC (0.23, 95% confidence interval [CI]: 0.23 to 0.69), P > .303) or the GRS (0.23, 95% CI: 0.33 to 0.79, P > .397). However, the mean difference between senior and

RESULTS A total of 94 evaluations were completed for 17 trainees across six PGY levels (levels 2–7). They were evaluated by a single faculty member from the division of Head and Neck Surgery as they performed thyroid surgery in the OR during a period of 1 year. In this pilot study, the evaluation instrument was considered feasible to use based on the rater’s high rate of usage and the time taken to complete each evaluation (median time of 2 minutes; range, 1–7 minutes). In addition, the evaluator found this assessment tool to be understandable, simple, and practical. Moreover, trainees found the tool to be useful in providing timely formative feedback during and after each case, thus improving their performance in subsequent evaluations. Laryngoscope 122: January 2012

106

Fig. 4. Mean and 95% confidence interval (CI) scores on the global rating scale (GRS) for trainees at different levels of training. PGY ¼ postgraduate year.

Diaz Voss Varela et al.: Thyroidectomy Skills Development

TABLE I. Pairwise Comparisons for the Task-Based Checklist Between Contiguous Clinical Groups. Mean Difference

95% CI

P Value

0.23 to 0.69

>.303

Intermediate trainees (PGY-3 and 4) and junior trainees (PGY-2)

0.23

Senior trainees (PGY-5 and 6) and intermediate trainees

0.9

0.5-1.3

Lihat lebih banyak...

Comentários

Copyright © 2017 DADOSPDF Inc.