Methods to assess clinical reasoning and ensure the correct development of clinical competence are challenging. Clinical reasoning – defined as the ability, process, or result by which physicians observe, collect, and interpret data to diagnose and treat patients – requires an evidence-based approach provided by research in medical education. This has been done by the American professor Michelle Daniel. Her meta-analysis, published in Academic Medicine, concludes that the assessment of clinical reasoning requires programs that combine ways to assess its components in simulated environments, in class, and in the workplace.
Madrid, July 16, 2019. “If medical educators want to ensure that [students and professionals] are competent in clinical reasoning, (…) they must arrange for adequate sampling. This can only be accomplished by employing multiple assessment methods.” This is the conclusion of a review of 377 articles on the assessment of clinical reasoning aimed at students, residents, and practicing physicians. Dr. Daniel, assistant dean for curriculum at the University of Michigan (USA) medical school, along with her team, identified three categories: non-workplace-based assessments (non-WBAs), assessments in simulated clinical environments, and workplace-based assessments (WBA). Their hypothesis was that they complement each other.
There are abundant evaluation methods aligned with different components of the complex construct of clinical reasoning. Competence consolidation involves the development of evaluation programs addressing its components. “Such programs are ideally constructed of complementary assessment methods to account for each method’s validity, feasibility issues, advantages, and disadvantages." Adopting a constructivist paradigm, the author clustered the articles into 20 assessment methods and, in a descriptive appendix, summarized each method in terms of common stimuli, response formats, scoring, typical uses, validity considerations, and advantages and disadvantages, among others.
Dr. Daniel observed that some methods had a large number of articles, e.g., script concordance testing (SCT) and technology-enhanced simulation, with more than 60 publications. In contrast, others had only a few articles, e.g., comprehensive integrative puzzles (CIPs), with only three. From a practical perspective, non-WBA assessments have the advantages of broad sampling, validity, content control, internal consistency, and reliability (most have a single response format). For their part, WBAs, embedded in the clinical environment, grant authenticity to the content and greater scope of the response process, while assessments in simulated environments ensure evaluation throughout the task, despite consuming time and resources.
Many types of commonly used non-WBAs, including multiple-choice questions (MCQs), extended matching multiple-choice questions (EMQs), key feature examinations (KFEs), and SCTs, are ineffective during the evaluation of information gathering, hypothesis generation, and problem representation. Their strength lies mainly in the evaluation of differential diagnoses, leading diagnoses, and management and treatment. Assessments in simulated clinical settings and WBAs are better at evaluating information gathering, with direct observation and objective structured clinical examinations (OSCEs) being the strongest in this domain.
The SRL-M strategy, understood as a structured interview protocol to collect information in-the-moment at a task level, and evaluate metacognition, along with Think Aloud (TA), a technique in which students are given a task and are asked to express their thoughts simultaneously in an unfiltered form, are effective in measuring the generation of hypotheses and the representation of problems because they force the participant to articulate the steps of the reasoning process. Combining strong strategies to evaluate the different components of clinical reasoning seems to be best, for example, MCQ + SRL-M + OSCE.
Non-WBAs, simulated environments, and WBAs
Most non-WBAs use written clinical vignettes, although images or videos can be added to reinforce the materials. Non-WBA methods are often used for summative, pass or fail judgments, as well as licensing, certification, and accreditation decisions. But, the defensibility of this practice is questionable since the results obtained do not ensure a successful transfer of skills to clinical practice. However, non-WBAs can be useful when used as evidence of progress in formative evaluation for learning, due to the effect they have on the development of clinical reasoning (for example, using concept maps to create cognitive networks).
Assessments in simulated clinical settings are usually limited to mannequins or virtual reality patient avatars. The response format for OSCEs and technology-enhanced simulations is usually task performance or verbal or written responses. Scoring is usually done using itemized checklists that can be dichotomous or behaviorally anchored. Global rating scales are also common. OSCEs are used for summative decision-making assessments, while technology-enhanced simulations are used for formative evaluations. Their correlation with clinical practice is reasonable, but an important obstacle is the fact that they involve many resources.
Workplace methods rely on real patients as stimuli. The grading systems vary and include detailed or global scoring scales and checklists. WBAs are used for formative assessment during clinical internships and residency. Because these methods are integrated into authentic clinical settings, there is evidence of the validity of the response processes and reasonable content. However, the unsystematic nature of clinical practice can present challenges with regard to content coverage. In addition, many of these methods require an observer, with the consequent risk of bias inherent to any trial.
Conclusions concerning undergraduate and graduate students
While evaluations in the workplace receive greater emphasis in educational programs, time and cost often limit the number and variety of cases that can be sampled. Undergraduate students are often assessed with a combination of MCQ, OSCE, global assessments, oral presentations, and written notes, while direct observation and incorporation of methods, such as TA or SRL-M, could reach currently underestimated components of clinical reasoning. In graduate medical education, most evaluations are carried out in the clinical setting, occasionally augmented by technology-enhanced simulation and in-training examinations, which are largely comprised of MCQs.
The concept of programmatic assessment of clinical reasoning is still incipient. Institutions should carry out frequent evaluations and collect information longitudinally from multiple sources and in various contexts. Looking ahead, it is important to consider the assessment for learning. For example, concept maps are very useful for learning, as they help students develop disease scripts and form connections. Clinical reasoning assessments, such as direct observations and technology-enhanced simulations, are essential means of obtaining formative feedback.
“Future research is also needed to determine how to best combine various methods into valid programs of clinical reasoning assessment to allow medical schools, residency programs, and licensing boards to confidently determine the competence of their learners,” concludes Dr. Daniel.
Daniel M, Rencic J, Durning SJ et al. Clinical reasoning assessment methods: a scoping review and practical guidance. Acad Med. 2019; 94: 902-912. doi:10.1097/ACM.0000000000002618