Abstract
Context: Good examinations have a number of characteristics; validity, reliability, educational impact, practicability and acceptability. The OSCE has improved reliability over the single long case but concerns about validity have led to modifications and development of other models such as the mini-CEX and OSLER. These retain some characteristics of the long case but feature repeated encounters and more structure. Nevertheless, practicalities and cost of mounting large scale examinations remain significant. The lack of metrics handicaps progress. This paper reports a system where a sequential design concentrates limited resources where they are most needed; to maintain reliability and practicability at the pass/fail interface.
Methods: Data were taken from the final examination in 2009. In the complete examination, candidates see 8 real patients (the OSLER) and encounter 12 OSCE stations. Those judged entirely satisfactory after the first 4 patients and 6 OSCE stations are not examined further. The others, about a third, see the remaining patients and stations. Results for these are based on all the data. Reliability was calculated using generalisability theory and practicability in terms of financial resources. Functioning of the sequential system was assessed by the ability of the first part to predict the final result.
Results: Generalisability for the OSLER was 0.63 after 4 patients and 0.77 after 8. The OSCE was less reliable (0.38 after 6 stations and 0.55 after 12). There was only a weak correlation between the OSLER and the OSCE. The first stage was highly predictive of the results of the 2nd stage. . Cost savings from the sequential design were approximately £30K.
Conclusions: The overall utility of an examination involves compromise. The system described provides good perceived validity with reasonable reliability; a sequential design can concentrate resources where they are most needed and still allow wide sampling of tasks.
Methods: Data were taken from the final examination in 2009. In the complete examination, candidates see 8 real patients (the OSLER) and encounter 12 OSCE stations. Those judged entirely satisfactory after the first 4 patients and 6 OSCE stations are not examined further. The others, about a third, see the remaining patients and stations. Results for these are based on all the data. Reliability was calculated using generalisability theory and practicability in terms of financial resources. Functioning of the sequential system was assessed by the ability of the first part to predict the final result.
Results: Generalisability for the OSLER was 0.63 after 4 patients and 0.77 after 8. The OSCE was less reliable (0.38 after 6 stations and 0.55 after 12). There was only a weak correlation between the OSLER and the OSCE. The first stage was highly predictive of the results of the 2nd stage. . Cost savings from the sequential design were approximately £30K.
Conclusions: The overall utility of an examination involves compromise. The system described provides good perceived validity with reasonable reliability; a sequential design can concentrate resources where they are most needed and still allow wide sampling of tasks.
Original language | English |
---|---|
Pages (from-to) | 741–747 |
Number of pages | 7 |
Journal | Medical Education |
Volume | 45 |
Issue number | 7 |
DOIs | |
Publication status | Published - Jul 2011 |