TitleUsing Automated Item Generation in a Large-scale Medical Licensure Exam Program: Lessons Learned.
Publication TypeConference Paper
Year of Publication2017
AuthorsDe Champlain, AF
Conference Name2017 IACAT Conference
Date Published08.2017
PublisherNiigata Seiryo University
Conference LocationNiigata, Japan
KeywordsAutomated item generation, large scale, medical licensure

On-demand testing has become commonplace with most large-scale testing programs. Continuous testing is appealing for candidates in that it affords greater flexibility in scheduling a session at the desired location. Furthermore, the push for more comprehensive systems of assessment (e.g. CBAL) is predicated on the availability of more frequently administered tasks given the purposeful link between instruction and assessment in these frameworks. However, continuous testing models impose several challenges to programs, including overexposure of items. Robust item banks are therefore needed to support routine retirement and replenishment of items. In a traditional approach to developing items, content experts select a topic and then develop an item consisting of a stem, lead-in question, a correct answer and list of distractors. The item then undergoes review by a panel of experts to validate the content and identify any potential flaws. The process involved in developing quality MCQ items can be time-consuming as well as costly, with estimates as high as $1500-$2500 USD per item (Rudner, 2010). The Medical Council of Canada (MCC) has been exploring a novel item development process to supplement traditional approaches. Specifically, the use of automated item generation (AIG), which uses technology to generate test items from cognitive models, has been studied for over five years. Cognitive models are representations of the knowledge and skills that are required to solve any given problem. While developing a cognitive model for a medical scenario, for example, content experts are asked to deconstruct the (clinical) reasoning process involved via clearly stated variables and related elements. The latter information is then entered into a computer program that uses algorithms to generate MCQs. The MCC has been piloting AIG –based items for over five years with the MCC Qualifying Examination Part I (MCCQE I), a pre-requisite for licensure in Canada. The aim of this presentation is to provide an overview of the practical lessons learned in the use and operational rollout of AIG with the MCCQE I. Psychometrically, the quality of the items is at least equal, and in many instances superior, to that of traditionally written MCQs, based on difficulty, discrimination, and information. In fact, 96% of the AIG based items piloted in a recent administration were retained for future operational scoring based on pre-defined inclusion criteria. AIG also offers a framework for the systematic creation of plausible distractors, in that the content experts not only need to provide the clinical reasoning underlying a correct response but also the cognitive errors associated with each of the distractors (Lai et al. 2016). Consequently, AIG holds great promise in regard to improving and tailoring diagnostic feedback for remedial purposes (Pugh, De Champlain, Gierl, Lai, Touchie, 2016). Furthermore, our test development process has been greatly enhanced by the addition of AIG as it requires that item writers use metacognitive skills to describe how they solve problems. We are hopeful that sharing our experiences with attendees might not only help other testing organizations interested in adopting AIG, but also foster discussion which might benefit all participants.


Lai, H., Gierl, M.J., Touchie, C., Pugh, D., Boulais, A.P., & De Champlain, A.F. (2016). Using automatic item generation to improve the quality of MCQ distractors. Teaching and Learning in Medicine, 28, 166-173.

Pugh, D., De Champlain, A.F., Lai, H., Gierl, M., & Touchie, C. (2016). Using cognitive models to develop quality multiple choice questions. Medical Teacher, 38, 838-843.

Rudner, L. (2010). Implementing the Graduate Management Admission Test Computerized Adaptive Test. In W. van der Linden & C. Glass (Eds.), Elements of adaptive testing (pp. 151-165). New York, NY: Springer. 

Presentation Video