%0 Generic %D 2011 %T Item banks for measuring emotional distress from the Patient-Reported Outcomes Measurement Information System (PROMIS®): depression, anxiety, and anger %A Pilkonis, P. A. %A Choi, S. W. %A Reise, S. P. %A Stover, A. M. %A Riley, W. T. %A Cella, D. %B Assessment %@ 1073-1911 %G eng %& June 21, 2011 %0 Journal Article %J Educational and Psychological Measurement %D 2010 %T A new stopping rule for computerized adaptive testing %A Choi, S. W. %A Grady, M. W. %A Dodd, B. G. %X The goal of the current study was to introduce a new stopping rule for computerized adaptive testing. The predicted standard error reduction stopping rule (PSER) uses the predictive posterior variance to determine the reduction in standard error that would result from the administration of additional items. The performance of the PSER was compared to that of the minimum standard error stopping rule and a modified version of the minimum information stopping rule in a series of simulated adaptive tests, drawn from a number of item pools. Results indicate that the PSER makes efficient use of CAT item pools, administering fewer items when predictive gains in information are small and increasing measurement precision when information is abundant. %B Educational and Psychological Measurement %7 2011/02/01 %V 70 %P 1-17 %8 Dec 1 %@ 0013-1644 (Print)0013-1644 (Linking) %G Eng %M 21278821 %2 3028267 %0 Book Section %D 2009 %T A burdened CAT: Incorporating response burden with maximum Fisher's information for item selection %A Swartz, R.J.. %A Choi, S. W. %X Widely used in various educational and vocational assessment applications, computerized adaptive testing (CAT) has recently begun to be used to measure patient-reported outcomes Although successful in reducing respondent burden, most current CAT algorithms do not formally consider it as part of the item selection process. This study used a loss function approach motivated by decision theory to develop an item selection method that incorporates respondent burden into the item selection process based on maximum Fisher information item selection. Several different loss functions placing varying degrees of importance on respondent burden were compared, using an item bank of 62 polytomous items measuring depressive symptoms. One dataset consisted of the real responses from the 730 subjects who responded to all the items. A second dataset consisted of simulated responses to all the items based on a grid of latent trait scores with replicates at each grid point. The algorithm enables a CAT administrator to more efficiently control the respondent burden without severely affecting the measurement precision than when using MFI alone. In particular, the loss function incorporating respondent burden protected respondents from receiving longer tests when their estimated trait score fell in a region where there were few informative items. %C In D. J. Weiss (Ed.), Proceedings of the 2009 GMAC Conference on Computerized Adaptive Testing. %G eng %0 Journal Article %J Applied Psychological Measurement %D 2009 %T Comparison of CAT item selection criteria for polytomous items %A Choi, S. W. %A Swartz, R.J.. %B Applied Psychological Measurement %V 33 %P 419–440 %G eng %0 Journal Article %J Applied Psychological Measurement %D 2009 %T Firestar: Computerized adaptive testing simulation program for polytomous IRT models %A Choi, S. W. %B Applied Psychological Measurement %7 2009/12/17 %V 33 %P 644-645 %8 Nov 1 %@ 1552-3497 (Electronic)0146-6216 (Linking) %G Eng %M 20011609 %2 2790213 %0 Journal Article %J Applied Psychological Measurement %D 2009 %T Firestar: Computerized adaptive testing simulation program for polytomous IRT models %A Choi, S. W. %B Applied Psychological Measurement %V 33 %P 644–645 %G eng %0 Journal Article %J Spine %D 2008 %T Letting the CAT out of the bag: Comparing computer adaptive tests and an 11-item short form of the Roland-Morris Disability Questionnaire %A Cook, K. F. %A Choi, S. W. %A Crane, P. K. %A Deyo, R. A. %A Johnson, K. L. %A Amtmann, D. %K *Disability Evaluation %K *Health Status Indicators %K Adult %K Aged %K Aged, 80 and over %K Back Pain/*diagnosis/psychology %K Calibration %K Computer Simulation %K Diagnosis, Computer-Assisted/*standards %K Humans %K Middle Aged %K Models, Psychological %K Predictive Value of Tests %K Questionnaires/*standards %K Reproducibility of Results %X STUDY DESIGN: A post hoc simulation of a computer adaptive administration of the items of a modified version of the Roland-Morris Disability Questionnaire. OBJECTIVE: To evaluate the effectiveness of adaptive administration of back pain-related disability items compared with a fixed 11-item short form. SUMMARY OF BACKGROUND DATA: Short form versions of the Roland-Morris Disability Questionnaire have been developed. An alternative to paper-and-pencil short forms is to administer items adaptively so that items are presented based on a person's responses to previous items. Theoretically, this allows precise estimation of back pain disability with administration of only a few items. MATERIALS AND METHODS: Data were gathered from 2 previously conducted studies of persons with back pain. An item response theory model was used to calibrate scores based on all items, items of a paper-and-pencil short form, and several computer adaptive tests (CATs). RESULTS: Correlations between each CAT condition and scores based on a 23-item version of the Roland-Morris Disability Questionnaire ranged from 0.93 to 0.98. Compared with an 11-item short form, an 11-item CAT produced scores that were significantly more highly correlated with scores based on the 23-item scale. CATs with even fewer items also produced scores that were highly correlated with scores based on all items. For example, scores from a 5-item CAT had a correlation of 0.93 with full scale scores. Seven- and 9-item CATs correlated at 0.95 and 0.97, respectively. A CAT with a standard-error-based stopping rule produced scores that correlated at 0.95 with full scale scores. CONCLUSION: A CAT-based back pain-related disability measure may be a valuable tool for use in clinical and research contexts. Use of CAT for other common measures in back pain research, such as other functional scales or measures of psychological distress, may offer similar advantages. %B Spine %7 2008/05/23 %V 33 %P 1378-83 %8 May 20 %@ 1528-1159 (Electronic) %G eng %M 18496352 %0 Journal Article %J Quality of Life Research %D 2007 %T The future of outcomes measurement: item banking, tailored short-forms, and computerized adaptive assessment %A Cella, D. %A Gershon, R. C. %A Lai, J-S. %A Choi, S. W. %X The use of item banks and computerized adaptive testing (CAT) begins with clear definitions of important outcomes, and references those definitions to specific questions gathered into large and well-studied pools, or “banks” of items. Items can be selected from the bank to form customized short scales, or can be administered in a sequence and length determined by a computer programmed for precision and clinical relevance. Although far from perfect, such item banks can form a common definition and understanding of human symptoms and functional problems such as fatigue, pain, depression, mobility, social function, sensory function, and many other health concepts that we can only measure by asking people directly. The support of the National Institutes of Health (NIH), as witnessed by its cooperative agreement with measurement experts through the NIH Roadmap Initiative known as PROMIS (www.nihpromis.org), is a big step in that direction. Our approach to item banking and CAT is practical; as focused on application as it is on science or theory. From a practical perspective, we frequently must decide whether to re-write and retest an item, add more items to fill gaps (often at the ceiling of the measure), re-test a bank after some modifications, or split up a bank into units that are more unidimensional, yet less clinically relevant or complete. These decisions are not easy, and yet they are rarely unforgiving. We encourage people to build practical tools that are capable of producing multiple short form measures and CAT administrations from common banks, and to further our understanding of these banks with various clinical populations and ages, so that with time the scores that emerge from these many activities begin to have not only a common metric and range, but a shared meaning and understanding across users. In this paper, we provide an overview of item banking and CAT, discuss our approach to item banking and its byproducts, describe testing options, discuss an example of CAT for fatigue, and discuss models for long term sustainability of an entity such as PROMIS. Some barriers to success include limitations in the methods themselves, controversies and disagreements across approaches, and end-user reluctance to move away from the familiar. %B Quality of Life Research %V 16 %P 133-141 %@ 0962-9343 %G eng