%0 Journal Article %J Quality of Life Research %D 2017 %T The validation of a computer-adaptive test (CAT) for assessing health-related quality of life in children and adolescents in a clinical sample: study design, methods and first results of the Kids-CAT study %A Barthel, D. %A Otto, C. %A Nolte, S. %A Meyrose, A.-K. %A Fischer, F. %A Devine, J. %A Walter, O. %A Mierke, A. %A Fischer, K. I. %A Thyen, U. %A Klein, M. %A Ankermann, T. %A Rose, M. %A Ravens-Sieberer, U. %X Recently, we developed a computer-adaptive test (CAT) for assessing health-related quality of life (HRQoL) in children and adolescents: the Kids-CAT. It measures five generic HRQoL dimensions. The aims of this article were (1) to present the study design and (2) to investigate its psychometric properties in a clinical setting. %B Quality of Life Research %V 26 %P 1105–1117 %8 May %U https://doi.org/10.1007/s11136-016-1437-9 %R 10.1007/s11136-016-1437-9 %0 Journal Article %J Journal of Educational Measurement %D 2015 %T Variable-Length Computerized Adaptive Testing Using the Higher Order DINA Model %A Hsu, Chia-Ling %A Wang, Wen-Chung %X Cognitive diagnosis models provide profile information about a set of latent binary attributes, whereas item response models yield a summary report on a latent continuous trait. To utilize the advantages of both models, higher order cognitive diagnosis models were developed in which information about both latent binary attributes and latent continuous traits is available. To facilitate the utility of cognitive diagnosis models, corresponding computerized adaptive testing (CAT) algorithms were developed. Most of them adopt the fixed-length rule to terminate CAT and are limited to ordinary cognitive diagnosis models. In this study, the higher order deterministic-input, noisy-and-gate (DINA) model was used as an example, and three criteria based on the minimum-precision termination rule were implemented: one for the latent class, one for the latent trait, and the other for both. The simulation results demonstrated that all of the termination criteria were successful when items were selected according to the Kullback-Leibler information and the posterior-weighted Kullback-Leibler information, and the minimum-precision rule outperformed the fixed-length rule with a similar test length in recovering the latent attributes and the latent trait. %B Journal of Educational Measurement %V 52 %P 125–143 %U http://dx.doi.org/10.1111/jedm.12069 %R 10.1111/jedm.12069 %0 Journal Article %J Applied Psychological Measurement %D 2013 %T Variable-Length Computerized Adaptive Testing Based on Cognitive Diagnosis Models %A Hsu, Chia-Ling %A Wang, Wen-Chung %A Chen, Shu-Ying %X

Interest in developing computerized adaptive testing (CAT) under cognitive diagnosis models (CDMs) has increased recently. CAT algorithms that use a fixed-length termination rule frequently lead to different degrees of measurement precision for different examinees. Fixed precision, in which the examinees receive the same degree of measurement precision, is a major advantage of CAT over nonadaptive testing. In addition to the precision issue, test security is another important issue in practical CAT programs. In this study, the authors implemented two termination criteria for the fixed-precision rule and evaluated their performance under two popular CDMs using simulations. The results showed that using the two criteria with the posterior-weighted Kullback–Leibler information procedure for selecting items could achieve the prespecified measurement precision. A control procedure was developed to control item exposure and test overlap simultaneously among examinees. The simulation results indicated that in contrast to no method of controlling exposure, the control procedure developed in this study could maintain item exposure and test overlap at the prespecified level at the expense of only a few more items.

%B Applied Psychological Measurement %V 37 %P 563-582 %U http://apm.sagepub.com/content/37/7/563.abstract %R 10.1177/0146621613488642 %0 Generic %D 2010 %T Validation of a computer-adaptive test to evaluate generic health-related quality of life %A Rebollo, P. %A Castejon, I. %A Cuervo, J. %A Villa, G. %A Garcia-Cueto, E. %A Diaz-Cuervo, H. %A Zardain, P. C. %A Muniz, J. %A Alonso, J. %X BACKGROUND: Health Related Quality of Life (HRQoL) is a relevant variable in the evaluation of health outcomes. Questionnaires based on Classical Test Theory typically require a large number of items to evaluate HRQoL. Computer Adaptive Testing (CAT) can be used to reduce tests length while maintaining and, in some cases, improving accuracy. This study aimed at validating a CAT based on Item Response Theory (IRT) for evaluation of generic HRQoL: the CAT-Health instrument. METHODS: Cross-sectional study of subjects aged over 18 attending Primary Care Centres for any reason. CAT-Health was administered along with the SF-12 Health Survey. Age, gender and a checklist of chronic conditions were also collected. CAT-Health was evaluated considering: 1) feasibility: completion time and test length; 2) content range coverage, Item Exposure Rate (IER) and test precision; and 3) construct validity: differences in the CAT-Health scores according to clinical variables and correlations between both questionnaires. RESULTS: 396 subjects answered CAT-Health and SF-12, 67.2% females, mean age (SD) 48.6 (17.7) years. 36.9% did not report any chronic condition. Median completion time for CAT-Health was 81 seconds (IQ range = 59-118) and it increased with age (p < 0.001). The median number of items administered was 8 (IQ range = 6-10). Neither ceiling nor floor effects were found for the score. None of the items in the pool had an IER of 100% and it was over 5% for 27.1% of the items. Test Information Function (TIF) peaked between levels -1 and 0 of HRQoL. Statistically significant differences were observed in the CAT-Health scores according to the number and type of conditions. CONCLUSIONS: Although domain-specific CATs exist for various areas of HRQoL, CAT-Health is one of the first IRT-based CATs designed to evaluate generic HRQoL and it has proven feasible, valid and efficient, when administered to a broad sample of individuals attending primary care settings. %B Health and Quality of Life Outcomes %7 2010/12/07 %V 8 %P 147 %@ 1477-7525 (Electronic)1477-7525 (Linking) %G eng %M 21129169 %2 3022567 %0 Journal Article %J Applied Psychological Measurement %D 2010 %T Variations on Stochastic Curtailment in Sequential Mastery Testing %A Finkelman, Matthew David %X

In sequential mastery testing (SMT), assessment via computer is used to classify examinees into one of two mutually exclusive categories. Unlike paper-and-pencil tests, SMT has the capability to use variable-length stopping rules. One approach to shortening variable-length tests is stochastic curtailment, which halts examination if the probability of changing classification decisions is low. The estimation of such a probability is therefore a critical component of a stochastically curtailed test. This article examines several variations on stochastic curtailment where the key probability is estimated more aggressively than the standard formulation, resulting in additional savings in average test length (ATL). In two simulation sets, the variations successfully reduced the ATL, and in many cases the average loss, compared with the standard formulation.

%B Applied Psychological Measurement %V 34 %P 27-45 %U http://apm.sagepub.com/content/34/1/27.abstract %R 10.1177/0146621609336113 %0 Journal Article %J Psychological Services %D 2009 %T Validation of the MMPI-2 computerized adaptive version (MMPI-2-CA) in a correctional intake facility %A Forbey, J. D. %A Ben-Porath, Y. S. %A Gartland, D. %X Computerized adaptive testing in personality assessment can improve efficiency by significantly reducing the number of items administered to answer an assessment question. The time savings afforded by this technique could be of particular benefit in settings where large numbers of psychological screenings are conducted, such as correctional facilities. In the current study, item and time savings, as well as the test–retest and extratest correlations associated with an audio augmented administration of all the scales of the Minnesota Multiphasic Personality Inventory (MMPI)-2 Computerized Adaptive (MMPI-2-CA) are reported. Participants include 366 men, ages 18 to 62 years (M = 33.04, SD = 10.40), undergoing intake into a large Midwestern state correctional facility. Results of the current study indicate considerable item and corresponding time savings for the MMPI-2-CA compared to conventional administration of the test, as well as comparability in terms of test–retest and correlations with external measures. Future directions of adaptive personality testing are discussed. %B Psychological Services %V 6 %P 279-292 %@ 1939-148X %G eng %0 Book Section %D 2007 %T Validity and decision issues in selecting a CAT measurement model %A Olsen, J. B. %A Bunderson, C. V %C D. J. Weiss (Ed.). Proceedings of the 2007 GMAC Conference on Computerized Adaptive Testing. %0 Thesis %D 2006 %T Validitätssteigerungen durch adaptives Testen [Increasing validity by adaptive testing]. %A Frey, A. %G eng %9 Doctoral %0 Conference Paper %B Paper presented at the annual meetings of the American Educational Research Association %D 2006 %T A variant of the progressive restricted item exposure control procedure in computerized adaptive testing systems based on the 3PL and the partial credit model %A McClarty, L. K. %A Sperling, R. %A Dodd, B. G. %B Paper presented at the annual meetings of the American Educational Research Association %C San Francisco %G eng %0 Journal Article %J Psychological Assessment %D 2005 %T Validation of a computerized adaptive testing version of the Schedule for Nonadaptive and Adaptive Personality (SNAP) %A Simms, L. J., %A Clark, L. A. %X This is a validation study of a computerized adaptive (CAT) version of the Schedule for Nonadaptive and Adaptive Personality (SNAP) conducted with 413 undergraduates who completed the SNAP twice, 1 week apart. Participants were assigned randomly to 1 of 4 retest groups: (a) paper-and-pencil (P&P) SNAP, (b) CAT, (c) P&P/CAT, and (d) CAT/P&P. With number of items held constant, computerized administration had little effect on descriptive statistics, rank ordering of scores, reliability, and concurrent validity, but was preferred over P&P administration by most participants. CAT administration yielded somewhat lower precision and validity than P&P administration, but required 36% to 37% fewer items and 58% to 60% less time to complete. These results confirm not only key findings from previous CAT simulation studies of personality measures but extend them for the 1st time to a live assessment setting. %B Psychological Assessment %V 17(1) %P 28-43 %G eng %0 Journal Article %J Psychological Assessment %D 2005 %T Validation of a computerized adaptive version of the Schedule of Non-Adaptive and Adaptive Personality (SNAP) %A Simms, L. J. %A Clark, L.J. %X This is a validation study of a computerized adaptive (CAT) version of the Schedule for Nonadaptive and Adaptive Personality (SNAP) conducted with 413 undergraduates who completed the SNAP twice, 1 week apart. Participants were assigned randomly to 1 of 4 retest groups: (a) paper-and-pencil (P&P) SNAP, (b) CAT, (c) P&P/CAT, and (d) CAT/P&P. With number of items held constant, computerized administration had little effect on descriptive statistics, rank ordering of scores, reliability, and concurrent validity, but was preferred over P&P administration by most participants. CAT administration yielded somewhat lower precision and validity than P&P administration, but required 36% to 37% fewer items and 58% to 60% less time to complete. These results confirm not only key findings from previous CAT simulation studies of personality measures but extend them for the 1st time to a live assessment setting. %B Psychological Assessment %V 17 %P 28-43 %G eng %0 Journal Article %J Quality of Life Research %D 2004 %T Validating the German computerized adaptive test for anxiety on healthy sample (A-CAT) %A Becker, J. %A Walter, O. B. %A Fliege, H. %A Bjorner, J. B. %A Kocalevent, R. D. %A Schmid, G. %A Klapp, B. F. %A Rose, M. %B Quality of Life Research %V 13 %P 1515 %G eng %0 Journal Article %J Educational Measurement: Issues and Practice %D 2001 %T Validity issues in computer-based testing %A Huff, K. L. %A Sireci, S. G. %B Educational Measurement: Issues and Practice %V 20(3) %P 16-25 %G eng %0 Generic %D 2000 %T Variations in mean response times for questions on the computer-adaptive GRE general test: Implications for fair assessment (GRE Board Professional Report No %A Bridgeman, B. %A Cline, F. %C 96-20P: Educational Testing Service Research Report 00-7) %G eng %0 Conference Paper %B annual meeting of the American Educational Research Association %D 1997 %T Validation of CATSIB To investigate DIF of CAT data %A Nandakumar, R. %A Roussos, L. A. %K computerized adaptive testing %X This paper investigates the performance of CATSIB (a modified version of the SIBTEST computer program) to assess differential item functioning (DIF) in the context of computerized adaptive testing (CAT). One of the distinguishing features of CATSIB is its theoretically built-in regression correction to control for the Type I error rates when the distributions of the reference and focal groups differ on the intended ability. This phenomenon is also called impact. The Type I error rate of CATSIB with the regression correction (WRC) was compared with that of CATSIB without the regression correction (WORC) to see if the regression correction was indeed effective. Also of interest was the power level of CATSIB after the regression correction. The subtest size was set at 25 items, and sample size, the impact level, and the amount of DIF were varied. Results show that the regression correction was very useful in controlling for the Type I error, CATSIB WORC had inflated observed Type I errors, especially when impact levels were high. The CATSIB WRC had observed Type I error rates very close to the nominal level of 0.05. The power rates of CATSIB WRC were impressive. As expected, the power increased as the sample size increased and as the amount of DIF increased. Even for small samples with high impact rates, power rates were 64% or higher for high DIF levels. For large samples, power rates were over 90% for high DIF levels. (Contains 12 tables and 7 references.) (Author/SLD) %B annual meeting of the American Educational Research Association %C Chicago, IL. USA %G eng %M ED409332 %0 Book Section %D 1997 %T Validation of the experimental CAT-ASVAB system %A Segall, D. O. %A Moreno, K. E. %A Kieckhaefer, W. F. %A Vicino, F. L. %A J. R. McBride %C W. A. Sands, B. K. Waters, and J. R. McBride (Eds.), Computerized adaptive testing: From inquiry to operation. Washington, DC: American Psychological Association. %G eng %0 Journal Article %J Teaching and Learning in Medicine %D 1996 %T Validity of item selection: A comparison of automated computerized adaptive and manual paper and pencil examinations %A Lunz, M. E. %A Deville, C. W. %B Teaching and Learning in Medicine %V 8 %P 152-157 %G eng %0 Book Section %D 1990 %T Validity %A Steinberg, L. %A Thissen, D. %A Wainer, H., %C H. Wainer (Ed.), Computerized adaptive testing: A primer (pp. 187-231). Hillsdale NJ: Erlbaum. %G eng %0 Generic %D 1990 %T Validity study in multidimensional latent space and efficient computerized adaptive testing (Final Report R01-1069-11-004-91) %A Samejima, F. %C Knoxville TN: University of Tennessee, Department of Psychology %G eng %0 Conference Paper %B Paper presented at the annual meeting of the American Psychological Association. %D 1985 %T Validity of adaptive testing: A summary of research results %A Sympson, J. B. %A Moreno, K. E. %B Paper presented at the annual meeting of the American Psychological Association. %G eng %0 Conference Paper %B Proceedings of the 27th Annual Conference of the Military Testing Association %D 1985 %T A validity study of the computerized adaptive testing version of the Armed Services Vocational Aptitude Battery %A Moreno, K. E. %A Segall, D. O. %A Kieckhaefer, W. F. %B Proceedings of the 27th Annual Conference of the Military Testing Association %G eng %0 Generic %D 1981 %T A validity comparison of adaptive and conventional strategies for mastery testing (Research Report 81-3) %A Kingsbury, G. G. %A Weiss, D. J. %C Minneapolis, Department of Psychology, Psychometric Methods Program, Computerized Adaptive Testing Laboratory %G eng %0 Book Section %D 1980 %T A validity study of an adaptive test of reading comprehension %A Hornke, L. F. %A Sauter, M. B. %C D. J. Weiss (Ed.), Proceedings of the 1979 Computerized Adaptive Testing Conference (pp. 57-67). Minneapolis MN: University of Minnesota, Department of Psychology, Psychometric Methods Program. %G eng %0 Journal Article %J Educational and Psychological Measurement %D 1974 %T The validity of Bayesian tailored testing %A Jensema, C J %B Educational and Psychological Measurement %V 34 %P 757-756 %G eng