TY - JOUR T1 - Computerized adaptive testing to screen children for emotional and behavioral problems by preventive child healthcare JF - BMC Pediatrics Y1 - 2020 A1 - Theunissen, Meninou H.C. A1 - de Wolff, Marianne S. A1 - Deurloo, Jacqueline A. A1 - Vogels, Anton G. C. AB -

Background

Questionnaires to detect emotional and behavioral problems (EBP) in Preventive Child Healthcare (PCH) should be short which potentially affects validity and reliability. Simulation studies have shown that Computerized Adaptive Testing (CAT) could overcome these weaknesses. We studied the applicability (using the measures participation rate, satisfaction, and efficiency) and the validity of CAT in routine PCH practice.

Methods

We analyzed data on 461 children aged 10–11 years (response 41%), who were assessed during routine well-child examinations by PCH professionals. Before the visit, parents completed the CAT and the Child Behavior Checklist (CBCL). Satisfaction was measured by parent- and PCH professional-report. Efficiency of the CAT procedure was measured as number of items needed to assess whether a child has serious problems or not. Its validity was assessed using the CBCL as the criterion.

Results

Parents and PCH professionals rated the CAT on average as good. The procedure required at average 16 items to assess whether a child has serious problems or not. Agreement of scores on the CAT scales with corresponding CBCL scales was high (range of Spearman correlations 0.59–0.72). Area Under Curves (AUC) were high (range: 0.95–0.97) for the Psycat total, externalizing, and hyperactivity scales using corresponding CBCL scale scores as criterion. For the Psycat internalizing scale the AUC was somewhat lower but still high (0.86).

Conclusions

CAT is a valid procedure for the identification of emotional and behavioral problems in children aged 10–11 years. It may support the efficient and accurate identification of children with overall, and potentially also specific, emotional and behavioral problems in routine PCH.

VL - 20 UR - https://bmcpediatr.biomedcentral.com/articles/10.1186/s12887-020-2018-1 IS - Article number: 119 ER - TY - JOUR T1 - Computerized Adaptive Testing for Cognitively Based Multiple-Choice Data JF - Applied Psychological Measurement Y1 - 2019 A1 - Hulya D. Yigit A1 - Miguel A. Sorrel A1 - Jimmy de la Torre AB - Cognitive diagnosis models (CDMs) are latent class models that hold great promise for providing diagnostic information about student knowledge profiles. The increasing use of computers in classrooms enhances the advantages of CDMs for more efficient diagnostic testing by using adaptive algorithms, referred to as cognitive diagnosis computerized adaptive testing (CD-CAT). When multiple-choice items are involved, CD-CAT can be further improved by using polytomous scoring (i.e., considering the specific options students choose), instead of dichotomous scoring (i.e., marking answers as either right or wrong). In this study, the authors propose and evaluate the performance of the Jensen–Shannon divergence (JSD) index as an item selection method for the multiple-choice deterministic inputs, noisy “and” gate (MC-DINA) model. Attribute classification accuracy and item usage are evaluated under different conditions of item quality and test termination rule. The proposed approach is compared with the random selection method and an approximate approach based on dichotomized responses. The results show that under the MC-DINA model, JSD improves the attribute classification accuracy significantly by considering the information from distractors, even with a very short test length. This result has important implications in practical classroom settings as it can allow for dramatically reduced testing times, thus resulting in more targeted learning opportunities. VL - 43 UR - https://doi.org/10.1177/0146621618798665 ER - TY - JOUR T1 - Computerized Adaptive Testing in Early Education: Exploring the Impact of Item Position Effects on Ability Estimation JF - Journal of Educational Measurement Y1 - 2019 A1 - Albano, Anthony D. A1 - Cai, Liuhan A1 - Lease, Erin M. A1 - McConnell, Scott R. AB - Abstract Studies have shown that item difficulty can vary significantly based on the context of an item within a test form. In particular, item position may be associated with practice and fatigue effects that influence item parameter estimation. The purpose of this research was to examine the relevance of item position specifically for assessments used in early education, an area of testing that has received relatively limited psychometric attention. In an initial study, multilevel item response models fit to data from an early literacy measure revealed statistically significant increases in difficulty for items appearing later in a 20-item form. The estimated linear change in logits for an increase of 1 in position was .024, resulting in a predicted change of .46 logits for a shift from the beginning to the end of the form. A subsequent simulation study examined impacts of item position effects on person ability estimation within computerized adaptive testing. Implications and recommendations for practice are discussed. VL - 56 UR - https://onlinelibrary.wiley.com/doi/abs/10.1111/jedm.12215 ER - TY - JOUR T1 - A Comparison of Constraint Programming and Mixed-Integer Programming for Automated Test-Form Generation JF - Journal of Educational Measurement Y1 - 2018 A1 - Li, Jie A1 - van der Linden, Wim J. AB - Abstract The final step of the typical process of developing educational and psychological tests is to place the selected test items in a formatted form. The step involves the grouping and ordering of the items to meet a variety of formatting constraints. As this activity tends to be time-intensive, the use of mixed-integer programming (MIP) has been proposed to automate it. The goal of this article is to show how constraint programming (CP) can be used as an alternative to automate test-form generation problems with a large variety of formatting constraints, and how it compares with MIP-based form generation as for its models, solutions, and running times. Two empirical examples are presented: (i) automated generation of a computerized fixed-form; and (ii) automated generation of shadow tests for multistage testing. Both examples show that CP works well with feasible solutions and running times likely to be better than that for MIP-based applications. VL - 55 UR - https://onlinelibrary.wiley.com/doi/abs/10.1111/jedm.12187 ER - TY - JOUR T1 - Constructing Shadow Tests in Variable-Length Adaptive Testing JF - Applied Psychological Measurement Y1 - 2018 A1 - Qi Diao A1 - Hao Ren AB - Imposing content constraints is very important in most operational computerized adaptive testing (CAT) programs in educational measurement. Shadow test approach to CAT (Shadow CAT) offers an elegant solution to imposing statistical and nonstatistical constraints by projecting future consequences of item selection. The original form of Shadow CAT presumes fixed test lengths. The goal of the current study was to extend Shadow CAT to tests under variable-length termination conditions and evaluate its performance relative to other content balancing approaches. The study demonstrated the feasibility of constructing Shadow CAT with variable test lengths and in operational CAT programs. The results indicated the superiority of the approach compared with other content balancing methods. VL - 42 UR - https://doi.org/10.1177/0146621617753736 ER - TY - JOUR T1 - A Continuous a-Stratification Index for Item Exposure Control in Computerized Adaptive Testing JF - Applied Psychological Measurement Y1 - 2018 A1 - Alan Huebner A1 - Chun Wang A1 - Bridget Daly A1 - Colleen Pinkelman AB - The method of a-stratification aims to reduce item overexposure in computerized adaptive testing, as items that are administered at very high rates may threaten the validity of test scores. In existing methods of a-stratification, the item bank is partitioned into a fixed number of nonoverlapping strata according to the items’a, or discrimination, parameters. This article introduces a continuous a-stratification index which incorporates exposure control into the item selection index itself and thus eliminates the need for fixed discrete strata. The new continuous a-stratification index is compared with existing stratification methods via simulation studies in terms of ability estimation bias, mean squared error, and control of item exposure rates. VL - 42 UR - https://doi.org/10.1177/0146621618758289 ER - TY - CONF T1 - Is CAT Suitable for Automated Speaking Test? T2 - IACAT 2017 Conference Y1 - 2017 A1 - Shingo Imai KW - Automated Speaking Test KW - CAT KW - language testing AB -

We have developed automated scoring system of Japanese speaking proficiency, namely SJ-CAT (Speaking Japanese Computerized Adaptive Test), which is operational for last few months. One of the unique features of the test is an adaptive test base on polytomous IRT.

SJ-CAT consists of two sections; Section 1 has sentence reading aloud tasks and a multiple choicereading tasks and Section 2 has sentence generation tasks and an open answer tasks. In reading aloud tasks, a test taker reads a phoneme-balanced sentence on the screen after listening to a model reading. In a multiple choice-reading task, a test taker sees a picture and reads aloud one sentence among three sentences on the screen, which describe the scene most appropriately. In a sentence generation task, a test taker sees a picture or watches a video clip and describes the scene with his/her own words for about ten seconds. In an open answer tasks, the test taker expresses one’s support for or opposition to e.g., a nuclear power generation with reasons for about 30 seconds.

In the course of the development of the test, we found many unexpected and unique characteristics of speaking CAT, which are not found in usual CATs with multiple choices. In this presentation, we will discuss some of such factors that are not previously noticed in our previous project of developing dichotomous J-CAT (Japanese Computerized Adaptive Test), which consists of vocabulary, grammar, reading, and listening. Firstly, we will claim that distribution of item difficulty parameters depends on the types of items. An item pool with unrestricted types of items such as open questions is difficult to achieve ideal distributions, either normal distribution or uniform distribution. Secondly, contrary to our expectations, open questions are not necessarily more difficult to operate in automated scoring system than more restricted questions such as sentence reading, as long as if one can set up suitable algorithm for open question scoring. Thirdly, we will show that the speed of convergence of standard deviation of posterior distribution, or standard error of theta parameter in polytomous IRT used for SJCAT is faster than dichotomous IRT used in J-CAT. Fourthly, we will discuss problems in equation of items in SJ-CAT, and suggest introducing deep learning with reinforcement learning instead of equation. And finally, we will discuss the issues of operation of SJ-CAT on the web, including speed of scoring, operation costs, security among others.

Session Video

JF - IACAT 2017 Conference PB - Niigata Seiryo University CY - Niigata, Japan ER - TY - CONF T1 - Comparison of Pretest Item Calibration Methods in a Computerized Adaptive Test (CAT) T2 - IACAT 2017 Conference Y1 - 2017 A1 - Huijuan Meng A1 - Chris Han KW - CAT KW - Pretest Item Calibration AB -

Calibration methods for pretest items in a computerized adaptive test (CAT) are not a new area of research inquiry. After decades of research on CAT, the fixed item parameter calibration (FIPC) method has been widely accepted and used by practitioners to address two CAT calibration issues: (a) a restricted ability range each item is exposed to, and (b) a sparse response data matrix. In FIPC, the parameters of the operational items are fixed at their original values, and multiple expectation maximization (EM) cycles are used to estimate parameters of the pretest items with prior ability distribution being updated multiple times (Ban, Hanson, Wang, Yi, & Harris, 2001; Kang & Peterson, 2009; Pommerich & Segall, 2003).

Another calibration method is the fixed person parameter calibration (FPPC) method proposed by Stocking (1988) as “Method A.” Under this approach, candidates’ ability estimates are fixed in the calibration of pretest items and they define the scale on which the parameter estimates are reported. The logic of FPPC is suitable for CAT applications because the person parameters are estimated based on operational items and available for pretest item calibration. In Stocking (1988), the FPPC was evaluated using the LOGIST computer program developed by Wood, Wingersky, and Lord (1976). He reported that “Method A” produced larger root mean square errors (RMSEs) in the middle ability range than “Method B,” which required the use of anchor items (administered non-adaptively) and linking steps to attempt to correct for the potential scale drift due to the use of imperfect ability estimates.

Since then, new commercial software tools such as BILOG-MG and flexMIRT (Cai, 2013) have been developed to handle the FPPC method with different implementations (e.g., the MH-RM algorithm with flexMIRT). The performance of the FPPC method with those new software tools, however, has rarely been researched in the literature.

In our study, we evaluated the performance of two pretest item calibration methods using flexMIRT, the new software tool. The FIPC and FPPC are compared under various CAT settings. Each simulated exam contains 75% operational items and 25% pretest items, and real item parameters are used to generate the CAT data. This study also addresses the lack of guidelines in existing CAT item calibration literature regarding population ability shift and exam length (more accurate theta estimates are expected in longer exams). Thus, this study also investigates the following four factors and their impact on parameter estimation accuracy, including: (1) candidate population changes (3 ability distributions); (2) exam length (20: 15 OP + 5 PT, 40: 30 OP + 10 PT, and 60: 45 OP + 15 PT); (3) data model fit (3PL and 3PL with fixed C), and (4) pretest item calibration sample sizes (300, 500, and 1000). This study’s findings will fill the gap in this area of research and thus provide new information on which practitioners can base their decisions when selecting a pretest calibration method for their exams.

References

Ban, J. C., Hanson, B. A., Wang, T., Yi, Q., & Harris, D. J. (2001). A comparative study of online pretest item—Calibration/scaling methods in computerized adaptive testing. Journal of Educational Measurement, 38(3), 191–212.

Cai, L. (2013). flexMIRT® Flexible Multilevel Multidimensional Item Analysis and Test Scoring (Version 2) [Computer software]. Chapel Hill, NC: Vector Psychometric Group.

Kang, T., & Petersen, N. S. (2009). Linking item parameters to a base scale (Research Report No. 2009– 2). Iowa City, IA: ACT.

Pommerich, M., & Segall, D.O. (2003, April). Calibrating CAT pools and online pretest items using marginal maximum likelihood methods. Paper presented at the annual meeting of the National Council on Measurement in Education, Chicago, IL.

Stocking, M. L. (1988). Scale drift in online calibration (Research Report No. 88–28). Princeton, NJ: Educational Testing Service.

Wood, R. L., Wingersky, M. S., & Lord, F. M. (1976). LOGIST: A computer program for estimating examinee ability and item characteristic curve parameters (RM76-6) [Computer program]. Princeton, NJ: Educational Testing Service.

Session Video

JF - IACAT 2017 Conference PB - Niigata Seiryo University CY - Niigata, Japan ER - TY - CONF T1 - A Comparison of Three Empirical Reliability Estimates for Computerized Adaptive Testing T2 - IACAT 2017 conference Y1 - 2017 A1 - Dong Gi Seo KW - CAT KW - Reliability AB -

Reliability estimates in Computerized Adaptive Testing (CAT) are derived from estimated thetas and standard error of estimated thetas. In practical, the observed standard error (OSE) of estimated thetas can be estimated by test information function for each examinee with respect to Item response theory (IRT). Unlike classical test theory (CTT), OSEs in IRT are conditional values given each estimated thetas so that those values should be marginalized to consider test reliability. Arithmetic mean, Harmonic mean, and Jensen equality were applied to marginalize OSEs to estimate CAT reliability. Based on different marginalization method, three empirical CAT reliabilities were compared with true reliabilities. Results showed that three empirical CAT reliabilities were underestimated compared to true reliability in short test length (< 40), whereas the magnitude of CAT reliabilities was followed by Jensen equality, Harmonic mean, and Arithmetic mean in long test length (> 40). Specifically, Jensen equality overestimated true reliability across all conditions in long test length (>50).

Session Video 

JF - IACAT 2017 conference PB - Niigata Seiryo University CY - Niigata, Japan UR - https://drive.google.com/file/d/1gXgH-epPIWJiE0LxMHGiCAxZZAwy4dAH/view?usp=sharing ER - TY - JOUR T1 - Is a Computerized Adaptive Test More Motivating Than a Fixed-Item Test? JF - Applied Psychological Measurement Y1 - 2017 A1 - Guangming Ling A1 - Yigal Attali A1 - Bridgid Finn A1 - Elizabeth A. Stone AB - Computer adaptive tests provide important measurement advantages over traditional fixed-item tests, but research on the psychological reactions of test takers to adaptive tests is lacking. In particular, it has been suggested that test-taker engagement, and possibly test performance as a consequence, could benefit from the control that adaptive tests have on the number of test items examinees answer correctly. However, previous research on this issue found little support for this possibility. This study expands on previous research by examining this issue in the context of a mathematical ability assessment and by considering the possible effect of immediate feedback of response correctness on test engagement, test anxiety, time on task, and test performance. Middle school students completed a mathematics assessment under one of three test type conditions (fixed, adaptive, or easier adaptive) and either with or without immediate feedback about the correctness of responses. Results showed little evidence for test type effects. The easier adaptive test resulted in higher engagement and lower anxiety than either the adaptive or fixed-item tests; however, no significant differences in performance were found across test types, although performance was significantly higher across all test types when students received immediate feedback. In addition, these effects were not related to ability level, as measured by the state assessment achievement levels. The possibility that test experiences in adaptive tests may not in practice be significantly different than in fixed-item tests is raised and discussed to explain the results of this and previous studies. VL - 41 UR - https://doi.org/10.1177/0146621617707556 ER - TY - CONF T1 - Computerized Adaptive Testing for Cognitive Diagnosis in Classroom: A Nonparametric Approach T2 - IACAT 2017 Conference Y1 - 2017 A1 - Yuan-Pei Chang A1 - Chia-Yi Chiu A1 - Rung-Ching Tsai KW - CD-CAT KW - non-parametric approach AB -

In the past decade, CDMs of educational test performance have received increasing attention among educational researchers (for details, see Fu & Li, 2007, and Rupp, Templin, & Henson, 2010). CDMs of educational test performance decompose the ability domain of a given test into specific skills, called attributes, each of which an examinee may or may not have mastered. The resulting attribute profile documents the individual’s strengths and weaknesses within the ability domain. The Cognitive Diagnostic Computerized Adaptive Testing (CD-CAT) has been suggested by researchers as a diagnostic tool for assessment and evaluation (e.g., Cheng & Chang, 2007; Cheng, 2009; Liu, You, Wang, Ding, & Chang, 2013; Tatsuoka & Tatsuoka, 1997). While model-based CD-CAT is relatively well-researched in the context of large-scale assessments, this type of system has not received the same degree of development in small-scale settings, where it would be most useful. The main challenge is that the statistical estimation techniques successfully applied to the parametric CD-CAT require large samples to guarantee the reliable calibration of item parameters and accurate estimation of examinees’ attribute profiles. In response to the challenge, a nonparametric approach that does not require any parameter calibration, and thus can be used in small educational programs, is proposed. The proposed nonparametric CD-CAT relies on the same principle as the regular CAT algorithm, but uses the nonparametric classification method (Chiu & Douglas, 2013) to assess and update the student’s ability state while the test proceeds. Based on a student’s initial responses, 2 a neighborhood of candidate proficiency classes is identified, and items not characteristic of the chosen proficiency classes are precluded from being chosen next. The response to the next item then allows for an update of the skill profile, and the set of possible proficiency classes is further narrowed. In this manner, the nonparametric CD-CAT cycles through item administration and update stages until the most likely proficiency class has been pinpointed. The simulation results show that the proposed method outperformed the compared parametric CD-CAT algorithms and the differences were significant when the item parameter calibration was not optimal.

References

Cheng, Y. (2009). When cognitive diagnosis meets computerized adaptive testing: CD-CAT. Psychometrika, 74, 619-632.

Cheng, Y., & Chang, H. (2007). The modified maximum global discrimination index method for cognitive diagnostic CAT. In D. Weiss (Ed.) Proceedings of the 2007 GMAC Computerized Adaptive Testing Conference.

Chiu, C.-Y., & Douglas, J. A. (2013). A nonparametric approach to cognitive diagnosis by proximity to ideal response patterns. Journal of Classification, 30, 225-250.

Fu, J., & Li, Y. (2007). An integrative review of cognitively diagnostic psychometric models. Paper presented at the Annual Meeting of the National Council on Measurement in Education. Chicago, Illinois.

Liu, H., You, X., Wang, W., Ding, S., & Chang, H. (2013). The development of computerized adaptive testing with cognitive diagnosis for an English achievement test in China. Journal of Classification, 30, 152-172.

Rupp, A. A., & Templin, J. L., & Henson, R. A. (2010). Diagnostic Measurement. Theory, Methods, and Applications. New York: Guilford.

Tatsuoka, K.K., & Tatsuoka, M.M. (1997), Computerized cognitive diagnostic adaptive testing: Effect on remedial instruction as empirical validation. Journal of Educational Measurement, 34, 3–20.

Session Video

JF - IACAT 2017 Conference PB - Niigata Seiryo University CY - Niigata, Japan ER - TY - CONF T1 - Concerto 5 Open Source CAT Platform: From Code to Nodes T2 - IACAT 2017 Conference Y1 - 2017 A1 - David Stillwell KW - Concerto 5 KW - Open Source CAT AB -

Concerto 5 is the newest version of the Concerto open source R-based Computer-Adaptive Testing platform, which is currently used in educational testing and in clinical trials. In our quest to make CAT accessible to all, the latest version uses flowchart nodes to connect different elements of a test, so that CAT test creation is an intuitive high-level process that does not require writing code.

A test creator might connect an Info Page node, to a Consent Page node, to a CAT node, to a Feedback node. And after uploading their items, their test is done.

This talk will show the new flowchart interface, and demonstrate the creation of a CAT test from scratch in less than 10 minutes.

Concerto 5 also includes a new Polytomous CAT node, so CATs with Likert items can be easily created in the flowchart interface. This node is currently used in depression and anxiety tests in a clinical trial.

Session Video

JF - IACAT 2017 Conference PB - Niigata Seiryo University CY - Niigata, Japan UR - https://drive.google.com/open?id=11eu1KKILQEoK5c-CYO1P1AiJgiQxX0E0 ER - TY - CONF T1 - Considerations in Performance Evaluations of Computerized Formative Assessments T2 - IACAT 2017 Conference Y1 - 2017 A1 - Michael Chajewski A1 - John Harnisher KW - algebra KW - Formative Assessment KW - Performance Evaluations AB -

Computerized adaptive instruments have been widely established and used in the context of summative assessments for purposes including licensure, admissions and proficiency testing. The benefits of examinee tailored examinations, which can provide estimates of performance that are more reliable and valid, have in recent years attracted a greater audience (i.e. patient oriented outcomes, test prep, etc.). Formative assessment, which are most widely understood in their implementation as diagnostic tools, have recently started to expand to lesser known areas of computerized testing such as in implementations of instructional designs aiming to maximize examinee learning through targeted practice.

Using a CAT instrument within the framework of evaluating repetitious examinee performances (in such settings as a Quiz Bank practices for example) poses unique challenges not germane to summative assessments. The scale on which item parameters (and subsequently examinee performance estimates such as Maximum Likelihood Estimates) are determined usually do not take change over time under consideration. While vertical scaling features resolve the learning acquisition problem, most content practice engines do not make use of explicit practice windows which could be vertically aligned. Alternatively, the Multidimensional (MIRT)- and Hierarchical Item Response Theory (HIRT) models allow for the specification of random effects associated with change over time in examinees’ skills, but are often complex and require content and usage resources not often observed.

The research submitted for consideration simulated examinees’ repeated variable length Quiz Bank practice in algebra using a 500 1-PL operational item pool. The stability simulations sought to determine with which rolling item interval size ability estimates would provide the most informative insight into the examinees’ learning progression over time. Estimates were evaluated in terms of reduction in estimate uncertainty, bias and RMSD with the true and total item based ability estimates. It was found that rolling item intervals between 20-25 items provided the best reduction of uncertainty around the estimate without compromising the ability to provide informed performance estimates to students. However, while asymptotically intervals of 20-25 items tended to provide adequate estimates of performance, changes over shorter periods of time assessed with shorter quizzes could not be detected as those changes would be suppressed in lieu of the performance based on the full interval considered. Implications for infrastructure (such as recommendation engines, etc.), product and scale development are discussed.

Session video

JF - IACAT 2017 Conference PB - Niigata Seiryo University CY - Niigata, Japan ER - TY - CONF T1 - Construction of Gratitude Scale Using Polytomous Item Response Theory Model T2 - IACAT 2017 Conference Y1 - 2017 A1 - Nurul Arbiyah KW - Gratitude Scale KW - polytomous items AB -

Various studies have shown that gratitude is essential to increase the happiness and quality of life of every individual. Unfortunately, research on gratitude still received little attention, and there is no standardized measurement for it. Gratitude measurement scale was developed overseas, and has not adapted to the Indonesian culture context. Moreover, the scale development is generally performed with classical theory approach, which has some drawbacks. This research will develop a gratitude scale using polytomous Item Response Theory model (IRT) by applying the Partial Credit Model (PCM).

The pilot study results showed that the gratitude scale (with 44 items) is a reliable measure (α = 0.944) and valid (meet both convergent and discriminant validity requirements). The pilot study results also showed that the gratitude scale satisfies unidimensionality assumptions.

The test results using the PCM model showed that the gratitude scale had a fit model. Of 44 items, there was one item that does not fit, so it was eliminated. Second test results for the remaining 43 items showed that they fit the model, and all items were fit to measure gratitude. Analysis using Differential Item Functioning (DIF) showed four items have a response bias based on gender. Thus, there are 39 items remaining in the scale.

Session Video 

JF - IACAT 2017 Conference PB - Niigata Seiryo University CY - Niigata, Japan UR - https://drive.google.com/open?id=1pHhO4cq2-wh24ht3nBAoXNHv7234_mjH ER - TY - JOUR T1 - A Comparison of Constrained Item Selection Methods in Multidimensional Computerized Adaptive Testing JF - Applied Psychological Measurement Y1 - 2016 A1 - Su, Ya-Hui AB - The construction of assessments in computerized adaptive testing (CAT) usually involves fulfilling a large number of statistical and non-statistical constraints to meet test specifications. To improve measurement precision and test validity, the multidimensional priority index (MPI) and the modified MPI (MMPI) can be used to monitor many constraints simultaneously under a between-item and a within-item multidimensional framework, respectively. As both item selection methods can be implemented easily and computed efficiently, they are important and useful for operational CATs; however, no thorough simulation study has compared the performance of these two item selection methods under two different item bank structures. The purpose of this study was to investigate the efficiency of the MMPI and the MPI item selection methods under the between-item and within-item multidimensional CAT through simulations. The MMPI and the MPI item selection methods yielded similar performance in measurement precision for both multidimensional pools and yielded similar performance in exposure control and constraint management for the between-item multidimensional pool. For the within-item multidimensional pool, the MPI method yielded slightly better performance in exposure control but yielded slightly worse performance in constraint management than the MMPI method. VL - 40 UR - http://apm.sagepub.com/content/40/5/346.abstract ER - TY - JOUR T1 - On Computing the Key Probability in the Stochastically Curtailed Sequential Probability Ratio Test JF - Applied Psychological Measurement Y1 - 2016 A1 - Huebner, Alan R. A1 - Finkelman, Matthew D. AB - The Stochastically Curtailed Sequential Probability Ratio Test (SCSPRT) is a termination criterion for computerized classification tests (CCTs) that has been shown to be more efficient than the well-known Sequential Probability Ratio Test (SPRT). The performance of the SCSPRT depends on computing the probability that at a given stage in the test, an examinee’s current interim classification status will not change before the end of the test. Previous work discusses two methods of computing this probability, an exact method in which all potential responses to remaining items are considered and an approximation based on the central limit theorem (CLT) requiring less computation. Generally, the CLT method should be used early in the test when the number of remaining items is large, and the exact method is more appropriate at later stages of the test when few items remain. However, there is currently a dearth of information as to the performance of the SCSPRT when using the two methods. For the first time, the exact and CLT methods of computing the crucial probability are compared in a simulation study to explore whether there is any effect on the accuracy or efficiency of the CCT. The article is focused toward practitioners and researchers interested in using the SCSPRT as a termination criterion in an operational CCT. VL - 40 UR - http://apm.sagepub.com/content/40/2/142.abstract ER - TY - JOUR T1 - Comparing Simple Scoring With IRT Scoring of Personality Measures: The Navy Computer Adaptive Personality Scales JF - Applied Psychological Measurement Y1 - 2015 A1 - Oswald, Frederick L. A1 - Shaw, Amy A1 - Farmer, William L. AB -

This article analyzes data from U.S. Navy sailors (N = 8,956), with the central measure being the Navy Computer Adaptive Personality Scales (NCAPS). Analyses and results from this article extend and qualify those from previous research efforts by examining the properties of the NCAPS and its adaptive structure in more detail. Specifically, this article examines item exposure rates, the efficiency of item use based on item response theory (IRT)–based Expected A Posteriori (EAP) scoring, and a comparison of IRT-EAP scoring with much more parsimonious scoring methods that appear to work just as well (stem-level scoring and dichotomous scoring). The cutting-edge nature of adaptive personality testing will necessitate a series of future efforts like this: to examine the benefits of adaptive scoring schemes and novel measurement methods continually, while pushing testing technology further ahead.

VL - 39 UR - http://apm.sagepub.com/content/39/2/144.abstract ER - TY - JOUR T1 - A Comparison of IRT Proficiency Estimation Methods Under Adaptive Multistage Testing JF - Journal of Educational Measurement Y1 - 2015 A1 - Kim, Sooyeon A1 - Moses, Tim A1 - Yoo, Hanwook (Henry) AB - This inquiry is an investigation of item response theory (IRT) proficiency estimators’ accuracy under multistage testing (MST). We chose a two-stage MST design that includes four modules (one at Stage 1, three at Stage 2) and three difficulty paths (low, middle, high). We assembled various two-stage MST panels (i.e., forms) by manipulating two assembly conditions in each module, such as difficulty level and module length. For each panel, we investigated the accuracy of examinees’ proficiency levels derived from seven IRT proficiency estimators. The choice of Bayesian (prior) versus non-Bayesian (no prior) estimators was of more practical significance than the choice of number-correct versus item-pattern scoring estimators. The Bayesian estimators were slightly more efficient than the non-Bayesian estimators, resulting in smaller overall error. Possible score changes caused by the use of different proficiency estimators would be nonnegligible, particularly for low- and high-performing examinees. VL - 52 UR - http://dx.doi.org/10.1111/jedm.12063 ER - TY - JOUR T1 - Considering the Use of General and Modified Assessment Items in Computerized Adaptive Testing JF - Applied Measurement in Education Y1 - 2015 A1 - Wyse, A. E. A1 - Albano, A. D. AB - This article used several data sets from a large-scale state testing program to examine the feasibility of combining general and modified assessment items in computerized adaptive testing (CAT) for different groups of students. Results suggested that several of the assumptions made when employing this type of mixed-item CAT may not be met for students with disabilities that have typically taken alternate assessments based on modified achievement standards (AA-MAS). A simulation study indicated that the abilities of AA-MAS students can be underestimated or overestimated by the mixed-item CAT, depending on students’ location on the underlying ability scale. These findings held across grade levels and test lengths. The mixed-item CAT appeared to function well for non-AA-MAS students. VL - 28 IS - 2 ER - TY - JOUR T1 - Cognitive Diagnostic Models and Computerized Adaptive Testing: Two New Item-Selection Methods That Incorporate Response Times JF - Journal of Computerized Adaptive Testing Y1 - 2014 A1 - Finkelman, M. D. A1 - Kim, W. A1 - Weissman, A. A1 - Cook, R.J. VL - 2 UR - http://www.iacat.org/jcat/index.php/jcat/article/view/43/21 IS - 4 ER - TY - JOUR T1 - A Comparison of Four Item-Selection Methods for Severely Constrained CATs JF - Educational and Psychological Measurement Y1 - 2014 A1 - He, Wei A1 - Diao, Qi A1 - Hauser, Carl AB -

This study compared four item-selection procedures developed for use with severely constrained computerized adaptive tests (CATs). Severely constrained CATs refer to those adaptive tests that seek to meet a complex set of constraints that are often not conclusive to each other (i.e., an item may contribute to the satisfaction of several constraints at the same time). The procedures examined in the study included the weighted deviation model (WDM), the weighted penalty model (WPM), the maximum priority index (MPI), and the shadow test approach (STA). In addition, two modified versions of the MPI procedure were introduced to deal with an edge case condition that results in the item selection procedure becoming dysfunctional during a test. The results suggest that the STA worked best among all candidate methods in terms of measurement accuracy and constraint management. For the other three heuristic approaches, they did not differ significantly in measurement accuracy and constraint management at the lower bound level. However, the WPM method appears to perform considerably better in overall constraint management than either the WDM or MPI method. Limitations and future research directions were also discussed.

VL - 74 UR - http://epm.sagepub.com/content/74/4/677.abstract ER - TY - JOUR T1 - A Comparison of Multi-Stage and Linear Test Designs for Medium-Size Licensure and Certification Examinations JF - Journal of Computerized Adaptive Testing Y1 - 2014 A1 - Brossman, Bradley. G. A1 - Guille, R.A. VL - 2 IS - 2 ER - TY - JOUR T1 - Computerized Adaptive Testing for the Random Weights Linear Logistic Test Model JF - Applied Psychological Measurement Y1 - 2014 A1 - Crabbe, Marjolein A1 - Vandebroek, Martina AB -

This article discusses four-item selection rules to design efficient individualized tests for the random weights linear logistic test model (RWLLTM): minimum posterior-weighted -error minimum expected posterior-weighted -error maximum expected Kullback–Leibler divergence between subsequent posteriors (KLP), and maximum mutual information (MUI). The RWLLTM decomposes test items into a set of subtasks or cognitive features and assumes individual-specific effects of the features on the difficulty of the items. The model extends and improves the well-known linear logistic test model in which feature effects are only estimated at the aggregate level. Simulations show that the efficiencies of the designs obtained with the different criteria appear to be equivalent. However, KLP and MUI are given preference over and due to their lesser complexity, which significantly reduces the computational burden.

VL - 38 UR - http://apm.sagepub.com/content/38/6/415.abstract ER - TY - BOOK T1 - Computerized multistage testing: Theory and applications Y1 - 2014 A1 - Duanli Yan A1 - Alina A von Davier A1 - Charles Lewis PB - CRC Press CY - Boca Raton FL SN - 13-978-1-4665-0577-3 ER - TY - JOUR T1 - Comparing the Performance of Five Multidimensional CAT Selection Procedures With Different Stopping Rules JF - Applied Psychological Measurement Y1 - 2013 A1 - Yao, Lihua AB -

Through simulated data, five multidimensional computerized adaptive testing (MCAT) selection procedures with varying test lengths are examined and compared using different stopping rules. Fixed item exposure rates are used for all the items, and the Priority Index (PI) method is used for the content constraints. Two stopping rules, standard error (SE) and predicted standard error reduction (PSER), are proposed; each MCAT selection process is stopped if either the required precision has been achieved or the selected number of items has reached the maximum limit. The five procedures are as follows: minimum angle (Ag), volume (Vm), minimize the error variance of the linear combination (V 1), minimize the error variance of the composite score with the optimized weight (V 2), and Kullback–Leibler (KL) information. The recovery for the domain scores or content scores and their overall score, test length, and test reliability are compared across the five MCAT procedures and between the two stopping rules. It is found that the two stopping rules are implemented successfully and that KL uses the least number of items to reach the same precision level, followed by Vm; Ag uses the largest number of items. On average, to reach a precision of SE = .35, 40, 55, 63, 63, and 82 items are needed for KL, Vm, V 1, V 2, and Ag, respectively, for the SE stopping rule. PSER yields 38, 45, 53, 58, and 68 items for KL, Vm, V 1, V 2, and Ag, respectively; PSER yields only slightly worse results than SE, but with much fewer items. Overall, KL is recommended for varying-length MCAT.

VL - 37 UR - http://apm.sagepub.com/content/37/1/3.abstract ER - TY - JOUR T1 - A Comparison of Computerized Classification Testing and Computerized Adaptive Testing in Clinical Psychology JF - Journal of Computerized Adaptive Testing Y1 - 2013 A1 - Smits, N. A1 - Finkelman, M. D. VL - 1 IS - 2 ER - TY - JOUR T1 - A Comparison of Exposure Control Procedures in CAT Systems Based on Different Measurement Models for Testlets JF - Applied Measurement in Education Y1 - 2013 A1 - Boyd, Aimee M. A1 - Dodd, Barbara A1 - Fitzpatrick, Steven VL - 26 UR - http://www.tandfonline.com/doi/abs/10.1080/08957347.2013.765434 ER - TY - JOUR T1 - A Comparison of Exposure Control Procedures in CATs Using the 3PL Model JF - Educational and Psychological Measurement Y1 - 2013 A1 - Leroux, Audrey J. A1 - Lopez, Myriam A1 - Hembry, Ian A1 - Dodd, Barbara G. AB -

This study compares the progressive-restricted standard error (PR-SE) exposure control procedure to three commonly used procedures in computerized adaptive testing, the randomesque, Sympson–Hetter (SH), and no exposure control methods. The performance of these four procedures is evaluated using the three-parameter logistic model under the manipulated conditions of item pool size (small vs. large) and stopping rules (fixed-length vs. variable-length). PR-SE provides the advantage of similar constraints to SH, without the need for a preceding simulation study to execute it. Overall for the large and small item banks, the PR-SE method administered almost all of the items from the item pool, whereas the other procedures administered about 52% or less of the large item bank and 80% or less of the small item bank. The PR-SE yielded the smallest amount of item overlap between tests across conditions and administered fewer items on average than SH. PR-SE obtained these results with similar, and acceptable, measurement precision compared to the other exposure control procedures while vastly improving on item pool usage.

VL - 73 UR - http://epm.sagepub.com/content/73/5/857.abstract ER - TY - JOUR T1 - A Comparison of Four Methods for Obtaining Information Functions for Scores From Computerized Adaptive Tests With Normally Distributed Item Difficulties and Discriminations JF - Journal of Computerized Adaptive Testing Y1 - 2013 A1 - Ito, K. A1 - Segall, D.O. VL - 1 IS - 5 ER - TY - JOUR T1 - Comparison Between Dichotomous and Polytomous Scoring of Innovative Items in a Large-Scale Computerized Adaptive Test JF - Educational and Psychological Measurement Y1 - 2012 A1 - Jiao, H. A1 - Liu, J. A1 - Haynie, K. A1 - Woo, A. A1 - Gorham, J. AB -

This study explored the impact of partial credit scoring of one type of innovative items (multiple-response items) in a computerized adaptive version of a large-scale licensure pretest and operational test settings. The impacts of partial credit scoring on the estimation of the ability parameters and classification decisions in operational test settings were explored in one real data analysis and two simulation studies when two different polytomous scoring algorithms, automated polytomous scoring and rater-generated polytomous scoring, were applied. For the real data analyses, the ability estimates from dichotomous and polytomous scoring were highly correlated; the classification consistency between different scoring algorithms was nearly perfect. Information distribution changed slightly in the operational item bank. In the two simulation studies comparing each polytomous scoring with dichotomous scoring, the ability estimates resulting from polytomous scoring had slightly higher measurement precision than those resulting from dichotomous scoring. The practical impact related to classification decision was minor because of the extremely small number of items that could be scored polytomously in this current study.

VL - 72 ER - TY - JOUR T1 - Comparison of Exposure Controls, Item Pool Characteristics, and Population Distributions for CAT Using the Partial Credit Model JF - Educational and Psychological Measurement Y1 - 2012 A1 - Lee, HwaYoung A1 - Dodd, Barbara G. AB -

This study investigated item exposure control procedures under various combinations of item pool characteristics and ability distributions in computerized adaptive testing based on the partial credit model. Three variables were manipulated: item pool characteristics (120 items for each of easy, medium, and hard item pools), two ability distributions (normally distributed and negatively skewed data), and three exposure control procedures (randomesque procedure, progressive–restricted procedure, and maximum information procedure). A number of measurement precision indexes such as descriptive statistics, correlations between known and estimated ability levels, bias, root mean squared error, and average absolute difference, exposure rates, item usage, and item overlap were computed to assess the impact of matched or nonmatched item pool and ability distributions on the accuracy of ability estimation and the performance of exposure control procedures. As expected, the medium item pool produced better precision of measurement than both the easy and hard item pools. The progressive–restricted procedure performed better in terms of maximum exposure rates, item average overlap, and pool utilization than both the randomesque procedure and the maximum information procedure. The easy item pool with the negatively skewed data as a mismatched condition produced the worst performance.

VL - 72 UR - http://epm.sagepub.com/content/72/1/159.abstract ER - TY - JOUR T1 - Comparison of two Bayesian methods to detect mode effects between paper-based and computerized adaptive assessments: a preliminary Monte Carlo study. JF - BMC Med Res Methodol Y1 - 2012 A1 - Riley, Barth B A1 - Carle, Adam C KW - Bayes Theorem KW - Data Interpretation, Statistical KW - Humans KW - Mathematical Computing KW - Monte Carlo Method KW - Outcome Assessment (Health Care) AB -

BACKGROUND: Computerized adaptive testing (CAT) is being applied to health outcome measures developed as paper-and-pencil (P&P) instruments. Differences in how respondents answer items administered by CAT vs. P&P can increase error in CAT-estimated measures if not identified and corrected.

METHOD: Two methods for detecting item-level mode effects are proposed using Bayesian estimation of posterior distributions of item parameters: (1) a modified robust Z (RZ) test, and (2) 95% credible intervals (CrI) for the CAT-P&P difference in item difficulty. A simulation study was conducted under the following conditions: (1) data-generating model (one- vs. two-parameter IRT model); (2) moderate vs. large DIF sizes; (3) percentage of DIF items (10% vs. 30%), and (4) mean difference in θ estimates across modes of 0 vs. 1 logits. This resulted in a total of 16 conditions with 10 generated datasets per condition.

RESULTS: Both methods evidenced good to excellent false positive control, with RZ providing better control of false positives and with slightly higher power for CrI, irrespective of measurement model. False positives increased when items were very easy to endorse and when there with mode differences in mean trait level. True positives were predicted by CAT item usage, absolute item difficulty and item discrimination. RZ outperformed CrI, due to better control of false positive DIF.

CONCLUSIONS: Whereas false positives were well controlled, particularly for RZ, power to detect DIF was suboptimal. Research is needed to examine the robustness of these methods under varying prior assumptions concerning the distribution of item and person parameters and when data fail to conform to prior assumptions. False identification of DIF when items were very easy to endorse is a problem warranting additional investigation.

VL - 12 ER - TY - JOUR T1 - Computerized Adaptive Testing for Student Selection to Higher Education JF - Journal of Higher Education Y1 - 2012 A1 - Kalender, I. AB -

The purpose of the present study is to discuss applicability of computerized adaptive testing format as an alternative for current student selection examinations to higher education in Turkey. In the study, first problems associated with current student selection system are given. These problems exerts pressure on students that results in test anxiety, produce measurement experiences that can be criticized, and lessen credibility of student selection system. Next, computerized adaptive test are introduced and advantages they provide are presented. Then results of a study that used two research designs (simulation and live testing) were presented. Results revealed that (i) computerized adaptive format provided a reduction up to 80% in the number of items given to students compared to paper and pencil format of student selection examination, (ii) ability estimations have high reliabilities. Correlations between ability estimations obtained from simulation and traditional format were higher than 0.80. At the end of the study solutions provided by computerized adaptive testing implementation to the current problems were discussed. Also some issues for application of CAT format for student selection examinations in Turkey are given.

ER - TY - THES T1 - Computerized adaptive testing in industrial and organizational psychology Y1 - 2012 A1 - Makransky, G. PB - University of Twente CY - Twente, The Netherlands VL - Ph.D. ER - TY - JOUR T1 - Computerized Adaptive Testing Using a Class of High-Order Item Response Theory Models JF - Applied Psychological Measurement Y1 - 2012 A1 - Huang, Hung-Yu A1 - Chen, Po-Hsi A1 - Wang, Wen-Chung AB -

In the human sciences, a common assumption is that latent traits have a hierarchical structure. Higher order item response theory models have been developed to account for this hierarchy. In this study, computerized adaptive testing (CAT) algorithms based on these kinds of models were implemented, and their performance under a variety of situations was examined using simulations. The results showed that the CAT algorithms were very effective. The progressive method for item selection, the Sympson and Hetter method with online and freeze procedure for item exposure control, and the multinomial model for content balancing can simultaneously maintain good measurement precision, item exposure control, content balance, test security, and pool usage.

VL - 36 UR - http://apm.sagepub.com/content/36/8/689.abstract ER - TY - JOUR T1 - catR: An R Package for Computerized Adaptive Testing JF - Applied Psychological Measurement Y1 - 2011 A1 - Magis, D. A1 - Raîche, G. KW - computer program KW - computerized adaptive testing KW - Estimation KW - Item Response Theory AB -

Computerized adaptive testing (CAT) is an active current research field in psychometrics and educational measurement. However, there is very little software available to handle such adaptive tasks. The R package catR was developed to perform adaptive testing with as much flexibility as possible, in an attempt to provide a developmental and testing platform to the interested user. Several item-selection rules and ability estimators are implemented. The item bank can be provided by the user or randomly generated from parent distributions of item parameters. Three stopping rules are available. The output can be graphically displayed.

ER - TY - JOUR T1 - A Comment on Early Student Blunders on Computer-Based Adaptive Tests JF - Applied Psychological Measurement Y1 - 2011 A1 - Green, Bert F. AB -

This article refutes a recent claim that computer-based tests produce biased scores for very proficient test takers who make mistakes on one or two initial items and that the ‘‘bias’’ can be reduced by using a four-parameter IRT model. Because the same effect occurs with pattern scores on nonadaptive tests, the effect results from IRT scoring, not from adaptive testing. Because very proficient test takers rarely err on items of middle difficulty, the so-called bias is one of selective data analysis. Furthermore, the apparently large score penalty for one error on an otherwise perfect response pattern is shown to result from the relative stretching of the IRT scale at very high and very low proficiencies. The recommended use of a four-parameter IRT model is shown to have drawbacks.

VL - 35 UR - http://apm.sagepub.com/content/35/2/165.abstract ER - TY - JOUR T1 - A Comment on Early Student Blunders on Computer-Based Adaptive Tests JF - Applied Psychological Measurement Y1 - 2011 A1 - Green, B. F. AB -

This article refutes a recent claim that computer-based tests produce biased scores for very proficient test takers who make mistakes on one or two initial items and that they can be reduced by using a four-parameter IRT model. Because the same effect occurs with pattern scores on nonadaptive tests, the effect results from IRT scoring, not from adaptive testing. Because very proficient test takers rarely err on items of middle difficulty, the so-called bias is one of selective data analysis. Furthermore, the apparently large score penalty for one error on an otherwise perfect response pattern is shown to result from the relative stretching of the IRT scale at very high and very low proficiencies. The recommended use of a four-parameter IRT model is shown to have drawbacks.

VL - 35 IS - 2 ER - TY - JOUR T1 - Computer adaptive testing for small scale programs and instructional systems JF - Journal of Applied Testing Technology Y1 - 2011 A1 - Rudner, L. M. A1 - Guo, F. AB -

This study investigates measurement decision theory (MDT) as an underlying model for computer adaptive testing when the goal is to classify examinees into one of a finite number of groups. The first analysis compares MDT with a popular item response theory model and finds little difference in terms of the percentage of correct classifications. The second analysis examines the number of examinees needed to calibrate MDT item parameters and finds accurate classifications even with calibration sample sizes as small as 100 examinees.

VL - 12 IS - 1 ER - TY - JOUR T1 - Computerized adaptive assessment of personality disorder: Introducing the CAT–PD project JF - Journal of Personality Assessment Y1 - 2011 A1 - Simms, L. J. A1 - Goldberg, L .R. A1 - Roberts, J. E. A1 - Watson, D. A1 - Welte, J. A1 - Rotterman, J. H. AB - Assessment of personality disorders (PD) has been hindered by reliance on the problematic categorical model embodied in the most recent Diagnostic and Statistical Model of Mental Disorders (DSM), lack of consensus among alternative dimensional models, and inefficient measurement methods. This article describes the rationale for and early results from a multiyear study funded by the National Institute of Mental Health that was designed to develop an integrative and comprehensive model and efficient measure of PD trait dimensions. To accomplish these goals, we are in the midst of a 5-phase project to develop and validate the model and measure. The results of Phase 1 of the project—which was focused on developing the PD traits to be assessed and the initial item pool—resulted in a candidate list of 59 PD traits and an initial item pool of 2,589 items. Data collection and structural analyses in community and patient samples will inform the ultimate structure of the measure, and computerized adaptive testing will permit efficient measurement of the resultant traits. The resultant Computerized Adaptive Test of Personality Disorder (CAT–PD) will be well positioned as a measure of the proposed DSM–5 PD traits. Implications for both applied and basic personality research are discussed. VL - 93 SN - 0022-3891 ER - TY - JOUR T1 - Computerized Adaptive Testing with the Zinnes and Griggs Pairwise Preference Ideal Point Model JF - International Journal of Testing Y1 - 2011 A1 - Stark, Stephen A1 - Chernyshenko, Oleksandr S. VL - 11 UR - http://www.tandfonline.com/doi/abs/10.1080/15305058.2011.561459 ER - TY - JOUR T1 - Computerized Classification Testing Under the Generalized Graded Unfolding Model JF - Educational and Psychological Measurement Y1 - 2011 A1 - Wang, Wen-Chung A1 - Liu, Chen-Wei AB -

The generalized graded unfolding model (GGUM) has been recently developed to describe item responses to Likert items (agree—disagree) in attitude measurement. In this study, the authors (a) developed two item selection methods in computerized classification testing under the GGUM, the current estimate/ability confidence interval method and the cut score/sequential probability ratio test method and (b) evaluated their accuracy and efficiency in classification through simulations. The results indicated that both methods were very accurate and efficient. The more points each item had and the fewer the classification categories, the more accurate and efficient the classification would be. However, the latter method may yield a very low accuracy in dichotomous items with a short maximum test length. Thus, if it is to be used to classify examinees with dichotomous items, the maximum text length should be increased.

VL - 71 UR - http://epm.sagepub.com/content/71/1/114.abstract ER - TY - JOUR T1 - Computerized Classification Testing Under the One-Parameter Logistic Response Model With Ability-Based Guessing JF - Educational and Psychological Measurement Y1 - 2011 A1 - Wang, Wen-Chung A1 - Huang, Sheng-Yun AB -

The one-parameter logistic model with ability-based guessing (1PL-AG) has been recently developed to account for effect of ability on guessing behavior in multiple-choice items. In this study, the authors developed algorithms for computerized classification testing under the 1PL-AG and conducted a series of simulations to evaluate their performances. Four item selection methods (the Fisher information, the Fisher information with a posterior distribution, the progressive method, and the adjusted progressive method) and two termination criteria (the ability confidence interval [ACI] method and the sequential probability ratio test [SPRT]) were developed. In addition, the Sympson–Hetter online method with freeze (SHOF) was implemented for item exposure control. Major results include the following: (a) when no item exposure control was made, all the four item selection methods yielded very similar correct classification rates, but the Fisher information method had the worst item bank usage and the highest item exposure rate; (b) SHOF can successfully maintain the item exposure rate at a prespecified level, without compromising substantial accuracy and efficiency in classification; (c) once SHOF was implemented, all the four methods performed almost identically; (d) ACI appeared to be slightly more efficient than SPRT; and (e) in general, a higher weight of ability in guessing led to a slightly higher accuracy and efficiency, and a lower forced classification rate.

VL - 71 UR - http://epm.sagepub.com/content/71/6/925.abstract ER - TY - JOUR T1 - Content range and precision of a computer adaptive test of upper extremity function for children with cerebral palsy JF - Physical & Occupational Therapy in Pediatrics Y1 - 2011 A1 - Montpetit, K. A1 - Haley, S. A1 - Bilodeau, N. A1 - Ni, P. A1 - Tian, F. A1 - Gorton, G., 3rd A1 - Mulcahey, M. J. AB - This article reports on the content range and measurement precision of an upper extremity (UE) computer adaptive testing (CAT) platform of physical function in children with cerebral palsy. Upper extremity items representing skills of all abilities were administered to 305 parents. These responses were compared with two traditional standardized measures: Pediatric Outcomes Data Collection Instrument and Functional Independence Measure for Children. The UE CAT correlated strongly with the upper extremity component of these measures and had greater precision when describing individual functional ability. The UE item bank has wider range with items populating the lower end of the ability spectrum. This new UE item bank and CAT have the capability to quickly assess children of all ages and abilities with good precision and, most importantly, with items that are meaningful and appropriate for their age and level of physical function. VL - 31 SN - 1541-3144 (Electronic)0194-2638 (Linking) N1 - Montpetit, KathleenHaley, StephenBilodeau, NathalieNi, PengshengTian, FengGorton, George 3rdMulcahey, M JEnglandPhys Occup Ther Pediatr. 2011 Feb;31(1):90-102. Epub 2010 Oct 13. JO - Phys Occup Ther Pediatr ER - TY - CONF T1 - Continuous Testing (an avenue for CAT research) T2 - Annual Conference of the International Association for Computerized Adaptive Testing Y1 - 2011 A1 - G. Gage Kingsbury KW - CAT KW - item filter KW - item filtration AB -

Publishing an Adaptive Test

Problems with Publishing

Research Questions

JF - Annual Conference of the International Association for Computerized Adaptive Testing ER - TY - JOUR T1 - Creating a K-12 Adaptive Test: Examining the Stability of Item Parameter Estimates and Measurement Scales JF - Journal of Applied Testing Technology Y1 - 2011 A1 - Kingsbury, G. G. A1 - Wise, S. L. AB -

Development of adaptive tests used in K-12 settings requires the creation of stable measurement scales to measure the growth of individual students from one grade to the next, and to measure change in groups from one year to the next. Accountability systems
like No Child Left Behind require stable measurement scales so that accountability has meaning across time. This study examined the stability of the measurement scales used with the Measures of Academic Progress. Difficulty estimates for test questions from the reading and mathematics scales were examined over a period ranging from 7 to 22 years. Results showed high correlations between item difficulty estimates from the time at which they where originally calibrated and the current calibration. The average drift in item difficulty estimates was less than .01 standard deviations. The average impact of change in item difficulty estimates was less than the smallest reported difference on the score scale for two actual tests. The findings of the study indicate that an IRT scale can be stable enough to allow consistent measurement of student achievement.

VL - 12 UR - http://www.testpublishers.org/journal-of-applied-testing-technology ER - TY - ABST T1 - Cross-cultural development of an item list for computer-adaptive testing of fatigue in oncological patients Y1 - 2011 A1 - Giesinger, J. M. A1 - Petersen, M. A. A1 - Groenvold, M. A1 - Aaronson, N. K. A1 - Arraras, J. I. A1 - Conroy, T. A1 - Gamper, E. M. A1 - Kemmler, G. A1 - King, M. T. A1 - Oberguggenberger, A. S. A1 - Velikova, G. A1 - Young, T. A1 - Holzner, B. A1 - Eortc-Qlg, E. O. AB - ABSTRACT: INTRODUCTION: Within an ongoing project of the EORTC Quality of Life Group, we are developing computerized adaptive test (CAT) measures for the QLQ-C30 scales. These new CAT measures are conceptualised to reflect the same constructs as the QLQ-C30 scales. Accordingly, the Fatigue-CAT is intended to capture physical and general fatigue. METHODS: The EORTC approach to CAT development comprises four phases (literature search, operationalisation, pre-testing, and field testing). Phases I-III are described in detail in this paper. A literature search for fatigue items was performed in major medical databases. After refinement through several expert panels, the remaining items were used as the basis for adapting items and/or formulating new items fitting the EORTC item style. To obtain feedback from patients with cancer, these English items were translated into Danish, French, German, and Spanish and tested in the respective countries. RESULTS: Based on the literature search a list containing 588 items was generated. After a comprehensive item selection procedure focusing on content, redundancy, item clarity and item difficulty a list of 44 fatigue items was generated. Patient interviews (n=52) resulted in 12 revisions of wording and translations. DISCUSSION: The item list developed in phases I-III will be further investigated within a field-testing phase (IV) to examine psychometric characteristics and to fit an item response theory model. The Fatigue CAT based on this item bank will provide scores that are backward-compatible to the original QLQ-C30 fatigue scale. JF - Health and Quality of Life Outcomes VL - 9 SN - 1477-7525 (Electronic)1477-7525 (Linking) N1 - Health Qual Life Outcomes. 2011 Mar 29;9(1):19. ER - TY - JOUR T1 - A Comparison of Content-Balancing Procedures for Estimating Multiple Clinical Domains in Computerized Adaptive Testing: Relative Precision, Validity, and Detection of Persons With Misfitting Responses JF - Applied Psychological Measurement Y1 - 2010 A1 - Barth B. Riley A1 - Michael L. Dennis A1 - Conrad, Kendon J. AB -

This simulation study sought to compare four different computerized adaptive testing (CAT) content-balancing procedures designed for use in a multidimensional assessment with respect to measurement precision, symptom severity classification, validity of clinical diagnostic recommendations, and sensitivity to atypical responding. The four content-balancing procedures were (a) no content balancing, (b) screener-based, (c) mixed (screener plus content balancing), and (d) full content balancing. In full content balancing and in mixed content balancing following administration of the screener items, item selection was based on (a) whether the target number of items for the item’s subscale was reached and (b) the item’s information function. Mixed and full content balancing provided the best representation of items from each of the main subscales of the Internal Mental Distress Scale. These procedures also resulted in higher CAT to full-scale correlations for the Trauma and Homicidal/Suicidal Thought subscales and improved detection of atypical responding.

VL - 34 UR - http://apm.sagepub.com/content/34/6/410.abstract ER - TY - JOUR T1 - A comparison of content-balancing procedures for estimating multiple clinical domains in computerized adaptive testing: Relative precision, validity, and detection of persons with misfitting responses JF - Applied Psychological Measurement Y1 - 2010 A1 - Riley, B. B. A1 - Dennis, M. L. A1 - Conrad, K. J. AB - This simulation study sought to compare four different computerized adaptive testing (CAT) content-balancing procedures designed for use in a multidimensional assessment with respect to measurement precision, symptom severity classification, validity of clinical diagnostic recommendations, and sensitivity to atypical responding. The four content-balancing procedures were (a) no content balancing, (b) screener-based, (c) mixed (screener plus content balancing), and (d) full content balancing. In full content balancing and in mixed content balancing following administration of the screener items, item selection was based on (a) whether the target numberof items for the item’s subscale was reached and (b) the item’s information function. Mixed and full content balancing provided the best representation of items from each of the main subscales of the Internal Mental Distress Scale. These procedures also resulted in higher CAT to full-scale correlations for the Trauma and Homicidal/Suicidal Thought subscales and improved detection of atypical responding.Keywords VL - 34 SN - 0146-62161552-3497 ER - TY - JOUR T1 - A Comparison of Item Selection Techniques for Testlets JF - Applied Psychological Measurement Y1 - 2010 A1 - Murphy, Daniel L. A1 - Dodd, Barbara G. A1 - Vaughn, Brandon K. AB -

This study examined the performance of the maximum Fisher’s information, the maximum posterior weighted information, and the minimum expected posterior variance methods for selecting items in a computerized adaptive testing system when the items were grouped in testlets. A simulation study compared the efficiency of ability estimation among the item selection techniques under varying conditions of local-item dependency when the response model was either the three-parameter-logistic item response theory or the three-parameter-logistic testlet response theory. The item selection techniques performed similarly within any particular condition, the practical implications of which are discussed within the article.

VL - 34 UR - http://apm.sagepub.com/content/34/6/424.abstract ER - TY - CONF T1 - Computerized adaptive testing based on decision trees T2 - 10th IEEE International Conference on Advanced Learning Technologies Y1 - 2010 A1 - Ueno, M. A1 - Songmuang, P. JF - 10th IEEE International Conference on Advanced Learning Technologies PB - IEEE Computer Sience CY - Sousse, Tunisia VL - 58 ER - TY - CHAP T1 - Constrained Adaptive Testing with Shadow Tests T2 - Elements of Adaptive Testing Y1 - 2010 A1 - van der Linden, W. J. JF - Elements of Adaptive Testing ER - TY - CONF T1 - Comparing methods to recalibrate drifting items in computerized adaptive testing T2 - American Educational Research Association Y1 - 2009 A1 - Masters, J. S. A1 - Muckle, T. J. A1 - Bontempo, B JF - American Educational Research Association CY - San Diego, CA ER - TY - CHAP T1 - Comparison of ability estimation and item selection methods in multidimensional computerized adaptive testing Y1 - 2009 A1 - Diao, Q. A1 - Reckase, M. CY - D. J. Weiss (Ed.), Proceedings of the 2009 GMAC Conference on Computerized Adaptive Testing. N1 - {PDF File, 342 KB} ER - TY - CHAP T1 - Comparison of adaptive Bayesian estimation and weighted Bayesian estimation in multidimensional computerized adaptive testing Y1 - 2009 A1 - Chen, P. H. CY - D. J. Weiss (Ed.), Proceedings of the 2009 GMAC Conference on Computerized Adaptive Testing. N1 - {PDF file, 308KB} ER - TY - JOUR T1 - Comparison of CAT Item Selection Criteria for Polytomous Items JF - Applied Psychological Measurement Y1 - 2009 A1 - Choi, Seung W. A1 - Swartz, Richard J. AB -

Item selection is a core component in computerized adaptive testing (CAT). Several studies have evaluated new and classical selection methods; however, the few that have applied such methods to the use of polytomous items have reported conflicting results. To clarify these discrepancies and further investigate selection method properties, six different selection methods are compared systematically. The results showed no clear benefit from more sophisticated selection criteria and showed one method previously believed to be superior—the maximum expected posterior weighted information (MEPWI)—to be mathematically equivalent to a simpler method, the maximum posterior weighted information (MPWI).

VL - 33 UR - http://apm.sagepub.com/content/33/6/419.abstract ER - TY - JOUR T1 - Comparison of CAT item selection criteria for polytomous items JF - Applied Psychological Measurement Y1 - 2009 A1 - Choi, S. W. A1 - Swartz, R.J.. VL - 33 ER - TY - JOUR T1 - Comparison of methods for controlling maximum exposure rates in computerized adaptive testing JF - Psicothema Y1 - 2009 A1 - Barrada, J A1 - Abad, F. J. A1 - Veldkamp, B. P. KW - *Numerical Analysis, Computer-Assisted KW - Psychological Tests/*standards/*statistics & numerical data AB - This paper has two objectives: (a) to provide a clear description of three methods for controlling the maximum exposure rate in computerized adaptive testing —the Symson-Hetter method, the restricted method, and the item-eligibility method— showing how all three can be interpreted as methods for constructing the variable sub-bank of items from which each examinee receives the items in his or her test; (b) to indicate the theoretical and empirical limitations of each method and to compare their performance. With the three methods, we obtained basically indistinguishable results in overlap rate and RMSE (differences in the third decimal place). The restricted method is the best method for controlling exposure rate, followed by the item-eligibility method. The worst method is the Sympson-Hetter method. The restricted method presents problems of sequential overlap rate. Our advice is to use the item-eligibility method, as it saves time and satisfies the goals of restricting maximum exposure. Comparación de métodos para el control de tasa máxima en tests adaptativos informatizados. Este artículo tiene dos objetivos: (a) ofrecer una descripción clara de tres métodos para el control de la tasa máxima en tests adaptativos informatizados, el método Symson-Hetter, el método restringido y el métodode elegibilidad del ítem, mostrando cómo todos ellos pueden interpretarse como métodos para la construcción del subbanco de ítems variable, del cual cada examinado recibe los ítems de su test; (b) señalar las limitaciones teóricas y empíricas de cada método y comparar sus resultados. Se obtienen resultados básicamente indistinguibles en tasa de solapamiento y RMSE con los tres métodos (diferencias en la tercera posición decimal). El método restringido es el mejor en el control de la tasa de exposición,seguido por el método de elegibilidad del ítem. El peor es el método Sympson-Hetter. El método restringido presenta un problema de solapamiento secuencial. Nuestra recomendación sería utilizar el método de elegibilidad del ítem, puesto que ahorra tiempo y satisface los objetivos de limitar la tasa máxima de exposición. VL - 21 SN - 0214-9915 (Print)0214-9915 (Linking) N1 - Barrada, Juan RamonAbad, Francisco JoseVeldkamp, Bernard PComparative StudySpainPsicothemaPsicothema. 2009 May;21(2):313-20. ER - TY - CHAP T1 - A comparison of three methods of item selection for computerized adaptive testing Y1 - 2009 A1 - Costa, D. R. A1 - Karino, C. A. A1 - Moura, F. A. S. A1 - Andrade, D. F. CY - D. J. Weiss (Ed.), Proceedings of the 2009 GMAC Conference on Computerized Adaptive Testing. N1 - PDF file, 531 K ER - TY - CHAP T1 - Computerized adaptive testing by mutual information and multiple imputations Y1 - 2009 A1 - Thissen-Roe, A. AB - Over the years, most computerized adaptive testing (CAT) systems have used score estimation procedures from item response theory (IRT). IRT models have salutary properties for score estimation, error reporting, and next-item selection. However, some testing purposes favor scoring approaches outside IRT. Where a criterion metric is readily available and more relevant than the assessed construct, for example in the selection of job applicants, a predictive model might be appropriate (Scarborough & Somers, 2006). In these cases, neither IRT scoring nor a unidimensional assessment structure can be assumed. Yet, the primary benefit of CAT remains desirable: shorter assessments with minimal loss of accuracy due to unasked items. In such a case, it remains possible to create a CAT system that produces an estimated score from a subset of available items, recognizes differential item information given the emerging item response pattern, and optimizes the accuracy of the score estimated at every successive item. The method of multiple imputations (Rubin, 1987) can be used to simulate plausible scores given plausible response patterns to unasked items (Thissen-Roe, 2005). Mutual information can then be calculated in order to select an optimally informative next item (or set of items). Previously observed response patterns to two complete neural network-scored assessments were resampled according to MIMI CAT item selection. The reproduced CAT scores were compared to full-length assessment scores. Approximately 95% accurate assignment of examinees to one of three score categories was achieved with a 70%-80% reduction in median test length. Several algorithmic factors influencing accuracy and computational performance were examined. CY - D. J. Weiss (Ed.), Proceedings of the 2009 GMAC Conference on Computerized Adaptive Testing. N1 - {PDF file, 179 KB} ER - TY - CHAP T1 - Computerized adaptive testing for cognitive diagnosis Y1 - 2009 A1 - Cheng, Y CY - D. J. Weiss (Ed.), Proceedings of the 2009 GMAC Conference on Computerized Adaptive Testing. N1 - {PDF File, 308 KB} ER - TY - CONF T1 - Computerized adaptive testing using the two parameter logistic model with ability-based guessing T2 - Paper presented at the International Meeting of the Psychometric Society. Cambridge Y1 - 2009 A1 - Shih, H.-J. A1 - Wang, W-C. JF - Paper presented at the International Meeting of the Psychometric Society. Cambridge ER - TY - CHAP T1 - Computerized classification testing in more than two categories by using stochastic curtailment Y1 - 2009 A1 - Wouda, J. T. A1 - Theo Eggen CY - D. J. Weiss (Ed.), Proceedings of the 2009 GMAC Conference on Computerized Adaptive Testing. N1 - {PDF file, 298 KB} ER - TY - JOUR T1 - A conditional exposure control method for multidimensional adaptive testing JF - Journal of Educational Measurement Y1 - 2009 A1 - Finkelman, M. A1 - Nering, M. L. A1 - Roussos, L. A. VL - 46 ER - TY - JOUR T1 - A Conditional Exposure Control Method for Multidimensional Adaptive Testing JF - Journal of Educational Measurement Y1 - 2009 A1 - Matthew Finkelman A1 - Nering, Michael L. A1 - Roussos, Louis A. AB -

In computerized adaptive testing (CAT), ensuring the security of test items is a crucial practical consideration. A common approach to reducing item theft is to define maximum item exposure rates, i.e., to limit the proportion of examinees to whom a given item can be administered. Numerous methods for controlling exposure rates have been proposed for tests employing the unidimensional 3-PL model. The present article explores the issues associated with controlling exposure rates when a multidimensional item response theory (MIRT) model is utilized and exposure rates must be controlled conditional upon ability. This situation is complicated by the exponentially increasing number of possible ability values in multiple dimensions. The article introduces a new procedure, called the generalized Stocking-Lewis method, that controls the exposure rate for students of comparable ability as well as with respect to the overall population. A realistic simulation set compares the new method with three other approaches: Kullback-Leibler information with no exposure control, Kullback-Leibler information with unconditional Sympson-Hetter exposure control, and random item selection.

VL - 46 UR - http://dx.doi.org/10.1111/j.1745-3984.2009.01070.x ER - TY - JOUR T1 - Considerations about expected a posteriori estimation in adaptive testing: adaptive a priori, adaptive correction for bias, and adaptive integration interval JF - Journal of Applied Measurement Y1 - 2009 A1 - Raiche, G. A1 - Blais, J. G. KW - *Bias (Epidemiology) KW - *Computers KW - Data Interpretation, Statistical KW - Models, Statistical AB - In a computerized adaptive test, we would like to obtain an acceptable precision of the proficiency level estimate using an optimal number of items. Unfortunately, decreasing the number of items is accompanied by a certain degree of bias when the true proficiency level differs significantly from the a priori estimate. The authors suggest that it is possible to reduced the bias, and even the standard error of the estimate, by applying to each provisional estimation one or a combination of the following strategies: adaptive correction for bias proposed by Bock and Mislevy (1982), adaptive a priori estimate, and adaptive integration interval. VL - 10 SN - 1529-7713 (Print)1529-7713 (Linking) N1 - Raiche, GillesBlais, Jean-GuyUnited StatesJournal of applied measurementJ Appl Meas. 2009;10(2):138-56. ER - TY - CHAP T1 - Constrained item selection using a stochastically curtailed SPRT Y1 - 2009 A1 - Wouda, J. T. A1 - Theo Eggen CY - D. J. Weiss (Ed.), Proceedings of the 2009 GMAC Conference on Computerized Adaptive Testing. N1 - {PDF File, 298 KB}{PDF File, 298 KB} ER - TY - JOUR T1 - Constraint-weighted a-stratification for computerized adaptive testing with nonstatistical constraints: Balancing measurement efficiency and exposure control JF - Educational and Psychological Measurement Y1 - 2009 A1 - Cheng, Y A1 - Chang, Hua-Hua A1 - Douglas, J. A1 - Guo, F. VL - 69 ER - TY - JOUR T1 - Constraint-Weighted a-Stratification for Computerized Adaptive Testing With Nonstatistical Constraints JF - Educational and Psychological Measurement Y1 - 2009 A1 - Ying Cheng, A1 - Chang, Hua-Hua A1 - Douglas, Jeffrey A1 - Fanmin Guo, AB -

a-stratification is a method that utilizes items with small discrimination (a) parameters early in an exam and those with higher a values when more is learned about the ability parameter. It can achieve much better item usage than the maximum information criterion (MIC). To make a-stratification more practical and more widely applicable, a method for weighting the item selection process in a-stratification as a means of satisfying multiple test constraints is proposed. This method is studied in simulation against an analogous method without stratification as well as a-stratification using descending-rather than ascending-a procedures. In addition, a variation of a-stratification that allows for unbalanced usage of a parameters is included in the study to examine the trade-off between efficiency and exposure control. Finally, MIC and randomized item selection are included as baseline measures. Results indicate that the weighting mechanism successfully addresses the constraints, that stratification helps to a great extent balancing exposure rates, and that the ascending-a design improves measurement precision.

VL - 69 UR - http://epm.sagepub.com/content/69/1/35.abstract ER - TY - CHAP T1 - Criterion-related validity of an innovative CAT-based personality measure Y1 - 2009 A1 - Schneider, R. J. A1 - McLellan, R. A. A1 - Kantrowitz, T. M. A1 - Houston, J. S. A1 - Borman, W. C. AB - This paper describes development and initial criterion-related validation of the PreVisor Computer Adaptive Personality Scales (PCAPS), a computerized adaptive testing-based personality measure that uses an ideal point IRT model based on forced-choice, paired-comparison responses. Based on results from a large consortium study, a composite of six PCAPS scales identified as relevant to the population of interest (first-line supervisors) had an estimated operational validity against an overall job performance criterion of ρ = .25. Uncorrected and corrected criterion-related validity results for each of the six PCAPS scales making up the composite are also reported. Because the PCAPS algorithm computes intermediate scale scores until a stopping rule is triggered, we were able to graph number of statement-pairs presented against criterion-related validities. Results showed generally monotonically increasing functions. However, asymptotic validity levels, or at least a reduction in the rate of increase in slope, were often reached after 5-7 statement-pairs were presented. In the case of the composite measure, there was some evidence that validities decreased after about six statement-pairs. A possible explanation for this is provided. CY - D. J. Weiss (Ed.), Proceedings of the 2009 GMAC Conference on Computerized Adaptive Testing. N1 - {PDF File, 163 KB} ER - TY - JOUR T1 - CAT-MD: Computerized adaptive testing on mobile devices JF - International Journal of Web-Based Learning and Teaching Technologies Y1 - 2008 A1 - Triantafillou, E. A1 - Georgiadou, E. A1 - Economides, A. A. VL - 3 ER - TY - JOUR T1 - Combining computer adaptive testing technology with cognitively diagnostic assessment JF - Behavioral Research Methods Y1 - 2008 A1 - McGlohen, M. A1 - Chang, Hua-Hua KW - *Cognition KW - *Computers KW - *Models, Statistical KW - *User-Computer Interface KW - Diagnosis, Computer-Assisted/*instrumentation KW - Humans AB - A major advantage of computerized adaptive testing (CAT) is that it allows the test to home in on an examinee's ability level in an interactive manner. The aim of the new area of cognitive diagnosis is to provide information about specific content areas in which an examinee needs help. The goal of this study was to combine the benefit of specific feedback from cognitively diagnostic assessment with the advantages of CAT. In this study, three approaches to combining these were investigated: (1) item selection based on the traditional ability level estimate (theta), (2) item selection based on the attribute mastery feedback provided by cognitively diagnostic assessment (alpha), and (3) item selection based on both the traditional ability level estimate (theta) and the attribute mastery feedback provided by cognitively diagnostic assessment (alpha). The results from these three approaches were compared for theta estimation accuracy, attribute mastery estimation accuracy, and item exposure control. The theta- and alpha-based condition outperformed the alpha-based condition regarding theta estimation, attribute mastery pattern estimation, and item exposure control. Both the theta-based condition and the theta- and alpha-based condition performed similarly with regard to theta estimation, attribute mastery estimation, and item exposure control, but the theta- and alpha-based condition has an additional advantage in that it uses the shadow test method, which allows the administrator to incorporate additional constraints in the item selection process, such as content balancing, item type constraints, and so forth, and also to select items on the basis of both the current theta and alpha estimates, which can be built on top of existing 3PL testing programs. VL - 40 SN - 1554-351X (Print) N1 - McGlohen, MeghanChang, Hua-HuaUnited StatesBehavior research methodsBehav Res Methods. 2008 Aug;40(3):808-21. ER - TY - JOUR T1 - Comparability of Computer-Based and Paper-and-Pencil Testing in K–12 Reading Assessments JF - Educational and Psychological Measurement Y1 - 2008 A1 - Shudong Wang, A1 - Hong Jiao, A1 - Young, Michael J. A1 - Brooks, Thomas A1 - Olson, John AB -

In recent years, computer-based testing (CBT) has grown in popularity, is increasingly being implemented across the United States, and will likely become the primary mode for delivering tests in the future. Although CBT offers many advantages over traditional paper-and-pencil testing, assessment experts, researchers, practitioners, and users have expressed concern about the comparability of scores between the two test administration modes. To help provide an answer to this issue, a meta-analysis was conducted to synthesize the administration mode effects of CBTs and paper-and-pencil tests on K—12 student reading assessments. Findings indicate that the administration mode had no statistically significant effect on K—12 student reading achievement scores. Four moderator variables—study design, sample size, computer delivery algorithm, and computer practice—made statistically significant contributions to predicting effect size. Three moderator variables—grade level, type of test, and computer delivery method—did not affect the differences in reading scores between test modes.

VL - 68 UR - http://epm.sagepub.com/content/68/1/5.abstract ER - TY - JOUR T1 - Computer Adaptive-Attribute Testing A New Approach to Cognitive Diagnostic Assessment JF - Zeitschrift für Psychologie / Journal of Psychology Y1 - 2008 A1 - Gierl, M. J. A1 - Zhou, J. KW - cognition and assessment KW - cognitive diagnostic assessment KW - computer adaptive testing AB -

The influence of interdisciplinary forces stemming from developments in cognitive science,mathematical statistics, educational
psychology, and computing science are beginning to appear in educational and psychological assessment. Computer adaptive-attribute testing (CA-AT) is one example. The concepts and procedures in CA-AT can be found at the intersection between computer adaptive testing and cognitive diagnostic assessment. CA-AT allows us to fuse the administrative benefits of computer adaptive testing with the psychological benefits of cognitive diagnostic assessment to produce an innovative psychologically-based adaptive testing approach. We describe the concepts behind CA-AT as well as illustrate how it can be used to promote formative, computer-based, classroom assessment.

VL - 216 IS - 1 ER - TY - JOUR T1 - Computer-Based and Paper-and-Pencil Administration Mode Effects on a Statewide End-of-Course English Test JF - Educational and Psychological Measurement Y1 - 2008 A1 - Kim, Do-Hong A1 - Huynh, Huynh AB -

The current study compared student performance between paper-and-pencil testing (PPT) and computer-based testing (CBT) on a large-scale statewide end-of-course English examination. Analyses were conducted at both the item and test levels. The overall results suggest that scores obtained from PPT and CBT were comparable. However, at the content domain level, a rather large difference in the reading comprehension section suggests that reading comprehension test may be more affected by the test administration mode. Results from the confirmatory factor analysis suggest that the administration mode did not alter the construct of the test.

VL - 68 UR - http://epm.sagepub.com/content/68/4/554.abstract ER - TY - JOUR T1 - Computerized adaptive testing for follow-up after discharge from inpatient rehabilitation: II. Participation outcomes JF - Archives of Physical Medicine and Rehabilitation Y1 - 2008 A1 - Haley, S. M. A1 - Gandek, B. A1 - Siebens, H. A1 - Black-Schaffer, R. M. A1 - Sinclair, S. J. A1 - Tao, W. A1 - Coster, W. J. A1 - Ni, P. A1 - Jette, A. M. KW - *Activities of Daily Living KW - *Adaptation, Physiological KW - *Computer Systems KW - *Questionnaires KW - Adult KW - Aged KW - Aged, 80 and over KW - Chi-Square Distribution KW - Factor Analysis, Statistical KW - Female KW - Humans KW - Longitudinal Studies KW - Male KW - Middle Aged KW - Outcome Assessment (Health Care)/*methods KW - Patient Discharge KW - Prospective Studies KW - Rehabilitation/*standards KW - Subacute Care/*standards AB - OBJECTIVES: To measure participation outcomes with a computerized adaptive test (CAT) and compare CAT and traditional fixed-length surveys in terms of score agreement, respondent burden, discriminant validity, and responsiveness. DESIGN: Longitudinal, prospective cohort study of patients interviewed approximately 2 weeks after discharge from inpatient rehabilitation and 3 months later. SETTING: Follow-up interviews conducted in patient's home setting. PARTICIPANTS: Adults (N=94) with diagnoses of neurologic, orthopedic, or medically complex conditions. INTERVENTIONS: Not applicable. MAIN OUTCOME MEASURES: Participation domains of mobility, domestic life, and community, social, & civic life, measured using a CAT version of the Participation Measure for Postacute Care (PM-PAC-CAT) and a 53-item fixed-length survey (PM-PAC-53). RESULTS: The PM-PAC-CAT showed substantial agreement with PM-PAC-53 scores (intraclass correlation coefficient, model 3,1, .71-.81). On average, the PM-PAC-CAT was completed in 42% of the time and with only 48% of the items as compared with the PM-PAC-53. Both formats discriminated across functional severity groups. The PM-PAC-CAT had modest reductions in sensitivity and responsiveness to patient-reported change over a 3-month interval as compared with the PM-PAC-53. CONCLUSIONS: Although continued evaluation is warranted, accurate estimates of participation status and responsiveness to change for group-level analyses can be obtained from CAT administrations, with a sizeable reduction in respondent burden. VL - 89 SN - 1532-821X (Electronic)0003-9993 (Linking) N1 - Haley, Stephen MGandek, BarbaraSiebens, HilaryBlack-Schaffer, Randie MSinclair, Samuel JTao, WeiCoster, Wendy JNi, PengshengJette, Alan MK02 HD045354-01A1/HD/NICHD NIH HHS/United StatesK02 HD45354-01/HD/NICHD NIH HHS/United StatesR01 HD043568/HD/NICHD NIH HHS/United StatesR01 HD043568-01/HD/NICHD NIH HHS/United StatesResearch Support, N.I.H., ExtramuralUnited StatesArchives of physical medicine and rehabilitationArch Phys Med Rehabil. 2008 Feb;89(2):275-83. U2 - 2666330 ER - TY - JOUR T1 - Computerized adaptive testing for patients with knee inpairments produced valid and responsive measures of function JF - Journal of Clinical Epidemiology Y1 - 2008 A1 - Hart, D. L. A1 - Wang, Y-C. A1 - Stratford, P. W. A1 - Mioduski, J. E. VL - 61 ER - TY - JOUR T1 - Computerized adaptive testing in back pain: Validation of the CAT-5D-QOL JF - Spine Y1 - 2008 A1 - Kopec, J. A. A1 - Badii, M. A1 - McKenna, M. A1 - Lima, V. D. A1 - Sayre, E. C. A1 - Dvorak, M. KW - *Disability Evaluation KW - *Health Status Indicators KW - *Quality of Life KW - Adult KW - Aged KW - Algorithms KW - Back Pain/*diagnosis/psychology KW - British Columbia KW - Diagnosis, Computer-Assisted/*standards KW - Feasibility Studies KW - Female KW - Humans KW - Internet KW - Male KW - Middle Aged KW - Predictive Value of Tests KW - Questionnaires/*standards KW - Reproducibility of Results AB - STUDY DESIGN: We have conducted an outcome instrument validation study. OBJECTIVE: Our objective was to develop a computerized adaptive test (CAT) to measure 5 domains of health-related quality of life (HRQL) and assess its feasibility, reliability, validity, and efficiency. SUMMARY OF BACKGROUND DATA: Kopec and colleagues have recently developed item response theory based item banks for 5 domains of HRQL relevant to back pain and suitable for CAT applications. The domains are Daily Activities (DAILY), Walking (WALK), Handling Objects (HAND), Pain or Discomfort (PAIN), and Feelings (FEEL). METHODS: An adaptive algorithm was implemented in a web-based questionnaire administration system. The questionnaire included CAT-5D-QOL (5 scales), Modified Oswestry Disability Index (MODI), Roland-Morris Disability Questionnaire (RMDQ), SF-36 Health Survey, and standard clinical and demographic information. Participants were outpatients treated for mechanical back pain at a referral center in Vancouver, Canada. RESULTS: A total of 215 patients completed the questionnaire and 84 completed a retest. On average, patients answered 5.2 items per CAT-5D-QOL scale. Reliability ranged from 0.83 (FEEL) to 0.92 (PAIN) and was 0.92 for the MODI, RMDQ, and Physical Component Summary (PCS-36). The ceiling effect was 0.5% for PAIN compared with 2% for MODI and 5% for RMQ. The CAT-5D-QOL scales correlated as anticipated with other measures of HRQL and discriminated well according to the level of satisfaction with current symptoms, duration of the last episode, sciatica, and disability compensation. The average relative discrimination index was 0.87 for PAIN, 0.67 for DAILY and 0.62 for WALK, compared with 0.89 for MODI, 0.80 for RMDQ, and 0.59 for PCS-36. CONCLUSION: The CAT-5D-QOL is feasible, reliable, valid, and efficient in patients with back pain. This methodology can be recommended for use in back pain research and should improve outcome assessment, facilitate comparisons across studies, and reduce patient burden. VL - 33 SN - 1528-1159 (Electronic)0362-2436 (Linking) N1 - Kopec, Jacek ABadii, MaziarMcKenna, MarioLima, Viviane DSayre, Eric CDvorak, MarcelResearch Support, Non-U.S. Gov'tValidation StudiesUnited StatesSpineSpine (Phila Pa 1976). 2008 May 20;33(12):1384-90. ER - TY - JOUR T1 - Computerized Adaptive Testing of Personality Traits JF - Zeitschrift für Psychologie / Journal of Psychology Y1 - 2008 A1 - Hol, A. M. A1 - Vorst, H. C. M. A1 - Mellenbergh, G. J. KW - Adaptive Testing KW - cmoputer-assisted testing KW - Item Response Theory KW - Likert scales KW - Personality Measures AB -

A computerized adaptive testing (CAT) procedure was simulated with ordinal polytomous personality data collected using a
conventional paper-and-pencil testing format. An adapted Dutch version of the dominance scale of Gough and Heilbrun’s Adjective
Check List (ACL) was used. This version contained Likert response scales with five categories. Item parameters were estimated using Samejima’s graded response model from the responses of 1,925 subjects. The CAT procedure was simulated using the responses of 1,517 other subjects. The value of the required standard error in the stopping rule of the CAT was manipulated. The relationship between CAT latent trait estimates and estimates based on all dominance items was studied. Additionally, the pattern of relationships between the CAT latent trait estimates and the other ACL scales was compared to that between latent trait estimates based on the entire item pool and the other ACL scales. The CAT procedure resulted in latent trait estimates qualitatively equivalent to latent trait estimates based on all items, while a substantial reduction of the number of used items could be realized (at the stopping rule of 0.4 about 33% of the 36 items was used).

VL - 216 IS - 1 ER - TY - JOUR T1 - Controlling item exposure and test overlap on the fly in computerized adaptive testing JF - British Journal of Mathematical and Statistical Psychology Y1 - 2008 A1 - Chen, S-Y. A1 - Lei, P. W. A1 - Liao, W. H. KW - *Decision Making, Computer-Assisted KW - *Models, Psychological KW - Humans AB - This paper proposes an on-line version of the Sympson and Hetter procedure with test overlap control (SHT) that can provide item exposure control at both the item and test levels on the fly without iterative simulations. The on-line procedure is similar to the SHT procedure in that exposure parameters are used for simultaneous control of item exposure rates and test overlap rate. The exposure parameters for the on-line procedure, however, are updated sequentially on the fly, rather than through iterative simulations conducted prior to operational computerized adaptive tests (CATs). Unlike the SHT procedure, the on-line version can control item exposure rate and test overlap rate without time-consuming iterative simulations even when item pools or examinee populations have been changed. Moreover, the on-line procedure was found to perform better than the SHT procedure in controlling item exposure and test overlap for examinees who take tests earlier. Compared with two other on-line alternatives, this proposed on-line method provided the best all-around test security control. Thus, it would be an efficient procedure for controlling item exposure and test overlap in CATs. VL - 61 SN - 0007-1102 (Print)0007-1102 (Linking) N1 - Chen, Shu-YingLei, Pui-WaLiao, Wen-HanResearch Support, Non-U.S. Gov'tEnglandThe British journal of mathematical and statistical psychologyBr J Math Stat Psychol. 2008 Nov;61(Pt 2):471-92. Epub 2007 Jul 23. ER - TY - CHAP T1 - CAT Security: A practitioner’s perspective Y1 - 2007 A1 - Guo, F. CY - D. J. Weiss (Ed.), Proceedings of the 2007 GMAC Conference on Computerized Adaptive Testing. N1 - {PDF file, 104 KB} ER - TY - CHAP T1 - Choices in CAT models in the context of educational testing Y1 - 2007 A1 - Theo Eggen CY - D. J. Weiss (Ed.), Proceedings of the 2007 GMAC Conference on Computerized Adaptive Testing. N1 - {PDF file, 123 KB} ER - TY - CONF T1 - Choices in CAT models in the context of educattional testing T2 - GMAC Conference on Computerized Adaptive Testing Y1 - 2007 A1 - Theo Eggen JF - GMAC Conference on Computerized Adaptive Testing PB - Graduate Management Admission Council CY - St. Paul, MN ER - TY - CHAP T1 - Comparison of computerized adaptive testing and classical methods for measuring individual change Y1 - 2007 A1 - Kim-Kang, G. A1 - Weiss, D. J. CY - D. J. Weiss (Ed.). Proceedings of the 2007 GMAC Conference on Computerized Adaptive Testing. N1 - {PDF file, 347 KB} ER - TY - JOUR T1 - The comparison of maximum likelihood estimation and expected a posteriori in CAT using the graded response model JF - Journal of Elementary Education Y1 - 2007 A1 - Chen, S-K. VL - 19 ER - TY - BOOK T1 - A comparison of two methods of polytomous computerized classification testing for multiple cutscores Y1 - 2007 A1 - Thompson, N. A. CY - Unpublished doctoral dissertation, University of Minnesota N1 - {PDF file, 363 KB} ER - TY - JOUR T1 - Computerized adaptive personality testing: A review and illustration with the MMPI-2 Computerized Adaptive Version JF - Psychological Assessment Y1 - 2007 A1 - Forbey, J. D. A1 - Ben-Porath, Y. S. KW - Adolescent KW - Adult KW - Diagnosis, Computer-Assisted/*statistics & numerical data KW - Female KW - Humans KW - Male KW - MMPI/*statistics & numerical data KW - Personality Assessment/*statistics & numerical data KW - Psychometrics/statistics & numerical data KW - Reference Values KW - Reproducibility of Results AB - Computerized adaptive testing in personality assessment can improve efficiency by significantly reducing the number of items administered to answer an assessment question. Two approaches have been explored for adaptive testing in computerized personality assessment: item response theory and the countdown method. In this article, the authors review the literature on each and report the results of an investigation designed to explore the utility, in terms of item and time savings, and validity, in terms of correlations with external criterion measures, of an expanded countdown method-based research version of the Minnesota Multiphasic Personality Inventory-2 (MMPI-2), the MMPI-2 Computerized Adaptive Version (MMPI-2-CA). Participants were 433 undergraduate college students (170 men and 263 women). Results indicated considerable item savings and corresponding time savings for the adaptive testing modalities compared with a conventional computerized MMPI-2 administration. Furthermore, computerized adaptive administration yielded comparable results to computerized conventional administration of the MMPI-2 in terms of both test scores and their validity. Future directions for computerized adaptive personality testing are discussed. VL - 19 SN - 1040-3590 (Print) N1 - Forbey, Johnathan DBen-Porath, Yossef SResearch Support, Non-U.S. Gov'tUnited StatesPsychological assessmentPsychol Assess. 2007 Mar;19(1):14-24. ER - TY - JOUR T1 - Computerized adaptive testing for measuring development of young children JF - Statistics in Medicine Y1 - 2007 A1 - Jacobusse, G. A1 - Buuren, S. KW - *Child Development KW - *Models, Statistical KW - Child, Preschool KW - Diagnosis, Computer-Assisted/*statistics & numerical data KW - Humans KW - Netherlands AB - Developmental indicators that are used for routine measurement in The Netherlands are usually chosen to optimally identify delayed children. Measurements on the majority of children without problems are therefore quite imprecise. This study explores the use of computerized adaptive testing (CAT) to monitor the development of young children. CAT is expected to improve the measurement precision of the instrument. We do two simulation studies - one with real data and one with simulated data - to evaluate the usefulness of CAT. It is shown that CAT selects developmental indicators that maximally match the individual child, so that all children can be measured to the same precision. VL - 26 SN - 0277-6715 (Print) N1 - Jacobusse, GertBuuren, Stef vanEnglandStatistics in medicineStat Med. 2007 Jun 15;26(13):2629-38. ER - TY - JOUR T1 - Computerized adaptive testing for polytomous motivation items: Administration mode effects and a comparison with short forms JF - Applied Psychological Measurement Y1 - 2007 A1 - Hol, A. M. A1 - Vorst, H. C. M. A1 - Mellenbergh, G. J. KW - 2220 Tests & Testing KW - Adaptive Testing KW - Attitude Measurement KW - computer adaptive testing KW - Computer Assisted Testing KW - items KW - Motivation KW - polytomous motivation KW - Statistical Validity KW - Test Administration KW - Test Forms KW - Test Items AB - In a randomized experiment (n=515), a computerized and a computerized adaptive test (CAT) are compared. The item pool consists of 24 polytomous motivation items. Although items are carefully selected, calibration data show that Samejima's graded response model did not fit the data optimally. A simulation study is done to assess possible consequences of model misfit. CAT efficiency was studied by a systematic comparison of the CAT with two types of conventional fixed length short forms, which are created to be good CAT competitors. Results showed no essential administration mode effects. Efficiency analyses show that CAT outperformed the short forms in almost all aspects when results are aggregated along the latent trait scale. The real and the simulated data results are very similar, which indicate that the real data results are not affected by model misfit. (PsycINFO Database Record (c) 2007 APA ) (journal abstract) VL - 31 SN - 0146-6216 N1 - 10.1177/0146621606297314Journal; Peer Reviewed Journal; Journal Article ER - TY - JOUR T1 - Computerized Adaptive Testing for Polytomous Motivation Items: Administration Mode Effects and a Comparison With Short Forms JF - Applied Psychological Measurement Y1 - 2007 A1 - Hol, A. Michiel A1 - Vorst, Harrie C. M. A1 - Mellenbergh, Gideon J. AB -

In a randomized experiment (n = 515), a computerized and a computerized adaptive test (CAT) are compared. The item pool consists of 24 polytomous motivation items. Although items are carefully selected, calibration data show that Samejima's graded response model did not fit the data optimally. A simulation study is done to assess possible consequences of model misfit. CAT efficiency was studied by a systematic comparison of the CAT with two types of conventional fixed length short forms, which are created to be good CAT competitors. Results showed no essential administration mode effects. Efficiency analyses show that CAT outperformed the short forms in almost all aspects when results are aggregated along the latent trait scale. The real and the simulated data results are very similar, which indicate that the real data results are not affected by model misfit.

VL - 31 UR - http://apm.sagepub.com/content/31/5/412.abstract ER - TY - CHAP T1 - Computerized adaptive testing with the bifactor model Y1 - 2007 A1 - Weiss, D. J. A1 - Gibbons, R. D. CY - D. J. Weiss (Ed.). Proceedings of the 2007 GMAC Conference on Computerized Adaptive Testing. N1 - {PDF file, 159 KB} ER - TY - CHAP T1 - Computerized attribute-adaptive testing: A new computerized adaptive testing approach incorporating cognitive psychology Y1 - 2007 A1 - Zhou, J. A1 - Gierl, M. J. A1 - Cui, Y. CY - D. J. Weiss (Ed.), Proceedings of the 2007 GMAC Conference on Computerized Adaptive Testing. N1 - {PDF file, 296 KB} ER - TY - CHAP T1 - Computerized classification testing with composite hypotheses Y1 - 2007 A1 - Thompson, N. A. A1 - Ro, S. CY - D. J. Weiss (Ed.). Proceedings of the 2007 GMAC Conference on Computerized Adaptive Testing. N1 - {PDF file, 96 KB} ER - TY - Generic T1 - Computerized classification testing with composite hypotheses T2 - GMAC Conference on Computerized Adaptive Testing Y1 - 2007 A1 - Thompson, N. A. A1 - Ro, S. KW - computerized adaptive testing JF - GMAC Conference on Computerized Adaptive Testing PB - Graduate Management Admissions Council CY - St. Paul, MN N1 - Proceedings of the 2007 GMAC Conference on Computerized Adaptive Testing. Retrieved [date] from www. psych. umn. edu/psylabs/CATCentral ER - TY - JOUR T1 - Computerizing Organizational Attitude Surveys JF - Educational and Psychological Measurement Y1 - 2007 A1 - Mueller, Karsten A1 - Liebig, Christian A1 - Hattrup, Keith AB -

Two quasi-experimental field studies were conducted to evaluate the psychometric equivalence of computerized and paper-and-pencil job satisfaction measures. The present research extends previous work in the area by providing better control of common threats to validity in quasi-experimental research on test mode effects and by evaluating a more comprehensive measurement model for job attitudes. Results of both studies demonstrated substantial equivalence of the computerized measure with the paper-and-pencil version. Implications for the practical use of computerized organizational attitude surveys are discussed.

VL - 67 UR - http://epm.sagepub.com/content/67/4/658.abstract ER - TY - JOUR T1 - Conditional Item-Exposure Control in Adaptive Testing Using Item-Ineligibility Probabilities JF - Journal of Educational and Behavioral Statistics Y1 - 2007 A1 - van der Linden, Wim J. A1 - Veldkamp, Bernard P. AB -

Two conditional versions of the exposure-control method with item-ineligibility constraints for adaptive testing in van der Linden and Veldkamp (2004) are presented. The first version is for unconstrained item selection, the second for item selection with content constraints imposed by the shadow-test approach. In both versions, the exposure rates of the items are controlled using probabilities of item ineligibility given θ that adapt the exposure rates automatically to a goal value for the items in the pool. In an extensive empirical study with an adaptive version of the Law School Admission Test, the authors show how the method can be used to drive conditional exposure rates below goal values as low as 0.025. Obviously, the price to be paid for minimal exposure rates is a decrease in the accuracy of the ability estimates. This trend is illustrated with empirical data.

VL - 32 UR - http://jeb.sagepub.com/cgi/content/abstract/32/4/398 ER - TY - CONF T1 - Cutscore location and classification accuracy in computerized classification testing T2 - Paper presented at the international meeting of the Psychometric Society Y1 - 2007 A1 - Ro, S. A1 - Thompson, N. A. JF - Paper presented at the international meeting of the Psychometric Society CY - Tokyo, Japan N1 - {PDF file, 94 KB} ER - TY - ABST T1 - A CAT with personality and attitude Y1 - 2006 A1 - Hol, A. M. CY - Enschede, The Netherlands: PrintPartners Ipskamp B N1 - #HO06-01 . ER - TY - JOUR T1 - Comparing methods of assessing differential item functioning in a computerized adaptive testing environment JF - Journal of Educational Measurement Y1 - 2006 A1 - Lei, P-W. A1 - Chen, S-Y. A1 - Yu, L. KW - computerized adaptive testing KW - educational testing KW - item response theory likelihood ratio test KW - logistic regression KW - trait estimation KW - unidirectional & non-unidirectional differential item functioning AB - Mantel-Haenszel and SIBTEST, which have known difficulty in detecting non-unidirectional differential item functioning (DIF), have been adapted with some success for computerized adaptive testing (CAT). This study adapts logistic regression (LR) and the item-response-theory-likelihood-ratio test (IRT-LRT), capable of detecting both unidirectional and non-unidirectional DIF, to the CAT environment in which pretest items are assumed to be seeded in CATs but not used for trait estimation. The proposed adaptation methods were evaluated with simulated data under different sample size ratios and impact conditions in terms of Type I error, power, and specificity in identifying the form of DIF. The adapted LR and IRT-LRT procedures are more powerful than the CAT version of SIBTEST for non-unidirectional DIF detection. The good Type I error control provided by IRT-LRT under extremely unequal sample sizes and large impact is encouraging. Implications of these and other findings are discussed. all rights reserved) PB - Blackwell Publishing: United Kingdom VL - 43 SN - 0022-0655 (Print) ER - TY - JOUR T1 - Comparing Methods of Assessing Differential Item Functioning in a Computerized Adaptive Testing Environment JF - Journal of Educational Measurement Y1 - 2006 A1 - Lei, Pui-Wa A1 - Chen, Shu-Ying A1 - Yu, Lan AB -

Mantel-Haenszel and SIBTEST, which have known difficulty in detecting non-unidirectional differential item functioning (DIF), have been adapted with some success for computerized adaptive testing (CAT). This study adapts logistic regression (LR) and the item-response-theory-likelihood-ratio test (IRT-LRT), capable of detecting both unidirectional and non-unidirectional DIF, to the CAT environment in which pretest items are assumed to be seeded in CATs but not used for trait estimation. The proposed adaptation methods were evaluated with simulated data under different sample size ratios and impact conditions in terms of Type I error, power, and specificity in identifying the form of DIF. The adapted LR and IRT-LRT procedures are more powerful than the CAT version of SIBTEST for non-unidirectional DIF detection. The good Type I error control provided by IRT-LRT under extremely unequal sample sizes and large impact is encouraging. Implications of these and other findings are discussed.

VL - 43 UR - http://dx.doi.org/10.1111/j.1745-3984.2006.00015.x ER - TY - JOUR T1 - The comparison among item selection strategies of CAT with multiple-choice items JF - Acta Psychologica Sinica Y1 - 2006 A1 - Hai-qi, D. A1 - De-zhi, C. A1 - Shuliang, D. A1 - Taiping, D. KW - CAT KW - computerized adaptive testing KW - graded response model KW - item selection strategies KW - multiple choice items AB - The initial purpose of comparing item selection strategies for CAT was to increase the efficiency of tests. As studies continued, however, it was found that increasing the efficiency of item bank using was also an important goal of comparing item selection strategies. These two goals often conflicted. The key solution was to find a strategy with which both goals could be accomplished. The item selection strategies for graded response model in this study included: the average of the difficulty orders matching with the ability; the medium of the difficulty orders matching with the ability; maximum information; A stratified (average); and A stratified (medium). The evaluation indexes used for comparison included: the bias of ability estimates for the true; the standard error of ability estimates; the average items which the examinees have administered; the standard deviation of the frequency of items selected; and sum of the indices weighted. Using the Monte Carlo simulation method, we obtained some data and computer iterated the data 20 times each under the conditions that the item difficulty parameters followed the normal distribution and even distribution. The results were as follows; The results indicated that no matter difficulty parameters followed the normal distribution or even distribution. Every type of item selection strategies designed in this research had its strong and weak points. In general evaluation, under the condition that items were stratified appropriately, A stratified (medium) (ASM) had the best effect. (PsycINFO Database Record (c) 2007 APA, all rights reserved) PB - Science Press: China VL - 38 SN - 0439-755X (Print) ER - TY - CONF T1 - A comparison of online calibration methods for a CAT T2 - Presented at the National Council on Measurement on Education Y1 - 2006 A1 - Morgan, D. L. A1 - Way, W. D. A1 - Augemberg, K.E. JF - Presented at the National Council on Measurement on Education CY - San Francisco, CA ER - TY - JOUR T1 - Comparison of the Psychometric Properties of Several Computer-Based Test Designs for Credentialing Exams With Multiple Purposes JF - Applied Measurement in Education Y1 - 2006 A1 - Jodoin, Michael G. A1 - Zenisky, April A1 - Hambleton, Ronald K. VL - 19 UR - http://www.tandfonline.com/doi/abs/10.1207/s15324818ame1903_3 ER - TY - JOUR T1 - Computer adaptive testing improved accuracy and precision of scores over random item selection in a physical functioning item bank JF - Journal of Clinical Epidemiology Y1 - 2006 A1 - Haley, S. M. A1 - Ni, P. A1 - Hambleton, R. K. A1 - Slavin, M. D. A1 - Jette, A. M. KW - *Recovery of Function KW - Activities of Daily Living KW - Adolescent KW - Adult KW - Aged KW - Aged, 80 and over KW - Confidence Intervals KW - Factor Analysis, Statistical KW - Female KW - Humans KW - Male KW - Middle Aged KW - Outcome Assessment (Health Care)/*methods KW - Rehabilitation/*standards KW - Reproducibility of Results KW - Software AB - BACKGROUND AND OBJECTIVE: Measuring physical functioning (PF) within and across postacute settings is critical for monitoring outcomes of rehabilitation; however, most current instruments lack sufficient breadth and feasibility for widespread use. Computer adaptive testing (CAT), in which item selection is tailored to the individual patient, holds promise for reducing response burden, yet maintaining measurement precision. We calibrated a PF item bank via item response theory (IRT), administered items with a post hoc CAT design, and determined whether CAT would improve accuracy and precision of score estimates over random item selection. METHODS: 1,041 adults were interviewed during postacute care rehabilitation episodes in either hospital or community settings. Responses for 124 PF items were calibrated using IRT methods to create a PF item bank. We examined the accuracy and precision of CAT-based scores compared to a random selection of items. RESULTS: CAT-based scores had higher correlations with the IRT-criterion scores, especially with short tests, and resulted in narrower confidence intervals than scores based on a random selection of items; gains, as expected, were especially large for low and high performing adults. CONCLUSION: The CAT design may have important precision and efficiency advantages for point-of-care functional assessment in rehabilitation practice settings. VL - 59 SN - 0895-4356 (Print) N1 - Haley, Stephen MNi, PengshengHambleton, Ronald KSlavin, Mary DJette, Alan MK02 hd45354-01/hd/nichdR01 hd043568/hd/nichdComparative StudyResearch Support, N.I.H., ExtramuralResearch Support, U.S. Gov't, Non-P.H.S.EnglandJournal of clinical epidemiologyJ Clin Epidemiol. 2006 Nov;59(11):1174-82. Epub 2006 Jul 11. ER - TY - JOUR T1 - Computer adaptive testing improved accuracy and precision of scores over random item selection in a physical functioning item bank JF - Journal of Clinical Epidemiology Y1 - 2006 A1 - Haley, S. A1 - Ni, P. A1 - Hambleton, R. K. A1 - Slavin, M. A1 - Jette, A. VL - 59 SN - 08954356 ER - TY - CHAP T1 - Computer-based testing T2 - Handbook of multimethod measurement in psychology Y1 - 2006 A1 - F Drasgow A1 - Chuah, S. C. KW - Adaptive Testing computerized adaptive testing KW - Computer Assisted Testing KW - Experimentation KW - Psychometrics KW - Theories AB - (From the chapter) There has been a proliferation of research designed to explore and exploit opportunities provided by computer-based assessment. This chapter provides an overview of the diverse efforts by researchers in this area. It begins by describing how paper-and-pencil tests can be adapted for administration by computers. Computerization provides the important advantage that items can be selected so they are of appropriate difficulty for each examinee. Some of the psychometric theory needed for computerized adaptive testing is reviewed. Then research on innovative computerized assessments is summarized. These assessments go beyond multiple-choice items by using formats made possible by computerization. Then some hardware and software issues are described, and finally, directions for future work are outlined. (PsycINFO Database Record (c) 2006 APA ) JF - Handbook of multimethod measurement in psychology PB - American Psychological Association CY - Washington D.C. USA VL - xiv N1 - Using Smart Source ParsingHandbook of multimethod measurement in psychology. (pp. 87-100). Washington, DC : American Psychological Association, [URL:http://www.apa.org/books]. xiv, 553 pp ER - TY - JOUR T1 - Computerized adaptive testing for follow-up after discharge from inpatient rehabilitation: I. Activity outcomes JF - Archives of Physical Medicine and Rehabilitation Y1 - 2006 A1 - Haley, S. M. A1 - Siebens, H. A1 - Coster, W. J. A1 - Tao, W. A1 - Black-Schaffer, R. M. A1 - Gandek, B. A1 - Sinclair, S. J. A1 - Ni, P. KW - *Activities of Daily Living KW - *Adaptation, Physiological KW - *Computer Systems KW - *Questionnaires KW - Adult KW - Aged KW - Aged, 80 and over KW - Chi-Square Distribution KW - Factor Analysis, Statistical KW - Female KW - Humans KW - Longitudinal Studies KW - Male KW - Middle Aged KW - Outcome Assessment (Health Care)/*methods KW - Patient Discharge KW - Prospective Studies KW - Rehabilitation/*standards KW - Subacute Care/*standards AB - OBJECTIVE: To examine score agreement, precision, validity, efficiency, and responsiveness of a computerized adaptive testing (CAT) version of the Activity Measure for Post-Acute Care (AM-PAC-CAT) in a prospective, 3-month follow-up sample of inpatient rehabilitation patients recently discharged home. DESIGN: Longitudinal, prospective 1-group cohort study of patients followed approximately 2 weeks after hospital discharge and then 3 months after the initial home visit. SETTING: Follow-up visits conducted in patients' home setting. PARTICIPANTS: Ninety-four adults who were recently discharged from inpatient rehabilitation, with diagnoses of neurologic, orthopedic, and medically complex conditions. INTERVENTIONS: Not applicable. MAIN OUTCOME MEASURES: Summary scores from AM-PAC-CAT, including 3 activity domains of movement and physical, personal care and instrumental, and applied cognition were compared with scores from a traditional fixed-length version of the AM-PAC with 66 items (AM-PAC-66). RESULTS: AM-PAC-CAT scores were in good agreement (intraclass correlation coefficient model 3,1 range, .77-.86) with scores from the AM-PAC-66. On average, the CAT programs required 43% of the time and 33% of the items compared with the AM-PAC-66. Both formats discriminated across functional severity groups. The standardized response mean (SRM) was greater for the movement and physical fixed form than the CAT; the effect size and SRM of the 2 other AM-PAC domains showed similar sensitivity between CAT and fixed formats. Using patients' own report as an anchor-based measure of change, the CAT and fixed length formats were comparable in responsiveness to patient-reported change over a 3-month interval. CONCLUSIONS: Accurate estimates for functional activity group-level changes can be obtained from CAT administrations, with a considerable reduction in administration time. VL - 87 SN - 0003-9993 (Print) N1 - Haley, Stephen MSiebens, HilaryCoster, Wendy JTao, WeiBlack-Schaffer, Randie MGandek, BarbaraSinclair, Samuel JNi, PengshengK0245354-01/phsR01 hd043568/hd/nichdResearch Support, N.I.H., ExtramuralUnited StatesArchives of physical medicine and rehabilitationArch Phys Med Rehabil. 2006 Aug;87(8):1033-42. ER - TY - JOUR T1 - Computerized adaptive testing of diabetes impact: a feasibility study of Hispanics and non-Hispanics in an active clinic population JF - Quality of Life Research Y1 - 2006 A1 - Schwartz, C. A1 - Welch, G. A1 - Santiago-Kelley, P. A1 - Bode, R. A1 - Sun, X. KW - *Computers KW - *Hispanic Americans KW - *Quality of Life KW - Adult KW - Aged KW - Data Collection/*methods KW - Diabetes Mellitus/*psychology KW - Feasibility Studies KW - Female KW - Humans KW - Language KW - Male KW - Middle Aged AB - BACKGROUND: Diabetes is a leading cause of death and disability in the US and is twice as common among Hispanic Americans as non-Hispanics. The societal costs of diabetes provide an impetus for developing tools that can improve patient care and delay or prevent diabetes complications. METHODS: We implemented a feasibility study of a Computerized Adaptive Test (CAT) to measure diabetes impact using a sample of 103 English- and 97 Spanish-speaking patients (mean age = 56.5, 66.5% female) in a community medical center with a high proportion of minority patients (28% African-American). The 37 items of the Diabetes Impact Survey were translated using forward-backward translation and cognitive debriefing. Participants were randomized to receive either the full-length tool or the Diabetes-CAT first, in the patient's native language. RESULTS: The number of items and the amount of time to complete the survey for the CAT was reduced to one-sixth the amount for the full-length tool in both languages, across disease severity. Confirmatory Factor Analysis confirmed that the Diabetes Impact Survey is unidimensional. The Diabetes-CAT demonstrated acceptable internal consistency reliability, construct validity, and discriminant validity in the overall sample, although subgroup analyses suggested that the English sample data evidenced higher levels of reliability and validity than the Spanish sample and issues with discriminant validity in the Spanish sample. Differential Item Function analysis revealed differences in responses tendencies by language group in 3 of the 37 items. Participant interviews suggested that the Spanish-speaking patients generally preferred the paper survey to the computer-assisted tool, and were twice as likely to experience difficulties understanding the items. CONCLUSIONS: While the Diabetes-CAT demonstrated clear advantages in reducing respondent burden as compared to the full-length tool, simplifying the item bank will be necessary for enhancing the feasibility of the Diabetes-CAT for use with low literacy patients. VL - 15 SN - 0962-9343 (Print) N1 - Schwartz, CarolynWelch, GarrySantiago-Kelley, PaulaBode, RitaSun, Xiaowu1 r43 dk066874-01/dk/niddkResearch Support, N.I.H., ExtramuralNetherlandsQuality of life research : an international journal of quality of life aspects of treatment, care and rehabilitationQual Life Res. 2006 Nov;15(9):1503-18. Epub 2006 Sep 26. ER - TY - JOUR T1 - Computerized adaptive testing under nonparametric IRT models JF - Psychometrika Y1 - 2006 A1 - Xu, X. A1 - Douglas, J. VL - 71 ER - TY - CONF T1 - Constraints-weighted information method for item selection of severely constrained computerized adaptive testing T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 2006 A1 - Cheng, Y A1 - Chang, Hua-Hua A1 - Wang, X. B. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - San Francisco ER - TY - JOUR T1 - A closer look at using judgments of item difficulty to change answers on computerized adaptive tests JF - Journal of Educational Measurement Y1 - 2005 A1 - Vispoel, W. P. A1 - Clough, S. J. A1 - Bleiler, T. VL - 42 ER - TY - BOOK T1 - A comparison of adaptive mastery testing using testlets with the 3-parameter logistic model Y1 - 2005 A1 - Jacobs-Cassuto, M.S. CY - Unpublished doctoral dissertation, University of Minnesota, Minneapolis, MN ER - TY - JOUR T1 - A comparison of item-selection methods for adaptive tests with content constraints JF - Journal of Educational Measurement Y1 - 2005 A1 - van der Linden, W. J. KW - Adaptive Testing KW - Algorithms KW - content constraints KW - item selection method KW - shadow test approach KW - spiraling method KW - weighted deviations method AB - In test assembly, a fundamental difference exists between algorithms that select a test sequentially or simultaneously. Sequential assembly allows us to optimize an objective function at the examinee's ability estimate, such as the test information function in computerized adaptive testing. But it leads to the non-trivial problem of how to realize a set of content constraints on the test—a problem more naturally solved by a simultaneous item-selection method. Three main item-selection methods in adaptive testing offer solutions to this dilemma. The spiraling method moves item selection across categories of items in the pool proportionally to the numbers needed from them. Item selection by the weighted-deviations method (WDM) and the shadow test approach (STA) is based on projections of the future consequences of selecting an item. These two methods differ in that the former calculates a projection of a weighted sum of the attributes of the eventual test and the latter a projection of the test itself. The pros and cons of these methods are analyzed. An empirical comparison between the WDM and STA was conducted for an adaptive version of the Law School Admission Test (LSAT), which showed equally good item-exposure rates but violations of some of the constraints and larger bias and inaccuracy of the ability estimator for the WDM. PB - Blackwell Publishing: United Kingdom VL - 42 SN - 0022-0655 (Print) ER - TY - JOUR T1 - A comparison of item-selection methods for adaptive tests with content constraints JF - Journal of Educational Measurement Y1 - 2005 A1 - van der Linden, W. J. VL - 42 ER - TY - JOUR T1 - Computer adaptive testing JF - Journal of Applied Measurement Y1 - 2005 A1 - Gershon, R. C. VL - 6 ER - TY - JOUR T1 - Computer adaptive testing JF - Journal of Applied Measurement Y1 - 2005 A1 - Gershon, R. C. KW - *Internet KW - *Models, Statistical KW - *User-Computer Interface KW - Certification KW - Health Surveys KW - Humans KW - Licensure KW - Microcomputers KW - Quality of Life AB - The creation of item response theory (IRT) and Rasch models, inexpensive accessibility to high speed desktop computers, and the growth of the Internet, has led to the creation and growth of computerized adaptive testing or CAT. This form of assessment is applicable for both high stakes tests such as certification or licensure exams, as well as health related quality of life surveys. This article discusses the historical background of CAT including its many advantages over conventional (typically paper and pencil) alternatives. The process of CAT is then described including descriptions of the specific differences of using CAT based upon 1-, 2- and 3-parameter IRT and various Rasch models. Numerous specific topics describing CAT in practice are described including: initial item selection, content balancing, test difficulty, test length and stopping rules. The article concludes with the author's reflections regarding the future of CAT. VL - 6 SN - 1529-7713 (Print) N1 - Gershon, Richard CReviewUnited StatesJournal of applied measurementJ Appl Meas. 2005;6(1):109-27. ER - TY - JOUR T1 - A computer adaptive testing approach for assessing physical functioning in children and adolescents JF - Developmental Medicine and Child Neuropsychology Y1 - 2005 A1 - Haley, S. M. A1 - Ni, P. A1 - Fragala-Pinkham, M. A. A1 - Skrinar, A. M. A1 - Corzo, D. KW - *Computer Systems KW - Activities of Daily Living KW - Adolescent KW - Age Factors KW - Child KW - Child Development/*physiology KW - Child, Preschool KW - Computer Simulation KW - Confidence Intervals KW - Demography KW - Female KW - Glycogen Storage Disease Type II/physiopathology KW - Health Status Indicators KW - Humans KW - Infant KW - Infant, Newborn KW - Male KW - Motor Activity/*physiology KW - Outcome Assessment (Health Care)/*methods KW - Reproducibility of Results KW - Self Care KW - Sensitivity and Specificity AB - The purpose of this article is to demonstrate: (1) the accuracy and (2) the reduction in amount of time and effort in assessing physical functioning (self-care and mobility domains) of children and adolescents using computer-adaptive testing (CAT). A CAT algorithm selects questions directly tailored to the child's ability level, based on previous responses. Using a CAT algorithm, a simulation study was used to determine the number of items necessary to approximate the score of a full-length assessment. We built simulated CAT (5-, 10-, 15-, and 20-item versions) for self-care and mobility domains and tested their accuracy in a normative sample (n=373; 190 males, 183 females; mean age 6y 11mo [SD 4y 2m], range 4mo to 14y 11mo) and a sample of children and adolescents with Pompe disease (n=26; 21 males, 5 females; mean age 6y 1mo [SD 3y 10mo], range 5mo to 14y 10mo). Results indicated that comparable score estimates (based on computer simulations) to the full-length tests can be achieved in a 20-item CAT version for all age ranges and for normative and clinical samples. No more than 13 to 16% of the items in the full-length tests were needed for any one administration. These results support further consideration of using CAT programs for accurate and efficient clinical assessments of physical functioning. VL - 47 SN - 0012-1622 (Print) N1 - Haley, Stephen MNi, PengshengFragala-Pinkham, Maria ASkrinar, Alison MCorzo, DeyaniraComparative StudyResearch Support, Non-U.S. Gov'tEnglandDevelopmental medicine and child neurologyDev Med Child Neurol. 2005 Feb;47(2):113-20. ER - TY - CHAP T1 - Computer adaptive testing quality requirements Y1 - 2005 A1 - Economides, A. A. CY - Proceedings E-Learn 2005 World Conference on E-Learning in Corporate, Government, Healthcare, and Higher Education, pp. 288-295, Vancouver, Canada, AACE, October 2005. ER - TY - JOUR T1 - A computer-assisted test design and diagnosis system for use by classroom teachers JF - Journal of Computer Assisted Learning Y1 - 2005 A1 - He, Q. A1 - Tymms, P. KW - Computer Assisted Testing KW - Computer Software KW - Diagnosis KW - Educational Measurement KW - Teachers AB - Computer-assisted assessment (CAA) has become increasingly important in education in recent years. A variety of computer software systems have been developed to help assess the performance of students at various levels. However, such systems are primarily designed to provide objective assessment of students and analysis of test items, and focus has been mainly placed on higher and further education. Although there are commercial professional systems available for use by primary and secondary educational institutions, such systems are generally expensive and require skilled expertise to operate. In view of the rapid progress made in the use of computer-based assessment for primary and secondary students by education authorities here in the UK and elsewhere, there is a need to develop systems which are economic and easy to use and can provide the necessary information that can help teachers improve students' performance. This paper presents the development of a software system that provides a range of functions including generating items and building item banks, designing tests, conducting tests on computers and analysing test results. Specifically, the system can generate information on the performance of students and test items that can be easily used to identify curriculum areas where students are under performing. A case study based on data collected from five secondary schools in Hong Kong involved in the Curriculum, Evaluation and Management Centre's Middle Years Information System Project, Durham University, UK, has been undertaken to demonstrate the use of the system for diagnostic and performance analysis. (PsycINFO Database Record (c) 2006 APA ) (journal abstract) VL - 21 ER - TY - JOUR T1 - Computerized adaptive testing: a mixture item selection approach for constrained situations JF - British Journal of Mathematical and Statistical Psychology Y1 - 2005 A1 - Leung, C. K. A1 - Chang, Hua-Hua A1 - Hau, K. T. KW - *Computer-Aided Design KW - *Educational Measurement/methods KW - *Models, Psychological KW - Humans KW - Psychometrics/methods AB - In computerized adaptive testing (CAT), traditionally the most discriminating items are selected to provide the maximum information so as to attain the highest efficiency in trait (theta) estimation. The maximum information (MI) approach typically results in unbalanced item exposure and hence high item-overlap rates across examinees. Recently, Yi and Chang (2003) proposed the multiple stratification (MS) method to remedy the shortcomings of MI. In MS, items are first sorted according to content, then difficulty and finally discrimination parameters. As discriminating items are used strategically, MS offers a better utilization of the entire item pool. However, for testing with imposed non-statistical constraints, this new stratification approach may not maintain its high efficiency. Through a series of simulation studies, this research explored the possible benefits of a mixture item selection approach (MS-MI), integrating the MS and MI approaches, in testing with non-statistical constraints. In all simulation conditions, MS consistently outperformed the other two competing approaches in item pool utilization, while the MS-MI and the MI approaches yielded higher measurement efficiency and offered better conformity to the constraints. Furthermore, the MS-MI approach was shown to perform better than MI on all evaluation criteria when control of item exposure was imposed. VL - 58 SN - 0007-1102 (Print)0007-1102 (Linking) N1 - Leung, Chi-KeungChang, Hua-HuaHau, Kit-TaiEnglandBr J Math Stat Psychol. 2005 Nov;58(Pt 2):239-57. ER - TY - JOUR T1 - Computerized adaptive testing with the partial credit model: Estimation procedures, population distributions, and item pool characteristics JF - Applied Psychological Measurement Y1 - 2005 A1 - Gorin, J. A1 - Dodd, B. G. A1 - Fitzpatrick, S. J. A1 - Shieh, Y. Y. VL - 29 ER - TY - JOUR T1 - Computerized Adaptive Testing With the Partial Credit Model: Estimation Procedures, Population Distributions, and Item Pool Characteristics JF - Applied Psychological Measurement Y1 - 2005 A1 - Gorin, Joanna S. A1 - Dodd, Barbara G. A1 - Fitzpatrick, Steven J. A1 - Shieh, Yann Yann AB -

The primary purpose of this research is to examine the impact of estimation methods, actual latent trait distributions, and item pool characteristics on the performance of a simulated computerized adaptive testing (CAT) system. In this study, three estimation procedures are compared for accuracy of estimation: maximum likelihood estimation (MLE), expected a priori (EAP), and Warm's weighted likelihood estimation (WLE). Some research has shown that MLE and EAP perform equally well under certain conditions in polytomous CAT systems, such that they match the actual latent trait distribution. However, little research has compared these methods when prior estimates of. distributions are extremely poor. In general, it appears that MLE, EAP, and WLE procedures perform equally well when using an optimal item pool. However, the use of EAP procedures may be advantageous under nonoptimal testing conditions when the item pool is not appropriately matched to the examinees.

VL - 29 UR - http://apm.sagepub.com/content/29/6/433.abstract ER - TY - JOUR T1 - Computerized adaptive testing with the partial credit model: Estimation procedures, population distributions, and item pool characteristics JF - Applied Psychological Measurement Y1 - 2005 A1 - Gorin, J. S. VL - 29 SN - 0146-6216 ER - TY - Generic T1 - Computerizing statewide assessments in Minnesota: A report on the feasibility of converting the Minnesota Comprehensive Assessments to a computerized adaptive format Y1 - 2005 A1 - Peterson, K.A. A1 - Davison. M. L. A1 - Hjelseth, L. PB - Office of Educational Accountability, College of Education and Human Development, University of Minnesota ER - TY - ABST T1 - Constraining item exposure in computerized adaptive testing with shadow tests Y1 - 2005 A1 - van der Linden, W. J. A1 - Veldkamp, B. P. CY - Law School Admission Council Computerized Testing Report 02-03 ER - TY - JOUR T1 - Constructing a Computerized Adaptive Test for University Applicants With Disabilities JF - Applied Measurement in Education Y1 - 2005 A1 - Moshinsky, Avital A1 - Kazin, Cathrael VL - 18 UR - http://www.tandfonline.com/doi/abs/10.1207/s15324818ame1804_3 ER - TY - JOUR T1 - Contemporary measurement techniques for rehabilitation outcomes assessment JF - Journal of Rehabilitation Medicine Y1 - 2005 A1 - Jette, A. M. A1 - Haley, S. M. KW - *Disability Evaluation KW - Activities of Daily Living/classification KW - Disabled Persons/classification/*rehabilitation KW - Health Status Indicators KW - Humans KW - Outcome Assessment (Health Care)/*methods/standards KW - Recovery of Function KW - Research Support, N.I.H., Extramural KW - Research Support, U.S. Gov't, Non-P.H.S. KW - Sensitivity and Specificity computerized adaptive testing AB - In this article, we review the limitations of traditional rehabilitation functional outcome instruments currently in use within the rehabilitation field to assess Activity and Participation domains as defined by the International Classification of Function, Disability, and Health. These include a narrow scope of functional outcomes, data incompatibility across instruments, and the precision vs feasibility dilemma. Following this, we illustrate how contemporary measurement techniques, such as item response theory methods combined with computer adaptive testing methodology, can be applied in rehabilitation to design functional outcome instruments that are comprehensive in scope, accurate, allow for compatibility across instruments, and are sensitive to clinically important change without sacrificing their feasibility. Finally, we present some of the pressing challenges that need to be overcome to provide effective dissemination and training assistance to ensure that current and future generations of rehabilitation professionals are familiar with and skilled in the application of contemporary outcomes measurement. VL - 37 N1 - 1650-1977 (Print)Journal ArticleReview ER - TY - JOUR T1 - Controlling item exposure and test overlap in computerized adaptive testing JF - Applied Psychological Measurement Y1 - 2005 A1 - Chen, S-Y. A1 - Lei, P-W. KW - Adaptive Testing KW - Computer Assisted Testing KW - Item Content (Test) computerized adaptive testing AB - This article proposes an item exposure control method, which is the extension of the Sympson and Hetter procedure and can provide item exposure control at both the item and test levels. Item exposure rate and test overlap rate are two indices commonly used to track item exposure in computerized adaptive tests. By considering both indices, item exposure can be monitored at both the item and test levels. To control the item exposure rate and test overlap rate simultaneously, the modified procedure attempted to control not only the maximum value but also the variance of item exposure rates. Results indicated that the item exposure rate and test overlap rate could be controlled simultaneously by implementing the modified procedure. Item exposure control was improved and precision of trait estimation decreased when a prespecified maximum test overlap rate was stringent. (PsycINFO Database Record (c) 2005 APA ) (journal abstract) VL - 29 ER - TY - JOUR T1 - Controlling item exposure and test overlap in computerized adaptive testing JF - Applied Psychological Measurement Y1 - 2005 A1 - Chen, S.Y. A1 - Lei, P. W. VL - 29(2) ER - TY - JOUR T1 - Controlling Item Exposure and Test Overlap in Computerized Adaptive Testing JF - Applied Psychological Measurement Y1 - 2005 A1 - Chen, Shu-Ying A1 - Lei, Pui-Wa AB -

This article proposes an item exposure control method, which is the extension of the Sympson and Hetter procedure and can provide item exposure control at both the item and test levels. Item exposure rate and test overlap rate are two indices commonly used to track item exposure in computerized adaptive tests. By considering both indices, item exposure can be monitored at both the item and test levels. To control the item exposure rate and test overlap rate simultaneously, the modified procedure attempted to control not only the maximum value but also the variance of item exposure rates. Results indicated that the item exposure rate and test overlap rate could be controlled simultaneously by implementing the modified procedure. Item exposure control was improved and precision of trait estimation decreased when a prespecified maximum test overlap rate was stringent.

VL - 29 UR - http://apm.sagepub.com/content/29/3/204.abstract ER - TY - CONF T1 - Combining computer adaptive testing technology with cognitively diagnostic assessment T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 2004 A1 - McGlohen, MK A1 - Chang, Hua-Hua A1 - Wills, J. T. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - San Diego CA N1 - {PDF file, 782 KB} ER - TY - JOUR T1 - Computer adaptive testing: a strategy for monitoring stroke rehabilitation across settings JF - Stroke Rehabilitation Y1 - 2004 A1 - Andres, P. L. A1 - Black-Schaffer, R. M. A1 - Ni, P. A1 - Haley, S. M. KW - *Computer Simulation KW - *User-Computer Interface KW - Adult KW - Aged KW - Aged, 80 and over KW - Cerebrovascular Accident/*rehabilitation KW - Disabled Persons/*classification KW - Female KW - Humans KW - Male KW - Middle Aged KW - Monitoring, Physiologic/methods KW - Severity of Illness Index KW - Task Performance and Analysis AB - Current functional assessment instruments in stroke rehabilitation are often setting-specific and lack precision, breadth, and/or feasibility. Computer adaptive testing (CAT) offers a promising potential solution by providing a quick, yet precise, measure of function that can be used across a broad range of patient abilities and in multiple settings. CAT technology yields a precise score by selecting very few relevant items from a large and diverse item pool based on each individual's responses. We demonstrate the potential usefulness of a CAT assessment model with a cross-sectional sample of persons with stroke from multiple rehabilitation settings. VL - 11 SN - 1074-9357 (Print) N1 - Andres, Patricia LBlack-Schaffer, Randie MNi, PengshengHaley, Stephen MR01 hd43568/hd/nichdEvaluation StudiesResearch Support, U.S. Gov't, Non-P.H.S.Research Support, U.S. Gov't, P.H.S.United StatesTopics in stroke rehabilitationTop Stroke Rehabil. 2004 Spring;11(2):33-9. ER - TY - CONF T1 - Computer adaptive testing and the No Child Left Behind Act T2 - Paper presented at the annual meeting of the American Educational Research Association Y1 - 2004 A1 - Kingsbury, G. G. A1 - Hauser, C. JF - Paper presented at the annual meeting of the American Educational Research Association CY - San Diego CA N1 - {PDF file, 117 KB} ER - TY - CHAP T1 - Computer-adaptive testing Y1 - 2004 A1 - Luecht, RM CY - B. Everett, and D. Howell (Eds.), Encyclopedia of statistics in behavioral science. New York: Wiley. ER - TY - ABST T1 - Computer-based test designs with optimal and non-optimal tests for making pass-fail decisions Y1 - 2004 A1 - Hambleton, R. K. A1 - Xing, D. CY - Research Report, University of Massachusetts, Amherst, MA ER - TY - JOUR T1 - A computerized adaptive knowledge test as an assessment tool in general practice: a pilot study JF - Medical Teacher Y1 - 2004 A1 - Roex, A. A1 - Degryse, J. KW - *Computer Systems KW - Algorithms KW - Educational Measurement/*methods KW - Family Practice/*education KW - Humans KW - Pilot Projects AB - Advantageous to assessment in many fields, CAT (computerized adaptive testing) use in general practice has been scarce. In adapting CAT to general practice, the basic assumptions of item response theory and the case specificity must be taken into account. In this context, this study first evaluated the feasibility of converting written extended matching tests into CAT. Second, it questioned the content validity of CAT. A stratified sample of students was invited to participate in the pilot study. The items used in this test, together with their parameters, originated from the written test. The detailed test paths of the students were retained and analysed thoroughly. Using the predefined pass-fail standard, one student failed the test. There was a positive correlation between the number of items and the candidate's ability level. The majority of students were presented with questions in seven of the 10 existing domains. Although proved to be a feasible test format, CAT cannot substitute for the existing high-stakes large-scale written test. It may provide a reliable instrument for identifying candidates who are at risk of failing in the written test. VL - 26 N1 - 0142-159xJournal Article ER - TY - JOUR T1 - Computerized adaptive measurement of depression: A simulation study JF - BMC Psychiatry Y1 - 2004 A1 - Gardner, W. A1 - Shear, K. A1 - Kelleher, K. J. A1 - Pajer, K. A. A1 - Mammen, O. A1 - Buysse, D. A1 - Frank, E. KW - *Computer Simulation KW - Adult KW - Algorithms KW - Area Under Curve KW - Comparative Study KW - Depressive Disorder/*diagnosis/epidemiology/psychology KW - Diagnosis, Computer-Assisted/*methods/statistics & numerical data KW - Factor Analysis, Statistical KW - Female KW - Humans KW - Internet KW - Male KW - Mass Screening/methods KW - Patient Selection KW - Personality Inventory/*statistics & numerical data KW - Pilot Projects KW - Prevalence KW - Psychiatric Status Rating Scales/*statistics & numerical data KW - Psychometrics KW - Research Support, Non-U.S. Gov't KW - Research Support, U.S. Gov't, P.H.S. KW - Severity of Illness Index KW - Software AB - Background: Efficient, accurate instruments for measuring depression are increasingly importantin clinical practice. We developed a computerized adaptive version of the Beck DepressionInventory (BDI). We examined its efficiency and its usefulness in identifying Major DepressiveEpisodes (MDE) and in measuring depression severity.Methods: Subjects were 744 participants in research studies in which each subject completed boththe BDI and the SCID. In addition, 285 patients completed the Hamilton Depression Rating Scale.Results: The adaptive BDI had an AUC as an indicator of a SCID diagnosis of MDE of 88%,equivalent to the full BDI. The adaptive BDI asked fewer questions than the full BDI (5.6 versus 21items). The adaptive latent depression score correlated r = .92 with the BDI total score and thelatent depression score correlated more highly with the Hamilton (r = .74) than the BDI total scoredid (r = .70).Conclusions: Adaptive testing for depression may provide greatly increased efficiency withoutloss of accuracy in identifying MDE or in measuring depression severity. VL - 4 ER - TY - CHAP T1 - Computerized adaptive testing Y1 - 2004 A1 - Segall, D. O. CY - Encyclopedia of social measurement. Academic Press. N1 - {PDF file, 180 KB} ER - TY - CHAP T1 - Computerized adaptive testing and item banking Y1 - 2004 A1 - Bjorner, J. B. A1 - Kosinski, M. A1 - Ware, J. E A1 - Jr. CY - P. M. Fayers and R. D. Hays (Eds.) Assessing Quality of Life. Oxford: Oxford University Press. N1 - {PDF file 371 KB} ER - TY - JOUR T1 - Computerized Adaptive Testing for Effective and Efficient Measurement in Counseling and Education JF - Measurement and Evaluation in Counseling and Development Y1 - 2004 A1 - Weiss, D. J. VL - 37 ER - TY - JOUR T1 - Computerized adaptive testing with multiple-form structures JF - Applied Psychological Measurement Y1 - 2004 A1 - Armstrong, R. D. A1 - Jones, D. H. A1 - Koppel, N. B. A1 - Pashley, P. J. KW - computerized adaptive testing KW - Law School Admission Test KW - multiple-form structure KW - testlets AB - A multiple-form structure (MFS) is an ordered collection or network of testlets (i.e., sets of items). An examinee's progression through the network of testlets is dictated by the correctness of an examinee's answers, thereby adapting the test to his or her trait level. The collection of paths through the network yields the set of all possible test forms, allowing test specialists the opportunity to review them before they are administered. Also, limiting the exposure of an individual MFS to a specific period of time can enhance test security. This article provides an overview of methods that have been developed to generate parallel MFSs. The approach is applied to the assembly of an experimental computerized Law School Admission Test (LSAT). (PsycINFO Database Record (c) 2007 APA, all rights reserved) PB - Sage Publications: US VL - 28 SN - 0146-6216 (Print) ER - TY - JOUR T1 - Computerized Adaptive Testing With Multiple-Form Structures JF - Applied Psychological Measurement Y1 - 2004 A1 - Armstrong, Ronald D. A1 - Jones, Douglas H. A1 - Koppel, Nicole B. A1 - Pashley, Peter J. AB -

A multiple-form structure (MFS) is an orderedcollection or network of testlets (i.e., sets of items).An examinee’s progression through the networkof testlets is dictated by the correctness of anexaminee’s answers, thereby adapting the test tohis or her trait level. The collection of pathsthrough the network yields the set of all possibletest forms, allowing test specialists the opportunityto review them before they are administered. Also,limiting the exposure of an individual MFS to aspecific period of time can enhance test security.This article provides an overview of methods thathave been developed to generate parallel MFSs.The approach is applied to the assembly of anexperimental computerized Law School Admission Test (LSAT).

VL - 28 UR - http://apm.sagepub.com/content/28/3/147.abstract ER - TY - JOUR T1 - Computers in clinical assessment: Historical developments, present status, and future challenges JF - Journal of Clinical Psychology Y1 - 2004 A1 - Butcher, J. N. A1 - Perry, J. L. A1 - Hahn, J. A. KW - clinical assessment KW - computerized testing method KW - Internet KW - psychological assessment services AB - Computerized testing methods have long been regarded as a potentially powerful asset for providing psychological assessment services. Ever since computers were first introduced and adapted to the field of assessment psychology in the 1950s, they have been a valuable aid for scoring, data processing, and even interpretation of test results. The history and status of computer-based personality and neuropsychological tests are discussed in this article. Several pertinent issues involved in providing test interpretation by computer are highlighted. Advances in computer-based test use, such as computerized adaptive testing, are described and problems noted. Today, there is great interest in expanding the availability of psychological assessment applications on the Internet. Although these applications show great promise, there are a number of problems associated with providing psychological tests on the Internet that need to be addressed by psychologists before the Internet can become a major medium for psychological service delivery. (PsycINFO Database Record (c) 2007 APA, all rights reserved) PB - John Wiley & Sons: US VL - 60 SN - 0021-9762 (Print); 1097-4679 (Electronic) ER - TY - JOUR T1 - Constraining Item Exposure in Computerized Adaptive Testing With Shadow Tests JF - Journal of Educational and Behavioral Statistics Y1 - 2004 A1 - van der Linden, Wim J. A1 - Veldkamp, Bernard P. AB -

Item-exposure control in computerized adaptive testing is implemented by imposing item-ineligibility constraints on the assembly process of the shadow tests. The method resembles Sympson and Hetter’s (1985) method of item-exposure control in that the decisions to impose the constraints are probabilistic. The method does not, however, require time-consuming simulation studies to set values for control parameters before the operational use of the test. Instead, it can set the probabilities of item ineligibility adaptively during the test using the actual item-exposure rates. An empirical study using an item pool from the Law School Admission Test showed that application of the method yielded perfect control of the item-exposure rates and had negligible impact on the bias and mean-squared error functions of the ability estimator.

VL - 29 UR - http://jeb.sagepub.com/cgi/content/abstract/29/3/273 ER - TY - JOUR T1 - Constraining item exposure in computerized adaptive testing with shadow tests JF - Journal of Educational and Behavioral Statistics Y1 - 2004 A1 - van der Linden, W. J. A1 - Veldkamp, B. P. KW - computerized adaptive testing KW - item exposure control KW - item ineligibility constraints KW - Probability KW - shadow tests AB - Item-exposure control in computerized adaptive testing is implemented by imposing item-ineligibility constraints on the assembly process of the shadow tests. The method resembles Sympson and Hetter’s (1985) method of item-exposure control in that the decisions to impose the constraints are probabilistic. The method does not, however, require time-consuming simulation studies to set values for control parameters before the operational use of the test. Instead, it can set the probabilities of item ineligibility adaptively during the test using the actual item-exposure rates. An empirical study using an item pool from the Law School Admission Test showed that application of the method yielded perfect control of the item-exposure rates and had negligible impact on the bias and mean-squared error functions of the ability estimator. PB - American Educational Research Assn: US VL - 29 SN - 1076-9986 (Print) ER - TY - JOUR T1 - Constructing rotating item pools for constrained adaptive testing JF - Journal of Educational Measurement Y1 - 2004 A1 - Ariel, A. A1 - Veldkamp, B. P. A1 - van der Linden, W. J. KW - computerized adaptive tests KW - constrained adaptive testing KW - item exposure KW - rotating item pools AB - Preventing items in adaptive testing from being over- or underexposed is one of the main problems in computerized adaptive testing. Though the problem of overexposed items can be solved using a probabilistic item-exposure control method, such methods are unable to deal with the problem of underexposed items. Using a system of rotating item pools, on the other hand, is a method that potentially solves both problems. In this method, a master pool is divided into (possibly overlapping) smaller item pools, which are required to have similar distributions of content and statistical attributes. These pools are rotated among the testing sites to realize desirable exposure rates for the items. A test assembly model, motivated by Gulliksen's matched random subtests method, was explored to help solve the problem of dividing a master pool into a set of smaller pools. Different methods to solve the model are proposed. An item pool from the Law School Admission Test was used to evaluate the performances of computerized adaptive tests from systems of rotating item pools constructed using these methods. (PsycINFO Database Record (c) 2007 APA, all rights reserved) PB - Blackwell Publishing: United Kingdom VL - 41 SN - 0022-0655 (Print) ER - TY - CONF T1 - The context effects of multidimensional CAT on the accuracy of multidimensional abilities and the item exposure rates T2 - Paper presented at the annual meeting of the American Educational Research Association Y1 - 2004 A1 - Li, Y. H. A1 - Schafer, W. D. JF - Paper presented at the annual meeting of the American Educational Research Association CY - San Diego CA N1 - {Incomplete PDF file, 202 KB} ER - TY - BOOK T1 - Contributions to the theory and practice of computerized adaptive testing Y1 - 2004 A1 - Theo Eggen CY - Arnhem, The Netherlands: Citogroep ER - TY - CONF T1 - Calibrating CAT item pools and online pretest items using MCMC methods T2 - Paper presented at the Annual meeting of the National Council on Measurement in Education Y1 - 2003 A1 - Segall, D. O. JF - Paper presented at the Annual meeting of the National Council on Measurement in Education CY - Chicago IL N1 - {PDF file, 155 KB} ER - TY - CONF T1 - Calibrating CAT pools and online pretest items using marginal maximum likelihood methods T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 2003 A1 - Pommerich, M A1 - Segall, D. O. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - Chicago IL N1 - {PDF file, 284 KB} ER - TY - CONF T1 - Calibrating CAT pools and online pretest items using nonparametric and adjusted marginal maximum likelihood methods T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 2003 A1 - Krass, I. A. A1 - Williams, B. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - Chicago IL N1 - PDF file, 128 K ER - TY - JOUR T1 - Calibration of an item pool for assessing the burden of headaches: an application of item response theory to the Headache Impact Test (HIT) JF - Quality of Life Research Y1 - 2003 A1 - Bjorner, J. B. A1 - Kosinski, M. A1 - Ware, J. E., Jr. KW - *Cost of Illness KW - *Decision Support Techniques KW - *Sickness Impact Profile KW - Adolescent KW - Adult KW - Aged KW - Comparative Study KW - Disability Evaluation KW - Factor Analysis, Statistical KW - Headache/*psychology KW - Health Surveys KW - Human KW - Longitudinal Studies KW - Middle Aged KW - Migraine/psychology KW - Models, Psychological KW - Psychometrics/*methods KW - Quality of Life/*psychology KW - Software KW - Support, Non-U.S. Gov't AB - BACKGROUND: Measurement of headache impact is important in clinical trials, case detection, and the clinical monitoring of patients. Computerized adaptive testing (CAT) of headache impact has potential advantages over traditional fixed-length tests in terms of precision, relevance, real-time quality control and flexibility. OBJECTIVE: To develop an item pool that can be used for a computerized adaptive test of headache impact. METHODS: We analyzed responses to four well-known tests of headache impact from a population-based sample of recent headache sufferers (n = 1016). We used confirmatory factor analysis for categorical data and analyses based on item response theory (IRT). RESULTS: In factor analyses, we found very high correlations between the factors hypothesized by the original test constructers, both within and between the original questionnaires. These results suggest that a single score of headache impact is sufficient. We established a pool of 47 items which fitted the generalized partial credit IRT model. By simulating a computerized adaptive health test we showed that an adaptive test of only five items had a very high concordance with the score based on all items and that different worst-case item selection scenarios did not lead to bias. CONCLUSION: We have established a headache impact item pool that can be used in CAT of headache impact. VL - 12 N1 - 0962-9343Journal Article ER - TY - JOUR T1 - Can an item response theory-based pain item bank enhance measurement precision? JF - Clinical Therapeutics Y1 - 2003 A1 - Lai, J-S. A1 - Dineen, K. A1 - Cella, D. A1 - Von Roenn, J. VL - 25 JO - Clin Ther ER - TY - CONF T1 - Can We Assess Pre-K Kids With Computer-Based Tests: STAR Early Literacy Data T2 - Presentation to the 33rd Annual National Conference on Large-Scale Assessment. Y1 - 2003 A1 - J. R. McBride JF - Presentation to the 33rd Annual National Conference on Large-Scale Assessment. CY - San Antonio TX ER - TY - ABST T1 - CAT-ASVAB prototype Internet delivery system: Final report (FR-03-06) Y1 - 2003 A1 - Sticha, P. J. A1 - Barber, G. CY - Arlington VA: Human Resources Rsearch Organization N1 - {PDF file, 393 KB} ER - TY - CONF T1 - Cognitive CAT in foreign language assessment T2 - Proceedings 11th International PEG Conference Y1 - 2003 A1 - Giouroglou, H. A1 - Economides, A. A. JF - Proceedings 11th International PEG Conference CY - Powerful ICT Tools for Learning and Teaching, PEG '03, CD-ROM, 2003 ER - TY - JOUR T1 - A comparative study of item exposure control methods in computerized adaptive testing JF - Journal of Educational Measurement Y1 - 2003 A1 - Chang, S-W. A1 - Ansley, T. N. KW - Adaptive Testing KW - Computer Assisted Testing KW - Educational KW - Item Analysis (Statistical) KW - Measurement KW - Strategies computerized adaptive testing AB - This study compared the properties of five methods of item exposure control within the purview of estimating examinees' abilities in a computerized adaptive testing (CAT) context. Each exposure control algorithm was incorporated into the item selection procedure and the adaptive testing progressed based on the CAT design established for this study. The merits and shortcomings of these strategies were considered under different item pool sizes and different desired maximum exposure rates and were evaluated in light of the observed maximum exposure rates, the test overlap rates, and the conditional standard errors of measurement. Each method had its advantages and disadvantages, but no one possessed all of the desired characteristics. There was a clear and logical trade-off between item exposure control and measurement precision. The M. L. Stocking and C. Lewis conditional multinomial procedure and, to a slightly lesser extent, the T. Davey and C. G. Parshall method seemed to be the most promising considering all of the factors that this study addressed. (PsycINFO Database Record (c) 2005 APA ) VL - 40 ER - TY - CONF T1 - A comparison of exposure control procedures in CAT systems based on different measurement models for testlets using the verbal reasoning section of the MCAT T2 - Paper presented at the Annual meeting of the National Council on Measurement in Education Y1 - 2003 A1 - Boyd, A. M A1 - Dodd, B. G. A1 - Fitzpatrick, S. J. JF - Paper presented at the Annual meeting of the National Council on Measurement in Education CY - Chicago IL N1 - {PDF file, 405 KB} ER - TY - CONF T1 - A comparison of item exposure control procedures using a CAT system based on the generalized partial credit model T2 - Paper presented at the annual meeting of the American Educational Research Association Y1 - 2003 A1 - Burt, W. M A1 - Kim, S.-J A1 - Davis, L. L. A1 - Dodd, B. G. JF - Paper presented at the annual meeting of the American Educational Research Association CY - Chicago IL N1 - {PDF file, 265 KB} ER - TY - CONF T1 - A comparison of learning potential results at various educational levels T2 - Paper presented at the 6th Annual Society for Industrial and Organisational Psychology of South Africa (SIOPSA) conference Y1 - 2003 A1 - De Beer, M. JF - Paper presented at the 6th Annual Society for Industrial and Organisational Psychology of South Africa (SIOPSA) conference CY - 25-27 June 2003 N1 - {PDF file, 391 KB} ER - TY - CONF T1 - Comparison of multi-stage tests with computer adaptive and paper and pencil tests T2 - Paper presented at the Annual meeting of the National Council on Measurement in Education Y1 - 2003 A1 - Rotou, O. A1 - Patsula, L. A1 - Steffen, M. A1 - Rizavi, S. JF - Paper presented at the Annual meeting of the National Council on Measurement in Education CY - Chicago IL N1 - {PDF file, 695 KB} ER - TY - JOUR T1 - A computer adaptive testing simulation applied to the FIM instrument motor component JF - Arch Phys Med Rehabil Y1 - 2003 A1 - Dijkers, M.P. VL - 84 ER - TY - JOUR T1 - Computer-adaptive test for measuring personality factors using item response theory JF - Dissertation Abstracts International: Section B: The Sciences & Engineering Y1 - 2003 A1 - Macdonald, Paul Lawrence AB - The aim of the present research was to develop a computer adaptive test with the graded response model to measure the Five Factor Model of personality attributes. In the first of three studies, simulated items and simulated examinees were used to investigate systematically the impact of several variables on the accuracy and efficiency of a computer adaptive test. Item test banks containing more items, items with greater trait discrimination, and more response options resulted in increased accuracy and efficiency of the computer adaptive test. It was also found that large stopping rule values required fewer items before stopping but had less accuracy compared to smaller stopping rule values. This demonstrated a trade-off between accuracy and efficiency such that greater measurement accuracy can be obtained at a cost of decreased test efficiency. In the second study, the archival responses of 501 participants to five 30-item test banks measuring the Five Factor Model of personality were utilized in simulations of a computer adaptive personality test. The computer adaptive test estimates of participant trait scores were highly correlated with the item response theory trait estimates, and the magnitude of the correlation was related directly to the stopping rule value with higher correlations and less measurement error being associated with smaller stopping rule values. It was also noted that the performance of the computer adaptive test was dependent on the personality factor being measured whereby Conscientiousness required the most number of items to be administered and Neuroticism required the least. The results confirmed that a simulated computer adaptive test using archival personality data could accurately and efficiently attain trait estimates. In the third study, 276 student participants selected response options with a click of a mouse in a computer adaptive personality test (CAPT) measuring the Big Five factors of the Five Factor Model of personality structure. Participant responses to alternative measures of the Big Five were also collected using conventional paper-and-pencil personality questionnaires. It was found that the CAPT obtained trait estimates that were very accurate even with very few administered items. Similarly, the CAPT trait estimates demonstrated moderate to high concurrent validity with the alternative Big Five measures, and the strength of the estimates varied as a result of the similarity of the personality items and assessment methodology. It was also found that the computer adaptive test was accurately able to detect, with relatively few items, the relations between the measured personality traits and several socially interesting variables such as smoking behavior, alcohol consumption rating, and number of dates per month. Implications of the results of this research are discussed in terms of the utility of computer adaptive testing of personality characteristics. As well, methodological limitations of the studies are noted and directions for future research are considered. (PsycINFO Database Record (c) 2004 APA, all rights reserved). VL - 64 ER - TY - JOUR T1 - Computerized adaptive rating scales for measuring managerial performance JF - International Journal of Selection and Assessment Y1 - 2003 A1 - Schneider, R. J. A1 - Goff, M. A1 - Anderson, S. A1 - Borman, W. C. KW - Adaptive Testing KW - Algorithms KW - Associations KW - Citizenship KW - Computer Assisted Testing KW - Construction KW - Contextual KW - Item Response Theory KW - Job Performance KW - Management KW - Management Personnel KW - Rating Scales KW - Test AB - Computerized adaptive rating scales (CARS) had been developed to measure contextual or citizenship performance. This rating format used a paired-comparison protocol, presenting pairs of behavioral statements scaled according to effectiveness levels, and an iterative item response theory algorithm to obtain estimates of ratees' citizenship performance (W. C. Borman et al, 2001). In the present research, we developed CARS to measure the entire managerial performance domain, including task and citizenship performance, thus addressing a major limitation of the earlier CARS. The paper describes this development effort, including an adjustment to the algorithm that reduces substantially the number of item pairs required to obtain almost as much precision in the performance estimates. (PsycINFO Database Record (c) 2005 APA ) VL - 11 ER - TY - CHAP T1 - Computerized adaptive testing Y1 - 2003 A1 - Ponsoda, V. A1 - Olea, J. CY - R. Fernández-Ballesteros (Ed.): Encyclopaedia of Psychological Assessment. London: Sage. ER - TY - CONF T1 - Computerized adaptive testing: A comparison of three content balancing methods T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 2003 A1 - Leung, C-K.. A1 - Chang, Hua-Hua A1 - Hau, K-T. A1 - Wen. Z. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - Chicago IL N1 - {PDF file, 227 KB} ER - TY - JOUR T1 - Computerized adaptive testing: A comparison of three content balancing methods JF - The Journal of Technology, Learning and Assessment Y1 - 2003 A1 - Leung, C-K.. A1 - Chang, Hua-Hua A1 - Hau, K-T. AB - Content balancing is often a practical consideration in the design of computerized adaptive testing (CAT). This study compared three content balancing methods, namely, the constrained CAT (CCAT), the modified constrained CAT (MCCAT), and the modified multinomial model (MMM), under various conditions of test length and target maximum exposure rate. Results of a series of simulation studies indicate that there is no systematic effect of content balancing method in measurement efficiency and pool utilization. However, among the three methods, the MMM appears to consistently over-expose fewer items. VL - 2 ER - TY - JOUR T1 - Computerized adaptive testing using the nearest-neighbors criterion JF - Applied Psychological Measurement Y1 - 2003 A1 - Cheng, P. E. A1 - Liou, M. KW - (Statistical) KW - Adaptive Testing KW - Computer Assisted Testing KW - Item Analysis KW - Item Response Theory KW - Statistical Analysis KW - Statistical Estimation computerized adaptive testing KW - Statistical Tests AB - Item selection procedures designed for computerized adaptive testing need to accurately estimate every taker's trait level (θ) and, at the same time, effectively use all items in a bank. Empirical studies showed that classical item selection procedures based on maximizing Fisher or other related information yielded highly varied item exposure rates; with these procedures, some items were frequently used whereas others were rarely selected. In the literature, methods have been proposed for controlling exposure rates; they tend to affect the accuracy in θ estimates, however. A modified version of the maximum Fisher information (MFI) criterion, coined the nearest neighbors (NN) criterion, is proposed in this study. The NN procedure improves to a moderate extent the undesirable item exposure rates associated with the MFI criterion and keeps sufficient precision in estimates. The NN criterion will be compared with a few other existing methods in an empirical study using the mean squared errors in θ estimates and plots of item exposure rates associated with different distributions. (PsycINFO Database Record (c) 2005 APA ) (journal abstract) VL - 27 ER - TY - JOUR T1 - Computerized adaptive testing using the nearest-neighbors criterion JF - Applied Psychological Measurement Y1 - 2003 A1 - Cheng, P. E. A1 - Liou, M. VL - 27 ER - TY - JOUR T1 - Computerized adaptive testing with item cloning JF - Applied Psychological Measurement Y1 - 2003 A1 - Glas, C. A. W. A1 - van der Linden, W. J. KW - computerized adaptive testing AB - (from the journal abstract) To increase the number of items available for adaptive testing and reduce the cost of item writing, the use of techniques of item cloning has been proposed. An important consequence of item cloning is possible variability between the item parameters. To deal with this variability, a multilevel item response (IRT) model is presented which allows for differences between the distributions of item parameters of families of item clones. A marginal maximum likelihood and a Bayesian procedure for estimating the hyperparameters are presented. In addition, an item-selection procedure for computerized adaptive testing with item cloning is presented which has the following two stages: First, a family of item clones is selected to be optimal at the estimate of the person parameter. Second, an item is randomly selected from the family for administration. Results from simulation studies based on an item pool from the Law School Admission Test (LSAT) illustrate the accuracy of these item pool calibration and adaptive testing procedures. (PsycINFO Database Record (c) 2003 APA, all rights reserved). VL - 27 N1 - References .Sage Publications, US ER - TY - CONF T1 - Constraining item exposure in computerized adaptive testing with shadow tests T2 - Paper presented at the Annual meeting of the National Council on Measurement in Education Y1 - 2003 A1 - van der Linden, W. J. A1 - Veldkamp, B. P. JF - Paper presented at the Annual meeting of the National Council on Measurement in Education CY - Chicago IL N1 - #vdLI03-02 ER - TY - CONF T1 - Constructing rotating item pools for constrained adaptive testing T2 - Paper presented at the Annual meeting of the National Council on Measurement in Education Y1 - 2003 A1 - Ariel, A. A1 - Veldkamp, B. A1 - van der Linden, W. J. JF - Paper presented at the Annual meeting of the National Council on Measurement in Education CY - Chicago IL N1 - {PDF file, 395 KB} ER - TY - CONF T1 - Controlling item exposure and item eligibility in computerized adaptive testing Y1 - 2003 A1 - van der Linden, W. J. A1 - Veldkamp, B. P. ER - TY - CONF T1 - Criterion item characteristic curve function for evaluating the differential weight procedure adjusted to on-line item calibration T2 - Paper presented at the annual meeting of the NCME Y1 - 2003 A1 - Samejima, F. JF - Paper presented at the annual meeting of the NCME CY - Chicago IL ER - TY - JOUR T1 - Can examinees use judgments of item difficulty to improve proficiency estimates on computerized adaptive vocabulary tests? JF - Journal of Educational Measurement Y1 - 2002 A1 - Vispoel, W. P. A1 - Clough, S. J. A1 - Bleiler, T. A1 - Hendrickson, A. B. A1 - Ihrig, D. VL - 39 ER - TY - CONF T1 - Comparing three item selection approaches for computerized adaptive testing with content balancing requirement T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 2002 A1 - Leung, C-K.. A1 - Chang, Hua-Hua A1 - Hau, K-T. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - New Orleans LA N1 - {PDF file, 226 KB} ER - TY - CONF T1 - A comparison of computer mastery models when pool characteristics vary T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 2002 A1 - Smith, R. L. A1 - Lewis, C. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - New Orleans LA N1 - {PDF file, 692 KB} ER - TY - JOUR T1 - A comparison of item selection techniques and exposure control mechanisms in CATs using the generalized partial credit model JF - Applied Psychological Measurement Y1 - 2002 A1 - Pastor, D. A. A1 - Dodd, B. G. A1 - Chang, Hua-Hua KW - (Statistical) KW - Adaptive Testing KW - Algorithms computerized adaptive testing KW - Computer Assisted Testing KW - Item Analysis KW - Item Response Theory KW - Mathematical Modeling AB - The use of more performance items in large-scale testing has led to an increase in the research investigating the use of polytomously scored items in computer adaptive testing (CAT). Because this research has to be complemented with information pertaining to exposure control, the present research investigated the impact of using five different exposure control algorithms in two sized item pools calibrated using the generalized partial credit model. The results of the simulation study indicated that the a-stratified design, in comparison to a no-exposure control condition, could be used to reduce item exposure and overlap, increase pool utilization, and only minorly degrade measurement precision. Use of the more restrictive exposure control algorithms, such as the Sympson-Hetter and conditional Sympson-Hetter, controlled exposure to a greater extent but at the cost of measurement precision. Because convergence of the exposure control parameters was problematic for some of the more restrictive exposure control algorithms, use of the more simplistic exposure control mechanisms, particularly when the test length to item pool size ratio is large, is recommended. (PsycINFO Database Record (c) 2005 APA ) (journal abstract) VL - 26 ER - TY - JOUR T1 - A comparison of non-deterministic procedures for the adaptive assessment of knowledge JF - Psychologische Beitrge Y1 - 2002 A1 - Hockemeyer, C. VL - 44 ER - TY - CONF T1 - Comparison of the psychometric properties of several computer-based test designs for credentialing exams T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 2002 A1 - Jodoin, M. A1 - Zenisky, A. L. A1 - Hambleton, R. K. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - New Orleans LA N1 - {PDF file, 261 KB} ER - TY - JOUR T1 - Computer adaptive testing: The impact of test characteristics on perceived performance and test takers' reactions JF - Dissertation Abstracts International: Section B: the Sciences & Engineering Y1 - 2002 A1 - Tonidandel, S. KW - computerized adaptive testing AB - This study examined the relationship between characteristics of adaptive testing and test takers' subsequent reactions to the test. Participants took a computer adaptive test in which two features, the difficulty of the initial item and the difficulty of subsequent items, were manipulated. These two features of adaptive testing determined the number of items answered correctly by examinees and their subsequent reactions to the test. The data show that the relationship between test characteristics and reactions was fully mediated by perceived performance on the test. In addition, the impact of feedback on reactions to adaptive testing was also evaluated. In general, feedback that was consistent with perceptions of performance had a positive impact on reactions to the test. Implications for adaptive test design concerning maximizing test takers' reactions are discussed. (PsycINFO Database Record (c) 2003 APA, all rights reserved). VL - 62 ER - TY - JOUR T1 - Computer-adaptive testing: The impact of test characteristics on perceived performance and test takers’ reactions JF - Journal of Applied Psychology Y1 - 2002 A1 - Tonidandel, S. A1 - Quiñones, M. A. A1 - Adams, A. A. VL - 87 ER - TY - JOUR T1 - Computerised adaptive testing JF - British Journal of Educational Technology Y1 - 2002 A1 - Latu, E. A1 - Chapman, E. KW - computerized adaptive testing AB - Considers the potential of computer adaptive testing (CAT). Discusses the use of CAT instead of traditional paper and pencil tests, identifies decisions that impact the efficacy of CAT, and concludes that CAT is beneficial when used to its full potential on certain types of tests. (LRW) VL - 33 ER - TY - CONF T1 - Confirmatory item factor analysis using Markov chain Monte Carlo estimation with applications to online calibration in CAT T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 2002 A1 - Segall, D. O. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - New Orleans, LA ER - TY - ABST T1 - Constraining item exposure in computerized adaptive testing with shadow tests (Research Report No. 02-06) Y1 - 2002 A1 - van der Linden, W. J. A1 - Veldkamp, B. P. CY - University of Twente, The Netherlands ER - TY - CONF T1 - Content-stratified random item selection in computerized classification testing T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 2002 A1 - Guille, R. Lipner, R. S. A1 - Norcini, J. J. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - New Orleans LA N1 - #GU02-01 ER - TY - CHAP T1 - Controlling item exposure and maintaining item security Y1 - 2002 A1 - Davey, T. A1 - Nering, M. CY - C. N. Mills, M. T. Potenza, and J. J. Fremer (Eds.), Computer-Based Testing: Building the Foundation for Future Assessments (pp. 165-191). Mahwah, NJ: Lawrence Erlbaum Associates, Inc. ER - TY - CONF T1 - Can examinees use judgments of item difficulty to improve proficiency estimates on computerized adaptive vocabulary tests? T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 2001 A1 - Vispoel, W. P. A1 - Clough, S. J. A1 - Bleiler, T. Hendrickson, A. B. A1 - Ihrig, D. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - Seattle WA N1 - #VI01-01 ER - TY - RPRT T1 - CATSIB: A modified SIBTEST procedure to detect differential item functioning in computerized adaptive tests (Research report) Y1 - 2001 A1 - Nandakumar, R. A1 - Roussos, L. PB - Law School Admission Council CY - Newton, PA ER - TY - ABST T1 - CB BULATS: Examining the reliability of a computer based test using test-retest method Y1 - 2001 A1 - Geranpayeh, A. CY - Cambridge ESOL Research Notes, Issue 5, July 2001, pp N1 - #GE01-01 14-16. {PDF file, 456 KB} ER - TY - JOUR T1 - A comparative study of on line pretest item—Calibration/scaling methods in computerized adaptive testing JF - Journal of Educational Measurement Y1 - 2001 A1 - Ban, J. C. A1 - Hanson, B. A. A1 - Wang, T. A1 - Yi, Q. A1 - Harris, D. J. AB - The purpose of this study was to compare and evaluate five on-line pretest item-calibration/scaling methods in computerized adaptive testing (CAT): marginal maximum likelihood estimate with one EM cycle (OEM), marginal maximum likelihood estimate with multiple EM cycles (MEM), Stocking's Method A, Stocking's Method B, and BILOG/Prior. The five methods were evaluated in terms ofitem-parameter recovery, using three different sample sizes (300, 1000 and 3000). The MEM method appeared to be the best choice among these, because it produced the smallest parameter-estimation errors for all sample size conditions. MEM and OEM are mathematically similar, although the OEM method produced larger errors. MEM also was preferable to OEM, unless the amount of timeinvolved in iterative computation is a concern. Stocking's Method B also worked very well, but it required anchor items that either would increase test lengths or require larger sample sizes depending on test administration design. Until more appropriate ways of handling sparse data are devised, the BILOG/Prior method may not be a reasonable choice for small sample sizes. Stocking's Method A hadthe largest weighted total error, as well as a theoretical weakness (i.e., treating estimated ability as true ability); thus, there appeared to be little reason to use it VL - 38 ER - TY - CONF T1 - Comparison of the SPRT and CMT procedures in computerized adaptive testing T2 - Paper presented at the annual meeting of the American Educational Research Association Y1 - 2001 A1 - Yi, Q. A1 - Hanson, B. A1 - Widiatmo, H. A1 - Harris, D. J. JF - Paper presented at the annual meeting of the American Educational Research Association CY - Seattle WA ER - TY - JOUR T1 - Computerized adaptive testing with equated number-correct scoring JF - Applied Psychological Measurement Y1 - 2001 A1 - van der Linden, W. J. AB - A constrained computerized adaptive testing (CAT) algorithm is presented that can be used to equate CAT number-correct (NC) scores to a reference test. As a result, the CAT NC scores also are equated across administrations. The constraints are derived from van der Linden & Luecht’s (1998) set of conditions on item response functions that guarantees identical observed NC score distributions on two test forms. An item bank from the Law School Admission Test was used to compare the results of the algorithm with those for equipercentile observed-score equating, as well as the prediction of NC scores on a reference test using its test response function. The effects of the constraints on the statistical properties of the θ estimator in CAT were examined. VL - 25 N1 - Sage Publications, US ER - TY - JOUR T1 - Computerized adaptive testing with the generalized graded unfolding model JF - Applied Psychological Measurement Y1 - 2001 A1 - Roberts, J. S. A1 - Lin, Y. A1 - Laughlin, J. E. KW - Attitude Measurement KW - College Students computerized adaptive testing KW - Computer Assisted Testing KW - Item Response KW - Models KW - Statistical Estimation KW - Theory AB - Examined the use of the generalized graded unfolding model (GGUM) in computerized adaptive testing. The objective was to minimize the number of items required to produce equiprecise estimates of person locations. Simulations based on real data about college student attitudes toward abortion and on data generated to fit the GGUM were used. It was found that as few as 7 or 8 items were needed to produce accurate and precise person estimates using an expected a posteriori procedure. The number items in the item bank (20, 40, or 60 items) and their distribution on the continuum (uniform locations or item clusters in moderately extreme locations) had only small effects on the accuracy and precision of the estimates. These results suggest that adaptive testing with the GGUM is a good method for achieving estimates with an approximately uniform level of precision using a small number of items. (PsycINFO Database Record (c) 2005 APA ) VL - 25 ER - TY - UNPB T1 - Computerized-adaptive versus paper-and-pencil testing environments: An experimental analysis of examinee experience Y1 - 2001 A1 - Bringsjord, E. L. VL - Doctoral dissertation ER - TY - JOUR T1 - Concerns with computerized adaptive oral proficiency assessment. A commentary on "Comparing examinee attitudes Toward computer-assisted and other oral proficient assessments": Response to the Norris Commentary JF - Language Learning and Technology Y1 - 2001 A1 - Norris, J. M. A1 - Kenyon, D. M. A1 - Malabonga, V. AB - Responds to an article on computerized adaptive second language (L2) testing, expressing concerns about the appropriateness of such tests for informing language educators about the language skills of L2 learners and users and fulfilling the intended purposes and achieving the desired consequences of language test use.The authors of the original article respond. (Author/VWL) VL - 5 ER - TY - JOUR T1 - CUSUM-based person-fit statistics for adaptive testing JF - Journal of Educational and Behavioral Statistics Y1 - 2001 A1 - van Krimpen-Stoop, E. M. L. A. A1 - Meijer, R. R. VL - 26 ER - TY - JOUR T1 - Capitalization on item calibration error in adaptive testing JF - Applied Measurement in Education Y1 - 2000 A1 - van der Linden, W. J. A1 - Glas, C. A. W. KW - computerized adaptive testing AB - (from the journal abstract) In adaptive testing, item selection is sequentially optimized during the test. Because the optimization takes place over a pool of items calibrated with estimation error, capitalization on chance is likely to occur. How serious the consequences of this phenomenon are depends not only on the distribution of the estimation errors in the pool or the conditional ratio of the test length to the pool size given ability, but may also depend on the structure of the item selection criterion used. A simulation study demonstrated a dramatic impact of capitalization on estimation errors on ability estimation. Four different strategies to minimize the likelihood of capitalization on error in computerized adaptive testing are discussed. VL - 13 N1 - References .Lawrence Erlbaum, US ER - TY - JOUR T1 - CAT administration of language placement examinations JF - Journal of Applied Measurement Y1 - 2000 A1 - Stahl, J. A1 - Bergstrom, B. A1 - Gershon, R. C. KW - *Language KW - *Software KW - Aptitude Tests/*statistics & numerical data KW - Educational Measurement/*statistics & numerical data KW - Humans KW - Psychometrics KW - Reproducibility of Results KW - Research Support, Non-U.S. Gov't AB - This article describes the development of a computerized adaptive test for Cegep de Jonquiere, a community college located in Quebec, Canada. Computerized language proficiency testing allows the simultaneous presentation of sound stimuli as the question is being presented to the test-taker. With a properly calibrated bank of items, the language proficiency test can be offered in an adaptive framework. By adapting the test to the test-taker's level of ability, an assessment can be made with significantly fewer items. We also describe our initial attempt to detect instances in which "cheating low" is occurring. In the "cheating low" situation, test-takers deliberately answer questions incorrectly, questions that they are fully capable of answering correctly had they been taking the test honestly. VL - 1 N1 - 1529-7713Journal Article ER - TY - CHAP T1 - Caveats, pitfalls, and unexpected consequences of implementing large-scale computerized testing Y1 - 2000 A1 - Wainer, H., A1 - Eignor, D. R. CY - Wainer, H. (Ed). Computerized adaptive testing: A primer (2nd ed.). pp. 271-299. Mahwah, NJ: Lawrence Erlbaum Associates. ER - TY - ABST T1 - CBTS: Computer-based testing simulation and analysis [computer software] Y1 - 2000 A1 - Robin, F. CY - Amherst, MA: University of Massachusetts, School of Education ER - TY - CONF T1 - Change in distribution of latent ability with item position in CAT sequence T2 - Paper presented at the annual meeting of the National Council on Measurement in Education in New Orleans Y1 - 2000 A1 - Krass, I. A. JF - Paper presented at the annual meeting of the National Council on Measurement in Education in New Orleans CY - LA N1 - {PDF file, 103 KB} ER - TY - JOUR T1 - The choice of item difficulty in self adapted testing JF - European Journal of Psychological Assessment Y1 - 2000 A1 - Hontangas, P. A1 - Ponsoda, V. A1 - Olea, J. A1 - Wise, S. L. VL - 16 IS - 1 ER - TY - CONF T1 - Classification accuracy and test security for a computerized adaptive mastery test calibrated with different IRT models T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 2000 A1 - Robin, F. A1 - Xing, D. A1 - Scrams, D. A1 - Potenza, M. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - New Orleans LA ER - TY - JOUR T1 - A comparison of computerized adaptive testing and multistage testing JF - Dissertation Abstracts International: Section B: the Sciences & Engineering Y1 - 2000 A1 - Patsula, L N. KW - computerized adaptive testing AB - There is considerable evidence to show that computerized-adaptive testing (CAT) and multi-stage testing (MST) are viable frameworks for testing. With many testing organizations looking to move towards CAT or MST, it is important to know what framework is superior in different situations and at what cost in terms of measurement. What was needed is a comparison of the different testing procedures under various realistic testing conditions. This dissertation addressed the important problem of the increase or decrease in accuracy of ability estimation in using MST rather than CAT. The purpose of this study was to compare the accuracy of ability estimates produced by MST and CAT while keeping some variables fixed and varying others. A simulation study was conducted to investigate the effects of several factors on the accuracy of ability estimation using different CAT and MST designs. The factors that were manipulated are the number of stages, the number of subtests per stage, and the number of items per subtest. Kept constant were test length, distribution of subtest information, method of determining cut-points on subtests, amount of overlap between subtests, and method of scoring total test. The primary question of interest was, given a fixed test length, how many stages and many subtests per stage should there be to maximize measurement precision? Furthermore, how many items should there be in each subtest? Should there be more in the routing test or should there be more in the higher stage tests? Results showed that, in general, increasing the number of stages from two to three decreased the amount of errors in ability estimation. Increasing the number of subtests from three to five increased the accuracy of ability estimates as well as the efficiency of the MST designs relative to the P&P and CAT designs at most ability levels (-.75 to 2.25). Finally, at most ability levels (-.75 to 2.25), varying the number of items per stage had little effect on either the resulting accuracy of ability estimates or the relative efficiency of the MST designs to the P&P and CAT designs. (PsycINFO Database Record (c) 2003 APA, all rights reserved). VL - 60 ER - TY - JOUR T1 - A comparison of item selection rules at the early stages of computerized adaptive testing JF - Applied Psychological Measurement Y1 - 2000 A1 - Chen, S-Y. A1 - Ankenmann, R. D. A1 - Chang, Hua-Hua KW - Adaptive Testing KW - Computer Assisted Testing KW - Item Analysis (Test) KW - Statistical Estimation computerized adaptive testing AB - The effects of 5 item selection rules--Fisher information (FI), Fisher interval information (FII), Fisher information with a posterior distribution (FIP), Kullback-Leibler information (KL), and Kullback-Leibler information with a posterior distribution (KLP)--were compared with respect to the efficiency and precision of trait (θ) estimation at the early stages of computerized adaptive testing (CAT). FII, FIP, KL, and KLP performed marginally better than FI at the early stages of CAT for θ=-3 and -2. For tests longer than 10 items, there appeared to be no precision advantage for any of the selection rules. (PsycINFO Database Record (c) 2005 APA ) (journal abstract) VL - 24 ER - TY - JOUR T1 - A comparison of item selection rules at the early stages of computerized adaptive testing JF - Applied Psychological Measurement Y1 - 2000 A1 - Chen, S.Y. A1 - Ankenmann, R. D. A1 - Chang, Hua-Hua VL - 24 ER - TY - CHAP T1 - Computer-adaptive sequential testing Y1 - 2000 A1 - Luecht, RM A1 - Nungester, R. J. CY - W. J. van der Linden (Ed.), Computerized Adaptive Testing: Theory and Practice (pp. 289-209). Dordrecht, The Netherlands: Kluwer. ER - TY - CHAP T1 - Computer-adaptive testing: A methodology whose time has come T2 - Development of Computerised Middle School Achievement Tests Y1 - 2000 A1 - Linacre, J. M. ED - Kang, U. ED - Jean, E. ED - Linacre, J. M. KW - computerized adaptive testing JF - Development of Computerised Middle School Achievement Tests PB - MESA CY - Chicago, IL. USA VL - 69 ER - TY - ABST T1 - Computer-adaptive testing: A methodology whose time has come. MESA Memorandum No 9 Y1 - 2000 A1 - Linacre, J. M. CY - Chicago : MESA psychometric laboratory, Unversity of Chicago. ER - TY - JOUR T1 - Computerization and adaptive administration of the NEO PI-R JF - Assessment Y1 - 2000 A1 - Reise, S. P. A1 - Henson, J. M. KW - *Personality Inventory KW - Algorithms KW - California KW - Diagnosis, Computer-Assisted/*methods KW - Humans KW - Models, Psychological KW - Psychometrics/methods KW - Reproducibility of Results AB - This study asks, how well does an item response theory (IRT) based computerized adaptive NEO PI-R work? To explore this question, real-data simulations (N = 1,059) were used to evaluate a maximum information item selection computerized adaptive test (CAT) algorithm. Findings indicated satisfactory recovery of full-scale facet scores with the administration of around four items per facet scale. Thus, the NEO PI-R could be reduced in half with little loss in precision by CAT administration. However, results also indicated that the CAT algorithm was not necessary. We found that for many scales, administering the "best" four items per facet scale would have produced similar results. In the conclusion, we discuss the future of computerized personality assessment and describe the role IRT methods might play in such assessments. VL - 7 N1 - 1073-1911 (Print)Journal Article ER - TY - JOUR T1 - Computerized adaptive administration of the self-evaluation examination JF - AANA.J Y1 - 2000 A1 - LaVelle, T. A1 - Zaglaniczny, K., A1 - Spitzer, L.E. VL - 68 ER - TY - ABST T1 - Computerized adaptive rating scales (CARS): Development and evaluation of the concept Y1 - 2000 A1 - Borman, W. C. A1 - Hanson, M. A. A1 - Kubisiak, U. C. A1 - Buck, D. E. CY - (Institute Rep No. 350). Tampa FL: Personnel Decisions Research Institute. ER - TY - BOOK T1 - Computerized adaptive testing: A primer (2nd edition) Y1 - 2000 A1 - Wainer, H., A1 - Dorans, N. A1 - Eignor, D. R. A1 - Flaugher, R. A1 - Green, B. F. A1 - Mislevy, R. A1 - Steinberg, L. A1 - Thissen, D. CY - Hillsdale, N. J. : Lawrence Erlbaum Associates ER - TY - JOUR T1 - Computerized adaptive testing for classifying examinees into three categories JF - Educational and Psychological Measurement Y1 - 2000 A1 - Theo Eggen A1 - Straetmans, G. J. J. M. KW - computerized adaptive testing KW - Computerized classification testing AB - The objective of this study was to explore the possibilities for using computerized adaptive testing in situations in which examinees are to be classified into one of three categories.Testing algorithms with two different statistical computation procedures are described and evaluated. The first computation procedure is based on statistical testing and the other on statistical estimation. Item selection methods based on maximum information (MI) considering content and exposure control are considered. The measurement quality of the proposed testing algorithms is reported. The results of the study are that a reduction of at least 22% in the mean number of items can be expected in a computerized adaptive test (CAT) compared to an existing paper-and-pencil placement test. Furthermore, statistical testing is a promising alternative to statistical estimation. Finally, it is concluded that imposing constraints on the MI selection strategy does not negatively affect the quality of the testing algorithms VL - 60 ER - TY - BOOK T1 - Computerized adaptive testing: Theory and practice Y1 - 2000 A1 - van der Linden, W. J. A1 - Glas, C. A. W. PB - Kluwer Academic Publishers CY - Dordrecht, The Netherlands ER - TY - CONF T1 - Computerized testing – the adolescent years: Juvenile delinquent or positive role model T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 2000 A1 - Reckase, M. D. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - New Orleans LA ER - TY - CHAP T1 - Constrained adaptive testing with shadow tests Y1 - 2000 A1 - van der Linden, W. J. CY - W. J. van der Linden and C. A. W. Glas (eds.), Computerized adaptive testing: Theory and practice (pp.27-52). Norwell MA: Kluwer. ER - TY - BOOK T1 - The construction and evaluation of a dynamic computerised adaptive test for the measurement of learning potential Y1 - 2000 A1 - De Beer, M. CY - Unpublished D. Litt et Phil dissertation. University of South Africa, Pretoria. ER - TY - CONF T1 - Content balancing in stratified computerized adaptive testing designs T2 - Paper presented at the annual meeting of the American Educational Research Association Y1 - 2000 A1 - Leung, C-K.. A1 - Chang, Hua-Hua A1 - Hau, K-T. JF - Paper presented at the annual meeting of the American Educational Research Association CY - New Orleans, LA N1 - {PDF file, 427 KB} ER - TY - CHAP T1 - Cross-validating item parameter estimation in adaptive testing Y1 - 2000 A1 - van der Linden, W. J. A1 - Glas, C. A. W. CY - A. Boorsma, M. A. J. van Duijn, and T. A. B. Snijders (Eds.) (pp. 205-219), Essays on item response theory. New York: Springer. ER - TY - JOUR T1 - Can examinees use a review option to obtain positively biased ability estimates on a computerized adaptive test? JF - Journal of Educational Measurement Y1 - 1999 A1 - Vispoel, W. P. A1 - Rocklin, T. R. A1 - Wang, T. A1 - Bleiler, T. VL - 36 ER - TY - CONF T1 - CAT administration of language placement exams T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 1999 A1 - Stahl, J. A1 - Gershon, R. C. A1 - Bergstrom, B. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - Montreal, Canada ER - TY - CHAP T1 - CAT for certification and licensure T2 - Innovations in computerized assessment Y1 - 1999 A1 - Bergstrom, Betty A. A1 - Lunz, M. E. KW - computerized adaptive testing AB - (from the chapter) This chapter discusses implementing computerized adaptive testing (CAT) for high-stakes examinations that determine whether or not a particular candidate will be certified or licensed. The experience of several boards who have chosen to administer their licensure or certification examinations using the principles of CAT illustrates the process of moving into this mode of administration. Examples of the variety of options that can be utilized within a CAT administration are presented, the decisions that boards must make to implement CAT are discussed, and a timetable for completing the tasks that need to be accomplished is provided. In addition to the theoretical aspects of CAT, practical issues and problems are reviewed. (PsycINFO Database Record (c) 2002 APA, all rights reserved). JF - Innovations in computerized assessment PB - Lawrence Erlbaum Associates CY - Mahwah, N.J. N1 - Using Smart Source ParsingInnovations in computerized assessment. (pp. 67-91). xiv, 266pp ER - TY - CONF T1 - A comparative study of ability estimates from computer-adaptive testing and multi-stage testing T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 1999 A1 - Patsula, L N. A1 - Hambleton, R. K. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - Montreal Canada ER - TY - BOOK T1 - A comparison of computerized-adaptive testing and multi-stage testing Y1 - 1999 A1 - Patsula, L N. CY - Unpublished doctoral dissertation, University of Massachusetts at Amherst ER - TY - CONF T1 - A comparison of conventional and adaptive testing procedures for making single-point decisions T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 1999 A1 - Kingsbury, G. G. A1 - A Zara JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - Montreal, Canada N1 - #KI99-1 ER - TY - CONF T1 - Comparison of stratum scored and maximum likelihood scoring T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 1999 A1 - Wise, S. L. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - Montreal, Canada ER - TY - ABST T1 - A comparison of testlet-based test designs for computerized adaptive testing (LSAC Computerized Testing Report 97-01) Y1 - 1999 A1 - Schnipke, D. L. A1 - Reese, L. M. CY - Newtown, PA: LSAC. ER - TY - CONF T1 - Comparison of the a-stratified method, the Sympson-Hetter method, and the matched difficulty method in CAT administration T2 - Paper presented at the annual meeting of the Psychometric Society Y1 - 1999 A1 - Ban, J A1 - Wang, T. A1 - Yi, Q. JF - Paper presented at the annual meeting of the Psychometric Society CY - Lawrence KS N1 - #BA99-01 ER - TY - JOUR T1 - Competency gradient for child-parent centers JF - Journal of Outcomes Measurement Y1 - 1999 A1 - Bezruczko, N. KW - *Models, Statistical KW - Activities of Daily Living/classification/psychology KW - Adolescent KW - Chicago KW - Child KW - Child, Preschool KW - Early Intervention (Education)/*statistics & numerical data KW - Female KW - Follow-Up Studies KW - Humans KW - Male KW - Outcome and Process Assessment (Health Care)/*statistics & numerical data AB - This report describes an implementation of the Rasch model during the longitudinal evaluation of a federally-funded early childhood preschool intervention program. An item bank is described for operationally defining a psychosocial construct called community life-skills competency, an expected teenage outcome of the preschool intervention. This analysis examined the position of teenage students on this scale structure, and investigated a pattern of cognitive operations necessary for students to pass community life-skills test items. Then this scale structure was correlated with nationally standardized reading and math achievement scores, teacher ratings, and school records to assess its validity as a measure of the community-related outcome goal for this intervention. The results show a functional relationship between years of early intervention and magnitude of effect on the life-skills competency variable. VL - 3 N1 - 1090-655X (Print)Journal ArticleResearch Support, U.S. Gov't, P.H.S. ER - TY - JOUR T1 - Computerized adaptive assessment with the MMPI-2 in a clinical setting JF - Psychological Assessment Y1 - 1999 A1 - Handel, R. W. A1 - Ben-Porath, Y. S. A1 - Watt, M. E. VL - 11 ER - TY - ABST T1 - Computerized adaptive testing in the Bundeswehr Y1 - 1999 A1 - Storm, E. G. CY - Unpublished manuscript N1 - #ST99-01 {PDF file, 427 KB} ER - TY - JOUR T1 - Computerized adaptive testing: Overview and introduction JF - Applied Psychological Measurement Y1 - 1999 A1 - Meijer, R. R. A1 - Nering, M. L. VL - 23 ER - TY - JOUR T1 - Computerized Adaptive Testing: Overview and Introduction JF - Applied Psychological Measurement Y1 - 1999 A1 - Meijer, R. R. A1 - Nering, M. L. KW - computerized adaptive testing AB - Use of computerized adaptive testing (CAT) has increased substantially since it was first formulated in the 1970s. This paper provides an overview of CAT and introduces the contributions to this Special Issue. The elements of CAT discussed here include item selection procedures, estimation of the latent trait, item exposure, measurement precision, and item bank development. Some topics for future research are also presented. VL - 23 ER - TY - Generic T1 - Computerized classification testing under practical constraints with a polytomous model T2 - annual meeting of the American Educational Research Association Y1 - 1999 A1 - Lau, CA A1 - Wang, T. JF - annual meeting of the American Educational Research Association CY - Montreal, Quebec, Canada ER - TY - CONF T1 - Computerized classification testing under practical constraints with a polytomous model T2 - Paper presented at the annual meeting of the American Educational Research Association Y1 - 1999 A1 - Lau, C. A, A1 - Wang, T. JF - Paper presented at the annual meeting of the American Educational Research Association CY - Montreal N1 - PDF file, 579 K ER - TY - CONF T1 - Computerized testing – Issues and applications (Mini-course manual) T2 - Annual Meeting of the National Council on Measurement in Education Y1 - 1999 A1 - Parshall, C. A1 - Davey, T. A1 - Spray, J. A1 - Kalohn, J. JF - Annual Meeting of the National Council on Measurement in Education CY - Montreal ER - TY - CONF T1 - Constructing adaptive tests to parallel conventional programs T2 - Paper presented at the annual meeting of the National council on Measurement in Education Y1 - 1999 A1 - Fan, M. A1 - Thompson, T. A1 - Davey, T. JF - Paper presented at the annual meeting of the National council on Measurement in Education CY - Montreal N1 - #FA99-01 ER - TY - CHAP T1 - Creating computerized adaptive tests of music aptitude: Problems, solutions, and future directions Y1 - 1999 A1 - Vispoel, W. P. CY - F. Drasgow and J. B. Olson-Buchanan (Eds.), Innovations in computerized assessment (pp. 151-176). Mahwah NJ: Erlbaum. ER - TY - ABST T1 - Current and future research in multi-stage testing (Research Report No 370) Y1 - 1999 A1 - Zenisky, A. L. CY - Amherst MA: University of Massachusetts, Laboratory of Pychometric and Evaluative Research. N1 - {PDF file, 131 KB} ER - TY - JOUR T1 - CUSUM-based person-fit statistics for adaptive testing JF - Journal of Educational and Behavioral Statistics Y1 - 1999 A1 - van Krimpen-Stoop, E. M. L. A. A1 - Meijer, R. R. VL - 26 ER - TY - ABST T1 - CUSUM-based person-fit statistics for adaptive testing (Research Report 99-05) Y1 - 1999 A1 - van Krimpen-Stoop, E. M. L. A. A1 - Meijer, R. R. CY - Enschede, The Netherlands: University of Twente, Faculty of Educational Science and Technology, Department of Measurement and Data Analysis ER - TY - ABST T1 - Capitalization on item calibration error in adaptive testing (Research Report 98-07) Y1 - 1998 A1 - van der Linden, W. J. A1 - Glas, C. A. W. CY - Enschede, The Netherlands: University of Twente, Faculty of Educational Science and Technology, Department of Measurement and Data Analysis ER - TY - ABST T1 - CASTISEL [Computer software] Y1 - 1998 A1 - Luecht, RM CY - Philadelphia, PA: National Board of Medical Examiners ER - TY - CONF T1 - CAT item calibration T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 1998 A1 - Hsu, Y. A1 - Thompson, T.D. A1 - Chen, W-H. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - San Diego ER - TY - CONF T1 - CAT Item exposure control: New evaluation tools, alternate methods and integration into a total CAT program T2 - Paper presented at the annual meeting of the National Council of Measurement in Education Y1 - 1998 A1 - Thomasson, G. L. JF - Paper presented at the annual meeting of the National Council of Measurement in Education CY - San Diego, CA ER - TY - ABST T1 - Comparability of paper-and-pencil and computer adaptive test scores on the GRE General Test (GRE Board Professional Report No 95-08P; Educational Testing Service Research Report 98-38) Y1 - 1998 A1 - Schaeffer, G. A1 - Bridgeman, B. A1 - Golub-Smith, M. L. A1 - Lewis, C. A1 - Potenza, M. T. A1 - Steffen, M. CY - Princeton, NJ: Educational Testing Service ER - TY - RPRT T1 - Comparability of paper-and-pencil and computer adaptive test scores on the GRE General Test Y1 - 1998 A1 - Schaeffer, G. A. A1 - Bridgeman, B. A1 - Golub-Smith, M. L. A1 - Lewis, C. A1 - Potenza, M. T. A1 - Steffen, M. PB - Educational Testing Services CY - Princeton, N.J. SN - ETS Research Report 98-38 ER - TY - ABST T1 - A comparative study of item exposure control methods in computerized adaptive testing Y1 - 1998 A1 - Chang, S-W. A1 - Twu, B.-Y. CY - Research Report Series 98-3, Iowa City: American College Testing. N1 - #CH98-03 ER - TY - BOOK T1 - A comparative study of item exposure control methods in computerized adaptive testing Y1 - 1998 A1 - Chang, S-W. CY - Unpublished doctoral dissertation, University of Iowa , Iowa City IA ER - TY - CONF T1 - Comparing and combining dichotomous and polytomous items with SPRT procedure in computerized classification testing T2 - Paper presented at the annual meeting of the American Educational Research Association Y1 - 1998 A1 - Lau, CA A1 - Wang, T. JF - Paper presented at the annual meeting of the American Educational Research Association CY - San Diego N1 - PDF file, 375 K ER - TY - JOUR T1 - A comparison of item exposure control methods in computerized adaptive testing JF - Journal of Educational Measurement Y1 - 1998 A1 - Revuelta, J., A1 - Ponsoda, V. VL - 35 ER - TY - JOUR T1 - A comparison of maximum likelihood estimation and expected a posteriori estimation in CAT using the partial credit model JF - Educational and Psychological Measurement Y1 - 1998 A1 - Chen, S. A1 - Hou, L. A1 - Dodd, B. G. VL - 58 ER - TY - CONF T1 - A comparison of two methods of controlling item exposure in computerized adaptive testing T2 - Paper presented at the meeting of the American Educational Research Association. San Diego CA. Y1 - 1998 A1 - Tang, L. A1 - Jiang, H. A1 - Chang, Hua-Hua JF - Paper presented at the meeting of the American Educational Research Association. San Diego CA. ER - TY - ABST T1 - Computer adaptive testing – Approaches for item selection and measurement Y1 - 1998 A1 - Armstrong, R. D. A1 - Jones, D. H. CY - Rutgers Center for Operations Research, New Brunswick NJ ER - TY - JOUR T1 - Computer-assisted test assembly using optimization heuristics JF - Applied Psychological Measurement Y1 - 1998 A1 - Luecht, RM VL - 22 ER - TY - CONF T1 - Computerized adaptive rating scales that measure contextual performance T2 - Paper presented at the 3th annual conference of the Society for Industrial and Organizational Psychology Y1 - 1998 A1 - Borman, W. C. A1 - Hanson, M. A. A1 - Montowidlo, S. J A1 - F Drasgow A1 - Foster, L A1 - Kubisiak, U. C. JF - Paper presented at the 3th annual conference of the Society for Industrial and Organizational Psychology CY - Dallas TX ER - TY - JOUR T1 - Computerized adaptive testing: What it is and how it works JF - Educational Technology Y1 - 1998 A1 - Straetmans, G. J. J. M. A1 - Theo Eggen AB - Describes the workings of computerized adaptive testing (CAT). Focuses on the key concept of information and then discusses two important components of a CAT system: the calibrated item bank and the testing algorithm. Describes a CAT that was designed for making placement decisions on the basis of two typical test administrations and notes the most significant differences between traditional paper-based testing and CAT. (AEF) VL - 38 ER - TY - CONF T1 - Computerized adaptive testing with multiple form structures T2 - Paper presented at the annual meeting of the Psychometric Society Y1 - 1998 A1 - Armstrong, R. D. A1 - Jones, D. H. A1 - Berliner, N. JF - Paper presented at the annual meeting of the Psychometric Society CY - Urbana, IL ER - TY - CONF T1 - Constructing adaptive tests to parallel conventional programs T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 1998 A1 - Thompson, T. A1 - Davey, T. A1 - Nering, M. L. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - San Diego ER - TY - CONF T1 - Constructing passage-based tests that parallel conventional programs T2 - Paper presented at the annual meeting of the Psychometric Society Y1 - 1998 A1 - Thompson, T. A1 - Davey, T. A1 - Nering, M. L. JF - Paper presented at the annual meeting of the Psychometric Society CY - Urbana, IL ER - TY - CONF T1 - Controlling item exposure and maintaining item security T2 - Paper presented at an Educational Testing Service-sponsored colloquium entitled “Computer-based testing: Building the foundations for future assessments Y1 - 1998 A1 - Davey, T. A1 - Nering, M. L. JF - Paper presented at an Educational Testing Service-sponsored colloquium entitled “Computer-based testing: Building the foundations for future assessments CY - ” Philadelphia PA ER - TY - JOUR T1 - Controlling item exposure conditional on ability in computerized adaptive testing JF - Journal of Educational & Behavioral Statistics Y1 - 1998 A1 - Stocking, M. L. A1 - Lewis, C. AB - The interest in the application of large-scale adaptive testing for secure tests has served to focus attention on issues that arise when theoretical advances are made operational. One such issue is that of ensuring item and pool security in the continuous testing environment made possible by the computerized admin-istration of a test, as opposed to the more periodic testing environment typically used for linear paper-and-pencil tests. This article presents a new method of controlling the exposure rate of items conditional on ability level in this continuous testing environment. The properties of such conditional control on the exposure rates of items, when used in conjunction with a particular adaptive testing algorithm, are explored through studies with simulated data. VL - 23 N1 - American Educational Research Assn, US ER - TY - CONF T1 - Calibration of CAT items administered online for classification: Assumption of local independence T2 - Paper presented at the annual meeting of the Psychometric Society Y1 - 1997 A1 - Spray, J. A. A1 - Parshall, C. G. A1 - Huang, C.-H. JF - Paper presented at the annual meeting of the Psychometric Society CY - Gatlinburg TN ER - TY - ABST T1 - CAST 5 for Windows users' guide Y1 - 1997 A1 - J. R. McBride A1 - Cooper, R. R CY - Contract No. "MDA903-93-D-0032, DO 0054. Alexandria, VA: Human Resources Research Organization ER - TY - CHAP T1 - CAT-ASVAB cost and benefit analyses Y1 - 1997 A1 - Wise, L. L. A1 - Curran, L. T. A1 - J. R. McBride CY - W. A. Sands, B. K. Waters, and J. R. McBride (Eds.), Computer adaptive testing: From inquiry to operation (pp. 227-236). Washington, DC: American Psychological Association. ER - TY - CHAP T1 - CAT-ASVAB operational test and evaluation Y1 - 1997 A1 - Moreno, K. E. CY - W. A. Sands, B. K. Waters, and . R. McBride (Eds.), Computerized adaptive testing: From inquiry to operation (pp. 199-205). Washington DC: American Psychological Association. ER - TY - ABST T1 - CATSIB: A modified SIBTEST procedure to detect differential item functioning in computerized adaptive tests (Research report) Y1 - 1997 A1 - Nandakumar, R. A1 - Roussos, L. CY - Newtown, PA: Law School Admission Council ER - TY - CONF T1 - Comparability and validity of computerized adaptive testing with the MMPI-2 using a clinical sample T2 - Paper presented at the 32nd Annual Symposium and Recent Developments in the use of the MMPI-2 and MMPI-A. Minneapolis MN. Y1 - 1997 A1 - Handel, R. W. A1 - Ben-Porath, Y. S. A1 - Watt, M. JF - Paper presented at the 32nd Annual Symposium and Recent Developments in the use of the MMPI-2 and MMPI-A. Minneapolis MN. ER - TY - JOUR T1 - A comparison of maximum likelihood estimation and expected a posteriori estimation in computerized adaptive testing using the generalized partial credit model JF - Dissertation Abstracts International: Section B: the Sciences & Engineering Y1 - 1997 A1 - Chen, S-K. KW - computerized adaptive testing AB - A simulation study was conducted to investigate the application of expected a posteriori (EAP) trait estimation in computerized adaptive tests (CAT) based on the generalized partial credit model (Muraki, 1992), and to compare the performance of EAP with maximum likelihood trait estimation (MLE). The performance of EAP was evaluated under different conditions: the number of quadrature points (10, 20, and 30), and the type of prior distribution (normal, uniform, negatively skewed, and positively skewed). The relative performance of the MLE and EAP estimation methods were assessed under two distributional forms of the latent trait, one normal and the other negatively skewed. Also, both the known item parameters and estimated item parameters were employed in the simulation study. Descriptive statistics, correlations, scattergrams, accuracy indices, and audit trails were used to compare the different methods of trait estimation in CAT. The results showed that, regardless of the latent trait distribution, MLE and EAP with a normal prior, a uniform prior, or the prior that matches the latent trait distribution using either 20 or 30 quadrature points provided relatively accurate estimation in CAT based on the generalized partial credit model. However, EAP using only 10 quadrature points did not work well in the generalized partial credit CAT. Also, the study found that increasing the number of quadrature points from 20 to 30 did not increase the accuracy of EAP estimation. Therefore, it appears 20 or more quadrature points are sufficient for accurate EAP estimation. The results also showed that EAP with a negatively skewed prior and positively skewed prior performed poorly for the normal data set, and EAP with positively skewed prior did not provide accurate estimates for the negatively skewed data set. Furthermore, trait estimation in CAT using estimated item parameters produced results similar to those obtained using known item parameters. In general, when at least 20 quadrature points are used, EAP estimation with a normal prior, a uniform prior or the prior that matches the latent trait distribution appears to be a good alternative to MLE in the application of polytomous CAT based on the generalized partial credit model. (PsycINFO Database Record (c) 2003 APA, all rights reserved). VL - 58 ER - TY - CONF T1 - A comparison of testlet-based test designs for computerized adaptive testing T2 - Paper presented at the meeting of American Educational Research Association Y1 - 1997 A1 - Schnipke, D. L. A1 - Reese, L. M. JF - Paper presented at the meeting of American Educational Research Association CY - Chicago, IL ER - TY - CONF T1 - Computer assembly of tests so that content reigns supreme T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 1997 A1 - Case, S. M. A1 - Luecht, RM JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - Chicago IL ER - TY - JOUR T1 - Computer-adaptive testing of listening comprehension: A blueprint of CAT Development JF - The Language Teacher Online 21 Y1 - 1997 A1 - Dunkel, P. A. VL - no. 10. N1 - . ER - TY - JOUR T1 - Computerized adaptive and fixed-item testing of music listening skill: A comparison of efficiency, precision, and concurrent validity JF - Journal of Educational Measurement Y1 - 1997 A1 - Vispoel, W. P., A1 - Wang, T. VL - 34 ER - TY - JOUR T1 - Computerized adaptive and fixed-item testing of music listening skill: A comparison of efficiency, precision, and concurrent validity JF - Journal of Educational Measurement Y1 - 1997 A1 - Vispoel, W. P. A1 - Wang, T. A1 - Bleiler, T. VL - 34 ER - TY - BOOK T1 - Computerized adaptive testing: From inquiry to operation Y1 - 1997 A1 - Sands, W. A. A1 - B. K. Waters A1 - J. R. McBride KW - computerized adaptive testing AB - (from the cover) This book traces the development of computerized adaptive testing (CAT) from its origins in the 1960s to its integration with the Armed Services Vocational Aptitude Battery (ASVAB) in the 1990s. A paper-and-pencil version of the battery (P&P-ASVAB) has been used by the Defense Department since the 1970s to measure the abilities of applicants for military service. The test scores are used both for initial qualification and for classification into entry-level training opportunities. /// This volume provides the developmental history of the CAT-ASVAB through its various stages in the Joint-Service arena. Although the majority of the book concerns the myriad technical issues that were identified and resolved, information is provided on various political and funding support challenges that were successfully overcome in developing, testing, and implementing the battery into one of the nation's largest testing programs. The book provides useful information to professionals in the testing community and everyone interested in personnel assessment and evaluation. (PsycINFO Database Record (c) 2004 APA, all rights reserved). PB - American Psychological Association CY - Washington, D.C., USA N1 - References .Using Smart Source Parsingxvii, pp ER - TY - JOUR T1 - A computerized adaptive testing system for speech discrimination measurement: The Speech Sound Pattern Discrimination Test JF - Journal of the Accoustical Society of America Y1 - 1997 A1 - Bochner, J. A1 - Garrison, W. A1 - Palmer, L. A1 - MacKenzie, D. A1 - Braveman, A. KW - *Diagnosis, Computer-Assisted KW - *Speech Discrimination Tests KW - *Speech Perception KW - Adolescent KW - Adult KW - Audiometry, Pure-Tone KW - Human KW - Middle Age KW - Psychometrics KW - Reproducibility of Results AB - A computerized, adaptive test-delivery system for the measurement of speech discrimination, the Speech Sound Pattern Discrimination Test, is described and evaluated. Using a modified discrimination task, the testing system draws on a pool of 130 items spanning a broad range of difficulty to estimate an examinee's location along an underlying continuum of speech processing ability, yet does not require the examinee to possess a high level of English language proficiency. The system is driven by a mathematical measurement model which selects only test items which are appropriate in difficulty level for a given examinee, thereby individualizing the testing experience. Test items were administered to a sample of young deaf adults, and the adaptive testing system evaluated in terms of respondents' sensory and perceptual capabilities, acoustic and phonetic dimensions of speech, and theories of speech perception. Data obtained in this study support the validity, reliability, and efficiency of this test as a measure of speech processing ability. VL - 101 N1 - 972575560001-4966Journal Article ER - TY - CONF T1 - Computerized adaptive testing through the World Wide Web Y1 - 1997 A1 - Shermis, M. D. ER - TY - ABST T1 - Computerized adaptive testing through the World Wide Web Y1 - 1997 A1 - Shermis, M. D. A1 - Mzumara, H. A1 - Brown, M. A1 - Lillig, C. CY - (ERIC No. ED414536) ER - TY - CHAP T1 - Computerized adaptive testing using the partial credit model for attitude measurement Y1 - 1997 A1 - Baek, S. G. CY - M. Wilson, G. Engelhard Jr and K. Draney (Eds.), Objective measurement: Theory into practice, volume 4. Norwood NJ: Ablex. ER - TY - CONF T1 - Controlling test and computer anxiety: Test performance under CAT and SAT conditions T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 1997 A1 - Shermis, M. D. A1 - Mzumara, H. A1 - Bublitz, S. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - Chicago IL ER - TY - CHAP T1 - Current and future challenges Y1 - 1997 A1 - Segall, D. O. A1 - Moreno, K. E. CY - W. A. Sands, B. K. Waters, and J. R. McBride (Eds.). Computerized adaptive testing: From inquiry to operation (pp 257-269). Washington DC: American Psychological Association. ER - TY - CONF T1 - Can examinees use a review option to positively bias their scores on a computerized adaptive test? Paper presented at the annual meeting of the National council on Measurement in Education, New York T2 - Paper presented at the annual meeting of the National council on Measurement in Education Y1 - 1996 A1 - Rocklin, T. R. A1 - Vispoel, W. P. A1 - Wang, T. A1 - Bleiler, T. L. JF - Paper presented at the annual meeting of the National council on Measurement in Education CY - New York NY ER - TY - BOOK T1 - A comparison of adaptive self-referenced testing and classical approaches to the measurement of individual change Y1 - 1996 A1 - VanLoy, W. J. CY - Unpublished doctoral dissertation, University of Minnesota ER - TY - JOUR T1 - Comparison of SPRT and sequential Bayes procedures for classifying examinees into two categories using a computerized test JF - Journal of Educational & Behavioral Statistics Y1 - 1996 A1 - Spray, J. A. A1 - Reckase, M. D. VL - 21 ER - TY - CONF T1 - A comparison of the traditional maximum information method and the global information method in CAT item selection T2 - annual meeting of the National Council on Measurement in Education Y1 - 1996 A1 - Tang, K. L. KW - computerized adaptive testing KW - item selection JF - annual meeting of the National Council on Measurement in Education CY - New York, NY USA ER - TY - JOUR T1 - Computerized adaptive skill assessment in a statewide testing program JF - Journal of Research on Computing in Education Y1 - 1996 A1 - Shermis, M. D. A1 - Stemmer, P. M. A1 - Webb, P. M. VL - 29(1) ER - TY - ABST T1 - Computerized adaptive testing for classifying examinees into three categories (Measurement and Research Department Rep 96-3) Y1 - 1996 A1 - Theo Eggen A1 - Straetmans, G. J. J. M. CY - Arnhem, The Netherlands: Cito N1 - #EG96-3 . [Reprinted in Chapter 5 in #EG04-01] ER - TY - JOUR T1 - Computerized adaptive testing for reading assessment and diagnostic assessment JF - Journal of Developmental Education Y1 - 1996 A1 - Shermis, M. D. A1 - et. al. ER - TY - JOUR T1 - Computerized adaptive testing for the national certification examination JF - AANA.J Y1 - 1996 A1 - Bergstrom, Betty A. VL - 64 ER - TY - JOUR T1 - Computerized adaptive testing for the national certification examination Y1 - 1996 A1 - Bergstrom, Betty A. ER - TY - CONF T1 - Computing scores for incomplete GRE General computer adaptive tests T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 1996 A1 - Slater, S. C. A1 - Schaffer, G.A. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - New York NY ER - TY - JOUR T1 - Conducting self-adapted testing using MicroCAT JF - Educational and Psychological Measurement Y1 - 1996 A1 - Roos, L. L. A1 - Wise, S. L. A1 - Yoes, M. E. A1 - Rocklin, T. R. VL - 56 ER - TY - CONF T1 - Constructing adaptive tests to parallel conventional programs T2 - Paper presented at the annual meeting of the American Educational Research Association Y1 - 1996 A1 - Davey, T. A1 - Thomas, L. JF - Paper presented at the annual meeting of the American Educational Research Association CY - New York ER - TY - CHAP T1 - A content-balanced adaptive testing algorithm for computer-based training systems Y1 - 1996 A1 - Huang, S. X. CY - Frasson, C. and Gauthier, G. and Lesgold, A. (Eds.), Intelligent Tutoring Systems, Third International Conference, ITS'96, Montr�al, Canada, June 1996 Proceedings. Lecture Notes in Computer Science 1086. Berlin Heidelberg: Springer-Verlag 306-314. ER - TY - CONF T1 - A critical analysis of the argument for and against item review in computerized adaptive testing T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 1996 A1 - Wise, S. L. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - New York ER - TY - CONF T1 - Current research in computer-based testing for personnel selection and classification in the United States T2 - Invited address to the Centre for Recruitment and Selection, Belgian Armed Forces" Y1 - 1996 A1 - J. R. McBride JF - Invited address to the Centre for Recruitment and Selection, Belgian Armed Forces" ER - TY - JOUR T1 - Comparability and validity of computerized adaptive testing with the MMPI-2 JF - Journal of Personality Assessment Y1 - 1995 A1 - Roper, B. L. A1 - Ben-Porath, Y. S. A1 - Butcher, J. N. AB - The comparability and validity of a computerized adaptive (CA) Minnesota Multiphasic Personality Inventory-2 (MMPI-2) were assessed in a sample of 571 undergraduate college students. The CA MMPI-2 administered adaptively Scales L, E the 10 clinical scales, and the 15 content scales, utilizing the countdown method (Butcher, Keller, & Bacon, 1985). All subjects completed the MMPI-2 twice, with three experimental conditions: booklet test-retest, booklet-CA, and conventional computerized (CC)-CA. Profiles across administration modalities show a high degree of similarity, providing evidence for the comparability of the three forms. Correlations between MMPI-2 scales and other psychometric measures (Beck Depression Inventory; Symptom Checklist-Revised; State-Trait Anxiety and Anger Scales; and the Anger Expression Scale) support the validity of the CA MMPI-2. Substantial item savings may be realized with the implementation of the countdown procedure. VL - 65 SN - 0022-3891 (Print) N1 - Roper, B LBen-Porath, Y SButcher, J NUnited StatesJournal of personality assessmentJ Pers Assess. 1995 Oct;65(2):358-71. ER - TY - CONF T1 - Comparability studies for the GRE CAT General Test and the NCLEX using CAT T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 1995 A1 - Eignor, D. R. A1 - Schaffer, G.A. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - San Francisco ER - TY - CONF T1 - A comparison of classification agreement between adaptive and full-length test under the 1-PL and 2-PL models T2 - Paper presented at the annual meeting of the American Educational Research Association Y1 - 1995 A1 - Lewis, M. J. A1 - Subhiyah, R. G. A1 - Morrison, C. A. JF - Paper presented at the annual meeting of the American Educational Research Association CY - San Francisco CA N1 - (cited in #RE98311) ER - TY - CONF T1 - A comparison of gender differences on paper-and-pencil and computer-adaptive versions of the Graduate Record Examination T2 - Paper presented at the annual meeting of the American Educational Research Association Y1 - 1995 A1 - Bridgeman, B. A1 - Schaeffer, G. A. JF - Paper presented at the annual meeting of the American Educational Research Association CY - San Francisco CA ER - TY - JOUR T1 - A comparison of item selection routines in linear and adaptive tests JF - Journal of Educational Measurement Y1 - 1995 A1 - Schnipke, D. L., A1 - Green, B. F. VL - 32 ER - TY - CONF T1 - A comparison of two IRT-based models for computerized mastery testing when item parameter estimates are uncertain T2 - Paper presented at the annual meeting of the American Educational Research Association Y1 - 1995 A1 - Way, W. D. A1 - Lewis, C. A1 - Smith, R. L. JF - Paper presented at the annual meeting of the American Educational Research Association CY - San Francisco ER - TY - CONF T1 - Computer adaptive testing in a medical licensure setting: A comparison of outcomes under the one- and two- parameter logistic models T2 - Paper presented at the annual meeting of the American Educational Research Association Y1 - 1995 A1 - Morrison, C. A. A1 - Nungester, R. J. JF - Paper presented at the annual meeting of the American Educational Research Association CY - San Francisco ER - TY - JOUR T1 - Computer-adaptive testing: A new breed of assessment JF - Journal of the American Dietetic Association Y1 - 1995 A1 - Ruiz, B. A1 - Fitz, P. A. A1 - Lewis, C. A1 - Reidy, C. VL - 95 ER - TY - JOUR T1 - Computer-adaptive testing: A new breed of assessment JF - Journal of the American Dietetic Association Y1 - 1995 A1 - Ruiz, B. A1 - Fitz, P. A. A1 - Lewis, C. A1 - Reidy, C. VL - 95 ER - TY - JOUR T1 - Computer-adaptive testing: CAT: A Bayesian maximum-falsification approach JF - Rasch Measurement Transactions Y1 - 1995 A1 - Linacre, J. M. VL - 9 ER - TY - BOOK T1 - Computerized adaptive attitude testing using the partial credit model Y1 - 1995 A1 - Baek, S. G. CY - Dissertation Abstracts International-A, 55(7-A), 1922 (UMI No. AAM9430378) ER - TY - JOUR T1 - Computerized adaptive testing: Tracking candidate response patterns JF - Journal of Educational Computing Research Y1 - 1995 A1 - Lunz, M. E. A1 - Bergstrom, Betty A. AB - Tracked the effect of candidate response patterns on a computerized adaptive test. Data were from a certification examination in laboratory science administered in 1992 to 155 candidates, using a computerized adaptive algorithm. The 90-item certification examination was divided into 9 units of 10 items each to track the pattern of initial responses and response alterations on ability estimates and test precision across the 9 test units. The precision of the test was affected most by response alterations during early segments of the test. While candidates generally benefited from altering responses, individual candidates showed different patterns of response alterations across test segments. Test precision was minimally affected, suggesting that the tailoring of computerized adaptive testing is minimally affected by response alterations. (PsycINFO Database Record (c) 2002 APA, all rights reserved). VL - 13 N1 - Baywood Publishing, US ER - TY - JOUR T1 - Computerized Adaptive Testing With Polytomous Items JF - Applied Psychological Measurement Y1 - 1995 A1 - Dodd, B. G. A1 - De Ayala, R. J. A1 - Koch. W.R., VL - 19 IS - 1 ER - TY - JOUR T1 - Computerized adaptive testing with polytomous items JF - Applied Psychological Measurement Y1 - 1995 A1 - Dodd, B. G. A1 - De Ayala, R. J., A1 - Koch, W. R. AB - Discusses polytomous item response theory models and the research that has been conducted to investigate a variety of possible operational procedures (item bank, item selection, trait estimation, stopping rule) for polytomous model-based computerized adaptive testing (PCAT). Studies are reviewed that compared PCAT systems based on competing item response theory models that are appropriate for the same measurement objective, as well as applications of PCAT in marketing and educational psychology. Directions for future research using PCAT are suggested. VL - 19 ER - TY - CHAP T1 - Computerized testing for licensure Y1 - 1995 A1 - Vale, C. D. CY - J. Impara (ed.), Licensure testing: Purposes, procedures, and Practices (pp. 291-320). Lincoln NE: Buros Institute of Mental Measurements. ER - TY - ABST T1 - Controlling item exposure conditional on ability in computerized adaptive testing (Research Report 95-25) Y1 - 1995 A1 - Stocking, M. L. A1 - Lewis, C. CY - Princeton NJ: Educational Testing Service. N1 - #ST95-25; also see #ST98057 ER - TY - BOOK T1 - CAT software system [computer program Y1 - 1994 A1 - Gershon, R. C. CY - Chicago IL: Computer Adaptive Technologies ER - TY - ABST T1 - CAT-GATB simulation studies Y1 - 1994 A1 - Segall, D. O. CY - San Diego CA: Navy Personnel Research and Development Center ER - TY - CONF T1 - Comparing computerized adaptive and self-adapted tests: The influence of examinee achievement locus of control T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 1994 A1 - Wise, S. L. A1 - Roos, L. L. A1 - Plake, B. S. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - New Orleans LA ER - TY - JOUR T1 - A Comparison of Item Calibration Media in Computerized Adaptive Testing JF - Applied Psychological Measurement Y1 - 1994 A1 - Hetter, R. D. A1 - Segall, D. O. A1 - Bloxom, B. M. VL - 18 IS - 3 ER - TY - JOUR T1 - A comparison of item calibration media in computerized adaptive tests JF - Applied Psychological Measurement Y1 - 1994 A1 - Hetter, R. D. A1 - Segall, D. O. A1 - Bloxom, B. M. VL - 18 ER - TY - JOUR T1 - Computer adaptive testing JF - International journal of Educational Research Y1 - 1994 A1 - Lunz, M. E. A1 - Bergstrom, Betty A. A1 - Gershon, R. C. VL - 6 ER - TY - JOUR T1 - Computer adaptive testing: A shift in the evaluation paradigm JF - Educational Technology Systems Y1 - 1994 A1 - Carlson, R. VL - 22 (3) ER - TY - JOUR T1 - Computer adaptive testing: Assessment of the future JF - Curriculum/Technology Quarterly Y1 - 1994 A1 - Diones, R. A1 - Everson, H. VL - 4 (2) ER - TY - CONF T1 - Computerized adaptive testing exploring examinee response time using hierarchical linear modeling T2 - Paper presented at the annual meeting of the American Educational Research Association Y1 - 1994 A1 - Bergstrom, B. A1 - Gershon, R. C. JF - Paper presented at the annual meeting of the American Educational Research Association CY - New Orleans LA N1 - ERIC No. ED 400 286). ER - TY - JOUR T1 - Computerized adaptive testing for licensure and certification JF - CLEAR Exam Review Y1 - 1994 A1 - Bergstrom, Betty A. A1 - Gershon, R. C. VL - Winter 1994 ER - TY - JOUR T1 - Computerized adaptive testing: Revolutionizing academic assessment JF - Community College Journal Y1 - 1994 A1 - Smittle, P. VL - 65 (1) ER - TY - ABST T1 - Computerized mastery testing using fuzzy set decision theory (Research Report 94-37) Y1 - 1994 A1 - Du, Y. A1 - Lewis, C. A1 - Pashley, P. J. CY - Princeton NJ: Educational Testing Service ER - TY - ABST T1 - Computerized Testing (Research Report 94-22). Y1 - 1994 A1 - Oltman, P. K. CY - Princeton NJ: Educational Testing Service ER - TY - JOUR T1 - Computerized-adaptive and self-adapted music-listening tests: Features and motivational benefits JF - Applied Measurement in Education Y1 - 1994 A1 - Vispoel, W. P., A1 - Coffman, D. D. VL - 7 ER - TY - ABST T1 - Case studies in computer adaptive test design through simulation (Research Report RR-93-56) Y1 - 1993 A1 - Eignor, D. R. A1 - Stocking, M. L. A1 - Way, W. D. A1 - Steffen, M. CY - Princeton NJ: Educational Testing Service N1 - #EI93-56 (also presented at the 1993 National Council on Measurement in Education meeting in Atlanta GA) ER - TY - ABST T1 - Case studies in computerized adaptive test design through simulation (Research Report 93-56) Y1 - 1993 A1 - Eignor, D. R. A1 - Way, W. D. A1 - Stocking, M. A1 - Steffen, M. CY - Princeton NJ: Educational Testing Service ER - TY - JOUR T1 - Comparability and validity of computerized adaptive testing with the MMPI-2 JF - Dissertation Abstracts International Y1 - 1993 A1 - Roper, B. L. KW - computerized adaptive testing VL - 53 ER - TY - BOOK T1 - A comparison of computer adaptive test administration methods Y1 - 1993 A1 - Dolan, S. CY - Unpublished doctoral dissertation, University of Chicago ER - TY - CONF T1 - Comparison of SPRT and sequential Bayes procedures for classifying examinees into two categories using an adaptive test T2 - Unpublished manuscript. ( Y1 - 1993 A1 - Spray, J. A. A1 - Reckase, M. D. JF - Unpublished manuscript. ( ER - TY - JOUR T1 - Computer adaptive testing: A comparison of four item selection strategies when used with the golden section search strategy for estimating ability JF - Dissertation Abstracts International Y1 - 1993 A1 - Carlson, R. D. KW - computerized adaptive testing VL - 54 ER - TY - JOUR T1 - Computer adaptive testing: A new era JF - Journal of Developmental Education Y1 - 1993 A1 - Smittle, P. VL - 17 (1) ER - TY - JOUR T1 - Computerized adaptive and fixed-item versions of the ITED Vocabulary test JF - Educational and Psychological Measurement Y1 - 1993 A1 - Vispoel, W. P. VL - 53 ER - TY - CONF T1 - Computerized adaptive testing in computer science: assessing student programming abilities T2 - Proceedings of the twenty-fourth SIGCSE Technical Symposium on Computer Science Education Y1 - 1993 A1 - Syang, A. A1 - Dale, N.B. JF - Proceedings of the twenty-fourth SIGCSE Technical Symposium on Computer Science Education CY - Indianapolis IN ER - TY - JOUR T1 - Computerized adaptive testing in instructional settings JF - Educational Technology Research and Development Y1 - 1993 A1 - Welch, R. E., A1 - Frick, T. VL - 41(3) ER - TY - BOOK T1 - Computerized adaptive testing strategies: Golden section search, dichotomous search, and Z-score strategies (Doctoral dissertation, Iowa State University, 1990) Y1 - 1993 A1 - Xiao, B. CY - Dissertation Abstracts International, 54-03B, 1720 ER - TY - JOUR T1 - Computerized adaptive testing: the future is upon us JF - Nurs Health Care Y1 - 1993 A1 - Halkitis, P. N. A1 - Leahy, J. M. KW - *Computer-Assisted Instruction KW - *Education, Nursing KW - *Educational Measurement KW - *Reaction Time KW - Humans KW - Pharmacology/education KW - Psychometrics VL - 14 SN - 0276-5284 (Print) N1 - Halkitis, P NLeahy, J MUnited statesNursing & health care : official publication of the National League for NursingNurs Health Care. 1993 Sep;14(7):378-85. ER - TY - JOUR T1 - Computerized adaptive testing using the partial credit model: Effects of item pool characteristics and different stopping rules JF - Educational and Psychological Measurement Y1 - 1993 A1 - Dodd, B. G. A1 - Koch, W. R. A1 - De Ayala, R. J., AB - Simulated datasets were used to research the effects of the systematic variation of three major variables on the performance of computerized adaptive testing (CAT) procedures for the partial credit model. The three variables studied were the stopping rule for terminating the CATs, item pool size, and the distribution of the difficulty of the items in the pool. Results indicated that the standard error stopping rule performed better across the variety of CAT conditions than the minimum information stopping rule. In addition it was found that item pools that consisted of as few as 30 items were adequate for CAT provided that the item pool was of medium difficulty. The implications of these findings for implementing CAT systems based on the partial credit model are discussed. VL - 53 ER - TY - JOUR T1 - Computerized mastery testing using fuzzy set decision theory JF - Applied Measurement in Education Y1 - 1993 A1 - Du, Y. A1 - Lewis, C. A1 - Pashley, P. J. VL - 6 N1 - (Also Educational Testing Service Research Report 94-37) ER - TY - ABST T1 - Controlling item exposure rates in a realistic adaptive testing paradigm (Research Report 93-2) Y1 - 1993 A1 - Stocking, M. L. CY - Princeton NJ: Educational Testing Service ER - TY - JOUR T1 - CAT-ASVAB precision JF - Proceedings of the 34th Annual Conference of the Military Testing Association Y1 - 1992 A1 - Moreno, K. E., A1 - Segall, D. O. VL - 1 ER - TY - CONF T1 - A comparison of computerized adaptive and paper-and-pencil versions of the national registered nurse licensure examination T2 - Paper presented at the annual meeting of the American Educational Research Association Y1 - 1992 A1 - A Zara JF - Paper presented at the annual meeting of the American Educational Research Association CY - San Francisco CA ER - TY - CONF T1 - Comparison of item targeting strategies for pass/fail adaptive tests T2 - Paper presented at the annual meeting of the American Educational Research Association Y1 - 1992 A1 - Bergstrom, B. A1 - Gershon, R. C. JF - Paper presented at the annual meeting of the American Educational Research Association CY - San Francisco CA N1 - (ERIC NO. ED 400 287). ER - TY - BOOK T1 - A comparison of methods for adaptive estimation of a multidimensional trait Y1 - 1992 A1 - Tam, S. S. CY - Unpublished doctoral dissertation, Columbia University ER - TY - JOUR T1 - A comparison of self-adapted and computerized adaptive achievement tests JF - Journal of Educational Measurement Y1 - 1992 A1 - Wise, S. L. A1 - Plake, S. S A1 - Johnson, P. L. A1 - Roos, S. L. VL - 29 ER - TY - JOUR T1 - A comparison of the partial credit and graded response models in computerized adaptive testing JF - Applied Measurement in Education Y1 - 1992 A1 - De Ayala, R. J. A1 - Dodd, B. G. A1 - Koch, W. R. VL - 5 ER - TY - JOUR T1 - A comparison of the performance of simulated hierarchical and linear testlets JF - Journal of Educational Measurement Y1 - 1992 A1 - Wainer, H., A1 - Kaplan, B. A1 - Lewis, C. VL - 29 ER - TY - BOOK T1 - Computer adaptive versus paper-and-pencil tests Y1 - 1992 A1 - Bergstrom, B. CY - Unpublished doctoral dissertation, University of Chicago ER - TY - JOUR T1 - Computer-based adaptive testing in music research and instruction JF - Psychomusicology Y1 - 1992 A1 - Bowers, D. R. VL - 10 ER - TY - ABST T1 - Computerized adaptive assessment of cognitive abilities among disabled adults Y1 - 1992 A1 - Engdahl, B. CY - ERIC Document No ED301274 ER - TY - JOUR T1 - Computerized adaptive mastery tests as expert systems JF - Journal of Educational Computing Research Y1 - 1992 A1 - Frick, T. W. VL - 8 ER - TY - JOUR T1 - Computerized adaptive mastery tests as expert systems JF - Journal of Educational Computing Research Y1 - 1992 A1 - Frick, T. W. VL - 8(2) ER - TY - JOUR T1 - Computerized adaptive testing for NCLEX-PN JF - Journal of Practical Nursing Y1 - 1992 A1 - Fields, F. A. KW - *Licensure KW - *Programmed Instruction KW - Educational Measurement/*methods KW - Humans KW - Nursing, Practical/*education VL - 42 SN - 0022-3867 (Print) N1 - Fields, F AUnited statesThe Journal of practical nursingJ Pract Nurs. 1992 Jun;42(2):8-10. ER - TY - JOUR T1 - Computerized adaptive testing: Its potential substantive contribution to psychological research and assessment JF - Current Directions in Psychological Science Y1 - 1992 A1 - Embretson, S. E. VL - 1 ER - TY - JOUR T1 - Computerized adaptive testing of music-related skills JF - Bulletin of the Council for Research in Music Education Y1 - 1992 A1 - Vispoel, W. P., A1 - Coffman, D. D. VL - 112 ER - TY - JOUR T1 - Computerized adaptive testing with different groups JF - Educational Measurement: Issues and Practice Y1 - 1992 A1 - Legg, S. M., A1 - Buhr, D. C. VL - 11 (2) ER - TY - CONF T1 - Computerized adaptive testing with the MMPI-2: Reliability, validity, and comparability to paper and pencil administration T2 - Paper presented at the 27th Annual Symposium on Recent Developments in the MMPI/MMPI-2 Y1 - 1992 A1 - Ben-Porath, Y. S. A1 - Roper, B. L. JF - Paper presented at the 27th Annual Symposium on Recent Developments in the MMPI/MMPI-2 CY - Minneapolis MN ER - TY - JOUR T1 - Computerized mastery testing with nonequivalent testlets JF - Applied Psychological Measurement Y1 - 1992 A1 - Sheehan, K., A1 - Lewis, C. VL - 16 ER - TY - JOUR T1 - Computerized Mastery Testing With Nonequivalent Testlets JF - Applied Psychological Measurement Y1 - 1992 A1 - Sheehan, K. A1 - Lewis, C. VL - 16 IS - 1 ER - TY - JOUR T1 - Confidence in pass/fail decisions for computer adaptive and paper and pencil examinations JF - Evaluation and The Health Professions Y1 - 1992 A1 - Bergstrom, Betty A. VL - 15(4) ER - TY - JOUR T1 - Confidence in pass/fail decisions for computer adaptive and paper and pencil examinations JF - Evaluation and the Health Professions Y1 - 1992 A1 - Bergstrom, Betty A. A1 - Lunz, M. E. AB - Compared the level of confidence in pass/fail decisions obtained with computer adaptive tests (CADTs) and pencil-and-paper tests (PPTs). 600 medical technology students took a variable-length CADT and 2 fixed-length PPTs. The CADT was stopped when the examinee ability estimate was either 1.3 times the standard error of measurement above or below the pass/fail point or when a maximum test length was reached. Results show that greater confidence in the accuracy of the pass/fail decisions was obtained for more examinees when the CADT implemented a 90% confidence stopping rule than with PPTs of comparable test length. (PsycINFO Database Record (c) 2002 APA, all rights reserved). VL - 15 N1 - Sage Publications, US ER - TY - JOUR T1 - Confidence in pass/fail decisions for computer adaptive and paper and pencil examinations JF - Evaluation and the Health Professions Y1 - 1992 A1 - Bergstrom, Betty A. VL - 15 IS - 4 ER - TY - ABST T1 - Collected works on the legal aspects of computerized adaptive testing Y1 - 1991 A1 - Stenson, H. A1 - Graves, P. A1 - Gardiner, J. A1 - Dally, L. CY - Chicago, IL: National Council of State Boards of Nursing, Inc ER - TY - JOUR T1 - Comparability of computerized adaptive and conventional testing with the MMPI-2 JF - Journal of Personality Assessment Y1 - 1991 A1 - Roper, B. L. A1 - Ben-Porath, Y. S. A1 - Butcher, J. N. AB - A computerized adaptive version and the standard version of the Minnesota Multiphasic Personality Inventory-2 (MMPI-2) were administered 1 week apart to a sample of 155 college students to assess the comparability of the two versions. The countdown method was used to adaptively administer Scales L, F, the I0 clinical scales, and the 15 new content scales. Profiles across administration modalities show a high degree of similarity, providing evidence for the comparability of computerized adaptive and conventional testing with the MMPI-2. Substantial item savings were found with the adaptive version. Future directions in the study of adaptive testing with the MMPI-2 are discussed. VL - 57 SN - 0022-3891 (Print) N1 - Roper, B LBen-Porath, Y SButcher, J NUnited StatesJournal of personality assessmentJ Pers Assess. 1991 Oct;57(2):278-90. ER - TY - JOUR T1 - Comparability of decisions for computer adaptive and written examinations JF - Journal of Allied Health Y1 - 1991 A1 - Lunz, M. E. A1 - Bergstrom, Betty A. VL - 20 ER - TY - JOUR T1 - A comparison of paper-and-pencil, computer-administered, computerized feedback, and computerized adaptive testing methods for classroom achievement testing JF - Dissertation Abstracts International Y1 - 1991 A1 - Kuan, Tsung Hao KW - computerized adaptive testing VL - 52 ER - TY - JOUR T1 - A comparison of procedures for content-sensitive item selection JF - Applied Measurement in Education Y1 - 1991 A1 - Kingsbury, G. G. ER - TY - JOUR T1 - A comparison of procedures for content-sensitive item selection in computerized adaptive tests JF - Applied Measurement in Education Y1 - 1991 A1 - Kingsbury, G. G. A1 - A Zara VL - 4 ER - TY - ABST T1 - Comparisons of computer adaptive and pencil and paper tests Y1 - 1991 A1 - Bergstrom, Betty A. A1 - Lunz, M. E. CY - Chicago IL: American Society of Clinical Pathologists N1 - Unpublished manuscript. ER - TY - CHAP T1 - Computerized adaptive testing: Theory, applications, and standards Y1 - 1991 A1 - Hambleton, R. K. A1 - Zaal, J. N. A1 - Pieters, J. P. M. CY - R. K. Hambleton and J. N. Zaal (Eds.), Advances in educational and psychological testing: Theory and Applications (pp. 341-366). Boston: Kluwer. ER - TY - CONF T1 - Confidence in pass/fail decisions for computer adaptive and paper and pencil examinations T2 - Paper presented at the annual meeting of the American Educational Research Association Y1 - 1991 A1 - Bergstrom, B. B A1 - Lunz, M. E. JF - Paper presented at the annual meeting of the American Educational Research Association CY - Chicago IL ER - TY - ABST T1 - Construction and validation of the SON-R 5-17, the Snijders-Oomen non-verbal intelligence test Y1 - 1991 A1 - Laros, J. A. A1 - Tellegen, P. J. CY - Groningen: Wolters-Noordhoff ER - TY - JOUR T1 - Correlates of examinee item choice behavior in self-adapted testing JF - Mid-Western Educational Researcher Y1 - 1991 A1 - Johnson, J. L. A1 - Roos, L. L. A1 - Wise, S. L. A1 - Plake, B. S. VL - 4 ER - TY - ABST T1 - A comparison of Rasch and three-parameter logistic models in computerized adaptive testing Y1 - 1990 A1 - Parker, S.B. A1 - J. R. McBride CY - Unpublished manuscript ER - TY - JOUR T1 - A comparison of three decision models for adapting the length of computer-based mastery tests JF - Journal of Educational Computing Research Y1 - 1990 A1 - Frick, T. W. VL - 6 IS - 4 ER - TY - JOUR T1 - Computer testing: Pragmatic issues and research needs JF - Educational Measurement: Issues and Practice Y1 - 1990 A1 - Rudner, L. M. VL - 9 (2) N1 - Sum 1990. ER - TY - JOUR T1 - Computerized adaptive measurement of attitudes JF - Measurement and Evaluation in Counseling and Development Y1 - 1990 A1 - Koch, W. R. A1 - Dodd, B. G. A1 - Fitzpatrick, S. J. VL - 23 ER - TY - CONF T1 - Computerized adaptive music tests: A new solution to three old problems T2 - Paper presented at the biannual meeting of the Music Educators National Conference Y1 - 1990 A1 - Vispoel, W. P. JF - Paper presented at the biannual meeting of the Music Educators National Conference CY - Washington DC ER - TY - BOOK T1 - Computerized adaptive testing: A primer (Eds.) Y1 - 1990 A1 - Wainer, H., A1 - Dorans, N. J. A1 - Flaugher, R. A1 - Green, B. F. A1 - Mislevy, R. J. A1 - Steinberg, L. A1 - Thissen, D. CY - Hillsdale NJ: Erlbaum ER - TY - JOUR T1 - The construction of customized two-staged tests Y1 - 1990 A1 - Adema, J. J. VL - 27 ER - TY - CHAP T1 - Creating adaptive tests of musical ability with limited-size item pools Y1 - 1990 A1 - Vispoel, W. T. A1 - Twing, J. S CY - D. Dalton (Ed.), ADCIS 32nd International Conference Proceedings (pp. 105-112). Columbus OH: Association for the Development of Computer-Based Instructional Systems. ER - TY - BOOK T1 - CAT administrator [Computer program] Y1 - 1989 A1 - Gershon, R. C. CY - Chicago: Micro Connections ER - TY - CONF T1 - Commercial applications of computerized adaptive testing T2 - C.E. Davis Chair, Computerized Adaptive Testing–Military and Commercial Developments Ten Years Later: Symposium conducted at the Annual Conference of the Military Testing Association (524-529) Y1 - 1989 A1 - J. R. McBride JF - C.E. Davis Chair, Computerized Adaptive Testing–Military and Commercial Developments Ten Years Later: Symposium conducted at the Annual Conference of the Military Testing Association (524-529) CY - San Antonio, TX ER - TY - ABST T1 - A comparison of an expert systems approach to computerized adaptive testing and an IRT model Y1 - 1989 A1 - Frick, T. W. CY - Unpublished manuscript (submitted to American Educational Research Journal) ER - TY - JOUR T1 - A comparison of the nominal response model and the three-parameter logistic model in computerized adaptive testing JF - Educational and Psychological Measurement Y1 - 1989 A1 - De Ayala, R. J., VL - 49 ER - TY - CONF T1 - A comparison of three adaptive testing strategies using MicroCAT T2 - Paper presented at the annual meeting of the American Educational Research Association Y1 - 1989 A1 - Ho, R. A1 - Hsu, T. C. JF - Paper presented at the annual meeting of the American Educational Research Association CY - San Francisco N1 - #HO89-01 Tables and figures only.) ER - TY - JOUR T1 - Comparisons of paper-administered, computer-administered and computerized adaptive achievement tests JF - Journal of Educational Computing Research Y1 - 1989 A1 - Olson, J. B A1 - Maynes, D. D. A1 - Slawson, D. A1 - Ho, K AB - This research study was designed to compare student achievement scores from three different testing methods: paper-administered testing, computer-administered testing, and computerized adaptive testing. The three testing formats were developed from the California Assessment Program (CAP) item banks for grades three and six. The paper-administered and the computer-administered tests were identical in item content, format, and sequence. The computerized adaptive test was a tailored or adaptive sequence of the items in the computer-administered test. VL - 5 ER - TY - CONF T1 - A computerized adaptive mathematics screening test T2 - Paper presented at the Annual Meeting of the California Educational Research Association Y1 - 1989 A1 - J. R. McBride JF - Paper presented at the Annual Meeting of the California Educational Research Association CY - Burlingame, CA N1 - ERIC Document Reproduction Service No. ED 316 554) ER - TY - BOOK T1 - Computerized adaptive personality assessment Y1 - 1989 A1 - Waller, N. G. CY - Unpublished master’s thesis, Harvard University, Cambridge MA ER - TY - ABST T1 - Computerized adaptive tests Y1 - 1989 A1 - Grist, S. A1 - Rudner, L. M. A1 - Wise CY - ERIC Clearinghouse on Tests, Measurement, and Evaluation, no. 107 ER - TY - ABST T1 - A consideration for variable length adaptive tests (Research Report 89-40) Y1 - 1989 A1 - Wingersky, M. S. CY - Princeton NJ: Educational Testing Service ER - TY - JOUR T1 - The College Board computerized placement tests: An application of computerized adaptive testing JF - Machine-Mediated Learning Y1 - 1988 A1 - W. C. Ward VL - 2 ER - TY - CONF T1 - A comparison of achievement level estimates from computerized adaptive testing and paper-and-pencil testing T2 - Paper presented at the annual meeting of the American Educational Research Association Y1 - 1988 A1 - Kingsbury, G. G. A1 - Houser, R.L. JF - Paper presented at the annual meeting of the American Educational Research Association CY - New Orleans LA N1 - {PDF file, 43 KB} ER - TY - CONF T1 - A comparison of two methods for the adaptive administration of the MMPI-2 content scales T2 - Paper presented at the 86th Annual Convention of the American Psychological Association Y1 - 1988 A1 - Ben-Porath, Y. S. A1 - Waller, N. G. A1 - Slutske, W. S. A1 - Butcher, J. N. JF - Paper presented at the 86th Annual Convention of the American Psychological Association CY - Atlanta GA ER - TY - CONF T1 - Computerized adaptive attitude measurement: A comparison of the graded response and rating scale models T2 - Paper presented at the annual meeting of the American Educational Research Association Y1 - 1988 A1 - Dodd, B. G. A1 - Koch, W. R. A1 - De Ayala, R. J., JF - Paper presented at the annual meeting of the American Educational Research Association CY - New Orleans ER - TY - JOUR T1 - Computerized adaptive testing: A comparison of the nominal response model and the three parameter model JF - Dissertation Abstracts International Y1 - 1988 A1 - De Ayala, R. J., KW - computerized adaptive testing VL - 48 ER - TY - JOUR T1 - Computerized adaptive testing: A four-year-old pilot study shows that CAT can work JF - Technological Horizons in Education Y1 - 1988 A1 - Kingsbury, G. G. A1 - et. al. VL - 16 (4) ER - TY - CONF T1 - Computerized adaptive testing: a good idea waiting for the right technology T2 - Paper presented at the meeting of the American Educational Research Association Y1 - 1988 A1 - Reckase, M. D. JF - Paper presented at the meeting of the American Educational Research Association CY - New Orleans, April 1988 ER - TY - CONF T1 - Computerized adaptive testing program at Miami-Dade Community College, South Campous T2 - Laguna Hill CA: League for Innovation in the community College. Y1 - 1988 A1 - Schinoff, R. B. A1 - Stead, L. JF - Laguna Hill CA: League for Innovation in the community College. ER - TY - ABST T1 - Computerized adaptive testing: The state of the art in assessment at three community colleges Y1 - 1988 A1 - League-for-Innovation-in-the-Community-College CY - Laguna Hills CA: Author N1 - (25431 Cabot Road, Suite 203, Laguna Hills CA 92653) ER - TY - BOOK T1 - Computerized adaptive testing: The state of the art in assessment at three community colleges Y1 - 1988 A1 - Doucette, D. CY - Laguna Hills CA: League for Innovation in the Community College ER - TY - CONF T1 - A computerized adaptive version of the Differential Aptitude Tests T2 - Paper presented at the meeting of the American Psychological Association Y1 - 1988 A1 - J. R. McBride JF - Paper presented at the meeting of the American Psychological Association CY - Atlanta GA ER - TY - JOUR T1 - Computerized mastery testing JF - Machine-Mediated Learning Y1 - 1988 A1 - Lewis, C. A1 - Sheehan, K. VL - 2 ER - TY - CHAP T1 - Construct validity of computer-based tests Y1 - 1988 A1 - Green, B. F. CY - H. Wainer and H. Braun (Eds.), Test validity (pp. 77-103). Hillsdale NJ: Erlbaum. ER - TY - JOUR T1 - Critical problems in computer-based psychological measurement, , , JF - Applied Measurement in Education Y1 - 1988 A1 - Green, B. F. VL - 1 ER - TY - JOUR T1 - CATS, testlets, and test construction: A rationale for putting test developers back into CAT JF - Journal of Educational Measurement Y1 - 1987 A1 - Wainer, H., A1 - Kiely, G. L. VL - 32 N1 - (volume number appears to incorrect) ER - TY - ABST T1 - A computer program for adaptive testing by microcomputer (MESA Memorandum No 40) Y1 - 1987 A1 - Linacre, J. M. CY - Chicago: University of Chicago. (ERIC ED 280 895.) ER - TY - ABST T1 - Computerized adaptive language testing: A Spanish placement exam Y1 - 1987 A1 - Larson, J. W. CY - In Language Testing Research Selected Papers from the Colloquium, Monterey CA N1 - (ERIC No. FL016939) ER - TY - CONF T1 - Computerized adaptive testing: A comparison of the nominal response model and the three-parameter logistic model T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 1987 A1 - De Ayala, R. J., A1 - Koch, W. R. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - Washington DC ER - TY - CHAP T1 - Computerized adaptive testing for measuring abilities and other psychological variables Y1 - 1987 A1 - Weiss, D. J. A1 - Vale, C. D. CY - J. N. Butcher (Ed.), Computerized personality measurement: A practitioners guide (pp. 325-343). New York: Basic Books. ER - TY - CONF T1 - Computerized adaptive testing made practical: The Computerized Adaptive Edition of the Differential Aptitude Tests T2 - Presented at the U.S. Department of Labor National Test Development Conference Y1 - 1987 A1 - J. R. McBride JF - Presented at the U.S. Department of Labor National Test Development Conference CY - San Francisco, CA ER - TY - CONF T1 - Computerized adaptive testing with the rating scale model T2 - Paper presented at the Fourth International Objective Measurement Workshop Y1 - 1987 A1 - Dodd, B. G. JF - Paper presented at the Fourth International Objective Measurement Workshop CY - Chicago ER - TY - JOUR T1 - Computerized psychological testing: Overview and critique JF - Professional Psychology: Research and Practice Y1 - 1987 A1 - Burke, M. J, A1 - Normand, J A1 - Raju, N. M. VL - 1 ER - TY - ABST T1 - CATs, testlets, and test construction: A rationale for putting test developers back into CAT (Technical Report 86-71) Y1 - 1986 A1 - Wainer, H., A1 - Kiely, G. L. CY - Princeton NJ: Educational Testing Service, Program Statistics Research N1 - #WA86-71 ER - TY - CHAP T1 - A cognitive error diagnostic adaptive testing system Y1 - 1986 A1 - Tatsuoka, K. K. CY - the 28th ADCIS International Conference Proceedings. Washington DC: ADCIS. ER - TY - ABST T1 - College Board computerized placement tests: Validation of an adaptive test of basic skills (Research Report 86-29) Y1 - 1986 A1 - W. C. Ward A1 - Kline, R. G. A1 - Flaugher, J. CY - Princeton NJ: Educational Testing Service. ER - TY - CONF T1 - Comparison and equating of paper-administered, computer-administered, and computerized adaptive tests of achievement T2 - Paper presented at the annual meeting of the American Educational Research Association Y1 - 1986 A1 - Olsen, J. B. A1 - Maynes, D. D. A1 - Slawson, D. A1 - Ho, K JF - Paper presented at the annual meeting of the American Educational Research Association CY - San Francisco CA ER - TY - CONF T1 - A computer-adaptive placement test for college mathematics T2 - Paper presented at the annual meeting of the American Educational Research Association Y1 - 1986 A1 - Shermis, M. D. JF - Paper presented at the annual meeting of the American Educational Research Association CY - San Francisco CA N1 - #SH86-01 ER - TY - CONF T1 - Computerized adaptive achievement testing: A prototype T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 1986 A1 - J. R. McBride A1 - Moe, K. C. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - San Francisco CA ER - TY - CONF T1 - A computerized adaptive edition of the Differential Aptitude Tests T2 - Paper presented at the meeting of the American Psychological Association Y1 - 1986 A1 - J. R. McBride JF - Paper presented at the meeting of the American Psychological Association CY - Washington DC N1 - ERIC No. ED 285 918) ER - TY - CONF T1 - A computerized adaptive edition of the Differential Aptitude Tests T2 - Presented at the National Assessment Conference of the Education Commission of the States Y1 - 1986 A1 - J. R. McBride JF - Presented at the National Assessment Conference of the Education Commission of the States CY - Boulder, CO ER - TY - CHAP T1 - Computerized adaptive testing: A pilot project Y1 - 1986 A1 - Kingsbury, G. G. CY - W. C. Ryan (ed.), Proceedings: NECC 86, National Educational Computing Conference (pp.172-176). Eugene OR: University of Oregon, International Council on Computers in Education. ER - TY - JOUR T1 - Computerized testing technology JF - Advances in Reading/Language Research Y1 - 1986 A1 - Wolfe, J. H. VL - 4 ER - TY - CONF T1 - Computerized adaptive attitude measurement T2 - Paper presented at the annual meeting of the American Educational Research Association Y1 - 1985 A1 - Koch, W. R. A1 - Dodd, B. G. JF - Paper presented at the annual meeting of the American Educational Research Association CY - Chicago ER - TY - JOUR T1 - Computerized adaptive testing JF - Educational Leadership Y1 - 1985 A1 - J. R. McBride VL - 43 ER - TY - CONF T1 - Computerized adaptive testing: An overview and an example T2 - Presented at the Assessment Conference of the Education Commission of the States Y1 - 1985 A1 - J. R. McBride JF - Presented at the Assessment Conference of the Education Commission of the States CY - Boulder, CO ER - TY - JOUR T1 - Controlling item exposure conditional on ability in computerized adaptive testing JF - Journal of Educational and Behavioral Statistics Y1 - 1985 A1 - Sympson, J. B. A1 - Hetter, R. D. VL - 23 ER - TY - CHAP T1 - Controlling item-exposure rates in computerized adaptive testing Y1 - 1985 A1 - Sympson, J. B. A1 - Hetter, R. D. CY - Proceedings of the 27th annual meeting of the Military Testing Association (pp. 973-977). San Diego CA: Navy Personnel Research and Development Center. ER - TY - JOUR T1 - Current developments and future directions in computerized personality assessment JF - Journal of Consulting and Clinical Psychology Y1 - 1985 A1 - Butcher, J. N. A1 - Keller, L. S. A1 - Bacon, S. F. AB - Although computer applications in personality assessment have burgeoned rapidly in recent years, the majority of these uses capitalize on the computer's speed, accuracy, and memory capacity rather than its potential for the development of new, flexible assessment strategies. A review of current examples of computer usage in personality assessment reveals wide acceptance of automated clerical tasks such as test scoring and even test administration. The computer is also assuming tasks previously reserved for expert clinicians, such as writing narrative interpretive reports from test results. All of these functions represent automation of established assessment devices and interpretive strategies. The possibility also exists of harnessing some of the computer's unique adaptive capabilities to alter standard devices and even develop new ones. Three proposed strategies for developing computerized adaptive personality tests are described, with the conclusion that the computer's potential in this area justifies a call for further research efforts., (C) 1985 by the American Psychological Association VL - 53 N1 - Miscellaneous Article ER - TY - BOOK T1 - A comparison of the maximum likelihood strategy and stradaptive test on a micro-computer Y1 - 1984 A1 - Bill, B. C. CY - Unpublished M.S. thesis, University of Wisconsin, Madison. N1 - #BI84-01 ER - TY - JOUR T1 - Computerized adaptive testing in the Maryland Public Schools JF - MicroCAT News Y1 - 1984 A1 - Stevenson, J. VL - 1 ER - TY - JOUR T1 - Computerized diagnostic testing JF - Journal of Educational Measurement Y1 - 1984 A1 - MCArthur , D.L. A1 - Choppin, B. H. VL - 21 ER - TY - CHAP T1 - A comparison of IRT-based adaptive mastery testing and a sequential mastery testing procedure Y1 - 1983 A1 - Kingsbury, G.G. A1 - Weiss, D. J. CY - D. J. Weiss (Ed.), New horizons in testing: Latent trait theory and computerized adaptive testing (pp. 1-8). New York: Academic Press. ER - TY - CHAP T1 - A comparison of IRT-based adaptive mastery testing and a sequential mastery testing procedure. T2 - New horizons in testing: Latent trait test theory and computerized adaptive testing Y1 - 1983 A1 - Kingsbury, G. G. A1 - Weiss, D. J. JF - New horizons in testing: Latent trait test theory and computerized adaptive testing PB - Academic Press. CY - New York, NY. USA ER - TY - CHAP T1 - A comparison of IRT-based adaptive mastery testing and a sequential mastery testing procedure Y1 - 1983 A1 - Kingsbury, G. G. A1 - Weiss, D. J. CY - D. J. Weiss (Ed.), New horizons in testing: Latent trait test theory and computerized adaptive testing (pp. 257-283). New York: Academic Press. ER - TY - RPRT T1 - Comparison of live and simulated adaptive tests Y1 - 1982 A1 - HUnter, D. R. JF - Air Force Human Resources Laborarory PB - Air Force Systems Command CY - Brooks Air Force Base, Texas ER - TY - ABST T1 - Computerized adaptive testing project: Objectives and requirements (Tech Note 82-22) Y1 - 1982 A1 - J. R. McBride CY - San Diego CA: Navy Personnel Research and Development Center. (AD A118 447) N1 - #McB82-22 ER - TY - ABST T1 - Computerized adaptive testing system design: Preliminary design considerations (Tech. Report 82-52) Y1 - 1982 A1 - Croll, P. R. CY - San Diego CA: Navy Personnel Research and Development Center. (AD A118 495) ER - TY - ABST T1 - Computerized Adaptive Testing system development and project management. Y1 - 1982 A1 - J. R. McBride CY - Minutes of the ASVAB (Armed Services Vocational Aptitude Battery) Steering Committee. Washington, DC: Office of the Assistant Secretary of Defense (Manpower, Reserve Affairs and Logistics), Accession Policy Directorate. ER - TY - CHAP T1 - The computerized adaptive testing system development project Y1 - 1982 A1 - J. R. McBride A1 - Sympson, J. B. CY - D. J. Weiss (Ed.), Proceedings of the 1982 Item Response Theory and Computerized Adaptive Testing Conference (pp. 342-349). Minneapolis: University of Minnesota, Department of Psychology. N1 - {PDF file, 296 KB} ER - TY - CHAP T1 - Computerized testing in the German Federal Armed Forces (FAF): Empirical approaches Y1 - 1982 A1 - Wildgrube, W. CY - D. J. Weiss (Ed.), Proceedings of the 1982 Item Response Theory and Computerized Adaptive Testing Conference (pp.353-359). Minneapolis: University of Minnesota, Department of Psychology, Psychometric Methods Program. N1 - PDF file, 384 K ER - TY - RPRT T1 - A comparison of a Bayesian and a maximum likelihood tailored testing procedure Y1 - 1981 A1 - McKinley, R. L., A1 - Reckase, M. D. JF - Research Report 81-2 PB - University of Missouri, Department of Educational Psychology, Tailored Testing Research Laboratory CY - Columbia MO ER - TY - CONF T1 - A comparison of a maximum likelihood and a Bayesian estimation procedure for tailored testing T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 1981 A1 - Rosso, M. A. A1 - Reckase, M. D. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - Los Angeles CA N1 - #RO81-01 ER - TY - ABST T1 - A comparison of two methods of interactive testing Final report. Y1 - 1981 A1 - Nicewander, W. A. A1 - Chang, H. S. A1 - Doody, E. N. CY - National Institute of Education Grant 79-1045 ER - TY - THES T1 - A comparative evaluation of two Bayesian adaptive ability estimation procedures with a conventional test strategy Y1 - 1980 A1 - Gorman, S. PB - Catholic University of America CY - Washington DC VL - Ph.D. ER - TY - BOOK T1 - A comparative evaluation of two Bayesian adaptive ability estimation procedures Y1 - 1980 A1 - Gorman, S. CY - Unpublished doctoral dissertation, the Catholic University of America ER - TY - ABST T1 - A comparison of adaptive, sequential, and conventional testing strategies for mastery decisions (Research Report 80-4) Y1 - 1980 A1 - Kingsbury, G. G. A1 - Weiss, D. J. CY - Minneapolis, Department of Psychology, Psychometric Methods Program, Computerized Adaptive Testing Laboratory N1 - {PDF file, 1.905 MB} ER - TY - CHAP T1 - A comparison of ICC-based adaptive mastery testing and the Waldian probability ratio method Y1 - 1980 A1 - Kingsbury, G. G. A1 - Weiss, D. J. CY - D. J. Weiss (Ed.). Proceedings of the 1979 Computerized Adaptive Testing Conference (pp. 120-139). Minneapolis MN: University of Minnesota, Department of Psychology, Psychometric Methods Program, Computerized Adaptive Testing Laboratory N1 - 51 MB} ER - TY - CHAP T1 - A comparison of the accuracy of Bayesian adaptive and static tests using a correction for regression Y1 - 1980 A1 - Gorman, S. CY - D. J. Weiss (Ed.), Proceedings of the 1979 Computerized Adaptive Testing Conference (pp. 35-50). Minneapolis MN: University of Minnesota, Department of Psychology, Computerized Adaptive Testing Laboratory. N1 - {PDF file, 735 KB} ER - TY - JOUR T1 - Computer applications in audiology and rehabilitation of the hearing impaired JF - Journal of Communication Disorders Y1 - 1980 A1 - Levitt, H. VL - 13 ER - TY - JOUR T1 - Computer applications to ability testing JF - Association for Educational Data Systems Journal Y1 - 1980 A1 - McKinley, R. L., A1 - Reckase, M. D. VL - 13 ER - TY - ABST T1 - Computerized instructional adaptive testing model: Formulation and validation (AFHRL-TR-79-33, Final Report) Y1 - 1980 A1 - Kalisch, S. J. CY - Brooks Air Force Base TX: Air Force Human Resources Laboratory", Also Catalog of Selected Documents in Psychology, February 1981, 11, 20 (Ms. No, 2217) ER - TY - CHAP T1 - Computerized testing in the German Federal Armed Forces (FAF) Y1 - 1980 A1 - Wildgrube, W. CY - D. J. Weiss (Ed.), Proceedings of the 1979 Item Response Theory and Computerized Adaptive Testing Conference (pp. 68-77). Minneapolis: University of Minnesota, Department of Psychology, Psychometric Methods Program, Computerized Adaptive Testing Laborator N1 - {PDF file, 595 KB} ER - TY - ABST T1 - Criterion-related validity of adaptive testing strategies (Research Report 80-3) Y1 - 1980 A1 - Thompson, J. G. A1 - Weiss, D. J. CY - Minneapolis: University of Minnesota, Department of Psychology, Psychometric Methods Program, Computerized Adaptive Testing Laboratory N1 - #TH80-03 {PDF file, 2.708 MB} ER - TY - JOUR T1 - A comparison of a standard and a computerized adaptive paradigm in Bekesy fixed-frequency audiometry JF - Journal of Auditory Research Y1 - 1979 A1 - Harris, J. D. A1 - Smith, P. F. VL - 19 ER - TY - ABST T1 - Computerized adaptive testing: The state of the art (ARI Technical Report 423) Y1 - 1979 A1 - J. R. McBride CY - Alexandria, VA: U.S. Army Research Institute for the Behavioral and Social Sciences. ER - TY - CONF T1 - Criterion-related validity of conventional and adaptive tests in a military environment T2 - Paper presented at the 1979 Computerized Adaptive Testing Conference Y1 - 1979 A1 - Sympson, J. B. JF - Paper presented at the 1979 Computerized Adaptive Testing Conference CY - Minneapolis MN ER - TY - JOUR T1 - Combining auditory and visual stimuli in the adaptive testing of speech discrimination JF - Journal of Speech and Hearing Disorders Y1 - 1978 A1 - Steele, J. A. A1 - Binnie, C. A. A1 - Cooper, W. A. VL - 43 ER - TY - BOOK T1 - A comparison of Bayesian and maximum likelihood scoring in a simulated stradaptive test Y1 - 1978 A1 - Maurelli, V. A. CY - Unpublished Masters thesis, St. Mary’s University of Texas, San Antonio TX ER - TY - ABST T1 - A comparison of the fairness of adaptive and conventional testing strategies (Research Report 78-1) Y1 - 1978 A1 - Pine, S. M. A1 - Weiss, D. J. CY - Minneapolis: University of Minnesota, Department of Psychology, Psychometric Methods Program ER - TY - JOUR T1 - Computer-assisted tailored testing: Examinee reactions and evaluation JF - Educational and Psychological Measurement Y1 - 1978 A1 - Schmidt, F. L. A1 - Urry, V. W. A1 - Gugel, J. F. VL - 38 ER - TY - JOUR T1 - Computerized adaptive testing: Principles and directions JF - Computers and Education Y1 - 1978 A1 - Kreitzberg, C. B. A1 - Stocking, M., A1 - Swanson, L. VL - 2 ER - TY - JOUR T1 - Computerized adaptive testing: Principles and directions JF - Computers and Education Y1 - 1978 A1 - Kreitzberg, C. B. VL - 2 (4) ER - TY - ABST T1 - A construct validation of adaptive achievement testing (Research Report 78-4) Y1 - 1978 A1 - Bejar, I. I. A1 - Weiss, D. J. CY - Minneapolis MN: University of Minnesota, Department of Psychology, Psychometric Methods Program, Computerized Adaptive Testing Laboratory ER - TY - ABST T1 - Calibration of an item pool for the adaptive measurement of achievement (Research Report 77-5) Y1 - 1977 A1 - Bejar, I. I. A1 - Weiss, D. J. A1 - Kingsbury, G. G. CY - Minneapolis: Department of Psychology, Psychometric Methods Program ER - TY - CHAP T1 - A comparison of conventional and adaptive achievement testing Y1 - 1977 A1 - Bejar, I. I. CY - D. J. Weiss (Ed.), Proceedings of the 1977 Computerized Adaptive Testing Conference. Minneapolis MN: University of Minnesota, Department of Psychology, Psychometric Methods Program. ER - TY - BOOK T1 - A comparison of the classification of students by two methods of administration of a mathematics placement test Y1 - 1977 A1 - Brooks, S. CY - Unpublished doctoral dissertation, Syracuse University, 1977 ER - TY - BOOK T1 - A computer adaptive approach to the measurement of personality variables Y1 - 1977 A1 - Sapinkopf, R. C. CY - Unpublished doctoral dissertation, University of Maryland, Baltimore ER - TY - JOUR T1 - A computer simulation study of tailored testing strategies for objective-based instructional programs JF - Educational and Psychological Measurement Y1 - 1977 A1 - Spineti, J. P. A1 - Hambleton, R. K. AB - One possible way of reducing the amount of time spent testing in . objective-based instructional programs would involve the implementation of a tailored testing strategy. Our purpose was to provide some additional data on the effectiveness of various tailored testing strategies for different testing situations. The three factors of a tailored testing strategy under study with various hypothetical distributions of abilities across two learning hierarchies were test length, mastery cutting score, and starting point. Overall, our simulation results indicate that it is possible to obtain a reduction of more than 50% in testing time without any loss in decision-making accuracy, when compared to a conventional testing procedure, by implementing a tailored testing strategy. In addition, our study of starting points revealed that it was generally best to begin testing in the middle of the learning hierarchy. Finally we observed a 40% reduction in errors of classification as the number of items for testing each objective was increased from one to five. VL - 37 ER - TY - ABST T1 - Computer-assisted tailored testing: Examinee reactions and evaluation (PB-276 748) Y1 - 1977 A1 - Schmidt, F. L. A1 - Urry, V. W. A1 - Gugel, J. F. CY - Washington DC: U. S. Civil Service Commission, Personnel Research and Development Center. N1 - #SC77-01 ER - TY - CHAP T1 - Computerized Adaptive Testing and Personnel Accessioning System Design Y1 - 1977 A1 - Underwood, M. A. CY - D. J. Weiss (Ed.), Proceedings of the 1977 Computerized Adaptive Testing Conference. Minneapolis MN: University of Minnesota, Department of Psychology, Psychometric Methods Program. ER - TY - CHAP T1 - Computerized Adaptive Testing research and development Y1 - 1977 A1 - J. R. McBride CY - H. Taylor, Proceedings of the Second Training and Personnel Technology Conference. Washington, DC: Office of the Director of Defense Research and Engineering. ER - TY - CHAP T1 - Computerized Adaptive Testing with a Military Population Y1 - 1977 A1 - Gorman, S. CY - D. J. Weiss (Ed.), Proceedings of the 1977 Computerized Adaptive Testing Conference. Minneapolls MN: University of Minnesota, Department of Psychology, Psychometric Methods Program ER - TY - CHAP T1 - Computer-assisted testing: An orderly transition from theory to practice Y1 - 1976 A1 - McKillip, R. H. A1 - Urry, V. W. CY - C. K. Clark (Ed.), Proceedings of the First Conference on Computerized Adaptive Testing (pp. 95-96). Washington DC: U.S. Government Printing Office. N1 - {PDF file, 191 KB} ER - TY - ABST T1 - Computer-assisted testing with live examinees: A rendezvous with reality (TN 75-3) Y1 - 1976 A1 - Urry, V. W. CY - Washington DC: U. S. Civil Service Commission, Personnel Research and Development Center ER - TY - JOUR T1 - Complete orders from incomplete data: Interactive ordering and tailored testing JF - Psychological Bulletin Y1 - 1975 A1 - Cliff, N. A. VL - 82 ER - TY - JOUR T1 - Computerized adaptive ability measurement JF - Naval Research Reviews Y1 - 1975 A1 - Weiss, D. J. VL - 28 ER - TY - CHAP T1 - Computerized adaptive trait measurement: Problems and prospects (Research Report 75-5) Y1 - 1975 A1 - Weiss, D. J. CY - Minneapolis: University of Minnesota, Department of Psychology, Psychometric Methods Program. ER - TY - BOOK T1 - The comparison of two tailored testing models and the effects of the models variables on actual loss Y1 - 1974 A1 - Kalisch, S. J. CY - Unpublished doctoral dissertation, Florida State University ER - TY - ABST T1 - A computer software system for adaptive ability measurement (Research Report 74-1) Y1 - 1974 A1 - De Witt, J. J. A1 - Weiss, D. J. CY - Minneapolis MN: University of Minnesota, Department of Psychology, Computerized Adaptive Testing Laboratory ER - TY - ABST T1 - Computer-assisted testing: The calibration and evaluation of the verbal ability bank (Technical Study 74-3) Y1 - 1974 A1 - Urry, V. W. CY - Washington DC: U. S. Civil Service Commission, Personnel Research and Development Center ER - TY - ABST T1 - Computer-based adaptive testing models for the Air Force technical training environment: Phase I: Development of a computerized measurement system for Air Force technical Training Y1 - 1974 A1 - Hansen, D. N. A1 - Johnson, B. F. A1 - Fagan, R. L. A1 - Tan, P. A1 - Dick, W. CY - JSAS Catalogue of Selected Documents in Psychology, 5, 1-86 (MS No. 882). AFHRL Technical Report 74-48. ER - TY - CHAP T1 - Computer-based psychological testing Y1 - 1973 A1 - Jones, D. A1 - Weinman, J. CY - A. Elithorn and D. Jones (Eds.), Artificial and human thinking (pp. 83-93). San Francisco CA: Jossey-Bass. ER - TY - JOUR T1 - A comparison of computer-simulated conventional and branching tests JF - Educational and Psychological Measurement Y1 - 1971 A1 - Waters, C. J. A1 - Bayroff, A. G. VL - 31 ER - TY - ABST T1 - A comparison of four methods of selecting items for computer-assisted testing (Technical Bulletin STB 72-5) Y1 - 1971 A1 - Bryson, R. CY - San Diego: Naval Personnel and Training Research Laboratory ER - TY - ABST T1 - Computer assistance for individualizing measurement Y1 - 1971 A1 - Ferguson, R. L. CY - Pittsburgh PA: University of Pittsburgh R and D Center ER - TY - BOOK T1 - Computerized adaptive sequential testing Y1 - 1971 A1 - Wood, R. L. CY - Unpublished doctoral dissertation, University of Chicago ER - TY - CHAP T1 - Comments on tailored testing Y1 - 1970 A1 - Green, B. F. CY - W. H. Holtzman, (Ed.), Computer-assisted instruction, testing, and guidance (pp. 184-197). New York: Harper and Row. ER - TY - ABST T1 - Computer assistance for individualizing measurement Y1 - 1970 A1 - Ferguson, R. L. CY - Pittsburgh PA: University of Pittsburgh, Learning Research and Development Center ER - TY - JOUR T1 - Computer assistance for individualizing measurement JF - Computers and Automation Y1 - 1970 A1 - Ferguson, R. L. VL - March 1970 ER - TY - ABST T1 - Computer-assisted criterion-referenced measurement (Working Paper No 49) Y1 - 1969 A1 - Ferguson, R. L. CY - Pittsburgh PA: University of Pittsburgh, Learning and Research Development Center. (ERIC No. ED 037 089) ER - TY - BOOK T1 - Computer-assisted testing (Eds.) Y1 - 1968 A1 - Harman, H. H. A1 - Helm, C. E. A1 - Loye, D. E. CY - Princeton NJ: Educational Testing Service ER - TY - ABST T1 - Construction of an experimental sequential item test (Research Memorandum 60-1) Y1 - 1960 A1 - Bayroff, A. G. A1 - Thomas, J. J A1 - Anderson, A. A. CY - Washington DC: Personnel Research Branch, Department of the Army ER - TY - JOUR T1 - A clinical study of consecutive and adaptive testing with the revised Stanford-Binet JF - Journal of Consulting Psychology Y1 - 1947 A1 - Hutt, M. L. VL - 11 ER -