In an ironic twist of history, modern psychological testing has returned to an adaptive format quite common when testing was not yet standardized. Important stimuli to the renewed interest in adaptive testing have been the development of item-response theory in psychometrics, which models the responses on test items using separate parameters for the items and test takers, and the use of computers in test administration, which enables us to estimate the parameter for a test taker and select the items in real time. This article reviews a selection from the latest developments in the technology of adaptive testing, such as constrained adaptive item selection, adaptive testing using rule-based item generation, multidimensional adaptive testing, adaptive use of test batteries, and the use of response times in adaptive testing.

VL - 216 ER - TY - JOUR T1 - Assembling a computerized adaptive testing item pool as a set of linear tests JF - Journal of Educational and Behavioral Statistics Y1 - 2006 A1 - van der Linden, W. J. A1 - Ariel, A. A1 - Veldkamp, B. P. KW - Algorithms KW - computerized adaptive testing KW - item pool KW - linear tests KW - mathematical models KW - statistics KW - Test Construction KW - Test Items AB - Test-item writing efforts typically results in item pools with an undesirable correlational structure between the content attributes of the items and their statistical information. If such pools are used in computerized adaptive testing (CAT), the algorithm may be forced to select items with less than optimal information, that violate the content constraints, and/or have unfavorable exposure rates. Although at first sight somewhat counterintuitive, it is shown that if the CAT pool is assembled as a set of linear test forms, undesirable correlations can be broken down effectively. It is proposed to assemble such pools using a mixed integer programming model with constraints that guarantee that each test meets all content specifications and an objective function that requires them to have maximal information at a well-chosen set of ability values. An empirical example with a previous master pool from the Law School Admission Test (LSAT) yielded a CAT with nearly uniform bias and mean-squared error functions for the ability estimator and item-exposure rates that satisfied the target for all items in the pool. PB - Sage Publications: US VL - 31 SN - 1076-9986 (Print) ER - TY - JOUR T1 - Equating scores from adaptive to linear tests JF - Applied Psychological Measurement Y1 - 2006 A1 - van der Linden, W. J. KW - computerized adaptive testing KW - equipercentile equating KW - local equating KW - score reporting KW - test characteristic function AB - Two local methods for observed-score equating are applied to the problem of equating an adaptive test to a linear test. In an empirical study, the methods were evaluated against a method based on the test characteristic function (TCF) of the linear test and traditional equipercentile equating applied to the ability estimates on the adaptive test for a population of test takers. The two local methods were generally best. Surprisingly, the TCF method performed slightly worse than the equipercentile method. Both methods showed strong bias and uniformly large inaccuracy, but the TCF method suffered from extra error due to the lower asymptote of the test characteristic function. It is argued that the worse performances of the two methods are a consequence of the fact that they use a single equating transformation for an entire population of test takers and therefore have to compromise between the individual score distributions. PB - Sage Publications: US VL - 30 SN - 0146-6216 (Print) ER - TY - JOUR T1 - A comparison of item-selection methods for adaptive tests with content constraints JF - Journal of Educational Measurement Y1 - 2005 A1 - van der Linden, W. J. KW - Adaptive Testing KW - Algorithms KW - content constraints KW - item selection method KW - shadow test approach KW - spiraling method KW - weighted deviations method AB - In test assembly, a fundamental difference exists between algorithms that select a test sequentially or simultaneously. Sequential assembly allows us to optimize an objective function at the examinee's ability estimate, such as the test information function in computerized adaptive testing. But it leads to the non-trivial problem of how to realize a set of content constraints on the test—a problem more naturally solved by a simultaneous item-selection method. Three main item-selection methods in adaptive testing offer solutions to this dilemma. The spiraling method moves item selection across categories of items in the pool proportionally to the numbers needed from them. Item selection by the weighted-deviations method (WDM) and the shadow test approach (STA) is based on projections of the future consequences of selecting an item. These two methods differ in that the former calculates a projection of a weighted sum of the attributes of the eventual test and the latter a projection of the test itself. The pros and cons of these methods are analyzed. An empirical comparison between the WDM and STA was conducted for an adaptive version of the Law School Admission Test (LSAT), which showed equally good item-exposure rates but violations of some of the constraints and larger bias and inaccuracy of the ability estimator for the WDM. PB - Blackwell Publishing: United Kingdom VL - 42 SN - 0022-0655 (Print) ER - TY - JOUR T1 - Constraining item exposure in computerized adaptive testing with shadow tests JF - Journal of Educational and Behavioral Statistics Y1 - 2004 A1 - van der Linden, W. J. A1 - Veldkamp, B. P. KW - computerized adaptive testing KW - item exposure control KW - item ineligibility constraints KW - Probability KW - shadow tests AB - Item-exposure control in computerized adaptive testing is implemented by imposing item-ineligibility constraints on the assembly process of the shadow tests. The method resembles Sympson and Hetter’s (1985) method of item-exposure control in that the decisions to impose the constraints are probabilistic. The method does not, however, require time-consuming simulation studies to set values for control parameters before the operational use of the test. Instead, it can set the probabilities of item ineligibility adaptively during the test using the actual item-exposure rates. An empirical study using an item pool from the Law School Admission Test showed that application of the method yielded perfect control of the item-exposure rates and had negligible impact on the bias and mean-squared error functions of the ability estimator. PB - American Educational Research Assn: US VL - 29 SN - 1076-9986 (Print) ER - TY - JOUR T1 - Constructing rotating item pools for constrained adaptive testing JF - Journal of Educational Measurement Y1 - 2004 A1 - Ariel, A. A1 - Veldkamp, B. P. A1 - van der Linden, W. J. KW - computerized adaptive tests KW - constrained adaptive testing KW - item exposure KW - rotating item pools AB - Preventing items in adaptive testing from being over- or underexposed is one of the main problems in computerized adaptive testing. Though the problem of overexposed items can be solved using a probabilistic item-exposure control method, such methods are unable to deal with the problem of underexposed items. Using a system of rotating item pools, on the other hand, is a method that potentially solves both problems. In this method, a master pool is divided into (possibly overlapping) smaller item pools, which are required to have similar distributions of content and statistical attributes. These pools are rotated among the testing sites to realize desirable exposure rates for the items. A test assembly model, motivated by Gulliksen's matched random subtests method, was explored to help solve the problem of dividing a master pool into a set of smaller pools. Different methods to solve the model are proposed. An item pool from the Law School Admission Test was used to evaluate the performances of computerized adaptive tests from systems of rotating item pools constructed using these methods. (PsycINFO Database Record (c) 2007 APA, all rights reserved) PB - Blackwell Publishing: United Kingdom VL - 41 SN - 0022-0655 (Print) ER - TY - JOUR T1 - Computerized adaptive testing with item cloning JF - Applied Psychological Measurement Y1 - 2003 A1 - Glas, C. A. W. A1 - van der Linden, W. J. KW - computerized adaptive testing AB - (from the journal abstract) To increase the number of items available for adaptive testing and reduce the cost of item writing, the use of techniques of item cloning has been proposed. An important consequence of item cloning is possible variability between the item parameters. To deal with this variability, a multilevel item response (IRT) model is presented which allows for differences between the distributions of item parameters of families of item clones. A marginal maximum likelihood and a Bayesian procedure for estimating the hyperparameters are presented. In addition, an item-selection procedure for computerized adaptive testing with item cloning is presented which has the following two stages: First, a family of item clones is selected to be optimal at the estimate of the person parameter. Second, an item is randomly selected from the family for administration. Results from simulation studies based on an item pool from the Law School Admission Test (LSAT) illustrate the accuracy of these item pool calibration and adaptive testing procedures. (PsycINFO Database Record (c) 2003 APA, all rights reserved). VL - 27 N1 - References .Sage Publications, US ER - TY - JOUR T1 - Optimal stratification of item pools in α-stratified computerized adaptive testing JF - Applied Psychological Measurement Y1 - 2003 A1 - Chang, Hua-Hua A1 - van der Linden, W. J. KW - Adaptive Testing KW - Computer Assisted Testing KW - Item Content (Test) KW - Item Response Theory KW - Mathematical Modeling KW - Test Construction computerized adaptive testing AB - A method based on 0-1 linear programming (LP) is presented to stratify an item pool optimally for use in α-stratified adaptive testing. Because the 0-1 LP model belongs to the subclass of models with a network flow structure, efficient solutions are possible. The method is applied to a previous item pool from the computerized adaptive testing (CAT) version of the Graduate Record Exams (GRE) Quantitative Test. The results indicate that the new method performs well in practical situations. It improves item exposure control, reduces the mean squared error in the θ estimates, and increases test reliability. (PsycINFO Database Record (c) 2005 APA ) (journal abstract) VL - 27 ER - TY - JOUR T1 - Some alternatives to Sympson-Hetter item-exposure control in computerized adaptive testing JF - Journal of Educational and Behavioral Statistics Y1 - 2003 A1 - van der Linden, W. J. KW - Adaptive Testing KW - Computer Assisted Testing KW - Test Items computerized adaptive testing AB - TheHetter and Sympson (1997; 1985) method is a method of probabilistic item-exposure control in computerized adaptive testing. Setting its control parameters to admissible values requires an iterative process of computer simulations that has been found to be time consuming, particularly if the parameters have to be set conditional on a realistic set of values for the examinees’ ability parameter. Formal properties of the method are identified that help us explain why this iterative process can be slow and does not guarantee admissibility. In addition, some alternatives to the SH method are introduced. The behavior of these alternatives was estimated for an adaptive test from an item pool from the Law School Admission Test (LSAT). Two of the alternatives showed attractive behavior and converged smoothly to admissibility for all items in a relatively small number of iteration steps. VL - 28 ER - TY - JOUR T1 - Using response times to detect aberrant responses in computerized adaptive testing JF - Psychometrika Y1 - 2003 A1 - van der Linden, W. J. A1 - van Krimpen-Stoop, E. M. L. A. KW - Adaptive Testing KW - Behavior KW - Computer Assisted Testing KW - computerized adaptive testing KW - Models KW - person Fit KW - Prediction KW - Reaction Time AB - A lognormal model for response times is used to check response times for aberrances in examinee behavior on computerized adaptive tests. Both classical procedures and Bayesian posterior predictive checks are presented. For a fixed examinee, responses and response times are independent; checks based on response times offer thus information independent of the results of checks on response patterns. Empirical examples of the use of classical and Bayesian checks for detecting two different types of aberrances in response times are presented. The detection rates for the Bayesian checks outperformed those for the classical checks, but at the cost of higher false-alarm rates. A guideline for the choice between the two types of checks is offered. VL - 68 ER - TY - RPRT T1 - Mathematical-programming approaches to test item pool design Y1 - 2002 A1 - Veldkamp, B. P. A1 - van der Linden, W. J. A1 - Ariel, A. KW - Adaptive Testing KW - Computer Assisted KW - Computer Programming KW - Educational Measurement KW - Item Response Theory KW - Mathematics KW - Psychometrics KW - Statistical Rotation computerized adaptive testing KW - Test Items KW - Testing AB - (From the chapter) This paper presents an approach to item pool design that has the potential to improve on the quality of current item pools in educational and psychological testing and hence to increase both measurement precision and validity. The approach consists of the application of mathematical programming techniques to calculate optimal blueprints for item pools. These blueprints can be used to guide the item-writing process. Three different types of design problems are discussed, namely for item pools for linear tests, item pools computerized adaptive testing (CAT), and systems of rotating item pools for CAT. The paper concludes with an empirical example of the problem of designing a system of rotating item pools for CAT. PB - University of Twente, Faculty of Educational Science and Technology CY - Twente, The Netherlands SN - 02-09 N1 - Using Smart Source ParsingAdvances in psychology research, Vol. ( Hauppauge, NY : Nova Science Publishers, Inc, [URL:http://www.Novapublishers.com]. vi, 228 pp ER - TY - JOUR T1 - Capitalization on item calibration error in adaptive testing JF - Applied Measurement in Education Y1 - 2000 A1 - van der Linden, W. J. A1 - Glas, C. A. W. KW - computerized adaptive testing AB - (from the journal abstract) In adaptive testing, item selection is sequentially optimized during the test. Because the optimization takes place over a pool of items calibrated with estimation error, capitalization on chance is likely to occur. How serious the consequences of this phenomenon are depends not only on the distribution of the estimation errors in the pool or the conditional ratio of the test length to the pool size given ability, but may also depend on the structure of the item selection criterion used. A simulation study demonstrated a dramatic impact of capitalization on estimation errors on ability estimation. Four different strategies to minimize the likelihood of capitalization on error in computerized adaptive testing are discussed. VL - 13 N1 - References .Lawrence Erlbaum, US ER - TY - JOUR T1 - Multidimensional adaptive testing with a minimum error-variance criterion JF - Journal of Educational and Behavioral Statistics Y1 - 1999 A1 - van der Linden, W. J. KW - computerized adaptive testing AB - Adaptive testing under a multidimensional logistic response model is addressed. An algorithm is proposed that minimizes the (asymptotic) variance of the maximum-likelihood estimator of a linear combination of abilities of interest. The criterion results in a closed-form expression that is easy to evaluate. In addition, it is shown how the algorithm can be modified if the interest is in a test with a "simple ability structure". The statistical properties of the adaptive ML estimator are demonstrated for a two-dimensional item pool with several linear combinations of the abilities. VL - 24 ER - TY - JOUR T1 - Using response-time constraints to control for differential speededness in computerized adaptive testing JF - Applied Psychological Measurement Y1 - 1999 A1 - van der Linden, W. J. A1 - Scrams, D. J. A1 - Schnipke, D. L. KW - computerized adaptive testing AB - An item-selection algorithm is proposed for neutralizing the differential effects of time limits on computerized adaptive test scores. The method is based on a statistical model for distributions of examinees’ response times on items in a bank that is updated each time an item is administered. Predictions from the model are used as constraints in a 0-1 linear programming model for constrained adaptive testing that maximizes the accuracy of the trait estimator. The method is demonstrated empirically using an item bank from the Armed Services Vocational Aptitude Battery. VL - 23 N1 - Sage Publications, US ER - TY - JOUR T1 - A model for optimal constrained adaptive testing JF - Applied Psychological Measurement Y1 - 1998 A1 - van der Linden, W. J. A1 - Reese, L. M. KW - computerized adaptive testing AB - A model for constrained computerized adaptive testing is proposed in which the information in the test at the trait level (0) estimate is maximized subject to a number of possible constraints on the content of the test. At each item-selection step, a full test is assembled to have maximum information at the current 0 estimate, fixing the items already administered. Then the item with maximum in-formation is selected. All test assembly is optimal because a linear programming (LP) model is used that automatically updates to allow for the attributes of the items already administered and the new value of the 0 estimator. The LP model also guarantees that each adaptive test always meets the entire set of constraints. A simulation study using a bank of 753 items from the Law School Admission Test showed that the 0 estimator for adaptive tests of realistic lengths did not suffer any loss of efficiency from the presence of 433 constraints on the item selection process. VL - 22 N1 - Sage Publications, US ER -