TY - JOUR T1 - Measuring patient-reported outcomes adaptively: Multidimensionality matters! JF - Applied Psychological Measurement Y1 - 2018 A1 - Paap, Muirne C. S. A1 - Kroeze, Karel A. A1 - Glas, C. A. W. A1 - Terwee, C. B. A1 - van der Palen, Job A1 - Veldkamp, Bernard P. ER - TY - JOUR T1 - Item usage in a multidimensional computerized adaptive test (MCAT) measuring health-related quality of life JF - Quality of Life Research Y1 - 2017 A1 - Paap, Muirne C. S. A1 - Kroeze, Karel A. A1 - Terwee, Caroline B. A1 - van der Palen, Job A1 - Veldkamp, Bernard P. VL - 26 UR - https://doi.org/10.1007/s11136-017-1624-3 ER - TY - JOUR T1 - Robust Automated Test Assembly for Testlet-Based Tests: An Illustration with Analytical Reasoning Items JF - Frontiers in Education Y1 - 2017 A1 - Veldkamp, Bernard P. A1 - Paap, Muirne C. S. VL - 2 UR - https://www.frontiersin.org/article/10.3389/feduc.2017.00063 ER - TY - JOUR T1 - On the Issue of Item Selection in Computerized Adaptive Testing With Response Times JF - Journal of Educational Measurement Y1 - 2016 A1 - Veldkamp, Bernard P. AB - Many standardized tests are now administered via computer rather than paper-and-pencil format. The computer-based delivery mode brings with it certain advantages. One advantage is the ability to adapt the difficulty level of the test to the ability level of the test taker in what has been termed computerized adaptive testing (CAT). A second advantage is the ability to record not only the test taker's response to each item (i.e., question), but also the amount of time the test taker spends considering and answering each item. Combining these two advantages, various methods were explored for utilizing response time data in selecting appropriate items for an individual test taker.Four strategies for incorporating response time data were evaluated, and the precision of the final test-taker score was assessed by comparing it to a benchmark value that did not take response time information into account. While differences in measurement precision and testing times were expected, results showed that the strategies did not differ much with respect to measurement precision but that there were differences with regard to the total testing time. VL - 53 UR - http://dx.doi.org/10.1111/jedm.12110 ER - TY - JOUR T1 - Multidimensional Computerized Adaptive Testing for Classifying Examinees With Within-Dimensionality JF - Applied Psychological Measurement Y1 - 2016 A1 - van Groen, Maaike M. A1 - Eggen, Theo J. H. M. A1 - Veldkamp, Bernard P. AB - A classification method is presented for adaptive classification testing with a multidimensional item response theory (IRT) model in which items are intended to measure multiple traits, that is, within-dimensionality. The reference composite is used with the sequential probability ratio test (SPRT) to make decisions and decide whether testing can be stopped before reaching the maximum test length. Item-selection methods are provided that maximize the determinant of the information matrix at the cutoff point or at the projected ability estimate. A simulation study illustrates the efficiency and effectiveness of the classification method. Simulations were run with the new item-selection methods, random item selection, and maximization of the determinant of the information matrix at the ability estimate. The study also showed that the SPRT with multidimensional IRT has the same characteristics as the SPRT with unidimensional IRT and results in more accurate classifications than the latter when used for multidimensional data. VL - 40 UR - http://apm.sagepub.com/content/40/6/387.abstract ER - TY - JOUR T1 - Item Selection Methods Based on Multiple Objective Approaches for Classifying Respondents Into Multiple Levels JF - Applied Psychological Measurement Y1 - 2014 A1 - van Groen, Maaike M. A1 - Eggen, Theo J. H. M. A1 - Veldkamp, Bernard P. AB -

Computerized classification tests classify examinees into two or more levels while maximizing accuracy and minimizing test length. The majority of currently available item selection methods maximize information at one point on the ability scale, but in a test with multiple cutting points selection methods could take all these points simultaneously into account. If for each cutting point one objective is specified, the objectives can be combined into one optimization function using multiple objective approaches. Simulation studies were used to compare the efficiency and accuracy of eight selection methods in a test based on the sequential probability ratio test. Small differences were found in accuracy and efficiency between different methods depending on the item pool and settings of the classification method. The size of the indifference region had little influence on accuracy but considerable influence on efficiency. Content and exposure control had little influence on accuracy and efficiency.

VL - 38 UR - http://apm.sagepub.com/content/38/3/187.abstract ER - TY - JOUR T1 - Uncertainties in the Item Parameter Estimates and Robust Automated Test Assembly JF - Applied Psychological Measurement Y1 - 2013 A1 - Veldkamp, Bernard P. A1 - Matteucci, Mariagiulia A1 - de Jong, Martijn G. AB -

Item response theory parameters have to be estimated, and because of the estimation process, they do have uncertainty in them. In most large-scale testing programs, the parameters are stored in item banks, and automated test assembly algorithms are applied to assemble operational test forms. These algorithms treat item parameters as fixed values, and uncertainty is not taken into account. As a consequence, resulting tests might be off target or less informative than expected. In this article, the process of parameter estimation is described to provide insight into the causes of uncertainty in the item parameters. The consequences of uncertainty are studied. Besides, an alternative automated test assembly algorithm is presented that is robust against uncertainties in the data. Several numerical examples demonstrate the performance of the robust test assembly algorithm, and illustrate the consequences of not taking this uncertainty into account. Finally, some recommendations about the use of robust test assembly and some directions for further research are given.

VL - 37 UR - http://apm.sagepub.com/content/37/2/123.abstract ER - TY - JOUR T1 - Multiple Maximum Exposure Rates in Computerized Adaptive Testing JF - Applied Psychological Measurement Y1 - 2009 A1 - Barrada, Juan Ramón A1 - Veldkamp, Bernard P. A1 - Olea, Julio AB -

Computerized adaptive testing is subject to security problems, as the item bank content remains operative over long periods and administration time is flexible for examinees. Spreading the content of a part of the item bank could lead to an overestimation of the examinees' trait level. The most common way of reducing this risk is to impose a maximum exposure rate (rmax) that no item should exceed. Several methods have been proposed with this aim. All of these methods establish a single value of rmax throughout the test. This study presents a new method, the multiple-rmax method, that defines as many values of rmax as the number of items presented in the test. In this way, it is possible to impose a high degree of randomness in item selection at the beginning of the test, leaving the administration of items with the best psychometric properties to the moment when the trait level estimation is most accurate. The implementation of the multiple-r max method is described and is tested in simulated item banks and in an operative bank. Compared with a single maximum exposure method, the new method has a more balanced usage of the item bank and delays the possible distortion of trait estimation due to security problems, with either no or only slight decrements of measurement accuracy.

VL - 33 UR - http://apm.sagepub.com/content/33/1/58.abstract ER - TY - JOUR T1 - Implementing Sympson-Hetter Item-Exposure Control in a Shadow-Test Approach to Constrained Adaptive Testing JF - International Journal of Testing Y1 - 2008 A1 - Veldkamp, Bernard P. A1 - van der Linden, Wim J. VL - 8 UR - http://www.tandfonline.com/doi/abs/10.1080/15305050802262233 ER - TY - JOUR T1 - Conditional Item-Exposure Control in Adaptive Testing Using Item-Ineligibility Probabilities JF - Journal of Educational and Behavioral Statistics Y1 - 2007 A1 - van der Linden, Wim J. A1 - Veldkamp, Bernard P. AB -

Two conditional versions of the exposure-control method with item-ineligibility constraints for adaptive testing in van der Linden and Veldkamp (2004) are presented. The first version is for unconstrained item selection, the second for item selection with content constraints imposed by the shadow-test approach. In both versions, the exposure rates of the items are controlled using probabilities of item ineligibility given θ that adapt the exposure rates automatically to a goal value for the items in the pool. In an extensive empirical study with an adaptive version of the Law School Admission Test, the authors show how the method can be used to drive conditional exposure rates below goal values as low as 0.025. Obviously, the price to be paid for minimal exposure rates is a decrease in the accuracy of the ability estimates. This trend is illustrated with empirical data.

VL - 32 UR - http://jeb.sagepub.com/cgi/content/abstract/32/4/398 ER - TY - JOUR T1 - Optimal Testlet Pool Assembly for Multistage Testing Designs JF - Applied Psychological Measurement Y1 - 2006 A1 - Ariel, Adelaide A1 - Veldkamp, Bernard P. A1 - Breithaupt, Krista AB -

Computerized multistage testing (MST) designs require sets of test questions (testlets) to be assembled to meet strict, often competing criteria. Rules that govern testlet assembly may dictate the number of questions on a particular subject or may describe desirable statistical properties for the test, such as measurement precision. In an MST design, testlets of differing difficulty levels must be created. Statistical properties for assembly of the testlets can be expressed using item response theory (IRT) parameters. The testlet test information function (TIF) value can be maximized at a specific point on the IRT ability scale. In practical MST designs, parallel versions of testlets are needed, so sets of testlets with equivalent properties are built according to equivalent specifications. In this project, the authors study the use of a mathematical programming technique to simultaneously assemble testlets to ensure equivalence and fairness to candidates who may be administered different testlets.

VL - 30 UR - http://apm.sagepub.com/content/30/3/204.abstract ER - TY - JOUR T1 - Automated Simultaneous Assembly for Multistage Testing JF - International Journal of Testing Y1 - 2005 A1 - Breithaupt, Krista A1 - Ariel, Adelaide A1 - Veldkamp, Bernard P. VL - 5 UR - http://www.tandfonline.com/doi/abs/10.1207/s15327574ijt0503_8 ER - TY - JOUR T1 - Constraining Item Exposure in Computerized Adaptive Testing With Shadow Tests JF - Journal of Educational and Behavioral Statistics Y1 - 2004 A1 - van der Linden, Wim J. A1 - Veldkamp, Bernard P. AB -

Item-exposure control in computerized adaptive testing is implemented by imposing item-ineligibility constraints on the assembly process of the shadow tests. The method resembles Sympson and Hetter’s (1985) method of item-exposure control in that the decisions to impose the constraints are probabilistic. The method does not, however, require time-consuming simulation studies to set values for control parameters before the operational use of the test. Instead, it can set the probabilities of item ineligibility adaptively during the test using the actual item-exposure rates. An empirical study using an item pool from the Law School Admission Test showed that application of the method yielded perfect control of the item-exposure rates and had negligible impact on the bias and mean-squared error functions of the ability estimator.

VL - 29 UR - http://jeb.sagepub.com/cgi/content/abstract/29/3/273 ER -