|Title||A Comparison of Three Empirical Reliability Estimates for Computerized Adaptive Testing|
|Publication Type||Conference Paper|
|Year of Publication||2017|
|Conference Name||IACAT 2017 conference|
|Publisher||Niigata Seiryo University|
|Conference Location||Niigata, Japan|
Reliability estimates in Computerized Adaptive Testing (CAT) are derived from estimated thetas and standard error of estimated thetas. In practical, the observed standard error (OSE) of estimated thetas can be estimated by test information function for each examinee with respect to Item response theory (IRT). Unlike classical test theory (CTT), OSEs in IRT are conditional values given each estimated thetas so that those values should be marginalized to consider test reliability. Arithmetic mean, Harmonic mean, and Jensen equality were applied to marginalize OSEs to estimate CAT reliability. Based on different marginalization method, three empirical CAT reliabilities were compared with true reliabilities. Results showed that three empirical CAT reliabilities were underestimated compared to true reliability in short test length (< 40), whereas the magnitude of CAT reliabilities was followed by Jensen equality, Harmonic mean, and Arithmetic mean in long test length (> 40). Specifically, Jensen equality overestimated true reliability across all conditions in long test length (>50).