Total views : 1390

An Automatic Online Calibration Design in Adaptive Testing


  • Master Management International A/S and University of Twente
  • University of Twente


An accurately calibrated item bank is essential for a valid computerized adaptive test. However, in some settings, such as occupational testing, there is limited access to test takers for calibration. As a result of the limited access to possible test takers, collecting data to accurately calibrate an item bank in an occupational setting is usually difficult. In such a setting, the item bank can be calibrated online in an operational setting. This study explored three possible automatic online calibration strategies, with the intent of calibrating items accurately while estimating ability precisely and fairly. That is, the item bank is calibrated in a situation where test takers are processed and the scores they obtain have consequences. A simulation study was used to identify the optimal calibration strategy. The outcome measure was the mean absolute error of the ability estimates of the test takers participating in the calibration phase. Manipulated variables were the calibration strategy, the size of the calibration sample, the size of the item bank, and the item response model.


Computerized Adaptive Testing, Item Bank, Item Response Theory, Online Calibration

Full Text:

 |  (PDF views: 621)


  • Berger, M. P. F. (1991). On the efficiency of IRT models when applied to different sampling designs. Applied Psychological Measurement, 15, 283-306.
  • Berger, M. P. F. (1992). Sequential sampling designs for the two-parameter item response theory model. Psychometrika, 57, 521-538.
  • Berger, M. P. F. (1994). D-optimal designs for item response theory models. Journal of Educational Statistics, 19, 43-56.
  • Berger, M. P. F., King, C. Y. J., & Wong, W. K. (2000). Minimax D-optimal designs for item response theory models. Psychometrika, 65, 377-390.
  • Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F.M. Lord & Novick M. R. Statistical theories of mental test scores (pp. 395-479). Reading, MA: Addison-Wesley.
  • Bock R. D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46, 443-459.
  • Daville, C. (1993). Flow as a testing ideal. Rasch Measurement Transactions 7:3.
  • Jones, D. H., & Nediak, M. S. (2000). A simulation study of optimal on-line calibration of testlets using real data (RUTCOR research report). New Brunswick, NJ: Rutgers University, Faculty of Management and RUTCOR.
  • Lima Passos, V., & Berger, M. P. F. (2004). Maximin calibration designs for the nominal response model: An empirical evaluation. Applied Psychological Measurement, Vol. 28, 72-87 (2004)
  • Linacre, J. M. (2000). Computer-adaptive testing: A methodology whose time has come. MESA Memorandum. No. 69.
  • Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Copenhagen: Denmarks Pædagogiske Institut.
  • Rudner, L. (1998). An applied study on computerized adaptive testing. Rockland, MA: Swets & Zeitlinger.
  • Spray, J. A., & Reckase, M. D. (1996). Comparison of SPRT and sequential Bayes procedures for classifying examinees into two categories using a computerized test. Journal of Educational & Behavioral Statistics,21, 405-414.
  • van der Linden, W. J., & Glas, C. A. W. (Eds.) (2000a). Computerized adaptive testing. Theory and practice, Dordrecht: Kluwer Academic Publishers.
  • van der Linden, W. J., & Glas, C. A. W. (2000b). Capitalization on item calibration error in adaptive testing. Applied Measurement in Education, 13, 35-53.
  • Stocking, M. L. (1990). Specifying optimum examinees for item parameter estimation in item response theory. Psychometrika, 55, 461-475.
  • Wainer, H. (Ed.) (2000). Computerized adaptive testing. A primer. Second edition. Hilsdale, NJ: Lawrence Erlbaum Associates.
  • Warm, T. A. (1989). Weighted likelihood estimation of ability in item response theory. Psychometrika, 54, 427-450.
  • Wise, S.L., & DeMars, C.E. (2006). Low examinee effort in low-stakes assessment: Problems and potential solutions. Educational Assessment, 10, 1-17.


  • There are currently no refbacks.