Erbay Mermer, Ş. Comparison Of Item Selection and Ability Estimation Methods In Computerized Adaptive Testing: A Simulation-Based Study | Academic Journal of Education and Social Sciences

This is an outdated version published on 2025-07-07. Read the most recent version.

PDF (Turkish)

Published: 2025-07-07

Updated: 2025-07-07

Versions:

2025-07-07 (3)

2025-07-07 (2)

2025-07-07 (1)

DOI: https://doi.org/10.5281/zenodo.15829038

Keywords:

Computerized adaptive testing, ıtem selection methods, ability estimation

Şeyma ERBAY MERMER

BİLECİK ŞEYH EDEBALİ ÜNİVERSİTESİ

https://orcid.org/0000-0002-7747-9545

Abstract

In this study, the performances of different item selection and ability estimation methods were compared through simulation with Computerized Adaptive Testing. In this context, estimation errors of item selection and ability parameters were calculated using a simulated dataset based on the three-parameter logistic model, and simulations were conducted via SimulCAT software. In the study, Maximum Likelihood Estimation and Bayesian estimation methods were compared for ability estimation; the accuracy of ability estimation and the average standard errors were calculated for each method using Maximum Fisher Information and Maksimum Likelihood Weighted Information as item selection methods. During the testing process, interim theta values of individuals were also examined, and it was recorded which method yielded better results. A fixed-length rule of 20 items was preferred as the test termination criterion, and the test was terminated for each individual after answering 20 questions. The application was carried out with a total of 1,000 individuals, and item pool consisted of 500 items. The average of the results obtained from 25 replications for each method was used in the analyses. According to results, in the Maximum Likelihood Estimation method, the Maksimum Likelihood Weighted Inform function yielded the most accurate ability estimation with the least error; in the Bayesian method, the Maximum Fisher Information criterion provided the most accurate estimation. In interim ability estimation as well, it can be stated that the Maksimum Likelihood Weighted Inform function in the Maximum Likelihood Estimation method and the Maximum Fisher Information criterion in the Bayesian method yielded better results.

Issue

Vol. 3 No. 1 (2026): Volume 3, Issue 1, 2026 (Online First)

Section

Research Article

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

All published content is protected under the CC BY-NC-ND 4.0 license.

Author Biography

Şeyma ERBAY MERMER, BİLECİK ŞEYH EDEBALİ ÜNİVERSİTESİ

How to Cite

Erbay Mermer, Ş. Comparison Of Item Selection and Ability Estimation Methods In Computerized Adaptive Testing: A Simulation-Based Study. (2025). Academic Journal of Education and Social Sciences, 3(1), 1-12. https://doi.org/10.5281/zenodo.15829038

References

Baker, F. B. (2001). The basics of item response theory (2nd ed.). ERIC Clearinghouse on Assessment and Evaluation.

Bejar, I. I. & Weiss, D. J. (1979). Computer programs for scoring test data with item characteristic curve models (Research Rep. No. 79-1). Minneapolis: University of Minnesota, Depeartment of Psychology, Psychometric Methods Program.

Carter, J. E., & Wilkinson, L. (1984). A latent trait analysis of the MMPI. Multivariate Behavioral Research, 19(3), 385–407.

DeMars, C., 2010. Item Response Theory: Understanding Statistics Measurement. Prof. Dr. Hülya Kelecioğlu (çeviri editörü), Oxford University Press, Oxford, 3-31.

Eggen, T. J. H. M. (1999). Item selection in adaptive testing with the sequential probability ratio test. Applied Psychological Measurement, 23(3), 249–261.

Eggen, P. D., & Kauchak, D. (2004). Educational psychology: Windows on classrooms (6th ed.). Pearson Prentice Hall.

Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Psychology Press.

Eroğlu, M. G., & Kelecioğlu, H. (2015). Bireyselleştirilmiş bilgisayarlı test uygulamalarında farklı sonlandırma kurallarının ölçme kesinliği ve test uzunluğu açısından karşılaştırılması. Uludağ Üniversitesi Eğitim Fakültesi Dergisi, 28(1), 31–52.

Fisher, R. A. (1925). Theory of statistical estimation. Proceedings of the Cambridge Philosophical Society, 22(5), 700–725.

Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., & Rubin, D. B. (2013). Bayesian data analysis (3rd ed.). CRC Press.

Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. Sage Publications.

Han, K. T. (2012). SimulCAT: Computerized adaptive testing simulation program (Version 1.0) [Computer software]. https://www.hantest.net/simulcat/

Ho, T. (2010). A Comparison of item selection procedures using different ability estimation methods in computerized adaptive testing based on the Generalized Partial Credit Model, [Unpublished doctoral dissertation]. University of Texas.

Hunter, G. (2020). The Sigmoid Function. 5 Eylül 2023 tarihinde https://blog.mbedded.ninja/programming/artificial-intelligence/the-sigmoid-function/?utm_source=chatgpt.com uzantısından erişilmiştir.

Kalender, İ. (2004). Bilgisayar ortamında bireyselleştirilmiş testlerin eğitimde kullanımı. XIII. Ulusal Eğitim Bilimleri Kurultayı. İnönü Üniversitesi, Eğitim Fakültesi, Temmuz 6-9, Malatya.

Kalender, I. (2009). CITO. Egitim Kuram ve Uygulama, 5, 39-48.

Karasar, N. (2019). Bilimsel araştırma yöntemi: Kavramlar, ilkeler, teknikler (23. baskı). Nobel Akademik Yayıncılık.

Kezer, F., & Koç, N. (2014). Bilgisayar ortamında bireye uyarlanmış test stratejilerinin karşılaştırılması [A comparison of computerized adaptive testing strategies]. Eğitim Bilimleri Araştırmaları Dergisi, 4(1), 145–174.

Meijer, R. R., & Nering, M. L. (1999). Multidimensional item response models [Monograph]. Lawrence Erlbaum Associates.

Overstall, A. M. (2020). Properties of using Fisher information gain for Bayesian design of experiments. https://arxiv.org/abs/2003.07315 uzantısından 10 Eylül 2022 tarihinde erişilmiştir.

Rudner, L. (1998). An On-line, Interactive, Computer Adaptive Testing Mini Tutorial. ERIC Clearinghouse on Assessment and Evaluation.

Rudner, L. M., & Guo, F. (2009). Computer adaptive testing for small scale programs and instructional systems. Journal of Applied Testing Technology, 10(1), 1–19.

Song, T. (2012). The effect of fitting a tridimensional IRT model to multidimensional data in content-balanced computerized adaptive testing. Unpublished Doctoral Dissertation. Michigan State University.

Tabachnick, B.G. & Fidell, L. S. (2013). Using multivariate statistics, 6th edition. Boston:Pearson.

Thissen, D., & Mislevy, R.J. (2000). Testing algorithms. In H. Wainer (Ed.). Computerized adaptive testing, (101-135). Lawrence Erlbaum Assc.

van der Linden, W. J., & Glas, C. A. W. (2000). Computerized adaptive testing: Theory and practice. Dordrecht: Kluwer Academic Publishers.

van der Linden, W. J., & Glas, C. A. W. (2010). Elements of adaptive testing. Springer.

Veerkamp, W. J. J., & Berger, M. P. F. (1997). Some new item selection criteria for adaptive testing. Journal of Educational and Behavioral Statistics, 22(3), 203–226.

Wainer, H., Dorans, N. J., Flaugher, R., Green, B. F., & Mislevy, R. J. (2000). Computerized adaptive testing: A primer (2nd ed.). Routledge.

Weiss, D. J. (1982). Improving measurement quality and efficiency with adaptive testing. Applied Psychological Measurement, 6(4), 473–492. https://doi.org/10.1177/014662168200600405

Weiss, D. J. (1985). Adaptive testing by computer. Journal of Consulting and Clinical Psychology, 53(6), 774–789. https://doi.org/10.1037/0022-006X.53.6.774

Article Sidebar

Main Article Content