Total views : 1013
Design of a Computer-Adaptive Test to Measure English Literacy and Numeracy in the Singapore Workforce: Considerations, Benefits, and Implications
A computer adaptive test (CAT) is a delivery methodology that serves the larger goals of the assessment system in which it is embedded. A thorough analysis of the assessment system for which a CAT is being designed is critical to ensure that the delivery platform is appropriate and addresses all relevant complexities. As such, a CAT engine must be designed to conform to the validity and reliability of the overall system. This design takes the form of adherence to the assessment goals and objectives of the adaptive assessment system. When the assessment is adapted for use in another country, consideration must be given to any necessary revisions including content differences. This article addresses these considerations while drawing, in part, on the process followed in the development of the CAT delivery system designed to test English language workplace skills for the Singapore Workforce Development Agency. Topics include item creation and selection, calibration of the item pool, analysis and testing of the psychometric properties, and reporting and interpretation of scores. The characteristics and benefits of the CAT delivery system are detailed as well as implications for testing programs considering the use of a CAT delivery system.
- American Education Research Association, American Psychological Association, & National Council on Measurement in Education. (1999). Standards for Educational and Psychological Testing. Washington, D.C.: American Educational Research Association.
- Angoff, W.H. (1971). Scales, Norms and Equivalent Scores. In R.L. Thorndike (Ed.), Educational Measurement (2nd ed., pp. 508-600) Washington, D.C.: American Council
- on Education.
- Brennan, R. L. (2006). Educational Measurement (4th ed.). Westport, CT: American Council on Education and Praeger.
- Dorans, N. J. & Holland, P.W. (1993). DIF Detection and Description: Mantel-Haenszel and Standardization. In P.W. Holland and H. Wainer (Eds.), Differential Item Functioning
- (pp. 35-66). Hillsdale, N.J.: Lawrence Erlbaum Associates.
- Educational Testing Service. (2008). ETS Fairness Review Guidelines. Princeton, N.J.: Author.
- Educational Testing Service. (2002). ETS Standards for Quality and Fairness. Princeton, N.J.:
- Holland P. W. & Thayer, D.T. (1988). Differential Item Performance and the Mantel-Haenszel
- Procedure. In H. Wainer, and H. I. Brown (Eds.), Test Validity (pp. 129-145). Hillsdale,
- N.J.: Lawrence Erlbaum Associates.
- Mitzel, H. C., Lewis, D. M., Patz, R. J., & Green, D. R. (2001). The bookmark procedure: Psychological perspectives. In G. J. Cizek (Ed.). Setting performance standards: Concepts, methods and perspectives (pp. 249-281). Mahwah, NJ: Lawrence Erlbaum
- Rafilson, F. (1991). The case for validity generalization. Practical Assessment, Research & Evaluation, 2(13). Washington, D.C.
- Rasch, G. (1980). Probabilistic Models for Some Intelligence and Attainment Tests. Copenhagen: Danmarks Paedagogiske Institut, 1960. Reprint, Chicago: University of
- Chicago Press.
- Singapore Ministry of Education (2011). Primary School Leaving Exam. Retrieved from
- UK NARIC (2008). National and International Benchmarking of WDA Workplace Literacy and Numeracy Qualifications. Retrieved from http://app2.wda.gov.sg/data/imgcont/936/Full%20Naric%20Report%20-%20last%20updated%20200409.pdf
- Wainer, H., Dorans, N. J., Eignor, D., Flaugher, R., Green, B.F., Mislevy, R.J., Steinberg, L., Thissen, D. (2000). Computer Adaptive Testing: A Primer (2nd ed.). Mahwah, NJ:Lawrence Erlbaum Associates.
- Zieky, M. & Perie, M. (2006). A Primer on Setting Cut Scores on Tests of Educational Achievement. Princeton, N.J.: Author.
- There are currently no refbacks.