Assessment Class B 2014 UM

A Summary on “Parallel Tests and Equating: Theories, Principles, and Practice”
By: Agus Eko Cahyono and Jumariati

In the context of language testing, parallel tests is an important issue. Multiple test forms are said to be parallel when they are as equal to one another as possible in terms of test specification like the type, form, content, purpose, and of statistical criteria like level of difficulty, discriminating power, and distracters. The common example is a school program which has two types of parallel tests: one is for the achievement test while the other is for those test-takers who need the retesting. In this case, the tests must be parallel as the function is the same that is to assess the achievement of the test-takers. High-stakes test like Ujian Nasional in Indonesian schools is used to be parallel with regards to the function to assess students’ learning achievement into certain level in spite of its administration that may be in different point in time throughout the country. Thus, the tests are different from one administration with another but the forms are still similar (equal). This implies that the assembly of multiple test forms should be designed very carefully and properly to ensure the fairness to each of the test-taker and at the same time to maintain the security of the tests.
In fact, there is still the possibility that the multiple test forms that have been developed are not similar; some differences in the statistical characteristics are still found. Therefore, equating methods to face this problem are needed. Equating parallel tests is an important issue in standardized tests to maintain the fairness among test-takers taking the test either at the same time or at different point in time. In order to be equal, Kolen and Brennan (2004) define several equality characteristics that need to be met. First, the equal construct requirement in which the tests to be equated must measure the same construct. If the tests’ constructs are different, they cannot be equated. Second is the equal reliability wherein the tests should yield reliable results. The third is the equal symmetry which means that the equating transformations must be symmetrical. Fourth, the equity requirement which deals with a matter of indifference to each test taker whether test form X or test form Y is administered. Finally is the population invariance requirement which means that the equating is the same regardless of the group of test-takers on which the equating is performed. These principles need to be taken into consideration once an equating is made.
In the practices of equating test forms, some methods are used such as the equating traditional method which utilizes the Item-Response Theory method and computer software like Kernell method and Automated Test Assembly (ATA). The traditional equating model is commonly done through random group design. In this type, test Model A is given to test-taker one, test Model B is given to test-taker two, and then test Model A is for test-taker three and so forth. The results obtained by the test-takers working on test Model A are compared to the result of those working on Model B. The conclusion then is made based on whether or not there is a difference between the two groups. If students in Model B obtain higher scores than those working on Model A, we can conclude that test Model B is easier than Model A and thus they are not equal (parallel).
The use of computer technology as ATA in assembling multiple test forms is preferred by test assemblers lately because of the fast processing and abundance of item pools (Lin, 2008). With the development of ATA, pre-equated parallel test forms can be achieved more efficiently. The computer software will process the test criteria that have been laid out in the test blueprint and these criteria are separated into two: psychometric and non-psychometric attributes. Non-psychometric attributes include the test content, test format, test length, item usage frequency, and item exclusion. Meanwhile, psychometric attributes deal with classical item statistic, IRT-based item parameter estimates, item-response function, or item information functions.
In conclusion, equating multiple test forms is a crucial method of ensuring the equality of the tests; it can help test designers guarantee the fairness of the test to each of the test-taker and the security of the test forms.

References:
Kolen MJ, Brennan RJ.2004. Test Equating: Methods and Practices (2nd ed.). New York:
Springer-Verlag.

Lin, C.-J. 2008. Comparisons between Classical Test Theory and Item Response Theory
in Automated Assembly of Parallel Test Forms. Journal of Technology, Learning, and
Assessment, 6(8). Retrieved at February, 10th 2015 from http://www.jtla.org

Assessment Class B 2014 UM

Selasa, 17 Februari 2015

Tidak ada komentar:

Posting Komentar