Assessment Class B 2014 UM: Parallel Tests & Equating: Theory, Principles, and Practice

Summarized by:

Marwa & Erlik Widiyani Styati

Parallel test is multiple test type. It refers to the same objective one another as possible in terms of test form, content, and item analysis in testing. It also refers to multiple test form to be equivalent in term of content, cognitive, demand and test item format. It is an important issue in testing. Multiple test forms should be designed very well to ensure the fairness of the test-taker. Parallel test should be stable and tried out. Then, parallel test is administered. After paralell test is administered, the test is scored which is commonly called equating test.

Equating parallel tests is used to maintain the fairness among test-takers taking the test either at the same time or different time usually happens in standardized tests( Final Examination held by the government). Equating test is a technical procedure or process conducted to establish comparable scores, with equivalent meaning, on different versions of test forms of the same test; it allows them to be used interchangeably (Ryan And Brockmann, 2008). It is the practice of placing two or more tests on the same scale and satisfying other requirements in order to use test scores interchangeably, as having the same meaning Moreover, equating is successful to the extent that the form taken is a matter of indifference to each examinee (Kolen and Brennan, 2004). The aim of equating test is to determine whether the differences in the two groups of test score are caused by different proficiency and different in the average difficulty. The reasons equating are to allow us to determine to the extend test is harder or one group is better, to what extend the difference in the performance better prepared than the other, and to compare the students ability across the two group. To allow the score from both test to be used interchangeably. The goal of score equating is to establish validity across forms and years; fairness; test security; and increase continuity in programs that release items or require ongoing development.

The methods of equating are using item theory response (IRT) and classical test theory (CTT). The use of IRT is now widespread and almost all statewide programs employ it along with CTT. Kolen and Brennan (2004) state IRT has numerous implications for equating and test construction which is popular among statewide assessment and accountability programs. It may be the result of its ability when CTT limitations in the certain situations. The fundamental model of CTT is that observed raw scores are composed of two components: the “true” score and the “error” of the items or test, idiosyncrasies of the particular testing setting, or variation in the students’ ability to perform. Item Response Theory (IRT) refers to a large collection of technical procedures for analyzing test items and scaling students based on their item responses. The estimation ability for students are taken into account characteristics of the test items where the students take and responses to the items. It estimates to take into account students’ raw scores and reflect to the certain characteristics of the test items answered by the students correctly. The IRT models are the Parameter Logistic Model (sometimes denoted as “1PL” or “the Rasch Model”), the Parameter Logistic Model (sometimes denoted as “2PL”), and the 3-Parameter Logistic Model (sometimes denoted as “3PL”).

The practical matter that requires actual scores as a starting point is called practice of equating two test forms. The scores (data) used to perform linking and equating calculations are collected according to established principles known as data collection designs or equating designs. The designs of equating are (1) Equivalent Groups (Random Groups) Design, which is used in many large scale assessment programs; (2) Single Group Design, which provides the conceptual basis for other designs; (3) Single Group Design with Counterbalancing, and (4) Anchor Test Design which utilizes concepts and practices that are quite common in many statewide testing programs.

To sum up, the parallel test is the multiple test types which has the equal objective one another as possible in terms of test form, content, and item analysis in testing. It also refers to multiple test form to be equivalent in term of content, cognitive, demand and test item format. Equating test is a technical procedure or process conducted to establish comparable scores, with equivalent meaning, on different versions of test forms of the same test; it allows them to be used interchangeably. So, parallel test is the form of test which is administered and scored by the equating which used two methods namely IRT and CTT.

References

Kolen, M.J., & Brennan, R.L. 2004.Test equating, scaling, and linking: Methods and

practices, 2nd.ed.New York, NY: Springer

Ryan, J & Brockmann.2008. A Practitioner’s Introduction to Equating.With Primers on

Classical Test Theory and Item Response Theory.UK: The Council of Chief State School Officers

Assessment Class B 2014 UM

Selasa, 24 Maret 2015

Parallel Tests & Equating: Theory, Principles, and Practice

Tidak ada komentar:

Posting Komentar