A
Summary on the Requirements, Assumptions, and Estimations of Parameters as well
as Instrument Development based on IRT
By: Agus Eko Cahyono and Jumariati
In the IRT based test development, we need
to consider the requirements, assumptions, and estimations of the parameters of
the test items. Large samples of examinees are required to accurately estimate
the IRT item parameters, and longer tests provide more accurate y estimates. To
a lesser extent, increasing the test length can also improve the accuracy of
the item parameter estimation. This results from either improved estimation of
the ys or improved estimation of the shape of the y distribution. In addition, increasing
the number of examinees can somewhat improve the estimation of y through
improved estimation of the item parameters.
There are some assumptions underlying
parameters. The first is unidimensionality in which a test that is
unidimensional if it consists of items that tap into only one dimension.
Whenever only a single score is reported for a test, there is an implicit
assumption that the items share a common primary construct. Unidimensionality
means that the model has a single y for each examinee, and any other factors
affecting the item response are treated as random error or nuisance dimensions
unique to that item and not shared by other items. One simple method of testing
unidimensionality is based on the eigenvalues (roots) of the inter-item
correlation matrix. Another assumption of IRT is local independence. If the
item responses are not locally independent under a unidimensional model,
another dimension must be causing the dependence. With tests of local
independence, however, the focus is on dependencies among pairs of items. These
dependencies might not emerge as separate dimensions, unless they influenced a
larger group of items, and thus might not be detectable by tests of unidimensionality.
Consequently, separate procedures have been developed to detect local
dependencies. The simplest case occurs when the item parameters have been estimated
in a previous sample and are used to estimate an individual examinee’s y score
or the y-distribution of a group of examinees. This would be the case for
on-demand testing, in which examinees take the test at different times and
receive a score immediately. It would also be the case for standardized tests
in which all of the operational items have been calibrated previously in
another sample(s).
Score estimation utilizes the likelihood
function. The simplest case occurs when the item parameters have been estimated
in a previous sample and are used to estimate an individual examinee’s y score
or the y-distribution of a group of examinees. This would be the case for
on-demand testing, in which examinees take the test at different times and
receive a score immediately. It would also be the case for standardized tests
in which all of the operational items have been calibrated previously in
another sample(s).To estimate parameters, one of the methods is the marginal
distribution. It is the distribution of one variable after marginalizing
(averaging) over the distribution of another variable. In this case, the
marginal likelihood referred to in MML is the likelihood of the item parameters
after marginalizing over y. By marginalizing over the y distribution, this procedure
greatly reduces the number of unknowns to be estimated.
References:
Baker,
F.B. 2001. The Basics of Item Response Theory.
ERIC Clearinghouse on Assessment
and
Evaluation.
DeMars,
C. 2010. Item Response Theory:
Understanding Statistics Measurement. New York:
Oxford
University Press.
Tidak ada komentar:
Posting Komentar