Introduction to Item response theory
I.G.A. Lokita
Purnamika Utami
Rina Sari
Item analysis provides a way of measuring the
quality of questions - seeing how appropriate they were for the respondents and
how well they measured their ability/trait. It also provides a way of re-using items over
and over again in different tests with prior knowledge of how they are going to
perform; creating a population of questions with known properties (e.g. test
bank)
Classical
Test Theory
Classical Test Theory (CTT) -
analyses are the easiest and most widely used form of analyses. The statistics
can be computed by readily available statistical packages (or even by hand). Classical
Analyses are performed on the test as a whole rather than on the item and
although item statistics can be generated, they apply only to that group of
students on that collection of items
CTT is based on the true score model
. In CTT we assume that the error :
- Is normally distributed
- Uncorrelated with true score
- Has a mean of Zero
Classical Test Theory vs. Latent Trait Models
Classical analysis has the test (not
the item) as its basis. Although the statistics generated are often generalised
to similar students taking a similar test; they only really apply to those
students taking that test . Latent trait models aim to look beyond that at the
underlying traits which are producing the test performance. They are measured
at item level and provide sample-free measurement
Latent
Trait Models
Latent trait models have been around
since the 1940s, but were not widely used until the 1960s. Although
theoretically possible, it is practically unfeasible to use these without
specialized software.. They aim to measure the underlying ability (or trait)
which is producing the test performance rather than measuring performance per
se. This leads to them being sample-free. As the statistics are not dependant
on the test situation which generated them, they can be used more flexibly
Item
Response Theory
Item Response Theory (IRT) – refers
to a family of latent trait models used to establish psychometric properties of
items and scales. Sometimes referred to as modern psychometrics because in large-scale
education assessment, testing programs and professional testing firms IRT has
almost completely replaced CTT as method of choice.IRT has many advantages over
CTT that have brought IRT into more frequent use
Three
Basics Components of IRT
Item Response Function (IRF) –
Mathematical function that relates the latent trait to the probability of
endorsing an item .Item Information Function – an indication of item quality;
an item’s ability to differentiate among respondents . Invariance – position on
the latent trait can be estimated by any items with know IRFs and item
characteristics are population independent within a linear transformation
IRT
- Item Response Function
Item Response Function (IRF) -
characterizes the relation between a latent variable (i.e., individual
differences on a construct) and the probability of endorsing an item. The IRF
models the relationship between examinee trait level, item properties and the
probability of endorsing the item and typically has mean = 0 and a standard
deviation = 1qExaminee
trait level is signified by the greek letter theta
IRF
– Item ParametersLocation (b)
An item’s location is defined as the
amount of the latent trait needed to have a .5 probability of endorsing the
item. The higher the “b” parameter the higher on the trait level a respondent
needs to be in order to endorse the item. Analogous to difficulty in CTT.Like Z
scores, the values of b typically range from -3 to +3
IRF
– Item Parameters Discrimination (a)
Indicates the steepness of the IRF
at the items location .An items discrimination indicates how strongly related
the item is to the latent trait like loadings in a factor analysis .Items with
high discriminations are better at differentiating respondents around the
location point; small changes in the latent trait lead to large changes in
probability.Vice versa for items with low discriminations
IRF
– Item Parameters Guessing (c)
The inclusion of a “c” parameter
suggests that respondents very low on the trait may still choose the correct
answer. In other words respondents with low trait levels may still have a small
probability of endorsing an item This is mostly used with multiple choice
testing…and the value should not vary excessively from the reciprocal of the
number of choices.
IRF
– Item Parameters Upper asymptote (d) The
inclusion of a “d” parameter suggests that respondents very high on the latent
trait are not guaranteed (i.e. have less than 1 probability) to endorse the
item. Often an item that is difficult to endorse (e.g. suicide ideation as an
indicator of depression)
IRT
- Item Response Function
The 4-parameter
logistic model , where
- represents examinee trait levelq
- b is the item difficulty that determines the location of the IRF
- a is the item’s discrimination that determines the steepness of the IRF
- c is a lower asymptote parameter for the IRF
- d is an upper asymptote parameter for the IRF
The
3-parameter logistic model
·
If the upper asymptote parameter is
set to 1.0, then the model is termed a 3PL.
·
In this model, individuals at low
trait levels have a non-zero probability of endorsing the item.
The
2-parameter logistic model
·
If in addition the lower asymptote
parameter is constrained to zero, then the model is termed a 2PL.
·
In the 2PLM, IRFs vary both in their
discrimination and difficulty (i.e., location) parameters.
The
1-parameter logistic model
·
If the item discrimination is set to
1.0 (or any constant) the result is a 1PL
·
A 1PL assumes that all scale items
relate to the latent trait equally and items vary only in difficulty (equivalent
to having equal factor loadings across items).

Tidak ada komentar:
Posting Komentar