Selasa, 13 Januari 2015

Stages of Test Development by I.G.A Lokita Purnamika Utami and Rina Sari






Stages of Test Development
by:
I.G.A Lokita Purnamika Utami
Rina Sari





Test development is best carried out by a team. It is difficult to develop a test by individual, especially in the stage of item writing. When fault in item writing is obvious for others it could be invisible for the writer. There are some qualities necessary possessed by an item writer, one of them is the willingness to accept justified criticism. Other qualities for item writer or test developer are native or near native command of the language, intelligence and imagination (to create context for an item and foresee possible misinterpretation).
The following are stages of test construction:
1.      Stating the problem
It is the stage where test developer should be clear about: what kind of test is it to be, what the purpose, what abilities to be tested, how detailed the result will be, how important backwash is, and what constrains are set by unavailability of expertise, facilities, time.

2.      Writing specification for the test
Test specification means information on content, test structure, timing, medium/channel, technique to be used, criteria levels of performance, and scoring procedure.

a.      Content

Content should be as fully specified as possible. The following is a possible framework of describing content of a test:
-          Operation: the task to be carried out.
-          Types of text: for a writing test this may include letter forms, academic essay.
-          Addressees of texts: the kind of people the candidate is expected to be able to write to or to speak to.
-          Length of test: for reading test, this could be the length of the passage.
-          Topics
-          Readability
-          Structural rage: list of structure which may occur in text, or should be excluded from the text or general indication of a range of structure
-          Vocabulary range
-          Dialect, accent, style: dialect the test taker should understand. Style may be formal or informal.
-          Speed of processing: in reading test is reading speed. In speaking test it could be rate of speech.

b.      Structure, timing, medium/channel and techniques
-          Test structure: sections in the test and what things to be tested in each section
-          Number of item
-          Number of passage
-          Medium/channel: paper and pencil test, tape, computer, face to face, telephone, etc.
-          Time: for each section or the entire test
-          Techniques: techniques to measure the skill and subskills

c.       Criterial Level of performance
            Required level performance for different level of success should be specified. For example, to demonstrate ‘mastery’, 80% of the item must be responded correctly.  However, for speaking and writing, one can expect the criteria level to be much more complex.
d.      Scoring procedure
                        Test developer should be sure as to how they will achieve high reliability and validity in scoring, especially for subjective scoring. This include considering what rating scale to be used, how many raters will be employed and what the consequences of the disagreement of the raters on a piece of work.
3.      Writing and moderating items
Here are the procedures:
-          Sampling: text samples will be chosen as wide a range of topics and types of writing as is compatible with the specification.
-          Writing item: test developer should be able to anticipate possible misinterpretation. Items writer cannot be expected to be able to produce consistent perfect items. Some items will have to be rejected or reworked. The process of moderation is the best way to identify items that have to be improved.
-          Moderating items: moderation is the scrutiny of proposed items by at least two colleagues to see the weakness of the items. A checklist is useful to moderators of a test.

4.      Informal trialling of items on native speakers
The test should be tried out to some native speakers, about twenty or so. This native speakers should be similar, in terms of age, education and general background to the intended test takers. This does not need to be conducted formally. The native speakers can do the test in their own time. Items which are difficult for them certainly need revision or replacement. So do items with unexpected or inappropriate response.

5.      Trialling of the test on a group of non-native speakers similar to those for whom the test is intended
Those items that have survived moderation and trialling on native speakers, should be put together into a test and tried out to a group of non-native speakers similar to those for whom the test is intended. Problems in administration and scoring are noted.
6.      Analysis of result of the trial: making of any necessary changes
There are two kinds of analysis: statistical and qualitative. Statistical analysis will show qualities of the test as a whole and of individual item ( how difficult they are, how well they discriminate between stronger and weaker candidate). The qualitative analysis is based on the examination of the responses to see misinterpretation or unanticipated but possible correct responses. Items which are proven to be faulty should be dropped or modified.

7.      Collaboration of scales
Where rating scales are going to be used for oral or writing testing, these should be calibrated. This means collecting samples of performance which cover the full range of the scales. The experts team then look at these samples and to see a point on the relevant sample. These samples provide referenced points for future use or training materials

8.      Validation
The final version of the test can be validated.

9.      Writing handbooks for test takers, test users and staff
The content of the handbook may be expected to contain some points: the rationale of the test, how test was developed and validated, a description of the test, sample item, advice on preparing for taking the test, an explanation on how test scores are to be interpreted, training materials, and details of test administration.

10.  Training staff
Using the handbook, all staff should be trained. These people may include interviewers, raters, scorers, computer operators and invigilators.
Reference
Hughes, A. 2003. Testing for Language Teachers. Cambridge: Cambridge University Press.

Tidak ada komentar:

Posting Komentar