Selasa, 28 April 2015



A Summary on the Requirements, Assumptions, and Estimations of Parameters as well as Instrument Development based on IRT
By: Agus Eko Cahyono and Jumariati

In the IRT based test development, we need to consider the requirements, assumptions, and estimations of the parameters of the test items. Large samples of examinees are required to accurately estimate the IRT item parameters, and longer tests provide more accurate y estimates. To a lesser extent, increasing the test length can also improve the accuracy of the item parameter estimation. This results from either improved estimation of the ys or improved estimation of the shape of the y distribution. In addition, increasing the number of examinees can somewhat improve the estimation of y through improved estimation of the item parameters.


A Summary on the Introduction to Item Response Theory
By: Agus Eko Cahyono and Jumariati

In the practices of equating test forms, some methods are used such as the Item-Response Theory (IRT) and the Classical Test Theory (CTT) methods. The CTT is a theory about test scores that introduces three concepts-test score (often called the observed score), true score, and error score. Within that theoretical framework, models of various forms have been formulated. For example, in what is often referred to as the "classical test model," a simple linear model is postulated linking the observable test score (X) to the sum of two unobservable (or often called latent) variables, true score (T) and error score (E), that is, X = T + E. Because for each examinee there are two unknowns in the equation, the equation is not solvable unless some simplifying assumptions are made. The assumptions in the classical test model are that (a) true scores and error scores are uncorrelated, (b) the average error score in the population of examinees is zero, and (c) error scores on parallel tests are uncorrelated. In this formulation, where error scores are defined, true score is the difference between test score and error score.

Selasa, 21 April 2015

Introduction to Item response theory I.G.A. Lokita Purnamika Utami Rina Sari



Introduction to Item response theory
I.G.A. Lokita Purnamika Utami
Rina Sari
 Item analysis provides a way of measuring the quality of questions - seeing how appropriate they were for the respondents and how well they measured their ability/trait.  It also provides a way of re-using items over and over again in different tests with prior knowledge of how they are going to perform; creating a population of questions with known properties (e.g. test bank)

Issues in the Development of Authentic Assessment

Issues in the Development of Authentic Assessment

by: I.G.A. Lokita Purnamika Utami & Rina Sari

WHAT IS IT? Performance assessment, also known as alternative or authentic assessment, is a form of testing that requires students to perform a task rather than select an answer from a ready-made list. For example, a student may be asked to explain historical events, generate scientific hypotheses, solve math problems, converse in a foreign language, or conduct research on an assigned topic. Experienced raters--either teachers or other trained staff--then judge the quality of the student's work based on an agreed-upon set of criteria. This new form of assessment is most widely used to directly assess writing ability based on text produced by students under test instructions.

Minggu, 19 April 2015

A Summary on the Issues on the Development of Authentic Assessment
By: Agus Eko Cahyono and Jumariati

There is mismatch between measures of language competence and the actual communicative competence required in real world communicative interaction (Duran, 1988; Kitao & Kitao, 1996; McNamara, 1996; O'Malley and Valdez Pierce, 1996; Spolsky, 1995). The movement of authentic assessment is an attempt to achieve a more appropriate and valid representation of student communicative competencies than that derived from standardized objective tests. Authentic assessment is also named performance-based assessment (Meyer, 1992; Marzano, 1993; but Wiggins (1990) named it as alternative assessment. Authentic assessment is an assessment that simulates, as far as possible, the authentic behavior which learners will need to enact in real situations. Some examples of authentic assessment are self- and peer-assessment, projects or exhibitions, observations, journals, and portfolios.

Selasa, 07 April 2015

Issues in the Development of Non-test Instruments (Rating Scales, Semantic Differential Scales, Checklists, Questionnaires, and others)
By: Agus Eko Cahyono and Jumariati

Rating scales are instruments used when the aspect of performance or the quality of a product varies from low to high, best to worst, good to bad, or on some other implicit continuum (Roid & Haladyna, 1982). The steps to construct a rating scale is first to define what aspects of performance are to be rated. Then, create the scale by employing one of these types: (1) simple numerical, (2) simple graphic, and (3) descriptive. Simple numerical uses certain scales to rate the qualities for instance from 1 (very poor), 2 (poor), 3 (fair), 4 (good) and 5 (very good). This type is very efficient and probably the most popular one. Numbers are used to represent degrees and the rater merely assigns a number to each object or performance being observed. The next type is simple graphic scale in which the rater is confronted with 3 to 7 terms representing the degrees. This type allows less chance for deviation among raters but it is less effective as it requires more time and more pages for writing the answer options. The last type describes the points on the rating scale more fully and easy to use even by untrained raters. The disadvantage is that it takes more time to develop.