Tampilkan postingan dengan label Classical Item Analysis. Tampilkan semua postingan
Tampilkan postingan dengan label Classical Item Analysis. Tampilkan semua postingan

Sabtu, 17 November 2012

Classical Item Analysis (The Effectiveness of Distracters, and Criteria for Item Selection and Item Assembly)





Presented By:
Sitti Fatimah Saleng (120221539886)
Niki Raga Tantri (120221539852)

A. Effectiveness of Distracters
In the case of tests that use multiple choice items, the incorrect answer choices to a question (distractors) play an important role. Distracters is the alternatives include the correct answer and several plausible wrong answers. The function of distracter is to distract lower group of test taker. A good test is a test that has a good distracter. What kinds of good distracter is it?
A good distracter has function effectively to distract low level of test taker (lower group), but it is not chosen by upper group or at least lower group choose distracters more than upper group.
By analysing the effectiveness of distracters, we can determine:
a.       How many subject which has correct option
b.      Which distracter has big mistake so that there is no test taker choose it.
c.       Which distracter that mislead to the test taker
d.      Which distracter that attract lower group, and not for the upper group.
To explain a clear description about answer distribution from each item, look the table below:
Group
Options

Total
A
B
*C
D
E
Upper group
2
3
7
2
0
14
Lower group
4
2
4
4
0
14
*correct answer
  





Distracter B is might not be effective, because the number of upper group is higher than lower group. It is contrast with  the purpose of distracters, that the number of lower group should be more than the upper group. Distracter A and D, the lower group choose more than the upper group. So, the distractor is good. 
To determine an effectiveness of distracter of a test by using formula as follows:

The number of students who choose distracters     x         100
                                   Total of test taker
            According to Depdikbud (1993) distracter can be effective if it is minimally chosen by 5% for four options and 3% for five options. While, according to Fernandes (1984), distracter is good if it is minimally chosen by 2% of the whole number of test taker.
            Distracters which is not fulfil the criteria should be changed with other distracters that might be attract more test takers to choose it.

B. Criteria for Item Selection
Selecting the item types should match between the learning or the topic that will be tested and the potential instrument of the test. Since there are numerous formats of the test, such as multiple-choice items, true-false, essays, etc, the test writer should determine wisely the test formats based on their purposes. If the teacher wants to test a range of detailed factual information that readily done using matching of classification items, the selected response item formats will be used. Otherwise, if the teacher tests the candidate’s capacity to perform integrated, higher-order skills such as synthesizing knowledge or attempting evaluations providing setting a complex problem for solution, or need an extended response to do those skills justice, than the constructed response item formats should be used.
1. Criteria for choice of Selected Response Item Formats
No.
Item Type
Advantages
Disadvantage
1.
True-False
·      Easy to write
·      Easy to mark
·      Easy to sample variety within a course
·         Guessing factor very high (50%)
·         Limited to unequivocal choices
·         Cannot test higher order skills
2.
Matching Items
·       Useful for testing relationship
·       Useful for testing factual information
·       Easy to construct a large number
·         The cluster approach destroys item independence
·         Difficult to word instructions
3.
Classification Items
·      Relatively easy to construct
·      Easy to mark
·      Useful for testing factual information
·      Useful for testing simple relationships
·         The cluster approach destroys item independence to some degree
·         Limited to factual sorting
·         Limited to unequivocal facts
4.
Multiple-choice items
·      Reduces the guessing factors
·      Versatile-can be used to measure a wide range of cognitive processes
·      Reduces problem of subjective scoring
·      Analysis of results can provide much diagnostic information
·      Easy to mark
·         Little, if any, stimulus given to creative thought
·         Expensive and time-consuming to construct
·         Difficult to measure organization and presentation of ideas
·         Plausible distracters hard to write
·         Presents wrong information as if it was right

2. Criteria for Choice of Constructed Response Item Formats
No.
Item Type
Advantages
Disadvantage
1.
Short-answer items
·      Excellent for testing factual knowledge
·      Successful guessing is reduced
·      Easy to write
·      Easy to mark
·         Guessing factor very high (50%)
·         Limited to unequivocal choices
·         Cannot test higher order skills
2.
Fill-in-the-blank sentence completion
·       Easy to test a range of factual knowledge
·       Guessing factor is reduced
·       Easy to write
·       Easy to mark
·         The cluster approach destroys item independence
·         Difficult to word instructions
3.
Cloze, modified cloze
·      Easy to construct
·      A good measure of word knowledge
·      Tests passage understanding
·         The cluster approach destroys item independence to some degree
·         Limited to factual sorting
·         Limited to unequivocal facts
4.
Extended responses
·      A means of assessing higher-order skills
·      Relatively easy to construct
·      Stimulate creative and critical thought as well as learned responses
·      Can measure learning in affective domain
·         Little, if any, stimulus given to creative thought
·         Expensive and time-consuming to construct
·         Difficult to measure organization and presentation of ideas
·         Plausible distracters hard to write
·         Presents wrong information as if it was right
5.
Problem solutions
·      A means of assessing higher-order skills
·      Can measure complex learning outcomes
·      Relatively easy to construct
·         Can be time-consuming to mark
·         Sometimes difficult to establish stable assessment criteria

C.  The Function of Item-Analysis for the Development of Test
 The benefits of item analysis are not limited to the improvement of individual test items. Therefore, there are number of benefit of special value to classroom teachers. The most important of these are as follows:
1.      Item analysis data provide a basis for efficient class discussion of the test.
2.      Item analysis data provide a basis for remedial work.
3.      Item analysis data provide a basis for general improvement of classroom instruction.
4.      Item analysis procedures provide a basis for increased skill in test construction.

D. Item Assembly 
Since there are no sources that explicitly explain about item assembly, thus the writer provides the definition about ‘assembly’ first. According to Cambridge dictionary, “Assembly is the process of putting together the parts of a machine or structure”. Therefore, items assembly is the process of putting together of the items into a set of test. According to Gronlund (1990), there are two procedures to assembly the items before stored in the item bank:
1. Recording Test Items
When constructing the test items, it is desirable to write each one on a separate index card. The index card should contain information concerning the subject area, instructional objective measured, and item characteristics.
a. Subject Area
An item that is provided by the teacher should perform a clear subject area of the test.  Subject area can be concluded subject at school; mathematic, English, etc, or the skill or combination of skills assessed or competencies.
b. Instructional Objective Measured
In assembling the test items, it should be mentioned the instructional objective measured of the item or the types of desired learning outcomes to make ease in categorization of the items themselves. The instructional objectives should be in harmony with the goals of the curriculum and will reflect the state content standards to whatever degree the school curriculum does. In addition, Gronlund (1998) mentioned an item can perform the types of desired learning outcomes such as in Bloom’s Taxonomy of Educational Objectives.
 
Tabel 1.  A Revision of Bloom’s Taxonomy of Educational Objectives by Anderson and Krathwol (2001)

c. Item Characteristics
An item should have best described item characteristics that measures item difficulty, item discrimination, effectiveness of distracters, and item validity.
2.   Reviewing Test Items
No matter how carefully test items have been prepared, sometimes teachers do not concentrate on the clarity and conciseness of the items. Therefore, it is necessary to review the items by fellow teachers or experts before they group the items into the test. Below is some steps to review the items before they are assembled into a form of test:
a.          The item format should appropriate for the learning outcome being measured
b.         The knowledge, understanding or thinking skill by the item should match the specific learning outcome and subject-matter content being measured.
c.          The point of the item should be clear.
d.         The item should free from excessive verbiage, racial, ethnic, sexual bias, and also technical errors and irrelevant clues.
e.          The item should have an appropriate difficulty.
f.          The item should have an answer that would be agreed by experts

E.  Anchor item
Selected items in item bank can be categorized as anchor items and non-anchor items. Anchor item is used to equate test forms, helping maintain a stable scale across test administrations, while non-anchor item is the remainder of the test items with acceptable statistical properties and content-standard constraints.
Anchor Items
Non-anchor Items
Items should measure the content standards as specified in the test blueprint
Items should follow psychometric guidelines (e.g. p-values are between .2 and .9, etc)
Items should not be used in consecutive years (e.g. avoid using the same item in spring 2000 and spring 2001)
Items should not have unacceptable item statistics
Anchor items should be selected from previous operational test forms
Non-anchor items should be selected from the entire field-test pool
Anchor items should not be edited
Minor changes (formatting, style, grammar) are permissible as long as it is not impact on students’ performance
Anchor items should have a range difficulty levels but not be extremely easy or difficult
Items should not be overly difficult




References:
Graeme  Withers. 2005. Module 5, Item Writing for Test and Examination. Paris: IIEP Unesco. <http:// www.unesco.org/iiep, Retrieved on September 30, 2012>
Gronlund, N. E., & Linn, R. L. 1990. Measurement and evaluation in teaching  (6th  ed.).  New  York:  Macmillan Publishing Company.
Gronlund, N.E. and Waugh, C.K. 2009. Assessment of Student Achievement. Upper Saddle River, NJ: Pearson Education.

Heaton, J.B.1988. Writing English Language Tests. New York: Longman Inc.

Henning, Grant 1987. A Guide To Language Testing. Massachusetts: Newburry House.
Hughes, Arthur. 2003. Testing for Language Teachers. Cambridge: Cambridge University Press
Hetzel, Susan Matlock. 1997. Basic Concepts in Item and Test Analysis. Online <http://ericae.net/ft/tamu/Espy.htm, Retrieved on: September 30, 2012>
http://www.pepuny.blogspot.com/2007/11/efektifitas_distractor_07.html, Retrieved on: October1, 2012