Memo 1: Guessing | Memo 2: Difficulty | Memo 3: Essays
Memo 4: Multiple Choice 1 | Memo 5: Multiple Choice 2
Memo 6: Averaging Grades | Memo 7: Assigning Grades
Memo 8: Reliability | Memo 9: Missed Test
Memo 10: Multiple Choice 3 | Memo 11: Absolute/Relative Grading
Robert B. Frary
TESTING MEMO 4 gave a few suggestions for item-writing, but only to a limited extent, due to its coverage of other aspects of test development. What follows is a more comprehensive list of recommendations. Some of these are backed up by psychometric research; i.e., it has been found that, generally, the resulting scores are more accurate indicators of each student's knowledge when the recommendations are followed than when they are violated. Other recommendations result from logical deduction.
1.) Do ask questions that require more than knowledge of facts. For example, a question might require selection of the best answer when all of the options contain elements of correctness. Such questions tend to be more difficult and discriminating than questions that merely ask for a fact. (See TESTING MEMO 2 for a discussion of the advantages of more difficult tests. Of course, a more difficult test can yield the same letter grade distribution as an easier one.) Justifying the "best-ness" of the keyed option may be as challenging to the instructor as the item was to the students, but, after all, isn't challenging students and responding to their challenges a big part of what being a professor is all about?
2.) Don't ask a question that begins, "Which of the following is true [or false]?" followed by a collection of unrelated options. Each test question should focus on some specific aspect of the course. Therefore, it's OK to use items that begin, "Which of the following is true [or false] concerning X?" followed by options all pertaining to X. However, this construction should be used sparingly if there is a tendency to resort to trivial reasons for falseness or an opposite tendency to offer options that are too obviously true. A few true-false questions (in among the multiple-choice questions) may forestall these problems. The options would be: 1) True 2) False.
3.) Do ask questions with varying numbers of options. There is no psychometric advantage to having a uniform number, especially if doing so results in options that are so implausible that no one or almost no one marks them. In fact, some valid and important questions demand only two or three options, e.g.,
"If drug X is administered, body temperature will probably:
1) increase, 2) stay about the same, 3) decrease."
4.) Don't put negative options following a negative stem. Empirically (or statistically) such items may appear to perform adequately, but this is probably only because brighter students who naturally tend to get higher scores are also better able to cope with the logical complexity of a double negative.
5.) Don't use "all of the above." Recognition of one wrong option eliminates "all of the above," and recognition of two right options identifies it as the answer, even if the other options are completely unknown to the student. Actually, some instructors may use items with "all of the above" in an unconscious or misguided effort to extend their teaching into the test. It just seems so good to have the students affirm, say, all of the major causes of some phenomenon. With this approach, "all of the above" is the answer to almost every item containing it, and the students soon figure this out.
6.) Do ask questions with "none of the above" as the final option, especially if the answer requires computation. Its use makes the question harder and more discriminating, because the uncertain student cannot us focus on a set of options that must contain the answer. Of course, "none of the above" cannot be used if the question requires selection of the best answer and should not be used following a negative stem. Also, it is important that "none of the above" should be the answer to a reasonable proportion of the questions containing it.
7.) Don't use items like the following:
What is (are) the capital(s) of Bolivia?
A. La Paz B. Sucre C. Santa Cruz
1) A only 2) B only 3) C only
4) Both A and B 5) All of the above
Research on this item type has consistently shown it to be easier and less discriminating than items with distinct options. In the example above, one only needs to remember that Bolivia has two capitals to be assured of answering correctly. This problem can be alleviated by offering all possible combinations of the three basic options, namely:
1) A only, 2) B only, 3) C only, 4) A and B
5) A and C, 6) B and C
7) A, B, and C, 8) None of the above
However, due to its complexity, initial use of this adaptation should be limited.
2.) Don't include superfluous information in the options. The reasons given for 8 above apply. In addition, as another manifestation of the desire to teach while testing, the additional information is likely to appear on the correct answer:
1) W, 2) X, 3) Y, because ...., 4) Z.
Students are very sensitive to this tendency and take advantage of it.
3.) Don't use specific determiners in distractors. Sometimes in a desperate effort to produce another, often unneeded, distractor (see 3 above), a statement is made incorrect by the inclusion of words like all or never, e.g., "All humans have 46 chromosomes." Students learn to classify such statements as distractors when otherwise ignorant.
4.) Don't repeat wording from the stem in the correct option. Again, an ignorant student will take advantage of this practice.
Most violations of the recommendations given thus far should not be classified as outright errors, but, instead, perhaps, as lapses of judgment. And, as almost all rules have exceptions, there are probably circumstances where some of 1-11 above would not hold. However, there are three not-too-uncommon item-writing/test-preparation errors that represent nothing less than negligence. They are mentioned below to encourage careful preparation and proofreading of tests:
1.) Typos. These are more likely to appear in distractors than in the stem and the correct answer, which get more scrutiny from the test preparer. Students easily become aware of this tendency if it is present.
2.) Grammatical inconsistency between stem and options. Almost always, the stem and the correct answer are grammatically consistent, but distractors, often produced as afterthoughts, may not mesh properly with the stem. Again, students quickly learn to take advantage of this foible.
3.) Overlapping distractors. For example:
Due to cutbacks, the Virginia Tech library now subscribes to fewer than _?_ periodicals.
1) 25,000 2) 20,000 3) 15,000 4) 10,000
Perhaps surprisingly, not all students "catch on" to items like this, but many do. Worse yet, the instructor might indicate option 2 as the correct answer
4.) Finally, consider the following set of options (stem omitted):
1) Abraham Lincoln, 2) Stephen A. Douglas
3) Robert E. Lee, 4) Andrew Jackson
The test-wise but ignorant student will select Lincoln because it represents the intersection of two categories of prominent nineteenth century people, namely, presidents and men associated with the Civil War.
Try this one:
1) before breakfast, 2) on a full stomach
3) with meals, 4) before going to bed