test output
Test Scoring returns a number of statistics for each test processed:
Average Percentage of light marks.
Number of Students who took each form of the test.
Mean Number Right - the average number of questions each student answered correctly on each form.
Median Number Right - the midpoint of the distribution where half the scores are above and half below.
Standard Deviation of Number Right - a measure of how spread out the scores are on either side of the mean number right.
Mean Number Omitted - the average number of questions omitted by each student.
Standard Deviation of Number Omitted - a measure of how spread out
the numbers of omissions are on either side of the mean number of items
omitted.
Reliability Estimate (KR-20) - an estimate of the correlation
between the scores on the test and the scores which would be obtained
if students were given a similar test on the same material.
Satisfactory levels range from .5 for tests of less than 15 items to .8
and up for longer tests. The method for computing this estimate assumes
that essentially a single area of knowledge is being tested. See Testing Memo 8 for more information on reliability scores.
Standard Error of Measurement - an estimate of how far a student's
observed score may be from a true score, or the mean score that would
be earned on many independent administrations of the test.
Item Analysis
An item analysis aids in
determining which test questions were weak or represented
unsatisfactory levels of understanding. The analysis includes for every
question the question number, the correct choice, the number of
omissions of that item, the numbers of persons marking each choice
(shown on first row), the percentage of the class selecting each choice
(second row) and a Pearson product moment correlation between each
choice and total scores.
Results for the correct choice
(marked by an asterisk) are most important. Good test questions have
most or all of the following characteristics:
Between 30 and 85% of
examinees should answer correctly. If the item is too hard or too easy
it contributes relatively little toward ranking examinees according to
their knowledge.
Each of the answer choices should attract at
least some of the examinees. If a wrong choice is so obvious that no
one selects it, student testing time is saved by omitting it on future
tests. There is no need to have the same number of choices for all
items.
The correlations between choices and total
score should be positive (preferably .3 or higher) for the right choice
and negative for the wrong choices. This outcome would indicate that
better students tended to get the item right. If a positive correlation
occurs for the wrong answer, it indicates that better students were
misled into selecting the wrong choice. A poor item correlation can
also alert the instructor to a possible mistake in filling out the key.
To the right of the three rows of statistics for each item, there may be messages that alert the instructor of likely problems:
THE KEYED ANSWER MAY BE INCORRECT - shown
when a very small percentage of examinees answered correctly or if the
correlation between the correct choice and the total score is near zero
or negative. This latter condition is ignored if a very large
percentage answered correctly.
CHOICE _ MAY BE THE CORRECT ANSWER -
indicates that a wrong choice was chosen by at least a moderate
proportion of examinees and correlates at least somewhat positively
with total score.
CHOICE _ MAY BE ALTERNATE ANSWER - shown in
cases similar to the previous situation but for which the percentage
marking the choice may be much smaller. Such answers are sometimes
technically correct due to obscure conditions known mainly by better
students.
Note: These
messages are not printed for tests with fewer than 25 examinees, for
the statistics on which they are based would be unreliable.
T-Score
A T-Score is the number of standard deviations from the mean and may be used for "curving" grades.
See Testing Memo 6 for a discussion of using T-scores to assign grades.
Light Marks
The scanner may fail to record a
mark if it is so light that the numeral can still be seen within the
circle. Instructors should tell students they may lose credit for
correct answers if they are too light.
If a sheet with over 30% light
marks has omissions (possibly due to light marks), a message with the
student's name will appear at the top of the printout noting the
percentage of light marks and the numbers of the omitted questions. It
is good practice to inspect sheets as they are turned in and have
students darken any light marks. If a lightly marked sheet with
omissions is processed, it should be photocopied before returning to
the student, since lightly marked answer sheets with omissions are
easier to revise without detection.
Multiple Marks
The opscan system is not able
to accommodate questions with more than one correct answer (multiple
marks). Multiple marks on a line always result in rejection of the
sheet by the scanner. The operator then circles the problem marks with
a red pen and reprocesses the sheet. The scoring program counts
multiple marks as omissions and writes an error message on the output
identifying the student and the problem item number.
If only one question has two
correct answers, alternatives are available on request. They will
require some hand processing by both Test Scoring operators and the
instructor and are not recommended.
Page last updated
August 7, 2008
.