Definition:
The discrimination index measures the degree to which a test item differentiates between high-ability and low-ability test-takers. Items with high discrimination are answered correctly by students who did well on the overall test and missed by students who did poorly. Items with low or negative discrimination are problematic — they may be ambiguous, poorly written, or testing something unrelated to the construct.
In-Depth Explanation
Calculation (upper-lower method):
- Rank all test-takers by total score
- Take the top 27% (or top third) = Upper group
- Take the bottom 27% (or bottom third) = Lower group
- D = (proportion correct in upper group) − (proportion correct in lower group)
Example: If 90% of the top group answered item #5 correctly, but only 40% of the bottom group did:
D = 0.90 − 0.40 = 0.50
Interpreting D:
| D Value | Interpretation | Action |
|---|---|---|
| 0.40+ | Excellent discrimination | Keep the item |
| 0.30–0.39 | Good | Keep, possibly minor revision |
| 0.20–0.29 | Acceptable | Consider revision |
| 0.00–0.19 | Poor | Revise or discard |
| Negative | Inverse discrimination | Discard — something is wrong |
Negative discrimination means that low-ability students are outperforming high-ability students on that item. This usually indicates:
- An ambiguous item where stronger students overthink it
- A keyed answer that is wrong or debatable
- An item testing a completely different construct
Alternative: Point-Biserial Correlation
A more statistically robust measure is the point-biserial correlation (r_pbis), which correlates item scores (0 or 1) with total test scores across all test-takers, avoiding the information loss of the upper-lower split.
Relationship to difficulty:
Discrimination and difficulty interact:
- Very easy items (p > 0.90) can’t discriminate well — everyone gets them right
- Very hard items (p < 0.10) can't discriminate well — everyone gets them wrong
- Moderate difficulty items (p ≈ 0.50) have the maximum potential for discrimination
This is why well-designed tests include items across a range of difficulties, with most items in the moderate range to maximize overall discrimination.
Related Terms
See Also
Research
- Ebel, R. L., & Frisbie, D. A. (1991). Essentials of Educational Measurement (5th ed.). Prentice Hall. — Classic treatment of discrimination index calculation and interpretation.
- Bachman, L. F. (1990). Fundamental Considerations in Language Testing. Oxford University Press. — Applies item analysis including discrimination to language testing.