Difficulty Index

Definition:

The difficulty index (also called the p-value or facility index) is the proportion of test-takers who answer a particular test item correctly. It is the simplest item-level statistic in Classical Test Theory. The value ranges from 0.00 (nobody answered correctly) to 1.00 (everybody answered correctly). Counterintuitively, a higher p-value means an easier item.

In-Depth Explanation

Formula:

Difficulty Index (p) = Number of correct responses ÷ Total number of test-takers

Example: If 75 out of 100 students answered item #12 correctly, the difficulty index is 0.75.

Interpreting the p-value:

p-value Range	Interpretation
0.90–1.00	Very easy — almost everyone gets it right
0.70–0.89	Easy
0.30–0.69	Moderate difficulty — ideal range for most tests
0.10–0.29	Difficult
0.00–0.09	Very difficult — almost nobody gets it right

Optimal difficulty depends on test purpose:

Achievement tests (checking what students learned): Items should cluster around 0.60–0.80 to confirm learning occurred
Proficiency/placement tests (maximizing discrimination): Items around 0.40–0.60 provide the most information about ability differences
Mastery tests (pass/fail): Higher p-values are acceptable if the goal is confirming minimum competence

Limitations:

The difficulty index is sample-dependent — it changes based on who takes the test. An item might have p = 0.30 for beginners and p = 0.90 for advanced students.
It doesn’t account for guessing. On a 4-option multiple-choice item, random guessing alone produces p = 0.25.
A very easy item (p = 0.95) has almost no ability to discriminate between students — see Discrimination Index.

In language testing:

JLPT items are calibrated so that items within each level span a range of difficulties. N5 items are designed to be easier than N1 items, but within each level, there’s a spread: some items most test-takers get right, some only the strongest at that level get right.

Related Terms

Research

Crocker, L., & Algina, J. (2008). Introduction to Classical and Modern Test Theory. Cengage Learning. — Standard reference for difficulty index calculation and interpretation.
Brown, J. D. (2005). Testing in Language Programs: A Comprehensive Guide to English Language Assessment. McGraw-Hill. — Applied difficulty index analysis for language tests.

In-Depth Explanation

Related Terms

See Also

Research