Definition:
Classical Test Theory (CTT) is a psychometric framework based on the equation: Observed Score = True Score + Error. Every test score contains both the person’s actual ability (true score) and random measurement error. CTT focuses on test-level statistics — total score reliability, item difficulty, and item discrimination — rather than modeling individual item-person interactions as Item Response Theory does.
In-Depth Explanation
The fundamental equation:
X = T + E
Where:
- X = the observed (actual) test score
- T = the true score (the person’s “real” ability, unmeasurable directly)
- E = random error (noise from fatigue, guessing, ambiguous items, etc.)
The goal of good test design in CTT is to minimize E so that X approximates T as closely as possible.
Key CTT concepts:
| Concept | Definition |
|---|---|
| Reliability | The proportion of observed score variance due to true score variance (not error). Higher = better. |
| Difficulty Index | Proportion of test-takers who answered an item correctly (p-value). Range: 0 to 1. |
| Discrimination Index | How well an item separates high-ability from low-ability test-takers. |
| Standard Error of Measurement (SEM) | Estimated standard deviation of error scores. Lower = more precise. |
Reliability estimation methods:
- Test-retest: Give the same test twice, correlate scores
- Parallel forms: Give two equivalent tests, correlate scores
- Internal consistency: Split-half, Cronbach’s alpha — estimate reliability from a single test administration
CTT vs. IRT:
| Feature | CTT | IRT |
|---|---|---|
| Unit of analysis | Total test score | Individual item-person interaction |
| Sample dependence | Item statistics depend on the sample | Item parameters are (theoretically) sample-independent |
| Score precision | Assumed equal for all test-takers | Varies by ability level |
| Complexity | Simple to calculate | Requires specialized software and larger samples |
| Still used? | Yes — widely | Yes — for high-stakes tests |
Why CTT persists:
Despite IRT’s theoretical advantages, CTT remains widely used because:
- It’s simpler to implement and understand
- It works well for classroom tests, formative assessments, and smaller samples
- Its statistics (Cronbach’s alpha, item difficulty, item discrimination) are familiar to most educators
- For many practical purposes, CTT and IRT produce similar conclusions
Related Terms
See Also
Research
- Crocker, L., & Algina, J. (2008). Introduction to Classical and Modern Test Theory. Cengage Learning. — Standard textbook covering both CTT and IRT.
- Bachman, L. F. (1990). Fundamental Considerations in Language Testing. Oxford University Press. — Applies CTT concepts specifically to language testing contexts.