Classical Test Theory

Definition:

Classical Test Theory (CTT) is a psychometric framework based on the equation: Observed Score = True Score + Error. Every test score contains both the person’s actual ability (true score) and random measurement error. CTT focuses on test-level statistics — total score reliability, item difficulty, and item discrimination — rather than modeling individual item-person interactions as Item Response Theory does.


In-Depth Explanation

The fundamental equation:

X = T + E

Where:

  • X = the observed (actual) test score
  • T = the true score (the person’s “real” ability, unmeasurable directly)
  • E = random error (noise from fatigue, guessing, ambiguous items, etc.)

The goal of good test design in CTT is to minimize E so that X approximates T as closely as possible.

Key CTT concepts:

ConceptDefinition
ReliabilityThe proportion of observed score variance due to true score variance (not error). Higher = better.
Difficulty IndexProportion of test-takers who answered an item correctly (p-value). Range: 0 to 1.
Discrimination IndexHow well an item separates high-ability from low-ability test-takers.
Standard Error of Measurement (SEM)Estimated standard deviation of error scores. Lower = more precise.

Reliability estimation methods:

  • Test-retest: Give the same test twice, correlate scores
  • Parallel forms: Give two equivalent tests, correlate scores
  • Internal consistency: Split-half, Cronbach’s alpha — estimate reliability from a single test administration

CTT vs. IRT:

FeatureCTTIRT
Unit of analysisTotal test scoreIndividual item-person interaction
Sample dependenceItem statistics depend on the sampleItem parameters are (theoretically) sample-independent
Score precisionAssumed equal for all test-takersVaries by ability level
ComplexitySimple to calculateRequires specialized software and larger samples
Still used?Yes — widelyYes — for high-stakes tests

Why CTT persists:

Despite IRT’s theoretical advantages, CTT remains widely used because:

  • It’s simpler to implement and understand
  • It works well for classroom tests, formative assessments, and smaller samples
  • Its statistics (Cronbach’s alpha, item difficulty, item discrimination) are familiar to most educators
  • For many practical purposes, CTT and IRT produce similar conclusions

Related Terms


See Also


Research

  • Crocker, L., & Algina, J. (2008). Introduction to Classical and Modern Test Theory. Cengage Learning. — Standard textbook covering both CTT and IRT.
  • Bachman, L. F. (1990). Fundamental Considerations in Language Testing. Oxford University Press. — Applies CTT concepts specifically to language testing contexts.