Definition:
A lexeme is the abstract lexical unit that underlies a set of grammatically related word forms. The lexeme WALK (conventionally written in small capitals) unifies the surface word-forms walk, walks, walked, walking — which are all inflected realizations of the same vocabulary item. Lexemes are the unit counted in vocabulary size estimates and dictionary entries: when learners are said to “know 10,000 words,” they know roughly 10,000 lexemes. The lexeme concept is essential for distinguishing the vocabulary unit (know) from its surface forms (know, knows, knew, known, knowing) and for understanding what it means to “learn a word” in second language acquisition.
Lexeme vs. Word-Form vs. Token
Three levels of analysis apply to words:
| Term | Definition | Example |
|---|---|---|
| Token | Any individual occurrence of a word in text | “The dog saw the dog” = 4 tokens |
| Word-form | A specific inflected surface form | dog, dogs, dog’s = 3 word-forms |
| Lexeme | The abstract unit underlying related forms | DOG = 1 lexeme (covering dog, dogs, dog’s) |
When counting vocabulary size, linguists count lexemes. When counting text frequency, they count tokens. Word-forms sit in between.
Inflectional Forms of a Lexeme
A paradigm is the full set of inflectional forms realizing a single lexeme:
English verb lexeme WRITE:
| Form | Word-form |
|---|---|
| Base | write |
| 3rd person singular present | writes |
| Past tense | wrote |
| Past participle | written |
| Present participle | writing |
All five are inflectional forms of the single lexeme WRITE.
Lexeme and Homonymy vs. Polysemy
Two distinct lexemes can share the same word-form (homonymy):
- bank (financial institution) = one lexeme
- bank (riverbank) = a different lexeme
A single lexeme can have multiple related senses (polysemy):
- run meaning “move quickly,” “manage,” “function,” “be a candidate” = one lexeme, multiple senses
Homonyms get separate dictionary entries; polysemous senses appear under one entry — this reflects underlying lexeme identity.
Lexeme and Derivation
Derivation creates new lexemes from existing ones:
- WRITE (verb) ? WRITER (noun) — a new lexeme
- HAPPY (adj) ? HAPPINESS (noun) — a new lexeme
- HAPPY (adj) ? UNHAPPY (adj) — a new lexeme
By contrast, inflection does not create a new lexeme — walked is still the lexeme WALK.
Lexeme in L2 Acquisition
For vocabulary learning research, the lexeme is the appropriate unit:
- Vocabulary size in L2 learners is measured in lexemes (the “lexical coverage” research, Nation 2001)
- A learner needs to know roughly 8,000–9,000 lexeme families for comfortable reading
- Word family extends the lexeme concept to include derivationally related forms (WRITE, WRITER, WRITING, WRITTEN as one word family)
History
The term lexeme was introduced and formalized in linguistics by Lyons (1968) and widely adopted in subsequent lexicological and morphological theory. Earlier structuralists had intuited the concept but lacked a unified term. Harris (1951) and Bloomfield (1933) worked with similar notions under other labels (morpheme, word). The lexeme concept became central to lexicography and computational linguistics in the late 20th century.
Common Misconceptions
- “A word = a lexeme” — A word-form is not the same as a lexeme; walked, walk, walking are three word-forms but one lexeme WALK
- “Words in a dictionary are words, not lexemes” — Dictionary headwords (lemmas) are lexemes; the dictionary lists lexemes, not every inflected form
Criticisms
- The boundary between inflection (which doesn’t create a new lexeme) and derivation (which does) is theoretically debated; some analyses treat all derivation as new-lexeme-creating while others are more conservative
Social Media Sentiment
The term “lexeme” is primarily academic but underlies widely discussed topics like “how many words do you need to know?” and vocabulary size debates in language learning communities. Last updated: 2026-04
Practical Application
- Teach vocabulary in lexeme families: learn write, then recognize writing, wrote, written, writer as the same lexeme’s forms — reduces memorization burden
- Word-family awareness is a core vocabulary-building strategy recommended by researchers like Nation
Related Terms
- Morpheme
- Inflectional Morphology
- Derivational Morphology
- Free Morpheme
- Affixation
- Second Language Acquisition
See Also
Research
- Lyons, J. (1968). Introduction to Theoretical Linguistics. Cambridge University Press. — Introduced and defined the lexeme as a theoretical unit.
- Nation, I. S. P. (2001). Learning Vocabulary in Another Language. Cambridge University Press. — Seminal treatment of vocabulary size in L2; uses lexeme-family as the unit of vocabulary knowledge.
- Bauer, L., & Nation, P. (1993). Word families. International Journal of Lexicography, 6(4), 253–279. — Defines word family (derivational + inflectional extension of lexeme) and documents coverage implications.