Word Family

Definition:

A word family is a grouping of words that share the same base (root) word together with all of its inflected forms (plurals, past tenses, comparatives) and derived forms (affixed variants: suffixes, prefixes that change grammatical category or meaning). For example, the word family for teach includes: teach, teaches, taught, teaching, teacher, teachers, teachable, unteachable, reteach. The word family concept is fundamental to vocabulary size measurement — when researchers say “a learner needs 3,000 word families” for basic reading comprehension, they mean 3,000 distinct bases and their related forms. Paul Nation’s work on vocabulary frequency and size relies heavily on word family as the unit of counting.


Components of a Word Family

A word family includes:

  • Base word: The canonical dictionary entry form (teach)
  • Inflections: Grammatical variants that don’t change lexical category or core meaning (teaches, taught, teaching)
  • Derivatives: Forms created by adding derivational affixes, which often change grammatical category (teacher [noun], teachable [adjective], unteachable [adjective])

A key boundary issue: How far does derivation extend? Researchers differ on whether semantically distant derivatives should count as the same family:

  • teachteacher → clearly the same family
  • signsignalsignificantsignificance → where does the family stop?

Nation’s BNC/COCA word family lists use relatively conservative criteria.

Word Family vs. Lemma vs. Type

UnitDefinitionExample (for teach)
Word typeEach distinct word formteach, teaches, taught, teaching, teacher… (separate types)
LemmaBase form + inflections onlyteach, teaches, taught, teaching
Word familyBase + inflections + derivativesteach, teacher, teachable, unteachable…

For vocabulary size estimation, word family is the most common unit because it credits learners appropriately for knowing related forms they can often infer morphologically.

Why Word Families Matter for Vocabulary Size

Nation’s research established minimum vocabulary coverage thresholds:

  • 98% text coverage is needed for comfortable reading with occasional dictionary use
  • 3,000 word families covers about 90% of most texts
  • 5,000 word families covers about 95%
  • 8,000–9,000 word families needed for 98% coverage of general written text (Nation, 2006)

These targets guide curriculum design and learner goal-setting.

Word Families and Japanese

The Japanese concept of word family is complicated by:

  • Kanji compounds (jukugo): Two or more kanji combine to form new words — knowing individual kanji gives partial access to compound meanings (e.g., 学 [study] + 校 [school] = 学校 [school])
  • Verb forms: Japanese verbs are highly agglutinative — a single verb root generates many conjugated forms (て-form, た-form, ない-form, potential form, causative, passive, causative-passive…)
  • Reading variants: A kanji has multiple readings (on-yomi, kun-yomi) that function differently in different word families
  • Wago/Kango overlap: Many kanji have both native Japanese (kun) and Sino-Japanese (on) readings used in different word family contexts (山 yama “mountain” vs. 山脈 sanmyaku “mountain range”)

Testing Vocabulary Size Using Word Families

The Vocabulary Levels Test (VLT) — designed by Paul Nation and revised by Schmitt — organizes test items by frequency bands of word families (1,000, 2,000, 3,000, 5,000, 10,000 word families) to diagnose where a learner’s vocabulary coverage breaks down.


History

The word family concept was formalized in applied linguistics by Bauer and Nation (1993), who proposed a systematic framework for defining word families at different levels of affixal complexity. A word family consists of a base word, its inflected forms, and its transparent derived forms — for example, “develop,” “develops,” “developing,” “developed,” “development,” and “developer” constitute one word family. Bauer and Nation established seven levels of word families, progressing from inflections only (Level 2) to the most productive and frequent derivational affixes (Levels 3-6) to rare and classical affixes (Level 7). Nation’s (2001, 2006) vocabulary frequency lists, organized by word families, became the standard tool for vocabulary size and text coverage research. The word family unit underpins the most widely used vocabulary size tests and frequency lists in applied linguistics.


Common Misconceptions

“All forms in a word family are equally easy to learn.”

Research shows that knowing a base word does not guarantee knowledge of its derived forms. A learner who knows “develop” may not know “developmental” or “underdevelopment.” The word family assumption — that knowing the base provides access to all family members — overestimates learner knowledge.

“Word families are the same in all languages.”

The word family concept was developed for English, which has productive derivational morphology (prefixes and suffixes that create new words). Japanese word formation works differently: kanji compounds (漢語), native Japanese derivation (和語), and loanword formation (外来語) do not map onto the English word family model.

“Learning word families is the most efficient vocabulary strategy.”

Learning morphological patterns (prefixes, suffixes, kanji components) that help decode unfamiliar family members is efficient, but only after learners have sufficient base vocabulary to apply these patterns. Beginners benefit more from learning individual high-frequency words than from morphological analysis.

“The word family is the only useful unit for counting vocabulary.”

Lemma-based counting (base form + inflections, excluding derivations) and flemma-based counting (reducing to base spelling irrespective of part of speech change) are alternatives that may better represent what learners actually know — especially for learners who have not acquired the morphological knowledge to connect family members.


Criticisms

The word family as the standard unit for vocabulary research has been challenged by McLean (2018) and others who argue that it overestimates learner vocabulary knowledge. If a learner knows “compete” but not “competitive” or “competitiveness,” counting all three as known (because they belong to the same family) inflates vocabulary size estimates and may produce misleading coverage calculations.

The concept has been criticized for being English-centric: Japanese, Chinese, and other languages with different morphological systems are poorly served by the word family framework. Japanese vocabulary research requires alternative units — kanji-based approaches, compound analysis, or morpheme-level counting — that the standard word family model does not provide. Additionally, the Bauer and Nation levels, while systematic, involve subjective decisions about which affixes are “transparent” at each level, leading to inconsistency in how different researchers define word families.


Social Media Sentiment

The word family concept is not widely discussed by name in language learning communities, but the underlying principle influences vocabulary learning practice. Learners commonly share “word family lists” (teach → teacher, teaching, taught) and discuss how learning roots and affixes can accelerate vocabulary growth.

In Japanese learning communities, the closest equivalent is kanji-component learning: recognizing that 食 appears in 食べる, 食事, 食堂, 食品 provides a morphological grouping similar to a word family. The advice to “learn kanji radicals/components” connects to the word family principle of morphological generalization.


Practical Application

  1. Learn morphological patterns — Study common prefixes, suffixes, and word-building patterns in your target language. In English: un-, re-, -tion, -ment. In Japanese: kanji compound patterns (漢語), common suffixes (〜的, 〜性, 〜化).
  2. Don’t assume family members are automatically known — When you learn a base word, explicitly note and study its important derived forms. “Develop” → “development” is not automatic knowledge; it needs exposure.
  3. Use word families for vocabulary estimation — When assessing your vocabulary size, word family counts give a rough but useful estimate. But be honest about which family members you actually know versus which you’re counting by assumption.
  4. Build kanji component awareness for Japanese — Recognizing shared kanji across words (会議, 会社, 会話) provides a Japanese-specific version of word family grouping that aids vocabulary expansion.

Related Terms


See Also


Research

Bauer and Nation (1993) established the systematic framework for defining word family levels. Nation (2006) used the word family unit to develop the BNC/COCA frequency-based word lists that underpin vocabulary size research.

McLean (2018) challenged the word family assumption, providing empirical evidence that L2 learners’ knowledge of derived forms within a word family is inconsistent — knowing a base word does not reliably predict knowledge of derivations. This finding has implications for vocabulary size estimation and has led to increased interest in lemma-based and flemma-based alternatives. For Japanese vocabulary research, Matsushita (2012) developed frequency-based lists using different unit definitions than the English word family, recognizing that Japanese vocabulary structure requires different analytical units due to the roles of kanji compounds, katakana loanwords, and native vocabulary in the Japanese lexicon.