Definition:
Forvo (forvo.com) is a free online pronunciation dictionary that hosts human-recorded audio of words, names, and phrases in over 400 languages, spoken by verified native speakers from different regions. When a learner encounters an unfamiliar word and wants to know how it actually sounds — not how a text-to-speech engine approximates it — Forvo provides recordings from real people who speak that language natively. Multiple recordings for the same word are often available, reflecting regional accent variation. Forvo is widely used as the reference pronunciation resource for any language where learner access to native speakers is limited.
In-Depth Explanation
Text-to-speech (TTS) engines have improved significantly, but they produce synthetic speech that may mismodel prosody, stress, and phoneme realization — especially for less-resourced languages. Forvo addresses this with human recordings: every audio file on the platform was recorded by a registered user who has indicated native or near-native speaker status for that language.
How It Works
Looking up a word:
Enter a word in any language into the Forvo search bar. If the word is in the database:
- All available recordings are listed, one per native speaker contributor
- Each recording shows the speaker’s username, country of origin, and (sometimes) regional dialect
- The recording is streamed or downloadable
- Users can upvote or downvote recordings for quality
Multiple recordings:
Because multiple native speakers contribute pronunciations for the same word, learners can hear regional accent variation. Searching a Spanish word might return recordings from speakers from Spain, Mexico, Argentina, and Colombia — illustrating how pronunciation differs across varieties of the same language.
Request system:
If a word is not in the database, any registered user can submit a pronunciation request. Native speakers then record it. This crowd-sourcing mechanism means the database expands continuously based on learner demand.
Language Coverage
Forvo covers over 400 languages, including:
- Major world languages (English, Spanish, French, Mandarin, Arabic, Japanese, Korean, Russian, etc.)
- Regional varieties within languages (British vs. American English, Castilian vs. Latin American Spanish)
- Less-resourced languages (minority and endangered languages, indigenous languages, smaller national languages)
The database contains millions of recordings, making it the largest human-voice pronunciation dictionary online.
Integration with Other Tools
Forvo is commonly integrated into language learning tools:
- Anki add-ons: Multiple Anki add-ons can pull Forvo audio automatically when creating flashcards, adding native-speaker pronunciation audio to cards without manual recording
- Yomitan: Can be configured to fetch Forvo audio for Japanese words during hover-over lookups
- Pleco: Forvo recordings available as optional audio source for Chinese word lookups
- Language learning browser extensions: Various extensions query Forvo for pronunciation alongside dictionary lookups
Names and Proper Nouns
Forvo is particularly valuable for names — personal names, place names, and brand names — that dictionaries often omit. Knowing how to pronounce a Japanese person’s name, a French city, or a Korean company correctly is difficult without a native-speaker reference. Forvo’s proper noun database is extensive.
History
- 2008: Forvo founded in Spain; initially launched as a Spanish-language pronunciation community.
- 2009–2011: Rapid expansion to cover dozens of languages as the platform attracted native speaker contributors globally; relaunched as a fully multilingual service.
- 2010s: Becomes the de facto standard pronunciation reference integrated into language learning workflows worldwide; API made available for third-party tool integration.
- Ongoing: Database continues growing through community contributions; API used by hundreds of educational tools.
Common Misconceptions
“Forvo recordings might not be accurate — anyone can contribute.”
Forvo has a quality control mechanism: registered users must declare their native language, and other users can rate recordings. Low-quality recordings are downvoted and higher-quality recordings rise to the top. While imperfect, this system generally surfaces reliable pronunciations. Learners are advised to check multiple recordings and prefer highly-rated contributors.
“TTS is good enough — I don’t need Forvo.”
For common words in major languages, modern TTS is often reliable. But for proper nouns, regional vocabulary, technical terms, and minority languages, TTS audio may be absent, incorrect, or unnatural. Forvo’s human recordings provide pronunciation that no TTS engine can replicate for these cases.
Social Media Sentiment
- r/languagelearning: Near-universal positive mention as a reliable pronunciation reference. Particularly praised for proper noun coverage and multi-regional recording variety.
- r/LearnJapanese: Frequently cited for Japanese name and place name pronunciation — an area where other tools fall short.
- Language learning podcasts and YouTube: Consistently included in “essential tools” resource lists across all language learning communities.
Last updated: 2026-04
Related Terms
See Also
Research
- Derwing, T. M., & Munro, M. J. (2005). Second language accent and pronunciation teaching: A research-based approach. TESOL Quarterly, 39(3), 379–397. https://doi.org/10.2307/3588486
Summary: Establishes that access to accurate native-speaker pronunciation models is foundational for pronunciation development — the core rationale for tools like Forvo that provide authentic human recordings.
- Bauer, L., & Nation, P. (1993). Word families. International Journal of Lexicography, 6(4), 253–279. https://doi.org/10.1093/ijl/6.4.253
Summary: Word form knowledge — including phonological form — is established as a core component of vocabulary knowledge, supporting the value of pronunciation reference tools in vocabulary acquisition.