Search and filter 500+ world languages. Click any card for a full typological profile.
Hierarchical view of all language families, branches, and individual languages.
Intelligibility matching, grammar similarity, side-by-side comparison, timeline, and world statistics.
Select a language to see which others its speakers can understand — and to what degree. Based on Gooskens (2007), Tang & van Heuven (2009).
Find languages that share grammatical features across families. Scored via WALS-derived typological features.
Select 2–5 languages for a side-by-side typological comparison. Matching features are highlighted.
Extinct and historical languages with their approximate dates of last attestation.
Global overview of language diversity, distribution, and endangerment.
Data provenance, scoring methodology, and academic references. All typological data derives from peer-reviewed databases and published literature.
Comprehensive catalogue of all known language families, languages, and dialects. Source for ISO 639-3 codes, family classification, and geographic coordinates.
Cross-linguistic database of structural (typological) properties. Primary source for word order, morphological type, case alignment, gender, and evidentiality data.
Repository of cross-linguistic phonological inventory data. Source for consonant/vowel inventory sizes, click phonemes, ejectives, tonal classifications, and nasal vowels.
Classification of languages by endangerment: safe, vulnerable, definitely endangered, severely endangered, critically endangered, and dormant/extinct. Used for all unescoStatus values.
Reference catalogue of all known living languages. Source for speaker population estimates, country-level distribution, and EGIDS vitality levels.
Cross-linguistic morphological database. Secondary source for morphological type and case system complexity.
Intelligibility scores represent the percentage of content a native speaker can understand without prior study of the target language. Scores are directional — Portuguese → Spanish intelligibility differs from Spanish → Portuguese.
Key literature:
| Feature | Weight | Source |
|---|---|---|
| Word Order (SOV/SVO/VSO…) | 15% | WALS ch. 81A |
| Morphological Type | 15% | WALS ch. 26A |
| Case Alignment | 12% | WALS ch. 98A |
| Case Count | 8% | WALS ch. 49A |
| Tone System | 10% | PHOIBLE / WALS ch. 13A |
| Gender/Noun Class System | 8% | WALS ch. 30A |
| Evidentiality | 7% | WALS ch. 77A |
| Vowel Harmony | 5% | WALS ch. 116A |
| Consonant Inventory Size | 5% | PHOIBLE / WALS ch. 1A |
| Vowel Inventory Size | 5% | PHOIBLE / WALS ch. 2A |
| Definiteness Strategy | 5% | WALS ch. 37A |
| Negation Strategy | 5% | WALS ch. 143A |
Safe: spoken in all domains, intergenerational transmission secure. Vulnerable: mostly spoken by adults. Definitely endangered: children no longer learn as native. Severely endangered: very few speakers, often elderly. Critically endangered: near extinction. Extinct: no native speakers remain.
Canonical order of Subject (S), Verb (V), and Object (O) in declarative clauses. SOV (e.g. Japanese), SVO (English), VSO (Welsh), VOS (Malagasy), OVS (Hixkaryana), OSV (rare). Source: WALS 81A.
Isolating: words are typically single morphemes (Mandarin). Agglutinative: morphemes stack with clear boundaries (Turkish). Fusional: morphemes blend together (Russian). Polysynthetic: entire sentences encoded in one word (Inuktitut).
Three-letter code identifying individual human languages, maintained by SIL International. GlossaForge uses ISO 639-3 as primary language identifiers (e.g., eng = English, cmn = Mandarin Chinese).
Unique identifier from the Glottolog catalogue. Unlike ISO 639-3, glottocodes also cover dialects, proto-languages, and language families at any node in the tree.
Grammaticalized encoding of the source of information: whether the speaker witnessed the event, heard about it, or inferred it. Found grammatically in languages like Quechua, Bulgarian, and Turkish.
Nominative-accusative: subject of transitive and intransitive verbs share one form (Latin, Russian). Ergative-absolutive: subject of intransitive and object of transitive share one form (Basque, Chechen).
Degree to which speakers of one language can understand speakers of another without prior study. Ranges from 0 (none) to 1 (complete). Values here are from empirical listening tests reported in peer-reviewed linguistics research.