Typology Data Tools Anna Siewierska Dik Bakker Typology
Typology: Data & Tools Anna Siewierska & Dik Bakker Typology: Data & Tools
Overview Course DAY 1: Data & Tools - Our domain DAY 2: Sampling DAY 3: Exercise - Questionnaires DAY 4: Corpora and typology DAY 5: Exercise - Other applications Typology: Data & Tools 2
Typology: domain LINGUISTICS DOMAIN INDEFINITENESS GENDER CASE ASPECT PERSON INCLUSIVITY IMPERSONALS WORD ORDER … Typology: Data & Tools Haspelmath Corbett Blake Comrie Siewierska Cysouw Stassen Dryer You! 3
Domain: knowledge DOMAIN “K N O W L E D G E” Typology: Data & Tools 4
Domain: knowledge DOMAIN ARTICLES MONOGRAPHS GRAMMARS EXPERTS TEXTS SPEAKERS Typology: Data & Tools 5
Domain: structure DOMAIN ARTICLES MONOGRAPHS GRAMMARS EXPERTS TEXTS SPEAKERS Typology: Data & Tools 6
Domain: structure PROVISIONAL Definition Categories: C 1 C 2 … ARTICLES MONOGRAPHS GRAMMARS EXPERTS TEXTS SPEAKERS Hypotheses Typology: Data & Tools 7
Domain: structure L PROVISIONAL Definition Categories: C 1 C 2 … Hypotheses L L L L Typology: Data & Tools 8
Domain: structure L PROVISIONAL Hypotheses L L Definition Categories: C 1 C 2 … L L L DATA L L L L L Typology: Data & Tools 9
Domain: structure L PROVISIONAL Hypotheses L L Definition Categories: C 1 C 2 C 3 L L L DATA L L L L L Typology: Data & Tools 10
Domain: structure L PROVISIONAL Hypotheses L L Definition Categories: C 1 … C 3 L L L DATA L L L L L Typology: Data & Tools 11
Domain: structure L PROVISIONAL Hypotheses L L Definition Categories: C 1 … C 3 L L L DATA L L L TEST L L L Typology: Data & Tools 12
Domain: structure Categories: C 1 … C 3 Hypotheses L L Definition DATA L L L PROVISIONAL L L L Typology: Data & Tools L L L TEST L L 13
Domain: structure Categories: C 1 … C 3 Hypotheses L L Definition DATA L L L PROVISIONAL L L L Typology: Data & Tools L L L TEST L L 14
Domain: structure Categories: C 1 … C 3 Hypotheses L L Definition DATA L L L DEFINITIVE L L L L Typology: Data & Tools L L L TEST L L 15
Domain: structure DOMAIN Typology: Data & Tools 16
Domain: structure DOMAIN Typology: Data & Tools 17
Empirical Cycle H Y P O T H E S LINGUISTIC THEORY D A Typology: Data & Tools 18
Empirical Cycle H Y P O T H E S LINGUISTIC THEORY D A Typology: Data & Tools 19
Empirical Cycle H Y P O T H E S LINGUISTIC THEORY Typology (narrow) D A Typology: Data & Tools 20
Empirical Cycle Typology (broad) H Y P O T H E S TYPOLOGICAL DOMAIN X D A Typology: Data & Tools 21
Data central H Y P O T H E S TYPOLOGICAL DOMAIN X D A Typology: Data & Tools 22
Data central H Y P O T H E S Categorization TYPOLOGICAL DOMAIN X D A T A Collection Organization Typology: Data & Tools Analysis 23
Data + Computational Tools D A T A Categorization Collection Organization Typology: Data & Tools Analysis 24
Language data Let’s not forget that language is the noise that people produce when chatting to each other … Typology: Data & Tools 25
Language data spoken Primary data written Typology: Data & Tools 26
Language data spoken Primary data written Analytical data Typology: Data & Tools 27
Language data + tools Primary data CORPUS Typology: Data & Tools (scanning) annotation parsing exploration 28
Language data + tools Primary data Analytical data CORPUS DATA BASE Typology: Data & Tools (scanning) annotation parsing exploration data entry coding storage retrieval sampling analysis 29
Language data + tools Primary data Secondary data Analytical data CORPUS (scanning) annotation parsing exploration DATA BASE data entry coding storage retrieval sampling analysis QUESTION NAIRE Typology: Data & Tools 30
Language data + tools Analytical data DATA BASE Typology: Data & Tools data entry coding storage retrieval sampling analysis 31
Analytical data A N A L Y T I C A L ? D A Typology: Data & Tools 32
Analytical data INDEFINITENESS GENDER CASE ASPECT PERSON INCLUSIVITY IMPERSONALS WORD ORDER … Typology: Data & Tools 33
Analytical data INDEFINITENESS GENDER CASE ASPECT PERSON INCLUSIVITY IMPERSONALS WORD ORDER … Typology: Data & Tools 34
Analytical data INCLUSIVITY DEF: distinction in the pronominal paradigm between a first person plural form which includes the hearer and a form which excludes the hearer Typology: Data & Tools 35
Analytical data PRIMARY DATA: English (Indo-European) 1 SG I 2 SG you 3 SG he/she/it 1 PL we 2 PL you 3 PL they Typology: Data & Tools 36
Analytical data PRIMARY DATA: English (Indo-European) We will surely meet again, Sarah and I We will surely meet again, you and I Typology: Data & Tools 37
Analytical data PRIMARY DATA: Gapapaiwa (Oceanic) 1 SG taku 2 SG tam 3 SG tuna 1 INCL tota 1 EXCL tokai 2 PL tami 3 PL ti Typology: Data & Tools 38
Analytical data CATEGORY: Inclusivity English NOT EXPRESSED Gapapaiwa EXPRESSED Typology: Data & Tools 39
Analytical data CATEGORY: Inclusivity English NO Gapapaiwa YES Typology: Data & Tools 40
Analytical data CATEGORY: Inclusivity English NO Gapapaiwa YES Typology: Data & Tools 41
Inclusivity NO 1+2=1+2+3=1+3 Only inclusive 1+2=1+2+3 Minimal inclusive 1+2 vs. 1+2+3=1+3 Augmented inclusive 1+2+3 vs. 1+2=1+3 Inclusive/exclusive 1+2=1+2+3 vs. 1+3 Minimal augmented 1+2 vs. 1+2+3 vs. 1+3 Typology: Data & Tools 42
Analytical data CATEGORY: Inclusivity English NO Gapapaiwa YES Typology: Data & Tools 43
Analytical data CATEGORY: Inclusivity English NO Gapapaiwa YES Typology: Data & Tools 44
Analytical data Variable: Inclusivity Values: Not expressed Only inclusive Minimal inclusive Augmented inclusive Inclusive/exclusive Minimal augmented Typology: Data & Tools 45
Variables & Values Typology: Data & Tools 46
Variables & Values Gender Person Number Inclusivity Domain (e. g. pronominal marking) Typology: Data & Tools 47
Variables & Values Variable: Morphological_Alignment Values: Accusative Ergative Active Neutral Hierarchical Tripartite Typology: Data & Tools 48
Variables & Values Variable: Morphological_Alignment Values: Accusative Ergative Active Neutral Hierarchical Tripartite Typology: Data & Tools 49
Variables & Values Variable: Morphological_Alignment Values: Accusative Ergative Active Neutral Hierarchical Tripartite NOMINAL Typology: Data & Tools 50
Variables & Values Variable: Morphological_Alignment Values: Accusative Ergative Active Neutral Hierarchical Tripartite NOMINAL STATISTICS Typology: Data & Tools 51
Variables & Values Scale of variable types: Typology: Data & Tools 52
Variables & Values Scale of variable types: NOMINAL: Val_1 Val_2 Val_3 … Typology: Data & Tools 53
Variables & Values Scale of variable types: NOMINAL: Val_1 Val_2 Val_3 … EX: Alignment; Gender; Person; … Typology: Data & Tools 54
Variables & Values Scale of variable types: NOMINAL: Val_1 Val_2 Val_3 … ORDINAL: Val_1 < Val_2 < Val_3 < … Typology: Data & Tools 55
Variables & Values Scale of variable types: NOMINAL: Val_1 Val_2 Val_3 … ORDINAL: Val_1 < Val_2 < Val_3 < … EX: Number (SG < DU < TRI < PL) Typology: Data & Tools 56
Variables & Values Scale of variable types: NOMINAL: Val_1 Val_2 Val_3 … ORDINAL: Val_1 < Val_2 < Val_3 < … INTERVAL: Val_1 < Val_2 < Val_3 < … 1 1 1 Typology: Data & Tools 57
Variables & Values Scale of variable types: NOMINAL: Val_1 Val_2 Val_3 … ORDINAL: Val_1 < Val_2 < Val_3 < … INTERVAL: Val_1 < Val_2 < Val_3 < … EX: Case_distinctions (1 < 2 < 3 < 4 < … < 52) Typology: Data & Tools 58
Variables & Values Scale of variable types: NOMINAL: QUALITATIVE – DIRECTLY OBSERVED ORDINAL: QUANTITATIVE - DERIVED INTERVAL: Typology: Data & Tools 59
Codebook List of variables plus values (preliminary): Var_1: Val_1, Val_2, … , Val_i Var_2: Val_1, Val_2, … , Val_j Var_3: Val_1, Val_2, … , Val_k … Var_v: Val_1, Val_2, … , Val_n Typology: Data & Tools 60
Data collecting Languages of the world: n 7000 Typology: Data & Tools 61
Data collecting S A M P L E (50 – 500) Typology: Data & Tools 62
Data collecting Language_1 Language_2 Language_3 Language_4 Language_5 … Language_n Typology: Data & Tools 63
Data collecting Var_1 Var_2 Var_3 … Var_k Language_1 Language_2 Language_3 Language_4 Language_5 … Language_n Typology: Data & Tools 64
Data collecting Var_1 Var_2 Var_3 . . Var_k Language_1 Language_2 Language_3 Language_4 Language_5 … Language_n Typology: Data & Tools 65
Data Matrix Var_1 Var_2 Var_3 . . Var_k Language_1 Language_2 Language_3 Language_4 Language_5 … Language_n V_3_3 Typology: Data & Tools 66
Values to Variable: ‘N A M E’ Typology: Data & Tools 67
Values to Variable: Inclusivity Typology: Data & Tools 68
Values to Variable: Inclusivity MNEMONIC Typology: Data & Tools 69
Values to Variable: Inclusivity Values: Not expressed Only inclusive Minimal inclusive Augmented inclusive Inclusive/exclusive Minimal augmented Typology: Data & Tools 70
Values to Value Labels Variable: Inclusivity Values: Not expressed Only inclusive Minimal inclusive Augmented inclusive Inclusive/exclusive Minimal augmented Value labels: 1 2 3 4 5 6 Typology: Data & Tools 71
Values to Value Labels Variable: Inclusivity Values: Not expressed Only inclusive Minimal inclusive Augmented inclusive Inclusive/exclusive Minimal augmented Value labels: 1 2 3 4 5 6 Typology: Data & Tools 72
Values to Value Labels Variable: Inclusivity Values: Not expressed Only inclusive Minimal inclusive Augmented inclusive Inclusive/exclusive Minimal augmented Value labels (mnemonic): NO Only. Incl Min. Incl Aug. Incl In. Excl Min. Aug Typology: Data & Tools 73
Data Matrix LANG FAMILY BWO ALIGN INCLUS Abipon Caucas SVO ACC Unif. We Abkhaz Indo_Pac SOV ERG In. Excl Abun Austric SVO NEUT Unif. We Acehnese Amerind FREE ACT In. Excl Typology: Data & Tools 74
Data Matrix LAN G Abip on Abk haz Abu n Aceh nese FA MI LY Cau cas Ind o_P ac Aus tric Am erin d BWO ALIG N INCL US SVO ACC SOV ERG Unif We In. Excl SVO NEUT FREE ACT Unif We In. Excl Typology: Data & Tools 75
Data Matrix: export Data entry (Excel; File. Maker) LAN G Abip on Abk haz Abu n Aceh nese FA MI LY Cau cas Ind o_P ac Aus tric Am erin d BWO ALIG N INCL US SVO ACC SOV ERG Unif We In. Excl SVO NEUT FREE ACT Unif We In. Excl Typology: Data & Tools 76
Data Matrix: import Data entry (Excel; File. Maker) LAN G Abip on Abk haz Abu n Aceh nese FA MI LY Cau cas Ind o_P ac Aus tric Am erin d BWO ALIG N INCL US SVO ACC SOV ERG Unif We In. Excl SVO NEUT FREE ACT Unif We In. Excl Data retrieval (Access) Typology: Data & Tools 77
Data Matrix: import Data entry (Excel; File. Maker) LAN G Abip on Abk haz Abu n Aceh nese FA MI LY Cau cas Ind o_P ac Aus tric Am erin d BWO ALIG N INCL US SVO ACC SOV ERG Unif We In. Excl SVO NEUT FREE ACT Unif We In. Excl Data retrieval (Access) Sampling Typology: Data & Tools 78
Data Matrix: import Data entry (Excel; File. Maker) LAN G Abip on Abk haz Abu n Aceh nese FA MI LY Cau cas Ind o_P ac Aus tric Am erin d BWO ALIG N INCL US SVO ACC SOV ERG Unif We In. Excl SVO NEUT FREE ACT Unif We In. Excl Data Analysis (SPSS; Mat. Lab) Data retrieval (Access) Sampling Typology: Data & Tools 79
Data Matrix: import Data entry (Excel; File. Maker) LAN G Report (Word) Abip on Abk haz Abu n Aceh nese FA MI LY Cau cas Ind o_P ac Aus tric Am erin d BWO ALIG N INCL US SVO ACC SOV ERG Unif We In. Excl SVO NEUT FREE ACT Unif We In. Excl Data Analysis (SPSS; Mat. Lab) Data retrieval (Access) Sampling Typology: Data & Tools 80
Data Matrix: complications 1 language 1 value: Variable: Adjective – Noun order English: (1) a. A small contribution b. *A contribution small AN_Order = AN Typology: Data & Tools 81
Data Matrix: complications 1 language 1 value: Variable: Adjective – Noun order Spanish: (2) a. Una contribución pequeña b. Una pequeña contribución AN_Order = AN+NA BUT: Typology: Data & Tools 82
Data Matrix: complications 1 language 1 value: Variable: Main Clause Order Polish: Main. Cl. Order = SOV+SVO+VSO+VOS+OVS+OSV Proliferation of values: unanalyzable Typology: Data & Tools 83
Data Matrix: scales AN_basic AN_alt AN NA AN IRREL NA AN NA IRREL Typology: Data & Tools 84
Data Matrix: conditions AN_1 AN_cond 1 AN_2 AN_cond 2 AN abstract NA concrete AN IRREL NA human AN NA IRREL nonhuman Typology: Data & Tools IRREL 85
Data Matrix: missing values LG Subj_Agr Numb_Agr Gender_Agr Abipon YES PL NO Adzera NO IRR Aghem YES DUPL UNKN … Typology: Data & Tools 86
Existing data resources - appendices of books / articles (Greenberg 1963; Hawkins 1985; Nichols 1992) Typology: Data & Tools 87
Existing data resources - appendices of books / articles (Greenberg 1963; Hawkins 1985; Nichols 1992) - projects (Euro. Typ; WALS) Underlying database Typology: Data & Tools 88
Existing data resources - appendices of books / articles (Greenberg 1963; Hawkins 1985; Nichols 1992) - projects (Euro. Typ; WALS) - on line data resources (LTRC; Ethnologue (SIL) ) Typology: Data & Tools 89
Data management tools - Windows: Excel; Access - SIL: Shoebox Toolbox - File. Maker Pro Typology: Data & Tools 90
- Slides: 90