ALEXANDRU IOAN CUZA UNIVERSITATY OF IAI FACULTY OF
- Slides: 20
“ALEXANDRU IOAN CUZA” UNIVERSITATY OF IAŞI FACULTY OF COMPUTER SCIENCE The Semantics and Pragmatics of Natural Language Daniela GÎFU http: //profs. info. uaic. ro/~daniela. gifu/
Course 1 The General Presentation 2
Main Concepts 1. Natural Language - used by human beings for communication. . . - sign, system, symbols, ruleset (or grammar) 2. Semantics - word meaning, causes of words change. . . 3. Pragmatics - how language is used by a emitent in a given context, with the intention to act in a determined mode and with certain effects on the interlocutor. . . 3
Natural Language Processing – a subdomain of Artificial Intelligence and Linguistics 1. Thematic Areas - Linguistics - mathematical linguistics - computational linguistics - Formal Language - Linguistic and Language Processing - The grammatical structure of utterances: the sentence, constituents, phrase, classifications and structural rules, syntactic processing. . . - Parser - Semantics & Pragmatics 4
Mathematical linguistics - the study of mathematical structures and methods that are of importance to linguistics → Phonetics, → Phonology, → Morphology, → Syntax, and → Semantics, → and… Sociolinguistics → Language Acquisition. Computational linguistics - the scientific and engineering discipline concerned with understanding written and spoken language from a computational perspective. - detecting synonymy (Grigonytė et al. , 2010); - developing Word. Net (Gala et Mititelu, 2013), (Iftene and Balahur, 2007). . . ; -WSD (Yang, H. et al. 2010), (Lefever et Hoste, 2010), (Tufiș, 2002). . . ; - semantic annotation (Garcia et al. , 2012). . . ; - reconstructing a diachronic morphology (Cristea et al. , 2007/2012) - diachronic text classification (Mihalcea and Năstase, 2012; Popescu and Strapparava, 2015), etc. 5
Formal language 1. Symbol - a character, an abstract entity that has no meaning by itself Ex: lettters, digits and special characters 2. Alphabet - finite set of symbols - often denoted by Σ Ex: B = {0, 1} says B is an alphabet of two symbols, 0 and 1 C = {a, b, c} – C an alphabet of 3 symbols, a, b and c 6
Formal language 3. String or a word - a finite sequence of symbols from an alphabet Ex: 01110 and 111 are strings from the alphabet B above aaabccc and b are strings from the C above 4. Language - a set of strings from an alphabet 5. Formal language (or simply language) - a set L of strings over some finite alphabet Σ - described using formal grammars 7
Linguistic and Language Processing 1. Linguistics - Science of language. Includes: 1. Sounds (phonology) 2. Word formation (morphology) 3. Sentence structure (syntax) 4. Meaning (semantics) and understanding (pragmatics)… 2. Levels of linguistic analysis - Higher level → Speech Recognition (SR) - Lower levels → Natural Language Processing (NLP) 8
Levels of Linguistic Analysis Acoustic signal Phones SR Phonetics – production and perception of speech Phonology – Sound patterns of language Letters - strings Lexicon – Dictionary of words in a language Morphemes Morphology – Word formation and structure Words NLP Syntax – Sentence structure Phrases & sentences Semantics – Intended meaning Meaning out of context Pragmatics – Understanding from external info Meaning in context 9
Steps of NLP 1. Morphological and Lexical Analysis - Lexicon - Morphology – identification, analysis and description of structure of words - Words – the smallest units of syntax - Syntax – the rules / principles that govern the sentence structure of any language - Lexical analysis – dividing text into paragraphs, sentences and words 2. Syntactic analysis - Analysis of words in a sentence, knowing the grammatical structure of the sentence Ex: Boy the go the store – correct? 10
Steps of NLP 3. Semantic Analysis - Derives an absolute (dictionary definition) meaning from the context - The structure created by the syntactic analyzer are assigned meaning. A mapping is made between the syntactic structure and objects in the task domain. Ex: “Colourless green ideas…” – correct? 4. Discourse Integration - The meaning of an individual sentence may depend on the sentences that precede it and may influence the meaning of the sentences that follow it. Ex: the word “it” in the sentence, “you wanted it” depends on the prior discourse context. 11
Steps of NLP 5. Pragmatic analysis - Derives knowledge from the external commonsense information - Means understanding the purposeful use of language in situations particularly those aspects pf language which require world knowledge - What was said is reinterpreted to determine what was actually meant. Ex: “Do you know what time it is” – should be interpreted as a request. 12
Semantics and pragmatics (S & P) 1. S & P - 2 stages of analysis concerned with getting at the meaning of a sentence; - 1 st – S – a partial representation of the meaning based on the possible syntactic structure(s) of the sentence and the meanings of the words in that sentence; - 2 nd – P – the meaning based on the contextual and the world knowledge. 13
Semantics and pragmatics (S & P) 14
Semantics and pragmatics (S & P) 1. Ex. for differences: “He asked for the boss”. We can work out that: 1. Someone (who is male) asked for someone who is a boss. 2. We can’t say who these people are and why the first guy wanted the second. 3. If we know something about the context (including the last few sentences spoken/written) we may be able to work these things out. 4. Maybe the last sentence was: “Fred had just been sacked”. 5. From our general knowledge that bosses generally sack people: if people want to speak to people who sack them it is generally to complain about it. 6. We could then really start to get at the meaning of the sentence: “Fred wants to complain to his boss about getting sacked”. 15
Homework: 1. Each student has to present a paper about clustering texts that guide final project (https: //aclweb. org/anthology/) între 2010 -2016 Platformele: LREC (Language Resources and Evaluation Conference) ACL (Association of Computational Linguistics) EACL (European Association of Computational Linguistics) Coling (International Conference on Computational Linguistics) 16
Other references… • Hamid Palangi, Li Deng, Yelong Shen, Jianfeng Gao, Xiaodong He, Jianshu Chen, Xinying Song, Rabab Ward (2015) Deep Sentence Embedding Using Long Short-Term Memory Networks: Analysis and Application to Information Retrieval. In https: //arxiv. org/pdf/1502. 06922. pdf • Kate Cohen, Fredrik Johansson, Lisa Kaati, and Jonas Mork, (2014) Detecting Linguistic Markers for Radical Violence in Social Media, Terrorism and Political Violence 26, no. 1 : 246 -256. • Joel Brynielsson, Andreas Horndahl, Fredrik Johansson, Lisa Kaati, Christian Martenson, and Pontus Svenson. (2013). Harvesting and Analysis of Weak Signals for Detecting Lone-Wolf Terrorists. Security Informatics 2, no. 11 (2013), accessed May 15, 2016, http: //www. securityinformatics. com/content/2/1/11; • Alexander V. Mamishev and Murray Sargent. (2013). Creating Research and Scientific Documents Using Microsoft Word. Microsoft Press, Redmond, WA. • Sean M. Gerrish and David M. Blei. (2010). A language-based approach to measuring scholarly impact. In Proceedings of International Conference of Machine Learning. 17
• Alexander V. Mamishev and Sean D. Williams. 2010. Technical Writing for Teams: The STREAM Tools Handbook. Wiley-IEEE Press, Hoboken, NJ. Jonas Muller, Aditya Thyagarajan (2016). Siamese Recurrent Architectures for Learning Sentence Similarity. In Proceedings of AAAI-16 • Tim Rocktäschel, Edward Grefenstette, Karl Moritz Hermann, Tomáš Kočiský, Phil Blunsom (2015) Reasoning about entailment with Neural Attention. IN Proceedings of ICLR, http: //arxiv. org/abs/1509. 06664 • Xiaofeng Wang, Matthew S. Gerber, and Donald E. Brown. 2012. Automatic Crime Prediction using Events Extracted from Twitter Posts. SBP, LNCS 7227: 231 -238. • Yaser Abu-Mostafa, Malik Magdon-Ismail, Hsuan-Tien Lin. (2012). Learning From Data, amlbook. com. • Jiaming Xu, Peng Wang, Guanhua Tian, Bo Xu, Jun Zhao, Fangyuan Wang, Hongwei Hao (2015) Short Text Clustering via Convolutional Neural Networks. In Proceedings of NAACL-HLT 2015, 62– 69 • Trevor Hastie, Robert Tibshirani, Jerome Friedman. (2008). The Elements of Statistical Learning. Data Mining, Inference, and Prediction, 2 nd ed. , Springer. 18
Final project: Implementing a tool for text clustering, including diachronic perspective (demo & Web resource) - SEMANTRIA model 1. Mixed teams (linguists + informaticians) - Building corpus: http: //www. bbc. com/news – English http: //www. e-ziare. ro/ – Romanian - NER - Topic - Detection – LDA -Domain & subdomains detection - Sentiment Analysis - Automatic News Source Detection - Interface Construction 19
Thank you! 20
- Liceul teoretic alexandru ioan cuza corabia
- Facultatea de psihologie
- Ion george nicholas alexander lambrino
- Colegiul alexandru ioan cuza ploiesti
- 5 triptongos
- Kode etik ikatan akuntan indonesia dimaksudkan sebagai
- 4 palabras con triptongo
- Iai
- Cuvinte cu iai
- Satul saraceni
- Pd iai diy
- Iai university
- Ambulatoriu cuza voda iasi
- Liceul ioan c stefanescu iasi
- Budulea taichii de ioan slavici
- Ioan mitrea
- Si viata vesnica este aceasta sa te cunoasca
- Alexandru bellu
- Predicting good probabilities with supervised learning
- Alexandru ivasiuc baia mare
- Alexandru tatomir