Automated Generalization of Phrasal Paraphrases from the Web
Automated Generalization of Phrasal Paraphrases from the Web Weigang Li Ting Liu Information Retrieval Lab, Harbin Institute of Technology, China 2005 -10 -14
Outline Motivation and Goal n Our Generalization method n Evaluation method n Experiments n Conclusions n IWP 2005 -----Weigang Li, et. al Automated Generalization of Phrasal Paraphrases from the Web
Motivation and Goal(1/3) n Why is paraphrase templates? ¨ Strong representation capacity ¨ Can generate many paraphrase examples n Two challenges ¨ Representation of templates ¨ Acquisition of templates IWP 2005 -----Weigang Li, et. al Automated Generalization of Phrasal Paraphrases from the Web
Motivation and Goal(2/3) n A paraphrase example ¨ 苹果的价格是多少? (What's the price for the apples? ) ¨ 苹果多少钱一千克? (How much is the apples per kilogram? ) n A paraphrase template using POS information ¨ [noun] 的价格是多少? (What's the price for the [noun]? ) ¨ [noun] 多少钱一千克? (How much is the [noun] per kilogram? ) IWP 2005 -----Weigang Li, et. al Automated Generalization of Phrasal Paraphrases from the Web
Motivation and Goal(3/3) n Using template to generate an example ¨ 笔记本的价格是多少? (What's the price for the notebook? ) ¨ 笔记本多少钱一千克? (How much is the notebook per kilogram? ) n Disadvantages of existing representation methods ¨ POS information is more generalized ¨ NE is limited IWP 2005 -----Weigang Li, et. al Automated Generalization of Phrasal Paraphrases from the Web
Outline Motivation and Goal n Our Generalization method n Evaluation method n Experiments n Conclusions n IWP 2005 -----Weigang Li, et. al Automated Generalization of Phrasal Paraphrases from the Web
Generalization method n A new representation method ¨ Semantic codes ¨ Combining a semantic dictionary n A new acquisition method ¨ Using existing search engine IWP 2005 -----Weigang Li, et. al Automated Generalization of Phrasal Paraphrases from the Web
A template and an example n A paraphrase example ¨ 在我看来——我觉得 (In my view/mind ----I feel) n A paraphrase template using our method [Aa 02 A 01=] 看来 (In [Aa 02 A 01=] view/mind) ¨ [Aa 02 A 01=] 觉得 ([Aa 02 A 01=] feel) ¨在 IWP 2005 -----Weigang Li, et. al Automated Generalization of Phrasal Paraphrases from the Web
Chinese Semantic Dictionary (Tong. Yi. Ci. Lin) A semantic class example: Ab 03 A 01= 青年人 青年 小伙子 青少年 后生 弟子 子弟 初生之犊 年青人 小 伙 小青年 年轻人 The first layer The second layer … … Ab 03 The third layer …… …… Ab 03 A … … Ab 03 A 01= … … … The fourth layer … … … IWP 2005 -----Weigang Li, et. al Automated Generalization of Phrasal Paraphrases from the Web The fifth layer
Encoding table of Cinlin (EV) Encoding bit 1 2 3 4 5 6 7 8 Example A b 0 3 A 0 1 = Attribute Big Middle Small Layer 1 2 3 groups 4 Atom groups 5 A semantic class example: Ab 03 A 01= 青年人 青年 小伙子 青少年 后生 弟子 子弟 初生之犊 年青人 小伙 小青年 年轻人 IWP 2005 -----Weigang Li, et. al Automated Generalization of Phrasal Paraphrases from the Web
Overview of Generalization Method 在…看来—[经理(manager)-- Af 10 B 07#, 小伙子(younger)-- Ab 03 A 01= …] …觉得—[男孩(boy)--Dd 15 D 04#, 在外资基金经理看来内地市场“风光无限” 年轻人(younger)-Ab 03 A 01= …] (In the view of the fund manager…) A seed phrasal Getting the Extend the slot word : 在这个小伙子看来, 只要资本家赚钱高兴. . . e The slot wordexamples is on slot word paraphrase using SE on example y r (In the view of the younger……) 来 e “我“(I) Qu …看 在 Intersection operation Ab 03 A 01= … 在…看来—[经理(manager), 小伙子(younger)…] …觉得—[男孩(boy), 年轻人(younger)…] 在我看来——我觉得 (In my view/mind ----I feel) Mapping two word Que Intersection operation Generalizing 在[Ab 03 A 01= , . . . ]看来 ry sets to their semantic …觉 twon two semantic o: the那个男孩觉得你很可爱 [Ab 03 A 01= , …]觉得the final 得 (Thatsets boy feels. . . ) code sets code template 现在的年轻人觉得美丽! (The younger feels. . ) IWP 2005 -----Weigang Li, et. al Automated Generalization of Phrasal Paraphrases from the Web
Advantages of our generalization method n Resolve the problem mentioned above ¨ 笔记本的价格是多少? (What's the price for the notebook? ) ¨ 笔记本多少钱一千克? (How much is the notebook per kilogram? ) “notebook” will not belongs to any semantic code IWP 2005 -----Weigang Li, et. al Automated Generalization of Phrasal Paraphrases from the Web
Outline Motivation and Goal n Our Generalization method n Evaluation method n Experiments n Conclusions n IWP 2005 -----Weigang Li, et. al Automated Generalization of Phrasal Paraphrases from the Web
Evaluation method n Three metrics ¨ Reasonability n n Strict_Reasonability = S / N Loose_Reasonability = (L + S) / N ¨ Precision n Precision = R / (4 * n) ¨ Coverage n n Surface_Coverage = M / NS Semantic_Coverage = Map(K) / (Map(NS-M) + Map(K)) IWP 2005 -----Weigang Li, et. al Automated Generalization of Phrasal Paraphrases from the Web
Reasonability Strict_Reasonability = S / N n Loose_Reasonability = (L + S) / N n One template Generating N paraphrase examples Every phrase in paraphrase examples as query for SE S examples (including two L examples (only one phrases) can get the complete matching results phrase in it) can get the complete matching results IWP 2005 -----Weigang Li, et. al Automated Generalization of Phrasal Paraphrases from the Web
Precision n Precision = R / (4 * n) One template Generating N paraphrase examples Replacing the correspondent For every phrase to generate the 4 * n candidate sentences manually extracting the first 2 matching sentences R candidate sentences retain the meaning IWP 2005 ------ Weigang Li, et. al Automated Generalization of Phrasal Paraphrases from the Web Sampling n examples random in N 2 n phrases in examples as queries for SE
Coverage Surface_Coverage = M / NS n Semantic_Coverage = Map(A) / (Map(NSM) + Map(A)) n NS M A Map(A) Map(NS-M) + Map(A) IWP 2005 -----Weigang Li, et. al Automated Generalization of Phrasal Paraphrases from the Web
Outline Motivation and Goal n Our Generalization method n Evaluation method n Experiments n Conclusions n IWP 2005 -----Weigang Li, et. al Automated Generalization of Phrasal Paraphrases from the Web
Experiments n. Templates and their corresponding examples number Number of Paraphrase templates 1 2 3 4 Instantiated examples number Cilin 3 Cilin 4 Cilin 5 2696 13032 1057 3004 1815 6354 587 2229 478 3011 177 429 IWP 2005 -----Weigang Li, et. al Automated Generalization of Phrasal Paraphrases from the Web
Results Reaso Cove nabi rag lity e Preci (%) (% sio ) n (% S S St Lo u e ) _ _ R R C C 9 -11. 7 Weigang Li, et. al Automated IWP 2005 -----P 10. 17. 0 Generalization - 5 of Phrasal Paraphrases from the Web
Results IWP 2005 -----Weigang Li, et. al Automated Generalization of Phrasal Paraphrases from the Web
Outline Motivation and Goal n Our Generalization method n Evaluation method n Experiments n Conclusions n IWP 2005 -----Weigang Li, et. al Automated Generalization of Phrasal Paraphrases from the Web
Conclusions n A new representation method ¨ Multiple n semantic codes A new acquisition method ¨ Web n A new evaluation method ¨ Three measures combining web IWP 2005 -----Weigang Li, et. al Automated Generalization of Phrasal Paraphrases from the Web
The future work Generalizing templates with two slot words n Combining with NE representation method n More reasonable evaluation methods n IWP 2005 -----Weigang Li, et. al Automated Generalization of Phrasal Paraphrases from the Web
Thanks for your attention! IWP 2005 -----Weigang Li, et. al Automated Generalization of Phrasal Paraphrases from the Web
- Slides: 25