Treestructured Conditional Random Fields for Semantic Annotation Jie

  • Slides: 32
Download presentation
Tree-structured Conditional Random Fields for Semantic Annotation Jie Tang, Mingcai Hong, Juanzi Li, and

Tree-structured Conditional Random Fields for Semantic Annotation Jie Tang, Mingcai Hong, Juanzi Li, and Bangyong Liang Knowledge Engineering Group (KEG) Department of Computer Science and Technology Tsinghua University Nov. 5, 2006

Outline • Motivation and Problem Description • Related Work • Our Approach • Experimental

Outline • Motivation and Problem Description • Related Work • Our Approach • Experimental Results • Future work & Summary

Introduction • Semantic web requires annotating existing web content according to particular ontologies •

Introduction • Semantic web requires annotating existing web content according to particular ontologies • Application of semantic annotation – Personal profile annotation – Product information annotation – Image annotation – Company annual report annotation –…

Example of Semantic Annotation Task: 1) Identifying target entities & relations 2) Populating the

Example of Semantic Annotation Task: 1) Identifying target entities & relations 2) Populating the ontology base October 14, 2002, 4: 00 a. m. PT For years, Microsoft Corporation CEO Bill Gates railed against the economic philosophy of open-source software with Orwellian fervor, denouncing its communal licensing as a "cancer" that stifled technological innovation. Today, Microsoft claims to "love" the opensource concept, by which software code is made public to encourage improvement and development by outside programmers. Gates himself says Microsoft will gladly disclose its crown jewels--the coveted code behind the Windows operating system--to select customers. "We can be open source. We love the concept of shared source, " said Bill Veghte, a Microsoft VP. "That's a super-important shift for us in terms of code access. “ Richard Stallman, founder of the Free Software Foundation, countered saying… Metadata Person Name work. In Organization Title Name Instance Person#1 Bill Gates Person#2 CEO Founder work. In Organization#1 Richard Stallman Organization#2 work. In Person#3 Bill Beghte Microsoft VP Free Software Foundation

Hierarchical Semantic Annotation • Hierarchical dependency Dependency 3. Company Directorate Info Company directorate secretary:

Hierarchical Semantic Annotation • Hierarchical dependency Dependency 3. Company Directorate Info Company directorate secretary: Haokui Zhou Representative of directorate: He Zhang Address: No. 583 -14, Road Linling, Shanghai, China Zipcode: 200030 Email: ajcoob@mail 2. online. sh. cn Phone: 021 -64396600 Fax: 021 -64392118 Dependency 4. Company Registration Info Company registration address: No. 838, Road Zhang Yang, Shanghai, China Zipcode: 200122 Company office address: No. 583 -14, Road Linling, Shanghai, China Zipcode: 200030 Email: ajcorp@online. sh. cn Phone: 021 -64396654 Metadata Company Basic Info has_directorate_info has_registration_info Company Directorate Info Company Registration Info secretary address reg_address representative Email zipcode reg_zipcode phone Phone Email Fax office_address - How to make use of the dependencies in annotation? - How to formalize a unified model? office_zipcode

Outline • Motivation and Problem Description • Related Work • Our Approach • Experimental

Outline • Motivation and Problem Description • Related Work • Our Approach • Experimental Results • Future work & Summary

Related Work—Semantic Annotation • Annotation using Rule Learning – Learning annotation rules – E.

Related Work—Semantic Annotation • Annotation using Rule Learning – Learning annotation rules – E. g. Ciravegna (2001), Handschuh et al. (2002), and Popov et al. (2003) • Annotation using Classification – Formalizing the annotation problem as that of classification – E. g. Hammond, Sheth, and Kochut (2002) • Annotation using Sequential Labeling – Sequential labeling can describe dependencies between targeted entities – E. g. Reeve (2004)

Related Work—Information Extraction • Classification Models – E. g. Cortes and Vapnik (1995), Collions

Related Work—Information Extraction • Classification Models – E. g. Cortes and Vapnik (1995), Collions (2002), and Finn (2004) • Dependent Models – E. g. Ghahramani and Jordan (1997), Mc. Callum et al. (2000), and Lafferty et al. (2001) • Non-linear Dependent Models – E. g. Sutton et al. (2004), Zhu et al. (2005), and Bunescu and Mooney (2004)

Outline • Motivation and Problem Description • Related Work • Our Approach • Experimental

Outline • Motivation and Problem Description • Related Work • Our Approach • Experimental Results • Future work & Summary

Our Approach • Hierarchical Semantic Annotation – Information is organized as a tree structure

Our Approach • Hierarchical Semantic Annotation – Information is organized as a tree structure – E. g. HTML, XML • Tree-structured Conditional Random Field (TCRF) – Modeling hierarchical dependencies in a cyclable tree – Performing parameter estimation by maximizing the loglikelihood objective function – Using TRP algorithm to do the inference in the parameter estimation

Linear Conditional Random Fields 3. Company Directorate Info Company directorate secretary: Haokui Zhou Representative

Linear Conditional Random Fields 3. Company Directorate Info Company directorate secretary: Haokui Zhou Representative of directorate: He Zhang Address: No. 583 -14, Road Linling, Shanghai, China Zipcode: 200030 Email: ajcoob@mail 2. online. sh. cn Phone: 021 -64396600 Fax: 021 -64392118 4. Company Registration Info Company registration address: No. 838, Road Zhang Yang, Shanghai, China Zipcode: 200122 Company office address: No. 583 -14, Road Linling, Shanghai, China Zipcode: 200030 Email: ajcorp@online. sh. cn Phone: 021 -64396654

Linear Conditional Random Fields O Z 1 O E 1 Zipcode: 200030 Email: ajcoob@.

Linear Conditional Random Fields O Z 1 O E 1 Zipcode: 200030 Email: ajcoob@. . . … … O Zipcode: Z 2 200030 O Email: E 2 ajcorp@. . .

Tree-structured CRFs (TCRFs) • In TCRF, the dependencies are organized as a tree structure

Tree-structured CRFs (TCRFs) • In TCRF, the dependencies are organized as a tree structure

Modeling with TCRFs A R D O Z 1 O E 1 … Zipcode:

Modeling with TCRFs A R D O Z 1 O E 1 … Zipcode: 200030 Email: ajcoob@. . . … 3. Company Directorate Info O Zipcode: 4. Company Registration Info Z 2 200030 O Email: E 2 ajcorp@. . .

TCRF Model: Annotation: How to estimate the parameters?

TCRF Model: Annotation: How to estimate the parameters?

Parameter Estimation (1) With training data D={(x(i), y(i))}, the log-likelihood objective function: where Θ={λ

Parameter Estimation (1) With training data D={(x(i), y(i))}, the log-likelihood objective function: where Θ={λ 1, λ 2, …; μk, μk+1, …} (2) Derivative of the objective function with respect to a λj p(y|x) p(yp, yc|x) p(yc, yp|x) p(ys, ys|x) here: - f denotes both the edge feature t and the vertex feature s; - c (clique) denotes both edge e and vertex v; -λ denotes the two kinds of parameters λ and μ. (3) With the objective function and the derivative function, we can use any gradient-based methods (e. g. L-BFGS) to solve the optimization problem so as to do the parameter estimation

Calculating the Marginal Probabilities • Tree-based Reparameterization (TRP) – • TRP is based on

Calculating the Marginal Probabilities • Tree-based Reparameterization (TRP) – • TRP is based on the fact that any exact algorithm for optimal inference on trees actually computes marginal distributions for pairs of neighboring nodes. TRP Algorithm – – Step 1: Initialization Step 2: Updates a) Generating a spanning tree b) Propagation on the spanning tree c) Stop if terminations are met

TRP—Step 1: Initialization X 1 T 01=k·exp(s(x 1, y 1)) y 1 T 014=k·exp(t(x,

TRP—Step 1: Initialization X 1 T 01=k·exp(s(x 1, y 1)) y 1 T 014=k·exp(t(x, y 1, y 4)) y 4 T 04=k·exp(s(x 4, y 4)) X 4 X 2 X 3 T 02=k·exp(s(x 2, y 2)) T 012=k·exp(t(x, y 2 y 1, y 2)) T 03=k·exp(s(x 3, y 3)) y 3 T 023=k·exp(t(x, y 2, y 3)) T 036=k·exp(t(x, y 3, y 6)) T 025=k·exp(t(x, y 2, y 5)) T 045=k·exp(t(x, y 4, y 5)) T 056=k·exp(t(x, y 5, y 6)) y 5 T 05=k·exp(s(x 5, y 5)) X 5 y 6 T 06=k·exp(s(x 6, y 6)) X 6

TRP—Step 2: a) Generating spanning tree • Methods: Edge cutting and edge adding X

TRP—Step 2: a) Generating spanning tree • Methods: Edge cutting and edge adding X 1 X 2 X 3 y 1 y 2 y 3 y 4 y 5 y 6 X 4 X 5 X 6

TRP—Step 2: b) Propagation X 1 X 2 X 3 y 1 y 2

TRP—Step 2: b) Propagation X 1 X 2 X 3 y 1 y 2 y 3 y 4 y 5 y 6 X 4 X 5 X 6

After The First Iteration X 1 T 11 y 1 X 2 T 12

After The First Iteration X 1 T 11 y 1 X 2 T 12 y 2 X 3 T 13 y 3 T 014 y 4 T 14 X 4 T 036 y 5 T 15 X 5 y 6 T 16 X 6

Annotation ? ? … Zipcode: 200030 Email: ajcoob@. . . … 3. Company Directorat

Annotation ? ? … Zipcode: 200030 Email: ajcoob@. . . … 3. Company Directorat Info ? Zipcode: 4. Company Registration Info ? 200030 ? Email: ? ajcorp@. . .

Annotation (cont. ) • Viterbi algorithm

Annotation (cont. ) • Viterbi algorithm

Outline • Motivation and Problem Description • Related Work • Our Approach • Experimental

Outline • Motivation and Problem Description • Related Work • Our Approach • Experimental Results • Future work & Summary

Experimental Setup Baselines Data Set #Docs Synthetic 62 #Annotation Task 4 Real 3726 10

Experimental Setup Baselines Data Set #Docs Synthetic 62 #Annotation Task 4 Real 3726 10 Ontology (1) SVM (2) Linear-CRF Features Category Features Edge Feature f(yp, yc), f(yc, yp), f(ys, ys) Vertex Feature {wi}, {wp}, {wc}, {ws} {wp, wi}, {wc, wi}, {ws, wi}

Annotation Results on Synthetic Data

Annotation Results on Synthetic Data

Annotation Results on Real Data Annotation Task SVM Prec. Rec. CRF F 1 Prec.

Annotation Results on Real Data Annotation Task SVM Prec. Rec. CRF F 1 Prec. Rec. TCRF F 1 Prec. Rec. F 1 Company_Chinese_Name 88. 82 89. 40 89. 11 82. 10 80. 69 81. 37 84. 34 92. 72 88. 33 Company_English_Name 90. 51 95. 33 92. 86 71. 68 80. 14 75. 66 89. 26 88. 67 88. 96 Legal_Representative 94. 84 97. 35 96. 08 92. 86 96. 60 94. 66 94. 84 97. 35 96. 08 Company_Secretary 99. 29 93. 33 96. 22 91. 65 96. 99 94. 23 77. 96 96. 67 86. 31 Secretary_Email 57. 14 8. 89 15. 39 69. 94 56. 53 62. 34 73. 86 97. 01 83. 87 Registered_Address 98. 66 96. 71 97. 68 94. 75 87. 20 90. 80 84. 05 90. 13 86. 98 Office_Address 70. 41 97. 54 81. 78 77. 41 87. 06 81. 94 86. 93 89. 86 88. 37 Company_Email 0. 00 84. 57 85. 64 85. 09 95. 20 90. 84 92. 97 Newspaper 100. 0 99. 34 99. 67 94. 51 91. 97 93. 21 98. 69 100. 0 99. 34 Accounting_Agency 83. 15 95. 63 88. 95 73. 81 56. 77 62. 73 79. 57 97. 19 87. 50 Average 78. 28 77. 35 75. 77 83. 33 81. 96 82. 20 86. 47 94. 04 89. 87

Time Complexity Methods Training Annotation SVM 96 s 30 s CRF 5 m 25

Time Complexity Methods Training Annotation SVM 96 s 30 s CRF 5 m 25 s 5 s TCRF 50 m 40 s 50 s Tested on a computer with two 2. 8 G P 4 -CPUs and 3 G memory

Outline • Motivation and Problem Description • Related Work • Our Approach • Experimental

Outline • Motivation and Problem Description • Related Work • Our Approach • Experimental Results • Future work & Summary

Questions • How to reduce the computational cost? – Parallelization – Incorporation of constraints

Questions • How to reduce the computational cost? – Parallelization – Incorporation of constraints from ontologies • How to incorporate the other types of dependencies into the CRF model? – E. g. Multiple dimensions – Long distant dependencies –… • How to identify entities & relations in a unified model?

Summary • Investigated the problem of hierarchical semantic annotation • Proposed a Tree-structured Conditional

Summary • Investigated the problem of hierarchical semantic annotation • Proposed a Tree-structured Conditional Random Fields for incorporating the hierarchical dependencies • Employed Tree-based Reparameterization (TRP) to perform the parameter estimation • Our approach significantly outperforms the baseline methods (SVM and CRF)

Thanks! HP: http: //keg. cs. tsinghua. edu. cn/persons/tj/

Thanks! HP: http: //keg. cs. tsinghua. edu. cn/persons/tj/