Overview of Relational Markov Networks Colin Evans Relational

  • Slides: 11
Download presentation
Overview of Relational Markov Networks Colin Evans

Overview of Relational Markov Networks Colin Evans

Relational Schema

Relational Schema

Relational Data – an Instantiation

Relational Data – an Instantiation

Problem • How do we find an optimal assignment of labels I. y to

Problem • How do we find an optimal assignment of labels I. y to a set of variables I. x in a specific instantiation I? • We need an objective function p(I. y|I. x) that indicates the quality of an assignment of labels. • The objective function should take into account the relational structure of the data.

Relational Clique Templates • A template C is an algorithm for selecting a subset

Relational Clique Templates • A template C is an algorithm for selecting a subset of nodes requiring labels – essentially a logical query. – “Find all label attributes A, B with pages C, D where C has. Label A and D has. Label B and C links. To D” • A specific subset is called a clique, I. xc. • An potential function fc(I. yc|I. xc) where I. yc is a subset of I. y is associated with the template. • An example potential function for the above template: – A=B → 1 – A≠B → 0

Markov Networks • Given a set of templates C and a set of cliques

Markov Networks • Given a set of templates C and a set of cliques C(I), we can construct a Markov Network by connecting each of the cliques.

Markov Networks • Given a Markov Network, we have the following JPD:

Markov Networks • Given a Markov Network, we have the following JPD:

Potential Functions • A weight wc associated with each clique template C is inserted

Potential Functions • A weight wc associated with each clique template C is inserted to balance the contribution of each potential function. • These weights need to be learned.

How do we learn the weights? • Gradient Descent • Perceptron Learning • Other

How do we learn the weights? • Gradient Descent • Perceptron Learning • Other optimization methods

How do we find an optimal labeling? • Modeling the relationships of the cliques

How do we find an optimal labeling? • Modeling the relationships of the cliques as a Markov Network and use the sum-product algorithm. – Problem: sum-product is only proven to converge on trees.

Other Issues • How does this method work if you have a large body

Other Issues • How does this method work if you have a large body of “correctly” labeled relational data and only wish to apply a small number of labels? – Email classification is a good example of this. – Complexity of assigning a label goes down, but we still use relationships to determine the label. • Could one template “dominate” in a specific data set? Does there need to be a factor which normalizes the contribution of a template if it produces too many cliques? • How do you test the “usefulness” of a template?