Friday 29 January 2016

Walking on Metapaths

What are metapaths?

To answer that I first need to explain what are heterogeneous graphs. Then we can go for metapaths.

Heterogeneous graphs

First of all, let us consider homogeneous graphs. Well, they are simple graphs where all the nodes are of same type and all the edges are of same type. How heterogeneous graphs differ is that here the nodes of the graph can be of different types and all the edges of the graph need not be of the same type as well.

How does heterogeneous graphs help?

As one might observe homogeneous graphs are inadequate to depict many real-life scenarios. Hence, we cannot express all the information contained in the network via homogeneous graphs effectively.
A typical advantage of heterogeneous networks can be seen in case of citation networks.

Figure 1 : Citation network


Here as we can see the nodes are of varied types : Term, Paper, Venue, Author. The node "term" represents keywords used in a paper. Other node are self-explanatory. These nodes are obviously entirely different entities with respect to citation networks. As a result, depicting them via homogeneous graphs will not be very intuitive. Hence, heterogeneous graphs are obviously a better option.

Now with this basic knowledge of heterogeneous graphs, we can finally answer the initial question, "what are metapaths ?"
Figure 2 : Metapaths



In layman terms, metapaths are just paths in a heterogeneous network. 

The point is that the edges in a heterogeneous network are also of different types. For example: a Author to Paper edge represents that the author has contributed to the paper. Similarly, Paper to Author edge represents that opposite direction of the relation. Paper to Venue denotes how that the Paper was submitted at that Venue. Similarly a Venue to Paper represents the opposite direction of the relation. What is interesting is that a Paper node can have exactly one edge to a Venue node while a Venue node can have many edges to different Paper nodes. Thus, the edges represent real-life relation between the nodes. 

Now, what metapaths do is to extend this real-life relations into further depth. For example, in Figure 2, the metapath APV represents a indirect relation between a author and a venue. So, metapaths are used to do information extraction in heterogeneous networks. 

An application - Recommending papers to be cited

In more formal terms, the user gives the main topic of his research ( one can call it the abstract ) and a few optional keywords and the output is a list of ranked papers could potentially be cited
given user’s input. This is more or less the "problem statement".

Let us divide and conquer the "problem". 

Identifying the relations or edge types in the graph

Table 1: Relations in the heterogeneous network
The relations are quite intuitive. So, let us go into assigning the edge weights. Consider the example of edge from Paper "p" to Author "a". There can be many ways of doing this, but the first thing that pops up in our minds is obviously 1/(Number of co-authors of Paper "a"). Similarly, for the weight from an Author "a" to Paper "p" is 1/(Number of papers authored by Author "a").
For the time being let us not go into how more complex edge weights are obtained. I will provide the link to reference paper at the end for those who are interested.

Extracting seed papers for recommendation

Given a set of top ranked papers for better predictions. What we do is select the top few ranked papers and use those papers and their edges to recommend other papers to the user. One thing to be noted is that the number of seed set papers may vary for different metapaths. What this does effectively is kind of a "Pseudo Relevance Feedback".


How do we actually predict?

Till now, we have a complex heterogeneous network and lot of jargons. So, let us see how actually we quantify the similarity between two papers or more generally two nodes in the heterogeneous network.

One more point, this similarity is different for different metapaths. For one metapath say "m", the similarity between two nodes given by equation 1.
Equation 1
where t is a tour from a_i to a_j along the metapath "m" and RW (t) is the random walk probability of the tour t. Similarly we can extend this to a set of nodes as given by equation 3.

Equation 2

That was quite simple. We have similarity values between two nodes across different metapaths. So, all that is left around is doing the final predictions. We just combine the all the similarity values for two nodes across different metapaths. Equation 3 gives an example.

Equation 3
There can be many metapaths. The reference paper has used 18 metapaths ( Those who are interested can see the reference I will be giving).

Now, we have a lot of parameters, alphas to be optimized. We can use co-ordinate ascent to handle that ( Look up wikipedia for this).

Thus, we can use metapaths for recommending papers.

This was quite a short and simple explanation. Please go through the following reference for deeper understanding.

References: dl.acm.org/citation.cfm?id=2661965