In the one hand, link scoring is the task of measuring the strength of existing links. On the other hand, link prediction is the task of estimating the the likelihood of existence of a link between a pair of disconnected nodes. The techniques applied in both problems are rather similar, and are based on the idea that the likelihood of two nodes interacting can be evaluated based on their neighborhood. Scoring linksIn NOESIS, links can be scored using the LinkScorer class. The constructor of this class takes the network for which the links will be scored and the algorithm that will be used to compute the scores. The supported link scoring algorithms are described below. For example, to use the resource allocation algorithm, first we need to instantiate it as follows:NodePairMeasureTask measure = new KatzScore(network); LinkScorer scorer = new LinkScorer(network, measure);Once the scorer has been instantiated, links can be scored using the compute method. The following example shows how to score the link between a pair of nodes with indices 15 and 28: double score = scorer.compute(15, 28); Predicting linksTo perform link prediction using NOESIS, we have to instantiate the link predictor that we want to use. For example, if we want to compute the likelihood of link between a pair of nodes in a given network using random walks, we would instantiate RandomWalkScore as follows: NodePairMeasureTask predictor = new RandomWalkScore(network); The instantiated predictor provides the call method, which computes and returns a matrix. This matrix contains the computed scores, so that the (i,j)element is the score associated to the pair of nodes with indices i and j. An example of how to use this method is provided in the following code snippet: Matrix matrix = predictor.call(); New linkrelated algorithms can be created by extending the NodePairMeasureTask class. Extending this class requires implementing the compute method, which must return the matrix of computed scores. An example is shown below: public class MyLinkMeasure extends NodePairMeasureTask { public MyLinkMeasure(Network<?, ?> network) { The implemented class can be used as link scorer and as link predictor, as shown in the previous sections. On the one hand, to compute a score proportional to the likelihood of link existence between a pair of nodes with indices 12 and 24, we would use the call method as follows: MyLinkMeasure measure = new MyLinkMeasure(network); Matrix scores = measure.call(); double score = scores.get(12, 14); On the other hand, this algorithm can be used for link scoring by creating and using a link scorer as follows: MyLinkScorer measure = new MyLinkScorer(network); LinkScorer scorer = new LinkScorer(network, measure); double score = scorer.compute(15, 28); The score variable contains the score computed for the link between the pair of nodes with indices 15 and 28. Supported link scorers and predictorsThe algorithms currently supported by NOESIS for link scoring and prediction are introduced in this section. These algorithms have been divided into two categories according to the amount of information in the network that they consider. In addition, examples of how to use these algorithms are provided. Local approachesLocal algorithms only use neighborhood information in the network. As a result of this, these algorithms have high performance and scalability. Common neighbors scoreThe common neighbors score defines the similarity between pairs of nodes as the number of shared neighbors between both nodes. The following example shows how to use the CommonNeighborsScore class, which implements this metric. Network network = ... CommonNeighborsScore commonNeighborsScorer = new CommonNeighborsScore(network); Matrix scores = commonNeighborsScorer.call(); References:
AdamicAdar scoreThis metric defines the similarity between a pair of nodes according to the number of shared neighbors. However, in contrast to the common neighbors score, the AdamicAdar score weighs the contribution of each shared neighbor inversely proportional to the logarithm of its degree. An example of how to use this scorer is shown below: Network network = ... AdamicAdarScore adamicAdarScorer = new AdamicAdarScore(network); Matrix scores = adamicAdarScorer.call(); References:
Resource allocation scoreThis score, which is motivated by the resource allocation process that takes place in complex networks, models the transmission of units of resources between pairs of unconnected nodes through their neighbors. This metric is similar to the AdamicAdar score, but the weighting is computed directly according to the degree. The following example shows how to use the resource allocation score: Network network = ... ResourceAllocationScore ra = new ResourceAllocationScore(network); Matrix scores = ra.call(); References:
Jaccard scoreThe Jaccard score is computed as the size of the intersection between the set of neighbors of both nodes, divided by the size of their union. An example of how to use the JaccardScore class to compute this metric is shown below: Network network = ... JaccardScore jaccard = new JaccardScore(network); Matrix scores = jaccard.call(); Preferential attachment scoreThe node degrees follow a power law distribution in many real world networks, which results in the widely studied scalefree networks. The preferential attachment score is derived from results in the study of scalefree networks, and measures the interaction between a pair of nodes as the product of their degrees. Therefore, this metric prioritizes links between pairs of highdegree nodes. The following example shows how to use the preferential attachment score: Network network = ... PreferentialAttachmentScore pa = new PreferentialAttachmentScore(network); Matrix scores = pa.call(); References:
Salton scoreThis score, also known as the cosine similarity, is closely related to the Jaccard index. It is defined as the size of the intersection between the sets of neighbors of both nodes, divided by the root of the product of their size. The Salton score can be computed using the SaltonScore class, as shown below: Network network = ... SaltonScore salton = new SaltonScore(network); Matrix scores = salton.call(); References:
Sørensen scoreThis score, which was initially proposed to compare the similarity between different ecological community data samples, is closely related to the Jaccard score. This metric is proportional to the number of shared neighbors, divided by the sum of the degree of both nodes. Some authors suggest that the Sørensen score is less sensitive to outliers than the Jaccard score. The following code snippet shows how to compute the Sørensen score: Network network = ... SorensenScore sorensen = new SorensenScore(network); Matrix scores = sorensen.call(); References:
Hub promoted scoreThis score, which was proposed as a result of studying modularity in metabolic networks, models behaviors that lead to hierarchically structured networks. This similarity score avoids link formation between hub nodes and promotes link formation between lowdegree nodes and hubs. This leads to small highly internally connected groups of nodes that are also highly isolated from other groups. This metric can be used as follows: Network network = ... HubPromotedScore hp = new HubPromotedScore(network); Matrix scores = hp.call(); References:
Hub depressed scoreThis metric has the opposite goal of the hub promoted score. It promotes link formation between hubs and between lowdegree nodes, but not between hubs and lowdegree nodes. This score, which is implemented by the HubDepressedScore, can be used as follows: Network network = ... HubDepressedScore hd = new HubDepressedScore(network); Matrix scores = hd.call(); References:
Local LeichtHolmeNewman scoreThis score is defined as the ratio of length two paths between a pair of nodes and a value proportional to the expected number of length two paths between them. This score can be computed as shown below: Network network = ... LocalLeichtHolmeNewmanScore llhn = new LocalLeichtHolmeNewmanScore(network); Matrix scores = llhn.call(); References:
Global approachesIn contrast to local methods, global methods take into account the complete network topology. This allows using more sophisticated approaches, but also more inefficient and with worse scalability. Katz scoreThis score is computed as the sum of the influence of all possible paths between pairs of nodes. Path influence is incrementally penalized proportionally to path length according to a damping factor. This damping factor, with a value between 0 and 1, controls path length penalization. A larger damping factor increases the influence of longer paths. The following example shows how to compute the Katz score for a given graph using a damping factor of 0.2: Network network = ... KatzScore katz = new KatzScore(network, 0.2); Matrix scores = katz.call(); References:
Global LeichtHolmeNewman scoreSimilar to the Katz score, the global LeichtHolmeNewman score measures node similarity proportional to the number of paths between nodes. This technique also introduces a damping factor, which controls the penalization of paths according to their length. The main different with the Katz score is that the global LeichtHolmeNewman score approximates the number of paths by computing the largest eigenvalue of the network adjacency matrix. The code listed below shows how to compute the global LeichtHolmeNewman score using a damping factor of 0.15: Network network = ... GlobalLeichtHolmeNewmanScore glhn = new GlobalLeichtHolmeNewmanScore(network, 0.15); Matrix scores = glhn.call(); References:
Random walk scoreA random walk on the graph is defined as Markov chain of randomly selected nodes. Starting from a given node, a theoretical walker randomly jumps from one node to another through links. Random walks have been widely used to study lots of stochastic processes in many fields of study. Random walks can be used to compute the probability of reaching a node from another. This probability is used by the random walk score to estimate the similarity between pairs of nodes. An example of how to use the RandomWalkScore class, which implements this metric, is shown below: Network network = ... RandomWalkScore rw = new RandomWalkScore(network); Matrix scores = rw.call(); References:
Random walk with restart scoreRandom walk with restart is an extension of the original random walk algorithm. In this new model, the random walker moves to adjacent nodes with a probability given by an alpha parameter. However, with complementary probability 1alpha, the random walker returns to the starting node. This metric is closely related to the PageRank algorithm, described in the Structural Network Properties section. An example of how to compute this metric with an alpha value of 0.8, which implies a restart probability equal to 0.2, is shown in the following fragment of code: Network network = ... RandomWalkWithRestartScore rwr = new RandomWalkWithRestartScore(network, 0.8); Matrix scores = rwr.call(); References:
Flow propagation scoreRandom walkbased metrics require the normalization of the network adjacency matrix. Previously described approaches required row normalization, so the sum of the elements in each row is equal to 1. The flow propagation score is a variant of Random walk with restart score that replaces the normalized adjacency matrix with the normalized Laplacian matrix. The following example shows how to compute flow propagation in a given network using an alpha value of 0.85: Network network = ... FlowPropagationScore fp = new FlowPropagationScore(network, 0.85); Matrix scores = fp.call(); References:
Pseudoinverse Laplacian scoreThe Laplacian matrix is an alternative graph representation widely used in spectral graph theory. The MoorePenrose pseudoinverse of the Laplacian matrix can be computed using singular value decomposition (SVD). This matrix is a kernel, which means that it can be used as a node similarity matrix. The pseudoinverse Laplacian score is computed as the node similarity given by the pseudoinverse Laplacian matrix, divided by the root of the product of the selfsimilarity of both nodes. This metric can be computed using the PseudoinverseLaplacianScore class as shown below: Network network = ... PseudoinverseLaplacianScore pl = new PseudoinverseLaplacianScore(network); Matrix scores = pl.call(); References:
Average commute time scoreThe average commute time score is equal to the average number of steps that a random walker, starting from a given node, takes to reach other given node for the first time and go back to the starting node. This number is computed using the pseudoinverse of
the Laplacian matrix. The average commute time score for a given network can be computed as follows: Network network = ... AverageCommuteTimeScore act = new AverageCommuteTimeScore(network); Matrix scores = act.call(); References:
Random forest kernel scoreThe random forest kernel measures the accessibility between pairs of nodes. This score is based on the matrixtree theorem, which states that the number of spanning trees in a graph can be estimated using the Laplacian matrix. This number can be used to estimate the likelihood of link existence between every pair of nodes in a given network. The following example shows how to compute this score using the RandomForestKernelScore class: Network network = ... RandomForestKernelScore rfk = new RandomForestKernelScore(network); Matrix scores = rfk.call(); References:

Wiki >