Question 1 [2 points: 0.2/each]: U se your own language to briefly explain the following concepts: Common neighbors for node similarity: Adamic/Adar for node similarity: Jaccard’s coefficient for...

1 answer below »


Question 1 [2 points: 0.2/each]: Use your own language to briefly explain the following concepts:



  1. Common neighbors for node similarity:

  2. Adamic/Adar for node similarity:

  3. Jaccard’s coefficient for node similarity:

  4. Random walk:

  5. Stationary Distribution:

  6. PageRank score:

  7. Rooted PageRank:

  8. Katz Score:

  9. Network Transition Matrix:

  10. Low rank approximation for node similarity:




Question 2 [1.5 points]:
In the network showing in Figure 1:


· Please calculate the Betweenness Centrality scores [0.5 pt], Closeness Centrality score [0.5 pt], and Eigen Vector based centrality scores [0.5 pt] for nodes
a,
b,
c,
d, and
e, respectively (please show your solutions).



Figure 1







Question 3 [2 pts]:
Given four web pages with the link structure showing in Figure 2,


1. Use “Power Iteration” (a.k.a simple iteration) to calculate the PageRank scores for each website. (Report the first three iteration results, with the initial PageRank scores for each node being set as 1/n=0.25). [1 pt].


2. Please also use Eigenvector based approach to calculate PageRank scores for each web page [1 pt] (show your solutions.)













































































Figure 2



Question 4 [1 pt]:
In the network showing in Figure 3, please use rooted PageRank to calculate similarity between each pair of nodes. Each time, the random walker has a probability 1-a

(where
a
=0.2)
to return back to
an original node.
(Please show your solutions).



Figure 3



Question 5 [2 pts]:
The network in Figure 4 show connections between 8 individuals in a small community. For node pairs (1, 7) and (1, 6), please use following measures to calculate their similarity (or distance) value and conclude which pair is more likely to form a link.


a. Jacarrd’s Coefficient (0.25 pt)


b. Adamic/Adar (0.25 pt)


c. Preferential attachment (0.25 pt)


d. Katz (with b=0.05) (0.25 pt)


e. SimRank score with C=1 (please show the SimRank score after the 2nd
iteration). (1.0 pt)

























Figure 4












Question 6 [2 pts]:
In the network showing in Figure 5,


1. Please report the graph distance similarity matrix of the network [0.25 pt], and create a 9x9 matrix with each cell of the matrix denoting the number of common neighbors shared between the corresponding row and column, respectively [0.25 pt]


2. Apply Singular-Value-Decomposition to the sum of the above two matrices. Report the U, S, and V matrices [0.25 pt]. Explain the meaning of the U matrix, and describe how to use U to represent network nodes into different dimensions [0.25 pt]


3. Use U matrix to represent each node in a 1-Dimensinoa space and a 2-Dimensional space, respectively (draw plot or chat) [0.5 pt].


4. Calculate Euclidean distance between all node pairs (report the Euclidean distance matrix) [0.25 pt]. Find which two nodes have the largest distance, and which two nodes have the smallest distance. Explain why? [0.25 pt]










Figure 5







Question 7 [1.5 pt]:
The “karate.gml” in the Dataset folder (in Canvas) is a social network with 34 vertices and 78 edges. Use Python (or R) to implement the following tasks (you can also use other programming language): Zachary network:
http://konect.uni-koblenz.de/networks/ucidata-zachary. Please submit the R code and the results as an R Notebook.

1) Find shortest path distance between all node pairs, and report the diameter of the network [0.25 pt]

2) Highlight the longest shortest path(s) of the network (if there are multiple paths, please use different color to code each path) [0.25 pt]

3) Report betweenness centrality scores of each node, and betweenness centrality scores of each edge. Report which node and which edge has the largest betweenness centrality score. highlight this node and edge in different color in a plot. [0.25 pt]

4) Report eigenvector centrality score of each node, and draw a plot to show the network with node size proportional to node’s eigenvector centrality score [0.25 pt]

5) Report Jaccard simialrity between each pair of nodes [0.25], and Aadmic/Adar similarity between each pair of nodes [0.25], as a table or a heatmap.



Answered Same DayFeb 27, 2021

Answer To: Question 1 [2 points: 0.2/each]: U se your own language to briefly explain the following concepts:...

Abr Writing answered on Feb 28 2021
132 Votes
[Content_Types].xml

_rels/.rels

word/document.xml
Notebook Importing necessary libraries import networkx as nx from collections import defaultdict import copy import matplotlib.pyplot as plt import numpy as np from mpl_toolkits.mplot3d import Axes3D from matplotlib import cm Question 1 Common neighbors for node similarity: It is simply the number of common neighbors among the nodes or: | Γ ( u ) ∩ Γ ( v ) | Adamic/Adar for node similarity: Adamic index is defined as: Σ w ∈ Γ ( u ) ∩ Γ ( v ) 1 l o g Γ ( w ) Jaccard’s coefficient for node similarity: It is the number of common neighbors of the nodes divided by all the unique neigbors of all the nodes together | Γ ( u ) ∩ Γ ( v ) | | Γ ( u ) ∪ Γ ( v ) | Where, Γ ( u ) =Set of neigbors
of node u Random walk: A stochastic process that represents a random path on some mathematical space Stationary Distribution: It is a distribution which is invariant in time or a distribution a dynamic process will achieve after infinite steps and becomes consistent with time PageRank score: A score that represents quality and quantity of links on/to a webpage Rooted PageRank: Another difficulty with using the measures based on hitting time and commute time is their sensitive dependence to parts of the graph far away from x and y, even when x and y are connected by very short paths. A way of counteracting this difficulty is to allow the random walk from x to y to periodically “reset,” returning to x with a fixed probability α at each step; in this way, distant parts of the graph will almost never be explored. Katz Score: This heuristic defines a measure that directly sums over collection of paths, exponentially damped by length to count short paths more heavily. The Katz-measure is a variant of the shortest-path measure. The idea behind the Katz-measure is that the more paths there are between two vertices and the shorter these paths are, the stronger the connection. Network Transition Matrix: It defines some measure between every two nodes in the network. Low rank approximation for node similarity: It a ranking of the nodes in the graph G based on a voting scheme Question 2 G = nx.Graph() nodes = ['a', 'b', 'c', 'd', 'e'] G.add_nodes_from(nodes) G.add_edge('a', 'b') G.add_edge('a', 'c') G.add_edge('b', 'c') G.add_edge('b', 'd') G.add_edge('c', 'd') G.add_edge('d', 'e') pos=nx.spring_layout(G) nx.draw(G, pos, with_labels = True) plt.show() Betweenness Centrality scores nx.betweenness_centrality(G) {'a': 0.0, 'b': 0.16666666666666666, 'c': 0.16666666666666666, 'd': 0.5, 'e': 0.0} Closeness Centrality score nx.closeness_centrality(G) {'a': 0.5714285714285714, 'b': 0.8, 'c': 0.8, 'd': 0.8, 'e': 0.5} Eigen Vector based centrality scores nx.eigenvector_centrality(G) {'a': 0.40669315164993997, 'b': 0.5370767614119433, 'c': 0.5370767614119433, 'd': 0.4747503533204841, 'e': 0.17974951217077526} Question 3 G = nx.DiGraph() G.add_nodes_from(['A', 'B', 'C', 'D']) G.add_edges_from([ ('A', 'B'), ('A', 'C'), ('A', 'D'), ('B', 'D'), ('C', 'A'), ('C', 'D'), ('D', 'A'), ('D', 'C') ]) pos=nx.circular_layout(G) nx.draw(G, pos, with_labels = True) plt.show() Part 1 nx.pagerank(G) {'A': 0.28895928662720183, 'B': 0.11937184094720205, 'C': 0.26023278702231667, 'D': 0.3314360854032792} Part 2 nx.pagerank_numpy(G) {'A': 0.2889592882178486, 'B': 0.11937179832839044, 'C': 0.26023234143595697, 'D': 0.331436572017804} Question 4 G = nx.DiGraph() G.add_nodes_from(['A', 'B', 'C', 'D', 'E', 'F', 'G']) G.add_edges_from([ ('A', 'B'), ('A', 'C'), ('A', 'D'), ('A', 'E'), ('B', 'A'), ('B', 'D'), ('B', 'E'), ('C', 'A'), ('C', 'F'), ('C', 'G'), ('D', 'A'), ('D', 'B'), ('E', 'A'), ('E', 'B'), ('F', 'A'), ('F', 'C'), ('G', 'A'), ('G', 'C') ]) pos=nx.circular_layout(G) nx.draw(G, pos, with_labels = True) plt.show() def similarityRank(G, r=0.8, max_iter=100): sim_old = defaultdict(list) sim = defaultdict(list) for n in G.nodes(): sim[n] = defaultdict(int) sim[n][n] = 1 sim_old[n] = defaultdict(int) sim_old[n][n] = 0 # recursively calculating the similarity rank between each pair of nodes for iter_ctr in range(max_iter): if _is_converge(sim, sim_old): break sim_old = copy.deepcopy(sim) for u in G.nodes(): for v in G.nodes(): if u == v: continue s_uv = 0.0 for n_u in G.neighbors(u): for n_v in G.neighbors(v): s_uv += sim_old[n_u][n_v] sim[u][v] = (r * s_uv / (len(list(G.neighbors(u))) * len(list(G.neighbors(v))))) return sim def _is_converge(s1, s2, eps=1e-4): for i in s1.keys(): for j in s1[i].keys(): if abs(s1[i][j] - s2[i][j]) >= eps: return False return True similarityRank(G) defaultdict(list, {'A': defaultdict(int, {'A': 1, 'B': 0.4045442755593862, 'C': 0.3027172665657269, 'D': 0.35464036242712205, 'E': 0.35464036242712205, 'F': 0.3538122647531686, 'G': 0.3538122647531686}), 'B': defaultdict(int, {'A': 0.40454427555938627, 'B': 1, 'C': 0.36190391156024837, 'D': 0.38426092933199457, 'E': 0.38426092933199457, 'F': 0.3695769051340661, 'G': 0.3695769051340661}), 'C': defaultdict(int, {'A': 0.30271726656572684, 'B': 0.3619039115602483, 'C': 1, 'D': 0.38012277707560455, 'E': 0.38012277707560455, 'F': 0.3654387528776761, 'G': 0.3654387528776761}), 'D': defaultdict(int, {'A': 0.35464036242712205, 'B': 0.3842609293319947, 'C': 0.38012277707560455, 'D': 1, 'E': 0.5617893127247865, 'F': 0.41378398721864534, 'G': 0.41378398721864534}), 'E': defaultdict(int, {'A': 0.35464036242712205, 'B': 0.3842609293319947, 'C': 0.38012277707560455, 'D': 0.5617893127247865, 'E': 1, 'F': 0.41378398721864534, 'G': 0.41378398721864534}), 'F': defaultdict(int, {'A': 0.3538122647531687, 'B': 0.36957690513406605, 'C': 0.3654387528776761, 'D': 0.41378398721864534, 'E': 0.41378398721864534, 'F': 1, 'G': 0.5210496308268143}), 'G': defaultdict(int, {'A': 0.3538122647531687, 'B': 0.36957690513406605, 'C': 0.3654387528776761, 'D': 0.41378398721864534, 'E': 0.41378398721864534, 'F': 0.5210496308268143, 'G': 1})}) Question 5 G = nx.Graph() G.add_nodes_from([1, 2, 3, 4, 5, 6, 7, 8]) G.add_edges_from([ (1, 2), (1, 3), (1, 4), (1, 5), (2, 3), (2, 4), (2, 8), (3, 4), (3, 5), (3, 6), (4, 6), (4, 7), (5, 7), (6, 7), (7, 8) ]) pos = nx.circular_layout(G) nx.draw(G, pos, with_labels=True) preds = nx.jaccard_coefficient(G, [(1, 7), (1, 6)]) for u, v, p in preds: print('Jaccard\'s coefficient (%d, %d) -> %.8f' % (u, v, p)) print() preds = nx.adamic_adar_index(G, [(1, 7), (1, 6)]) for u, v, p in preds: print('Adamic Adar Index (%d, %d) -> %.8f' % (u, v, p)) print() preds =...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here