Datasets‎ > ‎

Link prediction

The file at the bottom of this page (datasets.zipcollects 22 networks from different sources and applications domains. These networks were carefully selected to cover a wide range of properties, including diff.erent sizes, average degrees, clustering coefficients, and heterogeneity indices. A summary of the structural properties of the networks we used in our experiments can be found in the table below. 

UPG is a power distribution network. HPD, YST, and CEG are biological networks. ERD, KNH, LDG, SMG, ZWL, HTC, CGS, CDM, NSC, and GRQ are co-authorship networks for different fields of study. HMT, FBK, and ADV are social networks. UAL is an airport traffic network. EML is a network of individuals who shared emails. PGP is an interaction network of users of the Pretty Good Privacy algorithm. BUP is a network of political blogs. Finally, INF is a network of face-to-face contacts in an exhibition.

Network properties (from left to right): network name, number of nodes (|V|), number of edges (|E|),
average degree (<k>), average clustering coefficient (C), average shortest path length (ASPL),
diameter (D), heterogeneity (H), and degree assortativity (r).

Data sources

  • Bu, D., Zhao, Y., Cai, L., Xue, H., Zhu, X., Lu, H., Zhang, J., Sun, S., Ling, L., Zhang, N., et al., 2003. Topological structure analysis of the protein-protein interaction network in budding yeast. Nucleic Acids Research 31, 2443-2450.
  • Isella, L., Stehl.e, J., Barrat, A., Cattuto, C., Pinton, J.F., Van den Broeck, W., 2011. What's in a crowd? Analysis of face-to-face behavioral networks. Journal of Theoretical Biology 271, 166-180.
  • Krebs, V., 2008. A network of books about recent US politics sold by the online bookseller 
  • Leskovec, J., Kleinberg, J., Faloutsos, C., 2007. Graph evolution: Densification and shrinking diameters. ACM Transactions on Knowledge Discovery from Data (TKDD) 1, 2.
  • Massa, P., Salvetti, M., Tomasoni, D., 2009. Bowling alone and trust decline in social network sites, Eighth IEEE International Conference on Dependable, Autonomic and Secure Computing, DASC'09, pp. 658-663.
  • McAuley, J.J., Leskovec, J., 2012. Learning to discover social circles in ego networks, NIPS'2012, pp. 548-556.
  • Newman, M.E., 2001. The structure of scientific collaboration networks. Proceedings of the National Academy of Sciences 98, 404-409.
  • Newman, M.E., 2006. Finding community structure in networks using the eigenvectors of matrices. Physical Review E 74, 036104.
  • Peri, S., Navarro, J.D., Amanchy, R., Kristiansen, T.Z., Jonnalagadda, C.K., Surendranath, V., Niranjan, V., Muthusamy, B., Gandhi, T., Gronborg, M., et al., 2003. Development of human protein reference database as an initial platform for approaching systems biology in humans. Genome Research 13, 2363-2371.
  • Serrano, M.A., Boguña, M., Pastor-Satorras, R., Vespignani, A., 2007. Correlations in complex networks. Large scale structure and dynamics of 520 complex networks: From information technology to finance and natural sciences , 35-66.
  • Watts, D.J., Strogatz, S.H., 1998. Collective dynamics of small world networks. Nature 393, 440-442.

Fernando Berzal,
Jun 10, 2015, 3:05 AM