Graph datasets

What follows is a collection of graphs and networks used in our experiments and evaluations. The graphs are in the dot format.

dataset description source
GD collaboration Co-authorship network for the International Symposiums on Graph Drawing, 1994-2015. The vertices represent the authors with weights corresponding to the total number of papers in the conference. An edge is between two authors if they published at least one paper together; the weight of the edge corresponds to the number of such joint papers. Computer Science Bibliography
SODA collaboration Co-authorship network for the Symposium on Discrete Algorithms, 1990-2015. Computer Science Bibliography
CHI collaboration Co-authorship network for the International Conference on Human Factors in Computing Systems, 1990-2014. Computer Science Bibliography
ICML collaboration Co-authorship network for the International Conference on Machine Learning, 1988-2014. Computer Science Bibliography
KDD collaboration Co-authorship network for the Knowledge Discovery and Data Mining, 1994-2014. Computer Science Bibliography
TVCG collaboration Co-authorship network for the IEEE Transactions on Visualization and Computer Graphics, 1995-2015. Computer Science Bibliography
Recipes Recipe-ingredients contain 381 unique cooking ingredients extracted from 56498 cooking recipes. Edges are weighted based on co-occurrence of the ingredients in the recipes. Ahn et al., 2011
Trade The network contains trade relationships between 211 countries. Edges are weighted based on normalized combined import/export between pairs of countries.
Colors The graph is constructed using the 954 most common RGB monitor colors, as defined by several hundred thousand participants in the xkcd color name survey. The edge-weights are defined by the distance in the RGB space between corresponding pairs. xkcd.com
Books The network of 3204 popular books. The edges are obtained with a breadth-first traversal following Amazon's "Customers Who Bought This Item Also Bought" links, starting from the root node, George Orwell's 1984. Gansner et al., 2010
LastFM The network of 2588 popular bands and musicians crawled from the last.fm system. The edges correspond to similarities between the musicians, as suggested by last.fm. Gansner et al., 2009
Universities Network of US universities and their average SAT scores. The vertices are universities and edges are constructed based on their similarities of the admissions data. Fiske Guide to Colleges
ALGO The network of topics of research papers published in Algorithmica, 1950-2013. The graph contains the prominent words and phrases extracted from the titles of the papers. The edges represent similarities between the topics computed based on their co-occurrence in titles. MoCS
IPL The network of topics of research papers published in Inf. Process. Lett., 1950-2013. The graph contains the prominent words and phrases extracted from the titles of the papers. The edges represent similarities between the topics computed based on their co-occurrence in titles. MoCS
SOCG The network of topics of research papers published in the Symposium on Computational Geometry, 1950-2013. The graph contains the prominent words and phrases extracted from the titles of the papers. The edges represent similarities between the topics computed based on their co-occurrence in titles. MoCS
SODA The network of topics of research papers published in the Symposium on Discrete Algorithms, 1950-2013. The graph contains the prominent words and phrases extracted from the titles of the papers. The edges represent similarities between the topics computed based on their co-occurrence in titles. MoCS
TARJAN The network of topics of research papers published by Robert Endre Tarjan. The graph contains the prominent words and phrases extracted from the titles of the papers. The edges represent similarities between the topics computed based on their co-occurrence in titles. MoCS