The UCI Network Data Repository is an effort to facilitate the scientific study of networks. Feel free to browse and download the currently available datasets. For more information about networks and the terms used to describe the datasets, click Getting Started.

« Back to data set listing

Social Network from TopCoder's Epidemic Contest

(Data set visualization unavailable)
Basic Description:
4846609 nodes
42851237 edges
Undirected
Static
Unweighted
Summary:
The graph is from an online social networking site. Sampled subgraphs were used for a TopCoder.com's contest in April 2008. Participating programmers suggested solutions for optimally slowing the spread of a simulated epidemic along each of the provided network.
Background Information:
The graphs used in for the Epidemic problem at TopCoder.com are all sampled from an original graph with 4846609 nodes and 42851237 edges, for an average degree of about 17.7. The example graphs can be found in examples.gz. The original graph is also available.
Schema:
The first line in the file contains the degrees of all the nodes, in order. Each subsequent line contains the edge list for one node. For compactness, edges only appear in one edge list. 
Processing Information:
The sampling was done by first selecting M, the number of vertices to sample, between 100 and 100,000. A single seed vertex will by selected at random from the entire graph. To add each subsequent vertex, a random vertex will be chosen from those already in the sample and one of that vertex's friends will then be chosen at random. If that node has already been added, a new random vertex and a new random friend will be chosen until an unadded vertex is found. The edge set will consist of all edges in the original graph, both of whose endpoints are in the node set after all M nodes have been selected.

The C code used to sample the original graph can be downloaded from http://www.cs.cornell.edu/~lars/sample.c.

In the examples file, each line represents example (in order). If the line is scanned one integer at a time, where the function int() consumes the next integer on the line, the following code will produce the example graph:

    nodes = int();
    for(i = 0; i<nodes; i++){
        degree = int();
        for(j = 0; j<degree; j++){
            v = int();
            addEdge(i,v);
        }
    }
Attribution/License Information:
Data set provided by Lars Backstrom.
Original website:
http://www.topcoder.com/contest/problem/Epidemic/graph.html
Citation:
@MISC{topcoder,
  title = "Social Network from TopCoder's Epidemic Contest",
  author = "",
  note = "\url{http://networkdata.ics.uci.edu/data.php?d=topcoder}",
  abstract = "The graph is from an online social networking site.  Sampled subgraphs were used for a TopCoder.com's contest in April 2008.  Participating programmers suggested solutions for optimally slowing the spread of a simulated epidemic along each of the provided network."
}   
Related Literature
We hope to maintain a list of papers that use or reference this data set, hosted at CiteULike.com. If you know of others, feel free to add to the list.

Suggest Edits