Contest webpage:
The WilmaScope graph visualisation system was used to find a 3D layout for the graphs generated. Colin Murray implemented a fast, interactive force-directed layout engine for Wilma based on the FADE algorithm [6]. Our extensions to the algorithm provide improved layouts for scale-free networks.
3DS Max 6 was used to generate the 3D images and animations. Adel Ahmed created the scripts for parsing the Wilma xml files and generating the 3DS Max models and animations.
Traditionally, social networks are modeled as random networks where the degrees of nodes in the network follow a bell-shaped Poisson distribution. However, random network models fail to explain the existence of hubs and cliques in the network. The reason is that random network models don't capture two important factors in the formation of social networks: growth and preferential attachment [1, 2]. These two mechanisms help to explain the existence of hubs and cliques: as new nodes appear, they tend to connect to the more connected nodes, which results in the power-law distribution of the node degree. Such network is also called scale-free network which is abundant in nature and society, describing such diverse systems as the WWW, chemical network of cell and collaboration network. For the contest data, we have constructed citation network, cocitation network and coauthor network from the raw data. Statistics in their degree distribution show that they conform to the definition of scale-free network. For scale-free network, hubs in the network have special importance [3, 4], for example under malicious "attack" [5] (it may assume different meaning under different context). So in our layout, it's reasonable to emphasize those hubs in the networks.
To visualise this graph we used a "Two and a Half Dimensional" approach, constraining nodes to lie in layers (or parallel planes). Each node in the citation network represents one paper. The assignment of nodes to layers was done as follows:
To find a suitable layout for this reasonably large graph (with XXX nodes) in an interactive time we implemented a force directed algorithm using a spatial decomposition method [6] to reduce the complexity of calculating the repulsive forces between every node (normally this would be O(n^2) time per iteration).
In our movie we show an extensive tour of our various views of the co-author network, focusing on these cliques and clusters and highlighting prominant authors.
Using the first of the above research methods we identified the following research areas:
We then created a graph (network) of citations between research areas in different years. That is, we created a graph G=(V,E) where each node (u in V) represents all the papers published in a particular year in one of the above research areas and each directed edge ((u,v) in E) indicates a citation in a paper u of a paper in v.
Before visualising this graph we created additional edges between pairs of nodes representing sequential years in the same research area. Thus all nodes in each research area formed a chain ("worm"), ordered by year of publication. An embedding in 3D space was found for the graph using a force directed method, with edges in each worm having much stiffer springs than the springs representing citation edges. The positions of the nodes are constrained, such that all nodes for a given year must lie in the same plane. The planes corresponding to each year are arranged parallel to one another, and perpendicular to the z-axis, such that the top most plane (when viewed from the side) corresponds to the most recent year (2004) and the bottom most corresponds to the earliest (1993 and earlier).
Finally, the scene was rendered with spherical nodes of radius proportional to the number of papers published in the corresponding research area and year. Edges linking adjacent nodes in each worm are represented by cylinders with end radii matching that of their end nodes, creating the smooth "worm" effect. The research area worms are given distinctive colours, uniform within each worm, except where no paper was published in a particular year. In this last case the worm segment is given a darker shade. Citation edges between worms were represented with a 3D arrow of radius and colour intensity proportional to the number of citations represented.
In our interactive tool (WilmaScope) we use a cross-sectional viewer to study citations from individual years. In the movie (produced with 3DSMax), we show an animation, stepping through the citation edges emanating from a particular year. This revealed an interesting problem in the data set, namely citations from earlier papers to newer papers. We discuss this in more detail below.
A strange phenomenon has also become evident. A number of upward pointing edges have appeared. These can only be citations of future papers! Closer investigation revealed that many of these were due to older papers being reprinted in compilations and the year of publication for these papers being set to that of the more recently published volume, for example:
S. G. Eick and G. J. Wills. "Navigating large networks with hierarchies". Proceedings Visualization'93, IEEE, October 1993.
Was reprinted in "Readings in information visualization using vision to think" in 1999, and this is the definitive entry for article id="acm300752", even though this id is cited by papers from as early as 1994.
This struck us as an example of the power of visualisation. The problem in the data-set was made blatantly obvious by the visualisation, but had we been simply performing textual queries on a database it may not have become evident unless we specifically designed a query to look for it.
To visualise this graph we, again, used a "Two and a Half Dimensional" approach, constraining nodes to lie in layers (or parallel planes). The assignment of nodes to layers was done as follows:
Edges connecting author nodes to research area nodes were rendered as spikes on the surface of the author node, pointing towards the research area node. The colour of each spike matches the colour of the research area node and the length indicates the number of papers the author has published in that particular research area. For example see Image 3.1.1. Thus, an authors contribution to a specific research area is shown locally, by the various spikes, and globally by the author's position relative to the research area node. Initially we showed this relationship with an edge rendered as a line between the author and research area nodes, but we felt that the spike showed the same information with less clutter.
When arranged this layered graph with two styles of nodes. The first uses planar layers, producing a "mountain range" style effect. Important, high degree nodes literally stand out above less well connected nodes.
The alternative layout arranges the layers as concentric, spherical shells, with the import high degree nodes, on the outer-most layer. This spherical alternative possibly offers an alternative method for navigating the data set, with a natural "focus and context" metaphor. Mostly, however, this was done for aesthetic appeal.
This is an extreme close-up of G. Robertson, so it's difficult to see much of the rest of the graph. One notable point is that G. Robertson is in the top-most layer, indicating he has published papers with at least 20 other authors.
In our movie we show an extensive tour of our various views of the author network, focusing on these cliques and clusters and highlighting prominant authors.