Hive Panel of CRAN Package Dependency Network

Hive Panel of CRAN Package Dependency Network

Martin Krzywinski, author of the circular style genome visualization tool circos, proposed the hive plots in 2010. The most significant difference between hive plots and traditional layout is: its graphic design is based on the network's meaningful properties (vertices' degree, connectivity, centrality, etc.) instead of aesthetics. This design makes the graph interpretable and thus simplifies the presentation of relational data.

We selected 27 representative packages and visualize every three of them in one hive plot to make a 3x3 hive panel. Each panel represents a specific research field. Each node of the network is mapped on the axes by its degree information: green axis represents out-degree, orange axis represents in-degree, and purple axis combines in/out-degrees together. On each axis, outer nodes have higher degrees. The white connections, as the background, show us the overall connectivity of the network: the nodes have higher out-degrees are heavily depended by all ranges of nodes in the network, and the brighter parts of the arcs tend to indicate potential cluster patterns.

Meanwhile, we highlight three of the interested packages in each research field in one panel with three different colors to reveal its specific connection patterns. For the first panel, green connections represents lattice package. It's a fundamental package for graphic design in R, which is heavily depended by packages of all degrees. The purple connections represent the rgl package. It depends a little but it's depended by much more packages that distributed more discretely on the orange axis than lattice was. The orange represents the gplots package, which contains various miscellaneous tools for plotting. Obviously, the dependency patterns indicate its different role between the previous ones: it's more of a handy toolset for plotting, rather than a core package. The upper right panel shows us three of the data import/export packages: DBI, RODBC and RSQLite. Amazingly, althought they play different roles in the whole community, their dependency patterns are almost the same, except for a little difference between their degrees. The central panel, which highlights the finance-related packages fBasics, fOptions, and fGarch, reveals similar features.

Hive plots are relatively much more informative and comprehensive than conventional hairball-style visualizations, especially for large networks. You could discover much more interesting patterns in other panels yourself with this visualization.

The selected packages (ordered by panel 11, 12, 13, 21, 22 ...) are:

Graphics: lattice / rgl / gplots (Green / Purple / Orange)
Programming: tools / rJava / Rcpp
Data Import/Export: DBI / RODBC / RSQLite
GUI Dev Tools & Framework: tcltk / gWidgets / Rcmdr
Finance: fBasics / fOptions / fGarch
Machine Learning: e1071 / rpart / randomForest
Regression Analysis: car / leaps / quantreg
Spatial and Geo Statistics: sp / maps / fields
Time Series Analysis: forecast / timeDate / tseries

For more details, check

Add a Comment

Login or register to post comments
Posted Dec 23, 2011
Views: 1372
Tags community, CRAN, Dependency, hive panel, hive plot, network, software
Tools GIMP, linnet, R
<iframe src="" width="620" height="450" frameborder="0" scrolling="no" marginheight="0" marginwidth="0"></iframe>
Need help embedding?