Introduction & Concepts

Beyond the persistent "hair-balls" visualization

Rapid technological advances in modern biology have enabled biologists to conduct massively parallel experiments, which generate abundant data about interacting protein pairs, correlatively expressed gene pairs, etc.

hairball

Networks are typically used to represent these binary-relationship datasets (nodes = elements, edges = connections) to visually interpret them and extract useful biological information.
However, these representations often appear as "hair-balls"—with a large number of extremely tangled edges—and cannot be visually interpreted. Therefore, an interactive, multi-scale navigation method for large and complicated biological networks is desperately needed!

 

Have you ever used mapping services on the Internet for geographical maps, e.g., Google Maps? Such services provide appropriately abstracted views at any magnification and enable the user to interactively investigate regions of interest by zooming in and panning out.

 

Analogously, NaviCluster is developed as an interactive, multi-scale navigation tool for large and complicated biological networks.


Please proceed to Download & Run to start using NaviCluster now!

Hair-ball visualization
This figure was obtained from VisANT's website.
(15,447 nodes and 1,722,708 edges)

How does NaviCluster work?

NaviCluster automatically and rapidly abstracts any portion of a large network of interest to an immediately interpretable extent by the use of two clustering algorithms working in sequence:

  1. Ultrafast graph clustering technique that abstracts networks of about 100,000 nodes in a second by iteratively grouping densely connected portions
  2. Biological-property-based clustering technique that takes advantage of biological information often provided for biological entities (e.g., Gene Ontology terms).

After passing the network through two clustering components, NaviCluster presents an abstracted view and permits researchers to flexibly choose nodes/clusters. These processes can be completed in a few seconds on a typical PC with a CPU of ~2 GHz and a memory of ~1 GB for datasets with ~100,000 nodes.

NaviCluster's Workflow: workflow

Ultrafast Graph Clustering:

First, NaviCluster abstracts the whole network using the ultrafast graph clustering component. It detects topologically dense, connected regions, which may correspond to biologically meaningful clusters, such as protein complexes. It rapidly identifies clusters in huge networks of about 100,000 nodes within a few seconds.

 

Property-Based Clustering:

Second, in case the abstraction is insufficient because of the characteristics of the biological network, the property-based clustering component further abstracts the cluttered visualization to an extent sufficient for visual interpretation.

 

This component automatically groups clusters with similar biological properties by utilizing property information, such as Gene Ontology (GO) term, often assigned to biological entities. The new clusters from the property-based clustering are used in the next component instead of those generated by the ultrafast graph clustering component, thereby reducing the number of clusters on the screen.

Abstracted View with User Interface:

Third, the resultant clusters/nodes are immediately displayed with meta-edges and property edges, which represent the numbers of edges that exist between any members of two clusters and the similarities between their properties, respectively.

 

In case the number of clusters is less than that preferred by the user (specified through #Clusters in the Property-Based Clustering Panel), the biggest cluster is recursively split until the number of the clusters is equal to that in #Clusters or breaking only one more cluster makes the number of the clusters larger than that specified in #Clusters.

 

While showing the abstracted view, the interface allows researchers to interactively zoom, move laterally beyond cluster boundaries, focus on an arbitrary set of clusters/nodes, etc. Any subset of the entire network of particular interest to the researcher can be fed into the clustering components and the abstracted view of that subset is displayed.

abstracted view

For Beginners, please read step-by-step tutorials in How to Use.

If you want to see the explanations of each component in NaviCluster, please proceed to User Interface.

Examples of navigating some datasets can be found in Examples.


Last Updated:  1/29/2011