Background: Partitioning or clustering is a technique commonly applied to simplify large networks, but for metabolic networks the underlying mass balance constraints need to be taken into account too. A previously published algorithm using a random walk analysis achieves this but cannot by itself provide for user aims and priorities in the partitioning process. The Netsplitter application addresses this in a two-stage strategy. First the random walk algorithm is implemented, identifying metabolite nodes with the potential to partition the network when cut, but allowing the user full interactive control over their selection in each iterative round. In the second stage the user can choose individual subnets to be merged together again, based on information about the content and interconnections between subnets supplied by Netsplitter in graphic and tabular form.
Results: Netsplitter gives an efficient partitioning of genome scale metabolic networks with reduced fragmentation compared to commonly used connectivity based splitting, further improved by the merging step. A complete SBML specification of each subnet is produced by the program. In the case of the M. pneumoniae, the bacterial network of 189 reactions and 229 metabolites is reduced to 12 subnets, each of which closely corresponds with recognised biochemical functionality. Application to the network for the mouse M. musculus, shows that it works equally well for a network with more than 2000 reactions and metabolites which is partitioned into 162 subnets, reduced by merging to 102 subnets. Again the logical partitioning reflects biochemical reality, including cellular compartments. Subjective assessments of qualitative improvements are confirmed by calculated values of the objective partitioning efficacy index.
Conclusions: Netsplitter is a flexible tool that combines interactive user participation in partitioning with targeted assembly of subnets of a size and composition appropriate for the purpose of a particular study. Due to the efficient display of network structure in both matrix and layout forms, it is feasible to maintain the overview needed for partitioning and merging decisions even for the large and complex genome scale networks of eukaryotic organisms.