Robust disease module mining via enumeration of diverse prize-collecting Steiner trees

Abstract

Disease module mining methods (DMMMs) extract subgraphs that constitute candidate disease mechanisms from molecular interaction networks such as protein-protein interaction (PPI) networks. Irrespective of the employed models, DMMMs typically include non-robust steps in their workflows, i. e., the computed subnetworks vary when running the DMMMs multiple times on equivalent input. This lack of robustness has a negative effect on the trustworthiness of the obtained subnetworks and is hence detrimental for the wide-spread adoption of DMMMs in the biomedical sciences.To overcome this problem, we present a new DMMM called ROBUST (robust disease module mining via enumeration of diverse prize-collecting Steiner trees). In a large-scale empirical evaluation, we show that ROBUST outperforms competing methods in terms of robustness, scalability and, in most settings, functional relevance of the produced modules, measured via KEGG gene set enrichment scores and overlap with DisGeNET disease genes.A Python 3 implementation and scripts to reproduce the results reported in this paper are available on GitHub: https://github.com/bionetslab/robust, https://github.com/bionetslab/robust-eval.Supplementary data are available at Bioinformatics online.

Publication
Bioinformatics