Berlin
Technische Universität Berlin Gesellschaft für Informatik e.V.
41. Jahrestagung, Gesellschaft für Informatik e.V. (GI), Berlin
Informatik 2011 > Programm > Workshops > Artikel

Integration and Exploration of Biological Networks

Hendrik Mehlhorn, Falk Schreiber

Abstract: Biological networks play a crucial role in solving complex biological problems in the life sciences. Modern wet lab techniques such as GC/MS, multidimensional protein gels, and microarrays produce a continuously increasing amount of biological data sets. These arise from different domains such as metabolomics, proteomics, and transcriptomics. In order to achieve aims of life science projects such as the invention of new drugs or the increase of yield in crop plants, a deep understanding of the complex interactions of biological entities from different domains is necessary. However, the integration of the underlying biological networks is a task which is yet not solved satisfactory due to the lack of conventions and ambiguities. There are many databases which integrate data from different sources. However, these databases are often limited to a few organisms or data domains, and a comprehensive view on integrated biological networks is therefore not (or only partly) possible. As a result specific analyses have to be done independently on basis of available data sets and common biological networks. In order to integrate biological networks from different sources and different domains, an identifier mapping has to be done to infer corresponding and related entities of different biological networks, and exploration methods have to be provided to support the investigation of the data. We present methods and an easy to use prototype (based on the Vanted system) for the integration and visualization of biological networks via utilizing various data sources. The idea is to employ biological network data from different sources and from different domains such as metabolic pathways, protein-protein interaction networks, signal transduction pathways, and gene regulatory networks. The identifier mappings arise from an easily extensible set of integrated IDMapper's, which are managed by the identifier mapping framework BridgeDB. An IDMapper is a generic mapping information source which can be implemented in several shapes such as web services, SQL databases, or flat files. The manifold IDMapper implementation possibilities as well as their easy integration enable fast and adaptive extensions of the tool to further demands. The set of all identifier mappings constitute the identifier mapping graph. A powerful management of the identifier mapping graph including the handling of identifier synonyms and transitive identifier mapping paths afford a flexible integration of biological networks. The integrated biological networks can be visualized completely or partially according to various filtering operations. Via a targeted mapping and filtered visualization of integrated biological networks, the user is able to prepare custom systems biology analyzes and publication ready figures.