Termes les plus recherchés
[PDF](+63👁️) Télécharger ITM Probe: analyzing information flow in protein networks pdf
Summary: Founded upon diffusion with damping, ITM Probe is an application for modeling information flow in protein interaction networks without prior restriction to the sub-network of interest. Given a context consisting of desired origins and destinations of information, ITM Probe returns the set of most relevant proteins with weights and a graphical representation of the corresponding sub-network. With a click, the user may send the resulting protein list for enrichment analysis to facilitate hypothesis formation or confirmation. Availability: ITM Probe web service and documentation can be found at www.ncbi.nlm.nih.gov/CBBresearch/qmbp/mn/itm_probeTélécharger gratuit ITM Probe: analyzing information flow in protein networks pdf
ITM Probe: analyzing information flow in protein networks
Aleksandar Stojmirovic
and Yi-Kuo Yifl
National Center for Biotechnology Information
National Library of Medicine
National Institutes of Health
Bethesda, MD 20894
United States
Abstract
Summary:
Founded upon diffusion with damping, ITM Probe is an application for modelling information flow in protein inter-
action networks without prior restriction to the sub-network of interest. Given a context consisting of desired origins
and destinations of information, ITM Probe returns the set of most relevant proteins with weights and a graphical rep-
resentation of the corresponding sub-network. With a click, the user may send the resulting protein list for enrichment
analysis to facilitate hypothesis formation or confirmation.
Availability:
ITM Probe web service and documentation can be found at[
| www.ncbi.nlm.nih.gov/CBBresearch/qmbp/mn/itm_probel
Contact:
yyu@ncbi.nlm.nih.gov
*to whom correspondence should be addressed
1 Introduction
Protein interaction networks are presently under intensive research dBader et a/i 120081) . Recently, a number of authors
have applied the con cept of random walk (with truncation') to extract biologica lly relevant information from protein
interaction networks (iNabieva et al
2005
Tu et al.
2006;
Suthram et al.
20081) . These approaches, however, do not
model information loss/leakage that naturally occurs in all networks. For example, in cellular networks, proteases
constantly degrade proteins, diminishing the strength of information propagation. We have recently developed a
mathematica l framework to model infor mation flow in interaction networks with a novel ingredient, damping/aging of
information (IStoimirovic and Yul 12007). Implementing the theory, we have constructed a web application ITM Probe,
which also contains a new model of information propagation: information channel.
ITM Probe models information flow in a protein interaction network through discrete random walks. Unlike classical
random walks, our model allows the walker a certain probability to dissipate or damp (that is, to leave the network)
at each step. Each walk, simulating a possible information path, terminates either by dissipation or by reaching a
boundary node.
We distinguish two types of boundary nodes: sources (emitting information) and sinks (absorbing information).
ITM Probe offers three models: absorbing, emitting and channel. For any network node, the corresponding weight
returned by the emitting model is the expected number of visits to that node by a random walk originating at given
source(s). The absorbing model, on the other hand, returns the likelihood of a random walk starting at that node to
terminate at sink(s). The channel model combines the emitting and absorbing models: it contains both sources and
sinks as boundary and reports the expected numbers of visits to any network node from random walks originating at
sources and terminating at sinks.
Each selection of boundary nodes and dissipation rates provides the biological context for the information trans-
mission modelled. Small dissipation allows random walks to explore the nodes farther away from their origin while
large dissipation evaporates quickly most walks. For the channel model, dissipation controls how much a random walk
can deviate from the shortest path from sources to sinks. We call the set of most significant nodes, in terms of the
weights returned, an Information Transduction Module (ITM).
2 Usage
Both the absorbing and emitting models navigate neighborhoods of selected nodes and illuminate the protein com-
plexes associated with them. However, the absorbing model can reveal relatively distant 'leaf nodes linked to a sink
by a nearly unique path, while the emitting model favors highly connected clusters. The channel model is suited for
discovery of potential pathways linking proteins of interest or biological functions associated with them. Using mul-
tiple sources may reveal the potential points of crosstalk between information channels, while a solution of multiple
sinks chosen according to a set of competing hypotheses may suggest the most biologically plausible pathways among
many possible ones.
Every model of ITM Probe requires an interaction graph, the boundary nodes (sources and/or sinks) and the damp-
ing factors as input. The damping factors may be specified directly or by setting the desired average path-length
(emitting/channel model) or the average likelihood of absorption at sinks (absorbing model).
Although our mathematical framework can be applied to any directed graph, our web service presently supports
1
only the yeast (Saccharomyces cerevisiae) physical interaction networks derived from the BioGRID dStark et a/.l l2006)
database. We offer three yeast networks: Full, Reduced and Directed. The Full network consists of all interactions
from the BioGRID as an undirected graph, while the Reduced consists only of those interactions that are from low-
throughput experiments (that is, from publications reporting less than 300 interactions) or are reported by at least
two independent publications. The Directed network is derived from Reduced by turning all interactions labelled as
'Biochemical activity' into directed links (bait — > prey).
To assist in silico investigations on the impact of kno cking out certain gen es, ITM Probe allows users to specify nodes
to exclude from the network. Furthermore, it is known (ISteffen et a/.l 120021) that proteins with a large number of non-
specific interaction partners might overtake the true signaling proteins in the information flow modeling. Therefore,
ITM Probe by default excludes from the yeast networks the proteins that may provide undesirable shortcuts, such as
cytoskeleton proteins, histones and chaperones. The user may choose to lengthen or shorten this list.
Output and analysis
ITM Probe outputs a list of the top ranking nodes togeth er with an image of the sub -network consisting of these nodes
(Fig- [0- Images are produced using the Graphviz suite dGansner and Nor th. 2000). Each protein listed is linked to its
full description in several external databases. The number of nodes to be listed can be specified d irectly by the user or
determ ined automatically from the model results through a criterion such as participation ratio ( Stojmirovic and Yu .
2007) or the cutoff value. The resulting weights for all nodes can be downloaded in the CSV format for further
analysis.
Each ITM image can be rendered and saved in multiple formats (SVG, PNG, JPEG, EPS and PDF). For each
rendering, the users can choose which aspects of results to display, the color map and the scale for presentation
(linear or logarithmic). When multiple boundary points are specified, it is possible to obtain an overview of all of their
contributions simultaneously by selecting the color mixture scheme (Fig.Q]). In this case, each source (channel/emitting
model) or sink (absorbing model) is assigned a basic CMY (cyan, magenta or yellow) color and the coloring of each
displayed node is a result of mixing the colors corresponding to its source- or sink- specific values for each of the
boundary points.
While it is possible to specify any proteins in the network as sources and sinks, not every context produces bio-
logically meaningful res ults. To facilitate biolo gical interpretation of the users' results, we hav e locally implemen ted
Boyle et al.
(2004j). It
a Gene Ontology (GO) dAshburner et aZ.ll2000i) enrichment tool based on GO::TermFinder of]
compares a given input list of proteins to the lists annotated with GO terms and finds those GO terms that statistically
best explain the input list. Every ITM Probe results page contains a query form allowing the user to specify the number
of the top ranking proteins to consider for GO term enrichment analysis.
Example
Histone acetyltransferases remodel chromatin by ac etylating histone octamers and hence may play an important role
in transcription activation dSterner and Bergen, uOOOl) . To explore the interface between them and the RNA Polymerase
II core in yeast, we choose three histone acetyltransferases (Hatlp, Gcn5p, Elp3p) as sources and a catalytic subunit
Rpo21p of RNA Polymerase II as a sink for the channel model (Fig. Q]). From the color mixing image it appears
that Elp3p and Gcn5p interact with Rpo21p through a wide channel of proteins, while Hatlp seems to be remote
2
Figure 1 : An example ITM from running the ITM Probe channel model.
from Rpo21p. This prompts the hypothesis that Hatlp is not directly involved in transcription activation. Enrichment
analysis, using the 16 nodes (shown in magenta color in Fig. [TJ mostly visited from Hatlp, shows that Hatlp and
these nodes participate mainly in DNA replication and only indirectly in transcription regulation, thus reinforcing the
hypothesis. Similar analysis on the nodes associated with Elp3p indicates the interaction is almost exclusively through
the elongator complex. The nodes associated with Gcn5p are less specific, indicating a more generic interface, but are
all involved mRNA transcription.
3 Outlook
We plan to include interaction networks from additional organisms, once their coverage/quality becomes comparable
to those from yeast. In principle, the analysis from ITM Probe can be integrated with existing partial knowledge to form
a broad picture of possible communication paths in cellular processes. The concept of context-specific analysis may
find applications beyond biological networks.
3
Acknowledgments
This work was supported by the Intramural Research Program of the National Library of Medicine at National Insti-
tutes of Health. ITM Probe implementation relies on a variety of open source projects, which we acknowledge on our
website.
References
Ashburner, M. et al. (2000). Gene ontology: tool for the unification of biology, the gene ontology consortium. Nat
Genet, 25, 25-29.
Bader, S. et al. (2008). Interaction networks for systems biology. FEBS Lett, 582(8), 1220-4.
Boyle, E. I. et al. (2004). GO::TermFinder-open source software for accessing gene ontology information and finding
significantly enriched gene ontology terms associated with a list of genes. Bioinformatics, 20, 3710-3715.
Gansner, E. R. and North, S. C. (2000). An open graph visualization system and its applications to software engineer-
ing. Software — Practice and Experience, 30(11), 1203-1233.
Nabieva, E. et al. (2005). Whole-proteome prediction of protein function via graph-theoretic analysis of interaction
maps. Bioinformatics, 21 Suppl 1, 302-310.
Stark, C. et al. (2006). BioGRID: a general repository for interaction datasets. Nucleic Acids Res, 34(Database issue),
D535-9.
Steffen, M. et al. (2002). Automated modelling of signal transduction networks. BMC Bioinformatics, 3, 34.
Sterner, D. E. and Berger, S. L. (2000). Acetylation of histones and transcription-related factors. Microbiol Mol Biol
Rev, 64(2), 435-459.
Stojmirovic, A. and Yu, Y.-K. (2007). Information flow in interaction networks. / Comput Biol, 14(8), 1 1 15-43.
Suthram, S. et al. (2008). eQED: an efficient method for interpreting eQTL associations using protein networks. Mol.
Syst. Biol, A, 162.
Tu, Z. et al. (2006). An integrative approach for causal gene identification and gene regulatory pathway inference.
Bioinformatics, 22, e489-496.
4
Lire la suite
- 101.33 KB
- 15
Vous recherchez le terme ""

63

25
