PY52CH05-Denby

ARI

V I E W

A

12:49

Review in Advance first posted online on May 5, 2014. (Changes may still occur before final publication online and in print.)

N

I N

C E

S

R

E

28 April 2014

D V A

Annu. Rev. Phytopathol. 2014.52. Downloaded from www.annualreviews.org by North Carolina State University on 05/16/14. For personal use only.

Network Modeling to Understand Plant Immunity Oliver Windram,1 Christopher A. Penfold,2 and Katherine J. Denby2,3 1

Department of Life Sciences, Imperial College London, SL5 7PY, United Kingdom; email: [email protected]

2

Warwick Systems Biology Centre, University of Warwick, CV4 7AL, United Kingdom; email: [email protected], [email protected]

3

School of Life Sciences, University of Warwick, CV4 7AL, United Kingdom

Annu. Rev. Phytopathol. 2014. 52:5.1–5.20

Keywords

The Annual Review of Phytopathology is online at phyto.annualreviews.org

plant-pathogen interaction, protein-protein interactome, network inference, systems biology, gene regulatory networks

This article’s doi: 10.1146/annurev-phyto-102313-050103 c 2014 by Annual Reviews. Copyright  All rights reserved

Abstract Deciphering the networks that underpin complex biological processes using experimental data remains a significant, but promising, challenge, a task made all the harder by the added complexity of host-pathogen interactions. The aim of this article is to review the progress in understanding plant immunity made so far by applying network modeling algorithms and to show how this computational/mathematical strategy is facilitating a systems view of plant defense. We review the different types of network modeling that have been used, the data required, and the type of insight that such modeling can provide. We discuss the current challenges in modeling the regulatory networks that underlie plant defense and the future developments that may help address these challenges.

5.1

Changes may still occur before final publication online and in print

PY52CH05-Denby

ARI

28 April 2014

12:49

INTRODUCTION

Annu. Rev. Phytopathol. 2014.52. Downloaded from www.annualreviews.org by North Carolina State University on 05/16/14. For personal use only.

The plant defense response, like many biological processes, consists of multiple physiological, molecular, and metabolic changes mediated by a complex web of regulatory interactions. These regulatory interactions occur at different biological levels (e.g., phosphorylation and translational and transcriptional changes), in different cells or tissues, and over different temporal scales. These regulatory interactions are also influenced by the pathogen, again on a varying spatial and temporal level, with the sum of these parts leading to a successful or unsuccessful infection. The widespread changes in a plant during pathogen infection [for example, thousands of genes changed in expression in response to Botrytis cinerea (64) and Pseudomonas syringae (58) infection] make unraveling the regulatory interactions a mammoth task. More than 600 transcription factors (TFs) changed in expression during infection by B. cinerea (64), and these represent just a single level of regulatory function. Hence, computational and mathematical approaches to this problem are vital. Network topologies and regulatory behavior are not intuitively discernible from individual connections. With the help of algorithms, the scale and complexity of the huge amounts of data being generated can be reduced to facilitate biological insight. Such algorithms can identify complex patterns, generate hypotheses about individual connections, and switch points within the network to enable emergent properties of the network to be understood. Crucially, they can predict the most informative experiments to conduct, helping to make the task of unraveling plant immunity more tractable and driving novel insight.

THE IDEAL NETWORK MODEL Developing mathematical models to describe a system depends, in large part, on the existing knowledge about that system and the depth and breadth of available data. If there is good understanding of the system, then bottom-up approaches can be used, in which all current understanding of the model is encoded in complex mathematical models (such as coupled ordinary differential equations) that allow predictions under novel conditions. In cases in which predictions diverge from experimental observation, new connections or features can be introduced to the model as additional terms or equations, allowing refinement over successive iterations. Bottom-up approaches have, for example, proven very successful at deepening our understanding of the circadian clock (36, 50). Nonetheless, these types of approaches require a considerable amount of prior knowledge about the system and a significant amount of biological data to fit free parameters of the model. Indeed, even where large amounts of data are available, pinning some parameter values down may still prove impossible, and multiple models may explain the data equally well. When little is known about the system, or when the system is very large with a significant fraction of unknown quantities, top-down modeling is often applied. This type of approach attempts to infer properties of the system (such as network topologies or parameters that can capture the dynamics of interactions over time) from genome scale “omics” data sets in order to identify interesting features that may be tested using more targeted experiments. In general, top-down approaches attempt to infer very many unknowns from limited data and are therefore likely to contain a high degree of false positives or false negatives. A significant number of theoretical methods, algorithms, and tools have been developed for this type of network inference and have been reviewed and/or benchmarked (e.g., see 14, 24, 26, 40, 48, 62). The diversity of methods is due, in part, to the difficulty of the task at hand and the type of data available in individual cases. It seems likely that no one approach will address all challenges that we might face in deciphering complex biological systems, and different approaches are likely to be better suited in different situations. 5.2

Windram

·

Penfold

·

Denby

Changes may still occur before final publication online and in print

PY52CH05-Denby

ARI

28 April 2014

12:49

The ideal in silico network model should be concise but able to capture a number of key parameters or features of the actual biological system across different molecular, temporal, and spatial levels. In the context of plant immunity, we would like a model that can accurately simulate infection facilitating the identification of critical cellular components, and their interactions, that govern the phenotypic outcome. Complicating matters further, in the context of pathogen infection, this model should include components from pathogen and host as well as the interaction between the two, which is no mean feat. Knowledge of key components and their behaviors during biotic challenge will, in turn, help reveal how and where artificial intervention would be most beneficial for improving crop resilience to pathogens.

Annu. Rev. Phytopathol. 2014.52. Downloaded from www.annualreviews.org by North Carolina State University on 05/16/14. For personal use only.

PROTEIN-PROTEIN INTERACTION NETWORK MODELS A major component of plant defense is transcriptional modulation of a large proportion of the genome (58, 64). However, multiple forms of post-transcriptional modification heavily influence protein activity, with modifications such as phosphorylation and ubiquitination as well as response to calcium actively regulating defense responses (12, 11, 44). Moreover, proteomics studies are revealing extensive protein networks responsible for propagating initial pathogen detection signals to regulators driving transcriptome flux (52, 53). Perhaps the simplest type of informative network model results from combining pairwise protein-protein interaction data. However, a major challenge in plant systems pathology, and indeed many other areas of systems biology, is inferring protein-protein interactions and understanding the dynamic regulation of the proteome under different conditions.

Large-Scale Assessment of Protein-Protein Interaction Protein-protein interactions represent a major feature of the proteome that facilitates information propagation within the system itself and communication with other regulatory subsystems, such as the transcriptome and metabolome. The Arabidopsis genome is currently predicted to encode ∼27,000 proteins (28) representing around 729 million potential pairwise interactions. Given current methodology [typically yeast-two-hybrid (Y2H) testing] full interactome analysis in multiple plants is just not possible. Generation of an informative protein-protein interaction map for plant defense, and plant science in general, requires a prioritization pipeline to direct screening efforts along with an improvement in screening throughput. Computational prediction of protein-protein interactions also plays an important role. Despite overwhelming interaction possibilities, our existing knowledge of biological networks indicates the plant proteome is likely to exhibit the classical small-world network property of sparse connections but short connectivity distances between nodes overall. Furthermore, biological networks tend to also be scale-free, characterized by a small number of highly connected interaction hubs with even shorter connectivity distances between nodes overall (4) (see Figure 1 for examples of network types discussed in this paper). Information gleaned from protein-protein interaction networks inferred from Y2H data suggests that plant protein-protein interaction networks possess these important network characteristics, reiterating their biological relevance and highlighting the usefulness of this experimental approach (2). This impressive study identified ∼6,200 binary interactions between ∼2,300 proteins (the AI-1 network) in a screen covering all pairwise interactions for approximately 30% of the Arabidopsis genome. Through use of a manually curated gold standard of 118 interactions and a random set of 146 protein pairs, consortium members were able to establish a precision rate of ∼80% for this data set. Importantly, these data revealed significantly higher coexpression correlation between interacting pairs compared www.annualreviews.org • Modeling the Plant Immune Response

Changes may still occur before final publication online and in print

5.3

PY52CH05-Denby

ARI

28 April 2014

a

12:49

b

c

d

Annu. Rev. Phytopathol. 2014.52. Downloaded from www.annualreviews.org by North Carolina State University on 05/16/14. For personal use only.

Figure 1 Network topology. (a) An undirected network. (b) A directed network. Inward connections are edges to a node; outward connections are edges from a node. In this example, the hashed node has an inward connectivity of one, and an outward connectivity of two. The grey, hashed and white nodes form a feed-forward loop where the grey node regulates both the hashed and white nodes, and the hashed node also regulates the white node. (c) A small-world network in which most nodes have a similar number of connections. The mean path length (average number of steps along the shortest path between all pairs of nodes) is influenced by the network size, giving the network a small-world property. (d ) A scale-free network, typical of many biological networks, which is populated by nodes with varying connectivity, i.e., highly connected hubs down to nodes with single links. This reduces the mean path length between nodes compared with other small-world networks, such as that seen in panel c (4).

with noninteracting control genes. This may be a particularly useful property that could be exploited to help focus protein-protein interaction screening using expression data. Analysis of the Y2H-derived network and of a network derived from literature-curated data revealed the objective power of the undirected experimental approach. The literature-curated data resulted in a network with overall longer connection distances and lower connectivity between interconnected clusters, i.e. lacking small-world properties typical of biological networks. This bias in the literature data is postulated to arise as a consequence of hypothesis-driven research (tending toward independent focal points) providing support for the use of broad high-throughput screens. Of specific relevance to plant immunity, the AI-1 network suggests that the transcriptional corepressor TOPLESS (TPL) plays a broad role in hormone signaling. TPL had been shown to interact with AUX/IAA proteins in the auxin signaling pathway and with the adapter NINJA in the jasmonate signaling pathway. In the AI-1 network, TPL interactions included components of salicylic acid (SA) and ethylene signaling as well as direct interactions with JAZ repressor proteins. Mukhtar and colleagues (45) published an extension of the AI-1 network that included interactions of 552 host immune proteins and effector proteins from the pathogens P. syringae and Hyaloperonospora arabidopsidis with the previously screened 8,000 Arabidopsis open reading frames. This resulted in a plant-pathogen immune network, PPIN-1. This study identified a large number of novel Arabidopsis protein-pathogen effector interactions, provided evidence that pathogen effectors target a limited number of host immune proteins, and demonstrated that effectors from very distantly related pathogens interact with the same host proteins. Theoretical analysis of scale-free networks has indicated that highly connected hubs are a weak point in such networks and that perturbing these hubs dramatically affects flux throughout the network (1). Interestingly, given that a major aim of pathogen effectors is to inhibit a successful immune response, pathogen effectors were shown to target highly connected host immune proteins (with the PPIN-1 network significantly more connected than the AI-1 network). A further key observation from analysis of this immune interaction network was that the majority of pathogen effectors were not directly connected to immune receptors but were indirectly interacting via another protein, supporting the guard hypothesis (22) as a widespread mechanism (see Figure 2). The ability of the PPIN-1 5.4

Windram

·

Penfold

·

Denby

Changes may still occur before final publication online and in print

Annu. Rev. Phytopathol. 2014.52. Downloaded from www.annualreviews.org by North Carolina State University on 05/16/14. For personal use only.

PY52CH05-Denby

ARI

28 April 2014

12:49

network to predict gene function was determined by testing the role of 17 proteins targeted by effectors from both P. syringae and H. arabidopsidis. Knockout lines of 15 of these proteins (including 7 interaction hubs) exhibited altered susceptibility to pathogen infection. These findings illustrate the biological insight generated by this network, identifying effector targets and network hubs as important focal points for crop protection efforts. In a targeted approach, Seo et al. (55) built a rice defense interactome by probing Y2H cDNA library pools with known regulators of defense against Xanthomonas oryzae pv. oryzae, the pathogen responsible for bacterial blight in rice. This biotic stress network was highly interconnected with an abiotic stress network generated by the same authors, and, as with the AI-1 network, the components of this stress interactome network were enriched for correlated or anticorrelated expression. The network could successfully predict gene function; mutant lines of 17 network nodes were tested and 9 showed altered immunity against X. oryzae. Although Y2H represents a convenient system for the screening of binary protein-protein interactions, it cannot determine the context of the interaction and cannot detect the interactions dependent on post-translational modifications. Popescu and colleagues (53) investigated calmodulin binding activities using protein microarrays in the presence of calcium. Calmodulins, which initiate rapid signaling events during the plant immune response, bind calcium, causing conformational changes that alter their interaction with other proteins. Using three calmodulins and four calmodulin-like proteins as probes, these researchers identified interactions in the presence of calcium between the probe proteins and 1,113 Arabidopsis proteins present on the array (53). The work revealed that although many of the probe proteins (calmodulins and calmodulinlike proteins) bound multiple targets, the majority of interactions were specific to a small number of probe proteins, indicating specificity of binding and hence function. Incorporating such condition-dependent networks and extracting condition-dependent information from large-scale protein-protein interaction networks are key challenges for the future. Membrane-protein interactions represent an important component of plant defense, especially in the initial stages of pathogen detection, which is managed by a number of receptor-like kinases (22). These represent a particular challenge for high-throughput analysis. Split ubiquitin systems have been used to investigate membrane-protein interactions in Arabidopsis (27). Numerous receptor-like kinases functioned as hubs within the network, highlighting the important role these components play in disseminating signal perception information.

In Silico Network Inference In an attempt to populate the large areas of the protein-protein interaction space that currently remain unexplored, a number of computational methods have been developed to predict proteinprotein interactions. One approach is based on the notion that protein domain-domain interactions are conserved across species boundaries, thus facilitating protein-protein interaction inference based on experimental data derived in different species. On the basis of this principle, proteinprotein interaction networks have been predicted in rice (69), Brassica (65), and Arabidopsis (18). These methods generate network structures very similar to those identified using experimental techniques, typically scale-free and small-world in structure. However, this classical network structure breaks down as thresholding based on interaction evidence confidence is applied. That is, networks constructed using only interactions with more than one line of evidence tend to exhibit sparser connectivity with longer average connection paths between nodes than networks without this thresholding applied. These properties are similar to those of interaction networks derived from literature-curated information, which appear biased by hypothesis-driven research that focuses on selected nodes with prior evidence (2). www.annualreviews.org • Modeling the Plant Immune Response

Changes may still occur before final publication online and in print

5.5

PY52CH05-Denby

ARI

28 April 2014

12:49

Annu. Rev. Phytopathol. 2014.52. Downloaded from www.annualreviews.org by North Carolina State University on 05/16/14. For personal use only.

Hpa effectors

Psy effectors

Interact with Hpa effectors

Interact with effectors from two pathogens

Interact with Psy effectors

NB-LRRs

Defense

RLKs

Immune interactions

5.6

Windram

·

Penfold

·

Denby

Changes may still occur before final publication online and in print

Annu. Rev. Phytopathol. 2014.52. Downloaded from www.annualreviews.org by North Carolina State University on 05/16/14. For personal use only.

PY52CH05-Denby

ARI

28 April 2014

12:49

In one of the first attempts to reconstruct a cross-species interactome, researchers have used a computational approach to predict protein-protein interactions between host and pathogen proteomes (35). Li and collaborators (35) predicted 3,072 interactions between 1,442 Arabidopsis host proteins and 119 proteins from Ralstonia solanacearum. These potential interactions may mediate pathogen manipulation of host defense or facilitate pathogen recognition by the plant. The authors combined the predicted cross-species interactions with experimentally derived Arabidopsis protein-protein interactions and used cluster analysis to identify modules within the network. This highlighted the fact that pathogen effector targets tend to function as bottlenecks in the network, i.e., they link highly connected modules with different biological functions. In directed networks, bottleneck nodes are thought to control the flow of information and be critical points in a network (66). It remains to be seen how many of the computational Arabidopsis-Ralstonia interactions are true in vivo interactions and hence whether the pathogen effectors are truly targeting bottleneck nodes. It is clear though that computational predictions along with experimental data (45) create novel opportunities for plant defense network modeling. Such information offers the potential to generate integrated network models that capture the behavior of both pathogen and plant systems, revealing how they may interact and influence each other.

The Dynamic Proteome and Quantitative Proteomics Protein-protein interaction network models will ultimately reveal a whole array of interaction possibilities within the proteome. However, as with many interactive systems, the overall function of the network is highly dependent on the actual proteome complement in a cell, and the proteome can vary significantly between different conditions and cellular compartments. The proteome complement under particular circumstances can be approximated through transcriptome and polysome profiling, but given the variation in protein half-lives, quantitative snapshots of the actual proteome itself are likely to be far more informative. The field of quantitative proteomics is actively pursuing methods to address this question, and such data will dramatically improve protein-protein interaction network models and their ability to accurately capture interactions occurring in vivo. Elmore et al. (16) used a quantitative proteomics approach to show that activation of the resistance gene RPS2 induced changes in expression of more than 1,000 proteins within the plasma membrane–enriched fraction. Such changes dramatically affect the topology of the protein-protein interaction network, and integrating this quantitative information will help refine the regulatory network operating during plant defense.

RELEVANCE NETWORK MODELS Extensive collections of gene expression data are currently publicly available and capture snap-shots of profiles from different tissues, developmental stages, and environmental conditions. ←−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− Figure 2 Plant-pathogen interaction network. Nodes represent open reading frames encoding either pathogen effectors [Hyaloperonospora arabidopsidis (Hpa; purple) and Pseudomonas syringae (Psy; green)], known components of the plant immune response [N-terminal domain of NB-LRR (nucleotide-binding– leucine-rich repeat) proteins (red ), cytoplasmic domain of receptor-like kinases (RLKs) ( pink), and other proteins involved in defense (blue)], or interacting proteins ( gray nodes). It can be clearly seen that effectors from the two very different pathogens target a common set of Arabidopsis proteins and that the direct targets of most pathogen effectors are a set of proteins that are themselves directly interacting with immune receptors, i.e., proteins that form bridges between the effectors and immune receptors. Image adapted with permission from Mukhtar et al. (45). www.annualreviews.org • Modeling the Plant Immune Response

Changes may still occur before final publication online and in print

5.7

ARI

28 April 2014

12:49

Usually, individual experiments provide limited prospects for network inference; however, combining these data can significantly improve the potential for biological insight. The simplest methods for inferring networks from collections of expression data are based on using pairwise evaluations of similarity between gene profiles to produce static gene coexpression networks (reviewed in 60). The most common metrics used to compute coexpression are correlation coefficients, such as Pearman’s (PCC) or Spearman’s. Other methods use mutual information (MI) [a notable example being the Algorithm for the Reconstruction of Accurate Cellular Networks (ARACNe) (41)], which enables nonlinear relationships between gene expression profiles to be captured. Because pairwise scores may be quickly and efficiently computed, this type of approach can be applied to biological systems with many genes. The resulting coexpression networks are undirected networks (i.e., links between genes do not have a direction or causality) with edges representing similar expression of nodes (genes) across a series of expression data sets, judged by a threshold of correlation or other distance metric. It should be noted that this type of network cannot distinguish direct from indirect interactions; for example, if expression of two coregulated genes B and C is correlated with expression of their regulator, gene A, they are, by definition, correlated with one another and therefore connected within the network. The data sets used tend to be large collections of static (single time point) data from a variety of conditions and a variety of mutant genotypes. These are the types of expression analyses most frequently carried out and publicly available. Coexpression networks can be condition independent or condition dependent. Although condition-independent approaches tend to use all available expression data, condition-dependent networks use expression data sets that are related to a specific context, whether this is the developmental stage, specific tissues, or environmental conditions [for example, SeedNet (5) successfully use data sets specifically concerned with germination]. Obtaining useful coexpression networks is highly dependent upon having sufficient data; hence, the feasibility of a condition-dependent network depends on the number of suitable data sets available. Recently, Zheng et al. (68) used coexpression network inference to investigate plant immunity. They used four transcriptome data sets of citrus infected with Candidatus Liberibacter asiaticus bacterium. These transcriptome profiles were generated from leaves inoculated with the pathogen, and covered early and late time points. Correlation matrices were calculated for each data set using PCC and were combined together via a weighted sum (according to the number of samples in the data set) to produce a weighted correlation matrix. This matrix was used to construct a network model in which nodes represent probes on the array and edges represent the weighted sum of coexpression. The network contained 3,705 nodes and 56,287 edges, with edges included if the absolute value of the weighted PCC was greater than 0.9. By mapping to Arabidopsis IDs, nodes in the network were annotated with Gene Ontology (GO) terms (19), and a subnetwork consisting of nodes associated with SA response was extracted. Although much of the reported findings could have been achieved by relatively simple analysis of transcriptome data, the network modeling enabled hub genes, potentially key components of the process, to be identified and novel genes to be linked to defense. The main use of coexpression networks is in gene discovery; however, several groups have attempted to construct regulatory networks using large collections of static expression data. One of the earliest attempts to infer an Arabidopsis regulatory network was carried out by Carrera et al. (10) using publicly available genome-wide expression data. Using an extensive collection of expression data from a variety of conditions, tissues, and mutant genotypes, the authors inferred regulatory relationships between TFs in the Arabidopsis genome. This was achieved first by identifying the global network topology by computing MI between gene pairs. In a subsequent step, gene expression was assumed to correspond to a linear differential equation under steady state, such that the expression of a gene could be expressed as the weighted sum of the expression

Annu. Rev. Phytopathol. 2014.52. Downloaded from www.annualreviews.org by North Carolina State University on 05/16/14. For personal use only.

PY52CH05-Denby

5.8

Windram

·

Penfold

·

Denby

Changes may still occur before final publication online and in print

Annu. Rev. Phytopathol. 2014.52. Downloaded from www.annualreviews.org by North Carolina State University on 05/16/14. For personal use only.

PY52CH05-Denby

ARI

28 April 2014

12:49

of connected genes, with individual parameters fitted from the data. The resulting transcriptional network overall is weakly connected with most genes regulated by less than three TFs. However, the subnetworks corresponding to biotic stress (in this paper, genes annotated with the terms “response to other organism,” “immune response,” and “systemic acquired resistance”) show much greater connectivity. Specifically, Carrera et al. (10) found lower inward connectivity (edges to a node; Figure 1b) and higher outward connectivity (edges from a node; Figure 1b), indicating expression of genes involved in these processes are controlled by a relatively small number of master regulators. Indeed, one of the most highly outward-connected nodes in the network is ERF1, encoding a key defense regulator. Interestingly, subnetworks corresponding to abiotic stress responses are similarly highly connected, suggesting that this network topology enables suitably robust and rapid responses to changing environmental conditions. Feed-forward loops (Figure 1b), a motif thought to confer robustness on a network (29), are more common in subnetworks associated with stress responses than nonstress response modules. It is worth bearing in mind that this network was built from condition-independent expression data, yet subnetworks representing the immune response are some of the most highly connected in the network. This suggests coordinated transcriptional responses are a marked feature of the immune response. A similar genome-wide approach was carried out by Ma et al. (37) using a graphical Gaussian model (GGM) to infer causal regulatory relationships between genes. GGMs represent a distinct improvement over correlation-based approaches in that they use partial correlations, rather than pairwise correlation coefficients, to indicate connections. Partial correlations measure the degree of association between two genes with the effect of other genes removed. For instance, if TF A regulates genes B and C, expression of B and C is highly correlated. However, using partial correlation we can effectively remove the influence of A, revealing that no direct correlation exists between B and C. Overall, this allows for a better estimation of direct relationships through the removal of indirect ones. This GGM approach again used extensive publicly available genomewide expression data, and the topology of the network revealed relatively low connectivity overall, with a few major hubs predicted to regulate a large number of genes. Although subnetworks relating to plant immunity were identified, no further analysis of these was carried out. Recently, Ma et al. (38) used information on promoter motifs to inform analysis of their genome-wide coexpression network in an attempt to elucidate the underlying regulatory network. This is based on the premise that coexpression exhibited by genes in the network may be mediated by a common regulator. For every gene in the network, the likelihood of it being regulated by a specific motif was calculated based on the enrichment of that motif in the gene promoter [1 kb upstream of the 5 untranslated region (UTR) or translational start site if no 5 UTR was annotated] and whether the position of the motif was biased toward the transcriptional start site. Genes with a high probability of regulation by a specific motif are extracted from the genomewide network to form a subnetwork in which modules of highly connected genes are identified as expression modules. Expression modules are predicted to be regulated by a specific motif that indicates at least the TF family that the regulator belongs to, and members of this family can be tested experimentally for binding to module gene promoters. One advantage of this type of analysis is that the regulating TF does not need to be part of the coexpression network, or even regulated at the transcriptional level, to be identified. However, although this method is based on the latest understanding of TF-binding motifs, it is quite feasible that a single motif could drive regulation of a gene; such motifs might not be identified using the enrichment criterion. Sato et al. (54) make use of dimensionality reduction techniques using expression data from a range of Arabidopsis mutants to build a static gene regulatory network model of the plant immune response. Dimensionality reduction methods (which include principal component analysis) enable the data to be described in fewer dimensions while capturing most of the variation. A network was www.annualreviews.org • Modeling the Plant Immune Response

Changes may still occur before final publication online and in print

5.9

ARI

28 April 2014

12:49

constructed by assigning mutants to a set of similar mutants with weighted edges that represent the similarity of expression profiles between two nodes (with nodes being mutated genes). After removing the major similarities in the data, the method is applied recursively to the residual information, helping to uncover weaker regulatory relationships. The network does not infer causality, but, by using expression data from a range of mutant genotypes, edges do imply a type of shared regulation not just coexpression. The expression of 571 plant defense genes was profiled in 22 mutants and wild-type plants 6 hours after inoculation with the avirulent bacterial pathogen P. syringae pv. tomato DC3000 AvrRpt2. The network inference algorithm joins genes whose mutants cause similar effects on the expression profile compared with wild type. From this network approach, edges represent some type of regulatory relationship between the two nodes (mutated genes). This relationship may be that the two genes coregulate a group of genes, that one gene regulates the other (i.e., is upstream in a regulatory pathway), that both genes are coregulated by another, or combinations of these. The links were positive when the change in expression is the same in both mutants and negative when they were opposite. However, the network edges do not have direction and hence causality; they simply represent a regulatory link between the two nodes. The nodes in the network are limited to the 22 genes corresponding to the mutants with the additional genes in the expression profiles being used to assess the connection between two nodes; no novel defense genes can be included in the network. However, a key advantage of this approach is that the network components themselves do not need to be transcriptionally regulated (it is their effects on transcription that are measured), an important point, as although the immune response leads to large-scale transcriptional reprogramming, there are many non-transcriptionally regulated components. It is also likely that the use of a defined focused set of genes for expression profiling (571 immune response genes) prevents detection of indirect effects of the mutation propagating throughout the network and may enhance network inference ability. Certainly, the network produced by the authors could predict almost all the validated regulatory interactions between the nodes known in the literature. The key findings with respect to the immune response were that the defense network is highly interconnected [mirroring findings from Carrera et al. (10)] and that the network is characterized by negative relationships between different signaling sectors, i.e., modules in the network involved in the same signaling pathway. Negative relationships were absent within a signaling sector but were the predominant relationship between sectors, leading the authors to predict that switching between sectors plays a key role in minimizing the fitness costs of defense and adapting the defense response to the specific invading pathogen. Several new nonintuitive regulatory links between sectors were predicted, and mutual negative regulation was confirmed between SA signaling and early microbeassociated molecular pattern (MAMP) responses. Crucially, this paper demonstrated that one not particularly extensive data set combined with network modeling can predict multiple regulatory interactions that have taken many years of more traditional research to elucidate. Expression profiling of genetic perturbations appears to provide sufficient information to generate accurate network models in a cost-effective manner. Given that the method is effectively a dimensionality reduction and regression application, it should be computationally efficient even in much larger systems. Recently, a novel approach to coexpression networks was developed (9), again using a relatively small data set. The authors took an expression data set of leaf profiles at the same developmental stage from three accessions of Arabidopsis grown in six different labs. After controlling for the effects of the lab and accession on gene expression, the residual variation (presumably arising from subtle changes in the environment) was used to infer a codifferential expression network using ENIGMA (39). Essentially, expression profiles are discretized into upregulated, downregulated, and unchanged, according to the log-fold change. Codifferential expression subsequently measures

Annu. Rev. Phytopathol. 2014.52. Downloaded from www.annualreviews.org by North Carolina State University on 05/16/14. For personal use only.

PY52CH05-Denby

5.10

Windram

·

Penfold

·

Denby

Changes may still occur before final publication online and in print

PY52CH05-Denby

ARI

28 April 2014

12:49

if two genes are significantly upregulated or downregulated together over various treatments, according to combinatorial statistics. This network performed as well as a network inferred from a same-sized data set of genome-wide expression after the usual perturbations (e.g., stress treatment) and was able to predict the involvement of ILL6, an amidohydrolase, in the jasmonate response. This study is one of the first to show that networks containing meaningful biological information can be generated from limited data sets making use of natural variation, meaning that network modeling approaches are more feasible for nonmodel organisms, e.g., many crop plants.

Annu. Rev. Phytopathol. 2014.52. Downloaded from www.annualreviews.org by North Carolina State University on 05/16/14. For personal use only.

FUNCTIONAL ASSOCIATION NETWORK MODELS Extending the guilt-by-association strategy of coexpression networks, approaches pioneered by Insuk Lee and Edward Marcotte have demonstrated that integrating different types of data into a unified model provided a platform with impressive predictive properties (30–33). The method generates an undirected functional association network, with weighted edges providing a measure of interaction strength between the two linked nodes. Functional association networks typically integrate multiple data sources (including simple pairwise genetic interaction, protein-protein interaction, and coexpression data), and hence edges do not represent one specific type of interaction. Here, different networks are empirically assigned a likelihood according to their ability to recapitulate a particular set of pathways (for example, a pathway in a metabolic database), which then provides a level playing field on which to leverage the multiple data types within a Bayesian setting, allowing a probabilistic interpretation of the integrated data. Although this allows a multitude of differing data types to be combined, it requires some degree of knowledge about the system with which to objectively assign individual likelihoods. In multiple organisms (including nematodes, yeast, mice, and, more recently, plants), these researchers have shown that this guiltby-association framework was highly effective at identifying novel components of a biological process. Genes closely linked to characterized genes with known roles in cellular processes were consistently shown to be involved in processes similar to their previously validated neighbors. Furthermore, highly connected nodes within these networks are also more likely to be essential for organism viability (31). A drawback of these networks is that they rely on extensive publicly available collections of data and hence are restricted to commonly studied organisms. The first plant functional association network produced was AraNet in 2010 (30). This network drew from multiple diverse data sets and used orthology to incorporate additional information from other organisms, including yeast, fly, worm, and human. This genome-wide network has the power to predict gene functions (in terms of biological process) using the guilt-by-association technique. AraNet predicted gene functions for nearly 5,000 genes of unknown function that lacked any annotation, and the authors validated the predicted functions for two out of three genes tested. However, there were challenges. AraNet was less able to predict involvement in plant-specific processes (presumably because there is less data available for these), and epistasis may mask the involvement of a gene in a process, leading to false-negative results. In Arabidopsis, this probabilistic functional gene network has not been used for prediction of immune responses (at least in the published literature), but a similar network model has successfully predicted immune response genes in rice (33). Although fewer genome-wide data, essential for functional gene networks, are available for rice, the cross-species integration employed in the approach by Lee et al. (33) enabled a successful network model (RiceNet) to be built. Although exploiting the knowledge from orthology, the inclusion of rice data sets makes RiceNet more accurate than simply using AraNet to transfer linkages to rice genes. The authors targeted the immune response to test the predictive power of their network. They used a set of 15 defense-related genes with known disease-resistance www.annualreviews.org • Modeling the Plant Immune Response

Changes may still occur before final publication online and in print

5.11

PY52CH05-Denby

ARI

28 April 2014

12:49

Annu. Rev. Phytopathol. 2014.52. Downloaded from www.annualreviews.org by North Carolina State University on 05/16/14. For personal use only.

phenotypes as a query and identified genes that were associated with this query set in the network. Taking the genes most highly connected to the query genes and those with no known function in immunity, 14 candidate genes were selected from more than 800 with a predicted role in defense. These fourteen were prioritized by testing for protein-protein interaction with at least one component of the Xa21 [a rice resistance (R) gene] interactome (55). Three of the five genes tested had a role in regulation of Xa21-mediated disease resistance. This is despite Xa21 not being in the network (it is not present in Oryza sativa), indicating the conservation of signaling events downstream of R-gene activity. Excitingly, the functional gene network has cross-species predictive power, accurately predicting gene function in maize, another monocotyledonous plant. Given the expanded genome sizes of most crop plants, it may be necessary to combine a prioritization step with the functional gene network analysis, but this approach provides a powerful tool for annotating genes of unknown function and for discovering genes involved in a particular biological process.

DYNAMIC GENE REGULATORY NETWORK MODELS Coexpression network models (and their variants) tend to be useful for predicting function of genes and identifying novel components of a (largely unknown) biological process, and approaches such as that of Carrera et al. (10) and Sato et al. (54) have extended this to predicting regulatory relationships. However, these approaches are still of limited use in predicting causal and directed regulatory relationships between genes. Dynamic gene regulatory network models predict specific gene-gene interactions in which the upstream gene either directly or indirectly regulates the expression or activity of the downstream gene. These networks can range from small parameterized models (e.g., the Arabidopsis clock model found in Reference 50) that incorporate both transcriptional and post-transcriptional regulatory events to large-scale transcriptional network models inferred from genome-wide data sets (64). Small-scale networks are often modeled using ordinary differential equations, requiring detailed kinetic data and parameters for network interactions, with edges representing direct interactions. However, large-scale transcriptional networks are often modeled using dynamic Bayesian networks, with edges reflecting direct or indirect regulation, depending on the algorithm. Dynamic gene regulatory networks are crucial for predicting the effects of genetic perturbation on flow through the network, and ultimately on phenotype, and hence for synthetic biology approaches to crop design. A recent paper by Naseem et al. (46) generated a network model that could be used for dynamic simulations from an initial Boolean network. In Boolean modeling, the data for the individual genes are typically discretized to represent the gene being either on or off, with interactions between components capturing logic relationships, e.g., AND, IF, and OR links. Naseem et al. (46) focused on hormone signaling during the plant immune response. It is well known that multiple mechanisms of cross talk exist between different hormone signaling pathways acting synergistically, additively, or antagonistically (49). Unraveling the ultimate effect of these complex interactions on the immune response and disease resistance is a difficult challenge that will require modeling and simulation. Perturbing one component at a time without a network model framework limits the insight that can be achieved from experiments. Simulations effectively enable the researcher to perform multiple experiments in silico and determine the most informative experiments for in vivo validation. Naseem et al. (46) built a Boolean network using interactions reported in the literature. The resulting network consisted of 105 nodes and 163 edges. This includes biochemical pathways as well as signaling information and actions of pathogen effectors. Interestingly, this is one of the few examples of a network model where attempts have been made to combine pathogen and 5.12

Windram

·

Penfold

·

Denby

Changes may still occur before final publication online and in print

Annu. Rev. Phytopathol. 2014.52. Downloaded from www.annualreviews.org by North Carolina State University on 05/16/14. For personal use only.

PY52CH05-Denby

ARI

28 April 2014

12:49

plant components, a challenge we discuss below. This Boolean logic network was then converted into a dynamical system, which allows dynamic simulation of the network after perturbations of specific components. The transformation is achieved without detailed information about the kinetics of each connection, using simplified assumptions according to whether a gene is regulated by activators, inhibitors, or combinations of both. Many parameters that would conventionally be set according to detailed kinetic data are fixed to default values. The model therefore does not reflect the exact dynamics of the system; however, it does capture a large degree of information about it and enables the authors to run dynamic simulations after perturbing different network components, i.e., in silico experiments. The model captures different levels of interaction and regulation (e.g., hormone levels, mitogen-activated protein kinase activity, and TF expression) as well as several pathogen effector proteins. As with all network modeling, many hypotheses are generated, but the main insights from this work concerned the role of cytokinin in immunity. Previous reports have demonstrated both positive and negative effects of cytokinin on SA signaling in a dose-dependent manner (3). The network simulations indicated that cytokinin signaling was not activated in response to virulent P. syringae infection, but pretreatment with cytokinin induced expression of the SA marker gene PR-1 and enhanced resistance to the virulent pathogen. Modeling suggested (and experiments agreed) that cytokinin does not influence early events in the immune response and that the positive interaction of cytokinin and SA signaling is downstream of SA synthesis. Auxin is known to have a negative effect on immunity, with exogenous application increasing susceptibility to P. syringae. Network simulations demonstrated that the balance between auxin and cytokinin impacts immunity and that this balance exerts its effects via SA signaling. The network modeling of Naseem et al. (46) predicted relationships and interactions between known components of plant immunity. However, we are far from identifying all the components of the plant defense response, with many of the known interactions not being direct. There is a need for network models that extend beyond known components and can place novel components in the context of regulatory interactions. Ideally, such dynamic gene regulatory networks would be genome wide, but the data requirements (in terms of time points and replicates) for modeling so many genes simultaneously are well beyond what is currently possible. Hence, strategies are needed to enable broader genome-wide interactions to be inferred from the data currently being generated. Windram et al. (64) used a dynamic Bayesian network approach to identify a large-scale transcriptional network that mediates the Arabidopsis response to B. cinerea infection. The model was inferred from a high-resolution time series of expression data of Arabidopsis leaves following B. cinerea infection (25 time points, one time point every two hours). Time series data is very powerful for network inference and enables a single data set to be used to generate the initial network model. Genes with similar expression profiles were first grouped together into clusters with the mean of the individual clusters computed to create a representative profile. A network of interactions was then inferred between these representative profiles using the causal structure identification (CSI) algorithm. The authors also included the expression of a B. cinerea housekeeping gene in the modeling (as a proxy for pathogen growth) to look for predicted effects of the regulatory network on the pathogen. The CSI algorithm attempts to explain the expression of genes by the expression of other genes at previous time points. It evaluates all potential parents (i.e., regulators) of a gene and pairs of parents (i.e., combined effect on expression of the downstream gene) and attempts to identify the function governing the regulation. The network model generated by CSI predicts directed regulatory links between different clusters of coexpressed genes; hence, although the CSI algorithm attempts to infer causal relationships, the use of clusters of genes as nodes means the inferred network is not truly a causal one. Obviously, it is not possible from just the inferred network model and expression data to identify the gene within a particular www.annualreviews.org • Modeling the Plant Immune Response

Changes may still occur before final publication online and in print

5.13

PY52CH05-Denby

ARI

28 April 2014

12:49

TGA3 ABF1

Annu. Rev. Phytopathol. 2014.52. Downloaded from www.annualreviews.org by North Carolina State University on 05/16/14. For personal use only.

Botrytis growth

ANAC055

WRKY75 MYB2 AtERF1

MYB54

NAC WRKY MYB

Motif enrichment

Figure 3 Part of a gene regulatory network model mediating the Arabidopsis response to Botrytis cinerea infection. The network was inferred from time-series expression data using a mean profile for each cluster of coexpressed genes as input; hence, nodes represent a cluster of genes. The growth profile of the pathogen was also included. The expression profile of a cluster is shown on top of the relevant node. Colored boxes indicate enrichment of transcription factor (TF)-binding motifs in promoters of genes in the cluster nodes, with TFs from the corresponding binding family (NAC, WRKY, and MYB) highlighted in the same color. TFs present in selected clusters are shown. Enrichment of TF-binding motifs within the network enabled specific regulatory hypotheses to be generated. For example, direct regulation by ANAC055 of two downstream clusters. Adapted with permission from Windram et al. (64).

cluster that is a regulator or target of a regulator; however, combining the network model with additional bioinformatics analysis enables specific hypotheses to be generated (Figure 3). In a transcriptional network, TFs are likely to be doing much of the regulation; hence, TFs within clusters were identified. A single cluster was predicted to be upstream of B. cinerea growth, and two TFs were present in this cluster. One of these was selected and knockout mutants of this gene, TGA3, were found to be more susceptible to B. cinerea infection, confirming a role for TGA3 in the defense response (64). Although TGA3 had been implicated in defense against biotrophic pathogens (23), it had not been previously shown to affect susceptibility to necrotrophic pathogens. TFs present in clusters predicted to regulate a number of downstream clusters would be other candidates for this gene discovery approach, the advantage being that not only do 5.14

Windram

·

Penfold

·

Denby

Changes may still occur before final publication online and in print

Annu. Rev. Phytopathol. 2014.52. Downloaded from www.annualreviews.org by North Carolina State University on 05/16/14. For personal use only.

PY52CH05-Denby

ARI

28 April 2014

12:49

networks of this type enable novel components of the defense response to be identified but they also make concurrent predictions about their regulation. Regulatory interactions between clusters were turned into precise hypotheses for experimental testing by determining the enrichment of TF-binding motifs in the promoters of genes in a cluster. If a target cluster contained promoters enriched for the WRKY-binding motif, then WRKY TFs in the upstream network cluster would be predicted to be regulating these genes. Often a single member of a TF family would be present in a cluster, simplifying the experimental testing. The authors did not test the regulatory network predictions, but these predictions would form the basis of validating the network using techniques such as yeast one hybrid (Y1H) and chromatin immunoprecipitation-sequencing (ChIP-Seq). One clear advantage of inferring a regulatory network model from high-resolution time series expression data is that not only does it predict precise regulatory relationships but it can also be readily applied to a wide range of species. Such inference requires one extensive data set (a minimum of 12 time points and at least three replicates), but unlike relevance or functional network inference, it does not require the availability of large numbers of data sets, making it feasible for nonmodel organisms.

CHALLENGES IN MODELING PLANT IMMUNITY An important concept in all systems biology–driven fields is the iteration between modeling, simulation, and prediction; experimental testing of model predictions; and model refinement based on test results. Many published studies of plant immunity to date present this first cycle of model construction and testing. An immediate future goal is the incorporation of new experimental information into existing models to help improve their accuracy. Dynamic Bayesian networks are well suited to this purpose with their ability to incorporate prior information. Network modeling can make a distinctive and valuable contribution to understanding plantpathogen interactions and key control points that determine infection outcome. However, significant challenges remain, and several new developments would greatly enhance the usefulness of network models. First, there is a lack of protein-DNA binding information. Accurate system level quantification of protein-DNA interaction, such as TF binding to promoter sequences and subsequent influence on gene expression, is critical for validating predictions generated by numerous regulatory network inference methods. Multiple theoretical methods have been developed that seek to identify conserved motifs within promoter regions either within groups of coexpressed genes or across species in an attempt to establish some understanding of their coregulation (for example, see 6, 38). The main problem is establishing the identity of TFs that bind different motifs and their condition-specific activity; only a few hundred plant motif–TF pairs are described among various databases (21, 34, 42, 43). Furthermore, in most of these cases the specificity of binding (in terms of which family members bind with a particular motif) is not known. This is a distinct deficit of data, given that the Arabidopsis genome is currently predicted to encode more than 2,000 TFs (67). ChIP-Seq data provides genome-wide binding information for TFs and, combined with expression profiling, is a powerful technique for constructing regulatory networks. Recently, such a network was elucidated for the human pathogen Mycobacterium tuberculosis and was used to predict genes with a key role in persistence of the pathogen asymptomatically in hosts (17). Y1H techniques can be used in a high-throughput manner to identify TF binding to screened promoters (61). Although this technique does not indicate whether a particular TF binds to a promoter in vivo or under what conditions/in which tissues, combining condition-independent Y1H data with network modeling techniques can predict the condition-specific binding profile of a promoter (20). www.annualreviews.org • Modeling the Plant Immune Response

Changes may still occur before final publication online and in print

5.15

ARI

28 April 2014

12:49

Related to this, an angle relatively unexplored is comparative network analysis, be this between different conditions (e.g., infection by different pathogens) or between different species. Plant pathogens use diverse strategies to successfully infect plants, and the response of the host is finetuned in response to the pathogenic signals it receives. Many signaling components are involved in responses to both biotrophic and necrotrophic pathogens, with the relative activation of different modules thought to determine the infection outcome. Given the availability of suitable data sets, networks capturing the host response to different pathogens can be inferred and compared to highlight switch points within the networks (i.e., from defense against a biotrophic or necrotrophic pathogen) and can shed light on how diverse attack strategies are implemented. Algorithms, such as hierarchical CSI (47), can jointly infer networks for different conditions by sharing common information from the data sets. A hyperparent network is derived from common information but can be overridden or supplemented with condition-specific connections in the individual networks for each condition. One advantage of this joint inference is that information from data sets not exactly matching each condition can be used to strengthen predictions, and points at which networks differ between conditions are easily identified. Joint inference modeling has been used to predict a transcriptional switch in the Arabidopsis network mediating responses to cold and osmotic shock (47). This approach may also be useful in elucidating immune regulatory networks in different species, treating the hyperparent as an ancestral network. One of the key theoretical challenges is how to infer dynamic regulatory networks on a genomewide basis. Thousands of genes change in expression during the immune response, and an ideal transcriptional network model would be able to capture these changes. As outlined above, one approach to this is to model profiles representing coexpressed clusters of genes, but this means the resulting network is not strictly causal. An alternative approach is to focus on modeling TFs, as these are the proteins mostly responsible for transcriptional regulation. We have recently combined variational Bayesian state space modeling (7) with a probabilistic Metropolis-Hastings wrapper to generate a network model for all TFs differentially expressed during B. cinerea infection (C.L. Hill & C.A. Penfold, unpublished data). This TF-only network can be subsequently expanded using groups of coexpressed or coregulated genes from methods such as EDISA (56) and Wigwams (51). An additional challenge is exploiting predictive metabolic modeling (57) to understand the impacts of synthesizing defense-specific compounds on the cellular system and how metabolism changes during the immune response. The accumulation of many compounds is induced or repressed during infection (15, 25). Predictive metabolic models exist for Arabidopsis (13, 63) but have not yet been used to examine metabolism during pathogen infection, although a study of glucosinolates (antiherbivory compounds) using such a model predicted significant energetic costs in producing these defense metabolites (8). Linking metabolic models to regulatory network models will be a vital step toward being able to re-engineer regulatory circuits and manipulate the outcome of infection. Finally, accurate modeling of the plant immune system is only half the challenge. Generating a network model that is capable of accurately predicting the infection outcome and simulating the phenotypic outcome after perturbations in host and/or the pathogen will require integration of pathogen function and regulation control. Recently, a gene regulatory network was inferred from time series expression data captured simultaneously from host (mouse) and pathogen (Candida albicans) (59). This network predicted novel interactions between fungal and host genes, one of which was validated. Binding of mouse Ptx3, a soluble pattern recognition receptor, led to downregulation of Hap3 target genes in the pathogen in a Hap3-dependent manner. Such crossspecies networks are likely to identify key hubs of the host immune response as well as potential targets for chemical control of pathogens.

Annu. Rev. Phytopathol. 2014.52. Downloaded from www.annualreviews.org by North Carolina State University on 05/16/14. For personal use only.

PY52CH05-Denby

5.16

Windram

·

Penfold

·

Denby

Changes may still occur before final publication online and in print

PY52CH05-Denby

ARI

28 April 2014

12:49

DISCLOSURE STATEMENT The authors are not aware of any affiliations, memberships, funding, or financial holdings that might be perceived as affecting the objectivity of this review.

ACKNOWLEDGMENTS

Annu. Rev. Phytopathol. 2014.52. Downloaded from www.annualreviews.org by North Carolina State University on 05/16/14. For personal use only.

O.W. would like to acknowledge support from the Grand Challenges in Ecosystems and the Environment Initiative at Imperial College. C.P. acknowledges support from EPSRC grant EP/I036575/1. K.D. is part of the BBRSC-funded grant Plant Response to Environmental Stress Arabidopsis (BB/F005806/1). The authors would like to thank Dr. Yinyin Yuan for her helpful discussions.

LITERATURE CITED 1. Albert R, Jeong H, Barabasi A. 2000. Error and attack tolerance of complex networks. Nature 406(6794):378–82 2. Arabidopsis Interactome Mapping Consortium. 2011. Evidence for network evolution in an Arabidopsis interactome map. Science 333(6042):601–7 3. Argueso CT, Ferreira FJ, Epple P, To JPC, Hutchison CE, et al. 2012. Two-component elements mediate interactions between cytokinin and salicylic acid in plant immunity. PLoS Genet. 8(1):e1002448 4. Barab´asi A-L, Oltvai ZN. 2004. Network biology: understanding the cell’s functional organization. Nat. Rev. Genet. 5(2):101–13 5. Bassel GW, Lan H, Glaab E, Gibbs DJ, Gerjets T, et al. 2011. Genome-wide network model capturing seed germination reveals coordinated regulation of plant cellular phase transitions. Proc. Natl. Acad. Sci. USA 108(23):9709–14 6. Baxter L, Jironkin A, Hickman R, Moore J, Barrington C, et al. 2012. Conserved noncoding sequences highlight shared components of regulatory networks in dicotyledonous plants. Plant Cell 24(10):3949–65 7. Beal MJ, Falciani F, Ghahramani Z, Rangel C, Wild DL. 2005. A Bayesian approach to reconstructing genetic regulatory networks with hidden factors. Bioinformatics 21(3):349–56 8. Bekaert M, Edger PP, Hudson CM, Pires JC, Conant GC. 2012. Metabolic and evolutionary costs of herbivory defense: systems biology of glucosinolate synthesis. New Phytol. 196(2):596–605 9. Bhosale R, Jewell JB, Hollunder J, Koo AJK, Vuylsteke M, et al. 2013. Predicting gene function from uncontrolled expression variation among individual wild-type Arabidopsis plants. Plant Cell 25:2865–77 10. Carrera J, Rodrigo G, Jaramillo A, Elena SF. 2009. Reverse-engineering the Arabidopsis thaliana transcriptional network under changing environmental conditions. Genome Biol. 10(9):R96 11. Cheng YT, Li X. 2012. Ubiquitination in NB-LRR-mediated immunity. Curr. Opin. Plant Biol. 15(4):392– 99 12. Cheval C, Aldon D, Galaud J-P, Ranty B. 2013. Calcium/calmodulin-mediated regulation of plant immunity. Biochim. Biophys. Acta 1833(7):1766–71 13. de Oliveira Dal’Molin CG, Quek L-E, Palfreyman RW, Brumbley SM, Nielsen LK. 2010. AraGEM, a genome-scale reconstruction of the primary metabolic network in Arabidopsis. Plant Physiol. 152(2):579–89 14. De Smet R, Marchal K. 2010. Advantages and limitations of current network inference methods. Nat. Rev. Microbiol. 8(10):717–29 15. Dixon RA. 2001. Natural products and plant disease resistance. Nature 411(6839):843–47 16. Elmore JM, Liu J, Smith B, Phinney B, Coaker G. 2012. Quantitative proteomics reveals dynamic changes in the plasma membrane during Arabidopsis immune signaling. Mol. Cell Proteomics 11(4):M111.014555 17. Galagan JE, Minch K, Peterson M, Lyubetskaya A, Azizi E, et al. 2013. The Mycobacterium tuberculosis regulatory network and hypoxia. Nature 499(7457):178–83 18. Geisler-Lee J, O’Toole N, Ammar R, Provart NJ, Millar AH, Geisler M. 2007. A predicted interactome for Arabidopsis. Plant Physiol. 145(2):317–29 www.annualreviews.org • Modeling the Plant Immune Response

Changes may still occur before final publication online and in print

5.17

ARI

28 April 2014

12:49

19. Gene Ontology Consortium. 2000. Gene ontology: tool for the unification of biology. Nat. Genet. 25(1):25–29 20. Hickman R, Hill C, Penfold CA, Breeze E, Bowden L, et al. 2013. A local regulatory network around three NAC transcription factors in stress responses and senescence in Arabidopsis leaves. Plant J. 75(1):26–39 21. Higo K, Ugawa Y, Iwamoto M, Korenaga T. 1999. Plant cis-acting regulatory DNA elements (PLACE) database: 1999. Nucleic Acids Res. 27(1):297–300 22. Jones JDG, Dangl JL. 2006. The plant immune system. Nature 444(7117):323–29 23. Kesarwani M, Yoo J, Dong X. 2007. Genetic interactions of TGA transcription factors in the regulation of pathogenesis-related genes and disease resistance in Arabidopsis. Plant Physiol. 144(1):336–46 24. Kim Y, Han S, Choi S, Hwang D. 2013. Inference of dynamic networks using time-course data. Brief. Bioinforma. doi:10.1093-bib-bbt028 25. Kliebenstein DJ, Rowe HC, Denby KJ. 2005. Secondary metabolites influence Arabidopsis/Botrytis interactions: variation in host production and pathogen sensitivity. Plant J. 44(1):25–36 26. Krouk G, Lingeman J, Colon AM, Coruzzi G, Shasha D. 2013. Gene regulatory networks in plants: learning causality from time and perturbation. Genome Biol. 14(6):123 27. Lalonde S, Sero A, Pratelli R, Pilot G, Chen J, et al. 2010. A membrane protein/signaling protein interaction network for Arabidopsis version AMPv2. Front. Physiol. 1:24 28. Lamesch P, Berardini TZ, Li D, Swarbreck D, Wilks C, et al. 2012. The Arabidopsis Information Resource (TAIR): improved gene annotation and new tools. Nucleic Acids Res. 40:D1202–10 29. Le D-H, Kwon Y-K. 2013. A coherent feedforward loop design principle to sustain robustness of biological networks. Bioinformatics 29(5): 630–37 30. Lee I, Ambaru B, Thakkar P, Marcotte EM, Rhee SY. 2010. Rational association of genes with traits using a genome-scale gene network for Arabidopsis thaliana. Nat. Biotechnol. 28(2):149–56 31. Lee I, Lehner B, Crombie C, Wong W, Fraser AG, Marcotte EM. 2008. A single gene network accurately predicts phenotypic effects of gene perturbation in Caenorhabditis elegans. Nat. Genet. 40(2):181–88 32. Lee I, Li Z, Marcotte EM. 2007. An improved, bias-reduced probabilistic functional gene network of baker’s yeast, Saccharomyces cerevisiae. PLoS ONE 2(10):e988 33. Lee I, Seo Y-S, Coltrane D, Hwang S, Oh T, et al. 2011. Genetic dissection of the biotic stress response using a genome-scale gene network for rice. Proc. Natl. Acad. Sci. USA 108(45):18548–53 34. Lescot M, D´ehais P, Thijs G, Marchal K, Moreau Y, et al. 2002. PlantCARE, a database of plant cis-acting regulatory elements and a portal to tools for in silico analysis of promoter sequences. Nucleic Acids Res. 30(1):325–27 35. Li Z-G, He F, Zhang Z, Peng Y-L. 2011. Prediction of protein-protein interactions between Ralstonia solanacearum and Arabidopsis thaliana. Amino Acids 42(6):2363–71 36. Locke JCW, Southern MM, Kozma-Bogn´ar L, Hibberd V, Brown PE, et al. 2005. Extension of a genetic network model by iterative experimentation and mathematical analysis. Mol. Syst. Biol. 1:2005.0013 37. Ma S, Gong Q, Bohnert HJ. 2007. An Arabidopsis gene network based on the graphical Gaussian model. Genome Res. 17(11):1614–25 38. Ma S, Shah S, Bohnert HJ, Snyder M, Dinesh-Kumar SP. 2013. Incorporating motif analysis into gene co-expression networks reveals novel modular expression pattern and new signaling pathways. PLoS Genet. 9(10):e1003840 39. Maere S, Van Dijck P, Kuiper M. 2008. Extracting expression modules from perturbational gene expression compendia. BMC Syst. Biol. 2:33 40. Marbach D, Prill RJ, Schaffter T, Mattiussi C, Floreano D, Stolovitzky G. 2010. Revealing strengths and weaknesses of methods for gene network inference. Proc. Natl. Acad. Sci. USA 107(14):6286–91 41. Margolin AA, Nemenman I, Basso K, Wiggins C, Stolovitzky G, et al. 2006. ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinforma. 7(Suppl. 1):S7 42. Mathelier A, Zhao X, Zhang AW, Parcy F, Worsley-Hunt R, et al. 2013. JASPAR 2014: an extensively expanded and updated open-access database of transcription factor binding profiles. Nucleic Acids Res. 42(1):D142–47 43. Matys V, Kel-Margoulis OV, Fricke E, Liebich I, Land S, et al. 2006. TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes. Nucleic Acids Res. 34:D108–10

Annu. Rev. Phytopathol. 2014.52. Downloaded from www.annualreviews.org by North Carolina State University on 05/16/14. For personal use only.

PY52CH05-Denby

5.18

Windram

·

Penfold

·

Denby

Changes may still occur before final publication online and in print

Annu. Rev. Phytopathol. 2014.52. Downloaded from www.annualreviews.org by North Carolina State University on 05/16/14. For personal use only.

PY52CH05-Denby

ARI

28 April 2014

12:49

44. Meng X, Zhang S. 2013. MAPK Cascades in Plant Disease Resistance Signaling. Annu. Rev. Phytopathol. 51:245–66 45. Mukhtar MS, Carvunis AR, Dreze M, Epple P, Steinbrenner J, et al. 2011. Independently evolved virulence effectors converge onto hubs in a plant immune system network. Science 333(6042):596–601 46. Naseem M, Philippi N, Hussain A, Wangorsch G, Ahmed N, Dandekar T. 2012. Integrated systems view on networking by hormones in Arabidopsis immunity reveals multiple crosstalk for cytokinin. Plant Cell 24(5):1793–814 47. Penfold CA, Buchanan-Wollaston V, Denby KJ, Wild DL. 2012. Nonparametric Bayesian inference for perturbed and orthologous gene regulatory networks. Bioinformatics 28(12):i233–41 48. Penfold CA, Wild DL. 2011. How to infer gene networks from expression profiles, revisited. Interface Focus 1(6):857–70 49. Pieterse CMJ, Leon-Reyes A, Van der Ent S, Van Wees SCM. 2009. Networking by small-molecule hormones in plant immunity. Nat. Chem. Biol. 5(5):308–16 50. Pokhilko A, Fern´andez AP, Edwards KD, Southern MM, Halliday KJ, Millar AJ. 2012. The clock gene circuit in Arabidopsis includes a repressilator with additional feedback loops. Mol. Syst. Biol. 8:574 51. Polanski K, Rhodes J, Hill CL, Zhang P, Jenkins D, et al. 2014. Wigwams: identifying gene modules co-regulated across multiple biological conditions. Bioinformatics 30:962–70 52. Popescu SC, Popescu GV, Bachan S, Zhang Z, Gerstein M, et al. 2009. MAPK target networks in Arabidopsis thaliana revealed using functional protein microarrays. Genes Dev. 23(1):80–92 53. Popescu SC, Popescu GV, Bachan S, Zhang Z, Seay M, et al. 2007. Differential binding of calmodulinrelated proteins to their targets revealed through high-density Arabidopsis protein microarrays. Proc. Natl. Acad. Sci. USA 104(11):4730–35 54. Sato M, Tsuda K, Wang L, Coller J, Watanabe Y, et al. 2010. Network modeling reveals prevalent negative regulatory relationships between signaling sectors in Arabidopsis immune signaling. PLoS Pathog. 6(7):e1001011 55. Seo Y-S, Chern M, Bartley LE, Han M, Jung K-H, et al. 2011. Towards establishment of a rice stress response interactome. PLoS Genet. 7(4):e1002020 56. Supper J, Strauch M, Wanke D, Harter K, Zell A. 2007. EDISA: extracting biclusters from multiple time-series of gene expression profiles. BMC Bioinforma. 8:334 57. Sweetlove LJ, Fell D, Fernie AR. 2008. Getting to grips with the plant metabolic network. Biochem. J. 409(1):27–41 58. Tao Y, Xie Z, Chen W, Glazebrook J, Chang H-S, et al. 2003. Quantitative nature of Arabidopsis responses during compatible and incompatible interactions with the bacterial pathogen Pseudomonas syringae. Plant Cell 15(2):317–30 59. Tierney L, Linde J, Muller S, Brunke S, Molina JC, et al. 2012. An interspecies regulatory network ¨ inferred from simultaneous RNA-seq of Candida albicans invading innate immune cells. Front. Microbiol. 3:85 60. Usadel B, Obayashi T, Mutwil M, Giorgi FM, Bassel GW, et al. 2009. Co-expression tools for plant biology: opportunities for hypothesis generation and caveats. Plant Cell Environ. 32(12):1633–51 61. Vermeirssen V, Deplancke B, Barrasa MI, Reece-Hoyes JS, Arda HE, et al. 2007. Matrix and Steinertriple-system smart pooling assays for high-performance transcription regulatory network mapping. Nat. Methods 4(8):659–64 62. Werhli AV, Grzegorczyk M, Husmeier D. 2006. Comparative evaluation of reverse engineering gene regulatory networks with relevance networks, graphical gaussian models and bayesian networks. Bioinformatics 22(20):2523–31 63. Williams TCR, Poolman MG, Howden AJM, Schwarzlander M, Fell DA, et al. 2010. A genome-scale metabolic model accurately predicts fluxes in central carbon metabolism under stress conditions. Plant Physiol. 154(1):311–23 64. Windram O, Madhou P, McHattie S, Hill C, Hickman R, et al. 2012. Arabidopsis defense against Botrytis cinerea: chronology and regulation deciphered by high-resolution temporal transcriptomic analysis. Plant Cell 24(9):3530–57 65. Yang J, Osman K, Iqbal M, Stekel DJ, Luo Z, et al. 2012. Inferring the Brassica rapa interactome using protein-protein interaction data from Arabidopsis thaliana. Front. Plant Sci. 3:297 www.annualreviews.org • Modeling the Plant Immune Response

Changes may still occur before final publication online and in print

5.19

PY52CH05-Denby

ARI

28 April 2014

12:49

Annu. Rev. Phytopathol. 2014.52. Downloaded from www.annualreviews.org by North Carolina State University on 05/16/14. For personal use only.

66. Yu H, Kim PM, Sprecher E, Trifonov V, Gerstein M. 2007. The importance of bottlenecks in protein networks: correlation with gene essentiality and expression dynamics. PLoS Comput. Biol. 3(4):e59 67. Zhang H, Jin J, Tang L, Zhao Y, Gu X, et al. 2011. PlantTFDB 2.0: update and improvement of the comprehensive plant transcription factor database. Nucleic Acids Res. 39:D1114–17 68. Zheng Z-L, Zhao Y. 2013. Transcriptome comparison and gene coexpression network analysis provide a systems view of citrus response to “Candidatus Liberibacter asiaticus” infection. BMC Genomics 14:27 69. Zhu P, Gu H, Jiao Y, Huang D, Chen M. 2011. Computational identification of protein-protein interactions in rice based on the predicted rice interactome network. Genomics Proteomics Bioinforma. 9(4–5):128– 37

5.20

Windram

·

Penfold

·

Denby

Changes may still occur before final publication online and in print

Network modeling to understand plant immunity.

Deciphering the networks that underpin complex biological processes using experimental data remains a significant, but promising, challenge, a task ma...
577KB Sizes 2 Downloads 3 Views