Journal of Theoretical Biology 382 (2015) 216–222

Contents lists available at ScienceDirect

Journal of Theoretical Biology journal homepage: www.elsevier.com/locate/yjtbi

Phylogenetic tree and community structure from a Tangled Nature model Osman Canko n, Ferhat Taşkın, Kamil Argın Department of Physics, Erciyes University, Kayseri, Turkey

H I G H L I G H T S

 Phylogenetic trees (pt) estimated from genomes (or morphologies) of extant species cannot be compared with real pt, which is at best imperfectly known from the fossil record.  One way to assess the accuracy of common estimation methods, such as ML or NJ, would be to apply them to data from in silico evolution models, for which the pt is exactly known.  The quasi-evolutionary stable strategies' communities are very highly connected and there are no obvious fragmented subgroups among the species in a habitat.

art ic l e i nf o

a b s t r a c t

Article history: Received 28 January 2015 Received in revised form 25 June 2015 Accepted 7 July 2015 Available online 16 July 2015

In evolutionary biology, the taxonomy and origination of species are widely studied subjects. An estimation of the evolutionary tree can be done via available DNA sequence data. The calculation of the tree is made by well-known and frequently used methods such as maximum likelihood and neighborjoining. In order to examine the results of these methods, an evolutionary tree is pursued computationally by a mathematical model, called Tangled Nature. A relatively small genome space is investigated due to computational burden and it is found that the actual and predicted trees are in reasonably good agreement in terms of shape. Moreover, the speciation and the resulting community structure of the food-web are investigated by modularity. & 2015 Elsevier Ltd. All rights reserved.

Keywords: Phylogenetic tree Maximum likelihood method Neighbor-joining method Food-web Modularity

1. Introduction The discovery of inheritance, which is one of the fundamental principles of biology, has caused a revolution in evolutionary systematics (Huxley, 1940; Mayr, 1942; Simpson, 1961; Hennig, 1966). These systematics are based upon a hierarchical layout which is generated from the relationship among groups of living organisms and is very important to interpret evolutionary processes. Evolutionary systematics have been studied at both species and above species levels for the living organism (de Queiroz and Donoghue, 1988). Mayr (1969) and Simpson (1961), who studied at species level, made a significant contribution to the categorization of species. Hennig (1966), who has the tenet of common descent, came to the conclusion that there were higher taxa than species level (de Queiroz and Donoghue, 1990) and changed the concept of evolution in taxonomy (Queiroz and Gauthier, 1992).

n

Corresponding author. E-mail address: [email protected] (O. Canko).

http://dx.doi.org/10.1016/j.jtbi.2015.07.005 0022-5193/& 2015 Elsevier Ltd. All rights reserved.

In the last few decades, studies about taxonomy have been accelerated by taking advantage of computers. The increase in the capacity of computers has permitted us to study longer and more numerous DNA sequences which are necessary to achieve the real phylogenetic tree. In this perspective, many models have been developed to obtain the best phylogenetic tree. In most models, there are two main groups of approaches to construct the phylogenetic tree (Saitou and Imanishi, 1989). The first group involves searching all of the possible phylogenetic trees and selecting the most correct one according to certain criteria, such as maximizing the probability of evolution. The maximumparsimony (MP) (Eck and Dayhoff, 1966) and the maximumlikelihood (ML) (Felsenstein, 1981) methods are in this group. The second group involves building the best tree by analyzing the distances between nucleotide sequences. The neighbor-joining (NJ) method (Saitou and Nei, 1987) is a well-known example of this group. The ML method finds the best possible phylogenetic tree according to the probability of transition (or evolving) which occurred in nucleic acid sequences. The topology and branch

O. Canko et al. / Journal of Theoretical Biology 382 (2015) 216–222

lengths of the tree are of major importance in the ML method. Finding the tree topology and branch lengths is not a suitable approach. This is because, in a direct search, every possible tree topology should be searched and then the optimum value of the branch lengths with the maximum likelihood value should be determined for each topology. However, the number of possible topologies approaches huge numbers when the number of species (tips or nodes) is sufficiently big. Felsenstein (1978) found a procedure to remove this difficulty. He started with two species initially and then added the other species successfully. Hence, the number of possible topologies is systematically reduced. Even though this procedure does not assure the maximum value for the tree being constructed, the results it gives have, in practice, acceptable computational complexity. Since the ML method sets up an algorithm to find the branch lengths rather than using a direct search, the likelihood value of some trees can be equivalent due to the pulley principle. The branch lengths are altered at each step of the algorithm until the highest likelihood value is found. In spite of the fact that the ML method requires too much computational time for large genome sequences, the results it predicts are very appropriate to the phenomenological tree. The NJ method builds the best tree by using the distance (nucleotide differences) between each species (or nucleotide sequences). The distance matrix of the tree is established from nucleotide sequences which is originally an unresolved tree as a star-like tree. Afterwards, the distance matrix is modified by calculating the differences between the genome sequences and the average divergence of these sequences from all other sequences is taken into account separately. The two sequences which have the smallest value in the modified distance matrix are joined in a single node which is regarded as an ancestor of these two sequences. The single node is replaced by two descendant sequences in the distance matrix and the distance matrix is modified again. The iteration would run N  3 times, where N is the number of species (or sequences). The NJ model is fast and gives a unique topology for the best tree because the tree is constructed on the local mathematical relations. The phylogenetic tree of related species covers implicit information on how species evolved and adapted to the nature of their environment throughout different time periods. However, the accuracy of the assessments of evaluated trees is seldom investigated. The central question is whether or not the predicted phylogenetic tree is correct and reliable, because the methods for obtaining the phylogenetic tree only use the DNA sequence of species whose life forms are observed today. The main contribution of this study is to compare and test a simulated tree with the estimated evolutionary tree obtained from the above-mentioned statistical methods. For this purpose, the actual phylogenetic or evolutionary tree is produced from an individual based model. The simulation model considered here is called the Tangled-Nature (TaNa) model (Christensen et al., 2002; Hall et al., 2002) which emphasizes the co-evolution of individuals. The TaNa model has proven to be successful for use in the evolutionary phenomena seen in nature such as punctuated equilibrium, gradually decreasing extinction rate, and increasing diversity and power-law lifetimes (Christensen et al., 2002; Hall et al., 2002; Rikvold and Zia, 2003; Rikvold and Sevim, 2007). The remaining part of the paper is organized as follows: The TaNa model is briefly explained in Section 2. In Section 3, the phylogenetic tree of the model is created from the simulation result and transitional forms are depicted in it. Then, the trees of the ML and the NJ methods are constructed using the Molecular Evolutionary Genetics Analysis (MEGA) program (Tamura et al., 2011). In Section 4, the interaction network among the species seen in the phylogenetic tree is investigated and the last section is devoted to the summary and conclusion.

217

2. The model The TaNa model is an individual-based stochastic model of evolutionary ecology. As in the case of DNA sequence, the species are represented by binary strings whose elements are purine and pyrimidine. Because of computational burden, genome length is small (only 30 bits) in comparison to the real genome and when a mutation occurs (a change in the genome sequence), a new species appears in the system. In other words, genetic variety, namely phenotype, is ignored. The success of offspring probability, i.e., the reproduction ability or fitness of an individual i, is given as P i ðtÞ ¼

exp½Hðni ; tÞ A ½0; 1; 1 þ exp½Hðni ; tÞ

ð1Þ

where the weight function, H, is given by 1 1 2X J n ðtÞ  μNðtÞ: cNðtÞ j ¼ 0 ij j L

Hðni ; tÞ ¼

ð2Þ

Here c specifies the constant interaction strength, N(t) is the total population at time t, the pair interaction term between species, Jij, has a non-zero coupling with 0.25 probability. The non-zero elements of the fixed interaction matrix, Jij, are taken as random distribution whose range is ½ 1; þ 1 at the beginning of simulation. Self-interaction, namely cannibalism (J ii ¼ 0), is ignored. If the individual i interacts individuals at position j, as either a prey or predator, the occupancy, nj(t), makes a contribution to the weight function. Since nj(t) is the total population of a species j, the normalization, nj ðtÞ=NðtÞ, corresponds to the population density. μ determines physical environment and the average sustainable total population size of habitat. μ corresponds to the inverse of Verhulst carrying capacity. A time step of the model consists of the following dynamics: first a randomly chosen individual is killed with a constant probability, pkill. At a reproduction step following this annihilation event, a randomly selected individual reproduces asexually with an offspring probability, Eq. (1). The successful individual gives two offspring before it dies and the genes of each offsprings are exposed to a low mutation rate, pmut, as well. The occurrence of a mutation does depend on the current state of genome, as is in the memoryless Markov process. One generation contains the NðtÞ=pkill time steps. The model evolving the above steps finds quasievolutionary stable strategies (qESS) and these long periods are interrupted by short evolutionary active, hectic, periods. This feature is seen in Fig. 1. Simulation starts with a population on a randomly assigned position in the genome space and a rapid diversification occurs by mutations to the neighboring sites. A relatively stable ecosystem is

Fig. 1. Time series of occupation of genome space. A dot is placed for each of the occupied positions in the genome space. The genotypes are enumerated in an arbitrary way along the y-axis. Parameters are c¼ 0.5, L ¼ 30, pmut ¼ 0:002, pkill ¼ 0:2 and μ ¼ 0:0002.

218

O. Canko et al. / Journal of Theoretical Biology 382 (2015) 216–222

formed at which mutual interactions shape a (meta)stable dynamical system, the first qESS. Although two more existing qESS structures are explicitly seen in Fig. 1, we have focused on a generation positioned nearby at the end of the first qESS for the phenomenological tree. The analyzing of species diversification and the phenomenological tree will be discussed in the following section.

3. Phylogenetic tree The phylogenetic tree gives a lot of information about inferred evolutionary relationships among biological species. Nevertheless, most of the phylogenetic tree methods predict a tree based on its own algorithm. In addition, the verifiability of these methods is of importance. Evolution needs both a very long time and exceptionally rare events to prevent carrying out an in vitro experiment. Moreover, the incompleteness of fossil records is another problem in verifying the uniqueness and truthfulness of the phylogenetic tree. A simulation is a very useful and beneficial tool to confirm the evolutionary process from basic and general ingredients such as reproduction, mutation and natural selection and to test the predicted tree. The simulation begins with a randomly selected genome sequence whose decimal label and population size are 712889241 and N n ¼  ð1=μÞlog pkill =ð1 pkill Þ ¼ 6931, respectively. The species are labeled by decimal numbers corresponding to their binary genome sequence. This initial species lives for 83 generations and two important species appear, labeled as 779998105 and 712823705, at the end of the first generation. Even though, these species disappear at 99 and 103 generations, respectively, the long-lived species originate from these two species. Four main branches of the tree, namely the red–green and pink–blue colored lines in Fig. 2a, originate from 779998105 and 712823705,

Fig. 2. (a) Actual phylogenetic tree of 187 species. The vertical category axis simply displays the label of species arbitrarily without a scaled increment or decrement from the originating species, 712889241. The distinct branching is revealed by four different colors for 45 species. (b, c) The blown-up figures of the first era of origination. (For interpretation of the references to color in this figure, the reader is referred to the web version of this paper.)

respectively. At the beginning of the simulation, a high level of evolutionary activity is encountered and origination and extinction rates are very high until a qESS structure is established. Lots of transitional species appear and disappear in this relatively short, hectic, time period of co-evolution, see Fig. 2b and c. For the purpose of clarity, the other short-lived species are not depicted in Fig. 2, if they are not ancestors or predecessors of a species which lives at the focused generation 40,000. In other words, flashing species have been ignored because these species generally originate from the main species that are the backbone of the qESS and they cannot sustain their existence in the environment by themselves. There are 187 species at the generation 40,000 but only 45 of them live for more than 35,000 generations, strictly speaking almost the whole qESS period, see Table 2 for the full list in the Appendix. The sequences of the species in Fig. 2 are numbered from S1 to S45 from bottom to top in Table 2. While relatively short-lived species are represented by black-colored lines in the actual tree, they are not taken into account in the phylogenetic tree calculations. Hence, four distinct branches are classified by the simulated phylogeny of Table 2 and are represented by different colors according to the originating individual. Even though red and green species originate from a common ancestor, the genome of the green species is phylogenetically close to the pink and blue ones in terms of Hamming distance (number of different positions), as seen in Table 2. It should also be mentioned that a disadvantage of the TaNa model is the back- and perpetualmutations, since the ergodicity of the Markov process makes the original genomes quite quickly accessible from the mutant genomes in a small genome length. The dotted cyan line in the inset figure of Fig. 2b is an illustrative example of back mutation. For consistency purposes, the back mutations are discarded and treated as non-intersecting tree's branches and all the species are linked to their first ancestor whenever they originate. The ML method is a widely used method to comprehend and predict the phylogenetic tree of evolution. It is constructed on three bases. Firstly, a model is proposed for nucleotide sequence change. Then, different hypotheses about evolutionary history are evaluated in terms of the probability by which the hypothesized history would be consistent with the observed data. Finally, the hypothesis with the highest probability or likelihood is chosen. The advantages of the ML method are that it generally has low variance, its estimation is least affected by sampling error and it is invulnerable to violations of the assumptions in the proposed model. However, maximum likelihood is computationally too expensive to perform for more than a few sequences because it requires searching for all possible combinations of the tree topology (Strimmer, 1997). In this work the analysis contains 45 different genome sequences and evolutionary analysis of the phylogenetic tree was performed by using the MEGA 5 program (Tamura et al., 2011). The evolutionary history was derived by using the ML method based on the Kimura 2-parameter model (Kimura, 1980). The tree with the highest log likelihood ( 310.2721) is given in Fig. 3. The percentage of replicated trees in which the associated taxa clustered in the bootstrap test (100 replicates) is shown next to the branches. The initial tree(s) for the heuristic search was obtained by applying the NJ method to a matrix of pairwise distances estimated using the ML approach. The tree is drawn to scale, with branch lengths measured in the number of substitutions per site (0.05). All ambiguous positions were removed for each sequence pair. The tree of the ML method is not, on its own, adequate for phylogenetic tree construction. When constructing a tree, multiple methods are frequently used and compared. Sometimes, local topology is better estimated by the NJ model than by the ML model. However, the drawback of the NJ method is that it does not

O. Canko et al. / Journal of Theoretical Biology 382 (2015) 216–222

Fig. 3. The constructed phylogenetic tree produced from ML method. Numbers appearing at the internal nodes indicate bootstrap values based on 100 replicates. The species are indicated by colored circle according to original tree. (For interpretation of the references to color in this figure caption, the reader is referred to the web version of this paper.)

test alternative topologies, i.e., instead of exploring all tree-space, it begins with an unrooted star-like tree and finds a unique tree. Furthermore, it does not take into account all the information in a multi-alignment and only uses pairwise distance. The tree calculated by the NJ method is seen in Fig. 4. In Figs. 3 and 4, while the horizontal axes of the trees show the amount of genetic variations, the vertical axes do not have any physical meaning and they are just used to separate species in an orderly way. In the trees, the state of the interior nodes is not known; it is evaluated from the external tips of known-species ðS1; S2; …; S45Þ via likelihood of the shape of the tree. The numbers appearing at the top of branches represent the degree of reliability of the node which is calculated from the bootstrap test. The reason why the branches have small numbers is that the string of genomes is so narrow that the close Hamming distance between two strings allows a node to jump quite readily to a different spot in the tree. For example, relatively big numbers are encountered at the internal nodes among S1, S2, and S19 since the Hamming distance between S1 and S2, S2 and S19 is just one. On the other hand, the Hamming distance between S1 and all the other red symbolized species is two, except for S3. However, S3 is not joined to any of these internal nodes because the Hamming distance

219

Fig. 4. The predicted tree from NJ method. Bootstrap supports in NJ analysis appear at the top of internal branches. The species are indicated by colored circle based on original tree. (For interpretation of the references to color in this figure caption, the reader is referred to the web version of this paper.)

between S3 and all the other red symbolized species is also one, but except for S2. A similar property is seen among S30, S34, and S35. This explains why both methods wrongly embedded the S30 species among the blue symbolized species. The low numbers appearing at the internal node of the blue colored species can again be attributed to short Hamming distance; for instance, the Hamming distance between S33 and all the other blue symbolized species, and also S31 is within one difference. In order to avoid the short genome length problem, one remedy could be that the huge fixed interaction matrix used in the model is modified by a mathematical relation instead of saving it in RAM. In this way, the efficient use of computer resources permits us to the handling of distinctly separated, very long genome lengths. Despite these difficulties, both methods, nevertheless, are properly connecting the species between S1 and S19 (represented by red dots at the tips) into the common branches. Furthermore, they caught the historical relationship between species S30 and S31 (colored by pink and green dots at the tips, respectively) and from S32 to S45 (symbolized by blue dots at the tips). However, both methods wrongly joined S30 and S35 to a common ancestor. The similarity

220

O. Canko et al. / Journal of Theoretical Biology 382 (2015) 216–222

Table 1 Comparison values of trees is produced by TOPD/FMTS software (Puigbo et al., 2007). Method⧹Model

TaNa–ML

TaNa–NJ

ML–NJ

Percentage of taxa in common Nodal distance (pruned/unpruned) Split distance [differents/possibles] Disagreement [taxa disagree/all taxa] Nodal distance random (pruned/unpruned) Split distance random [differents/possibles] Disagreement random [taxa disagree/all taxa]

95.6 (5.89 7 0.13/6.157 0.14) 0.99 7 0.01 [79.50 70.87/80] [38.007 1.00/43] (4.39 7 0.27/4.58 7 0.29) 0.99 7 0.01 [79.53 70.85/80] [41.30 7 1.85/43]

100.0 (5.52/5.52) 0.98 [82/84] [43/45] (4.447 0.32/4.44 70.32) 0.99 7 0.01 [83.54 70.89/84] [43.487 1.77/45]

95.6 (5.197 0.14/5.42 7 0.15) 0.92 7 0.01 [73.50 7 0.87/80] [37.75 7 0.43/43] (4.377 0.30/4.56 7 0.32) 0.99 7 0.02 [79.447 1.17/80] [41.46 7 2.01/43]

between the actual tree and the tree obtained by NJ method in terms of S31 is better than by the ML method. After the above qualitative discussion, a numerical analysis will be complementary to the evaluation of the comparisons of the trees. For this reason, we have used TOPD/FMTS (TOPological Distance/From Multiple To Single) software (Puigbo et al., 2007). TOPD/FMTS software includes different methods and gives comparison values of unrooted trees and also the percentage of overlapping taxa. In Table 1, the nodal method begins a pairwise distance matrix produced from comparison of the number of nodes that distinguishes each taxon from the other taxa. The nodal distance shows root-mean-squared distance of tree's matrices. When zero distance value stands for the identical tree, deviation from zero shows the difference of trees. If the trees have overlapping leaf-sets without being the same taxa, appropriately pruned trees can also be compared. Disagree method points out disagree phylogenetic position via comparing two trees. The taxon having disagree position is discarded and the gain in the split distance is calculated. The taxon which has the highest gain is removed for the following iteration. The procedure continues until split distance being zero. The software also provides an assessment whether the similarity between trees is better than random. As a result, percentage of taxa, the nodal and split distances between TaNa model and NJ method are better/lower than that of TaNa model and ML method.

4. Network structure Another interesting relation can be inferred from the food-web structure of the phylogenetic tree. Even though species which evolved from a common ancestor have a closely relevant genome, the diet of the species strongly affects the selective pressure as well as environmental conditions. Functionality among species can be grouped as community, which can be derived from interspecies competitions such as antagonistic and mutualistic interactions. For example, the stability of an ecological community is very sensitive to omnivores, by virtue of the fact that the extinction of a predator triggers secondary extinctions via indirect effects (Solé and Montoya, 2001). If a co-evolutionary avalanche emerges from the intrinsic dynamics of biology, the long passive period (qESS) is ceased by sudden bursts of activity without the requirement of the disasters, such as a meteorite impact (Bak and Sneppen, 1993). In order to detect the community structure, Newman and Girvan (2004) defined modularity by the following formula which reflects the quality of partitioning a network into densely connected vertices (or nodes): X Q¼ ðeii  a2i Þ; ð3Þ i

where eij is the half of edges that connects the vertices between P groups i and j, and ai ¼ j eij , i.e., the total fraction of the edges ending at vertices in group i. Modularity measures the fraction of

Fig. 5. Four communities are separated by the dashed circles. There is a relatively high density of edges within, rather than among, the groups. The radius of a node is proportional to its population (ni) by 50  log ð10  ni Þ.

the edges within-community over the expected value in the randomized network. The randomized network (or null model) is constructed by the halved interactions to be randomly rewired among the species. Therefore, modularity is a quantity that denotes an access of interactions from null model. The non-zero values of modularity correspond to a deviation from the random network and the modularity values ranging from 0.3 to 0.7 indicate that there is an obvious division (Newman and Girvan, 2004). In this work, an unweighted, undirected network is produced from the non-zero value of the fixed interaction matrix. In other words, the edges correspond to the interaction between species and eij ¼ eji ¼ 1=2, if an interaction exists, otherwise zero. The modularity optimization is performed by using Blondel et al.'s (2008) algorithm. The numerical calculation resulted in the value 0.19 for Q, which means that the food-web cannot split into compartments sufficiently. Four subgroups of the network are shown in Fig. 5. There is a relatively high density of edges within, rather than among, the subgroups. However, no obvious division among the subgroups can be seen in Fig. 5 due to the low modularity value. In-silico evolution model finds that the qESS communities are almost entirely mutualistic and very highly connected (Rikvold, 2007). The total number of connections among the species is 275; however, the expected connections in the complete graph should be 495 since one-fourth of the species are assumed to be interacting with the model. We see that the extant species constituting the backbone of qESS support themselves with low interaction rate and tightly connected to each other. On the other hand, it is seen that the modularity values do not increase as the system ages (Canko et al., 2015). There is still a sufficient number of edges among the communities which prevent good division. If different habitats (μi and μj) are considered,

O. Canko et al. / Journal of Theoretical Biology 382 (2015) 216–222

221

Table 2 The labels of 45 persistent species and their genome sequences. The species are numbered from S1 to S45 corresponding from the bottom to the top of Fig. 1.

habitat's boundary could be contributing to the compartments. It is seen from Fig. 5 that the sizes of communities are close to each other. Four communities found here are consistent with the real five food web analyses which give the number of compartments ranging from one to six (Krause et al., 2003). The evolutionary tree is not confirmed by the food-web structure, namely the physiology of two species that originate from a common ancestor could be different in order to maintain their existence in the habitat.

5. Summary and conclusion In this work, the actual phylogenetic tree is compared with the predicted tree by the ML and NJ methods. The real tree is produced by an in silico model in which a species produces two offspring whose genomes are subject to a constant mutation rate. In the

model, the new mutant species generally disappears very quickly but sometimes it can sustain its existence under selection pressure. Meanwhile, in the speciation processes passing through the transitional forms, the ancestor and descendant can live together until one of them becomes extinct. Accordingly, they can be found in the same fossil records and be dated with the same age. However, both the ML and NJ methods terminate the ambiguous or unknown ancient species and branching of the interior node emerges at this point. Furthermore, these methods do not predict the transitional forms and who the common ancestor is, they only evaluate the likelihood of a given tree. Even though these difficulties can be addressed as a troublesome mathematical concept, the trees of the models produced from the knownspecies at the tip have a good similarity to the correct tree topology and are quite reasonable for four branches. The similarities between original and predicted trees are also supported by TOPD/FMTS software. In particular, the tree and its branching

222

O. Canko et al. / Journal of Theoretical Biology 382 (2015) 216–222

obtained from the NJ method match the actual tree more than that obtained from the ML method. We also observed that the evolved species can reside in a different compartment in the food-web. The connections between communities are still intense, i.e., each species is closely linked to all others which prevents them from properly distributing into different compartments. In other words, the low value of modularity implies that there is no explicit compartment within a habitat. The food-web resolution also affects the discovery of compartments (Krause et al., 2003). Acknowledgments This research was supported by The Scientific and Technological Research Council of Turkey (TÜBİTAK) under Grant no. 111T735 and by Erciyes University Research Funds under Grant no. FDA2013-4638. Appendix See Table 2. References Bak, P., Sneppen, K., 1993. Phys. Rev. Lett. 71, 4083. Blondel, V.D., Guillaume, J.-L., Lambiotte, R., Lefebvre, E., 2008. J. Stat. Mech.: Theory Exp. 2008, P10008.

Canko, O., Taskin, F., Argin, K., 2015. Appl. Soft Comput., submitted for publication. Christensen, K., Hall, M., di Collobiano, S.A., Jensen, H.J., 2002. J. Theor. Biol. 216, 73. de Queiroz, K., Donoghue, M.J., 1988. Cladistics 4, 317. de Queiroz, K., Donoghue, M.J., 1990. Cladistics 6, 61. Eck, R.V., Dayhoff, M.O., 1966. Atlas of Protein Sequence and Structure. National Biomedical Research Foundation, Silver Springs, MD. Felsenstein, J., 1978. Syst. Biol. 27, 27. Felsenstein, J., 1981. J. Mol. Evol. 17, 368. Hall, M., Christensen, K., di Collobiano, S.A., Jensen, H.J., 2002. Phys. Rev. E 66, 011904. Hennig, W., 1966. Phylogenetic Systematics. University of Illinois Press, Urbana. Huxley, J.S., 1940. The New Systematics. Oxford University Press, Oxford. Kimura, M., 1980. J. Mol. Evol. 16, 111. Krause, A.E., Frank, K.A., Mason, D.M., Ulanowicz, R.E., Taylor, W.W., 2003. Nature 426, 282. Mayr, E., 1942. Systematics and the Origin of Species. Columbia University Press, New York. Mayr, E., 1969. Biol. J. Linn. Soc. 1, 311. Newman, M., Girvan, M., 2004. Phys. Rev. E 69, 026113. Puigbo, P., Garcia-Vallve, S., McInerney, J.O., 2007. Bioinformatics 23, 1556. Queiroz, K., Gauthier, J., 1992. Annu. Rev. Ecol. Syst. 23, 449. Rikvold, P.A., Sevim, V., 2007. Phys. Rev. E 75, 051920. Rikvold, P.A., Zia, R.K.P., 2003. Phys. Rev. E 68, 031913. Rikvold, P.A., 2007. J. Math. Biol. 55, 653. Saitou, N., Imanishi, T., 1989. Mol. Biol. Evol. 6, 514. Saitou, N., Nei, M., 1987. Mol. Biol. Evol. 4, 406. Simpson, G.G., 1961. Principles of Animal Taxonomy. Columbia University Press, New York. Solé, R., Montoya, J., 2001. Proc. R. So. B: Biol. Sci. 268, 2039. Strimmer, K.S., 1997. Maximum Likelihood Methods in Molecular Phylogenetics. Herbert Utz Verlag, Munchen. Tamura, K., Peterson, D., Peterson, N., Stecher, G., Nei, M., Kumar, S., 2011. Mol. Biol. Evol. 28, 2731.

Phylogenetic tree and community structure from a Tangled Nature model.

In evolutionary biology, the taxonomy and origination of species are widely studied subjects. An estimation of the evolutionary tree can be done via a...
2MB Sizes 2 Downloads 6 Views