Advanced Review

TOPOFOLD, the designed modular biomolecular folds: polypeptide-based molecular origami nanostructures following the footsteps of DNA Vid Koˇcar,1 Sabina Božiˇc Abram,1 Tibor Doles,1,2 Nino Baši´c,3 Helena Gradišar,1,2 Tomaž Pisanski3,4 and Roman Jerala1,2∗ Biopolymers, the essential components of life, are able to form many complex nanostructures, and proteins in particular are the material of choice for most cellular processes. Owing to numerous cooperative interactions, rational design of new protein folds remains extremely challenging. An alternative strategy is to design topofolds—nanostructures built from polypeptide arrays of interacting modules that define their topology. Over the course of the last several decades DNA has successfully been repurposed from its native role of information storage to a smart nanomaterial used for nanostructure self-assembly of almost any shape, which is largely because of its programmable nature. Unfortunately, polypeptides do not possess the straightforward complementarity as do nucleic acids. However, a modular approach can nevertheless be used to assemble polypeptide nanostructures, as was recently demonstrated on a single-chain polypeptide tetrahedron. This review focuses on the current state-of-the-art in the field of topological polypeptide folds. It starts with a brief overview of the field of structural DNA and RNA nanotechnology, from which it draws parallels and possible directions of development for the emerging field of polypeptide-based nanotechnology. The principles of topofold strategy and unique properties of such polypeptide nanostructures in comparison to native protein folds are discussed. Reasons for the apparent absence of such folds in nature are also examined. Physicochemical versatility of amino acid residues and cost-effective production makes polypeptides an attractive platform for designed functional bionanomaterials. © 2014 Wiley Periodicals, Inc. How to cite this article:

WIREs Nanomed Nanobiotechnol 2014. doi: 10.1002/wnan.1289

∗ Correspondence

NATURAL BIOPOLYMER NANOSTRUCTURES

to: [email protected]

1 Department

of Biotechnology, National Institute of Chemistry, Ljubljana, Slovenia 2 Excellent NMR - Future Innovation for Sustainable Technologies Centre of Excellence, Ljubljana, Slovenia 3 Faculty

of Mathematics and Physics, University of Ljubljana, Ljubljana, Slovenia 4 Faculty of Mathematics, Natural Sciences and Information Technologies, University of Primorska, Koper, Slovenia Conflict of interest: The authors have declared no conflicts of interest for this article.

B

iological systems are based on the organized exchange of components and energy as well as on structured biopolymers. With exception of lipids, most biopolymeric assemblies are produced by the self-assembly of covalently linked monomers of amino acids, nucleotides, and monosaccharides. While polysaccharides are composed of monosaccharide units, their polymerization is accomplished by the consecutive enzymatic reactions, often performed

© 2014 Wiley Periodicals, Inc.

wires.wiley.com/nanomed

Advanced Review

Year growth of total protein PDB structures

100 90 80 70

(b)

Year growth of unique protein folds by CATH

1400

Thousands

(a)

1200 1000

60

800

50 600

40 30

400

20 200

2013

2011

2012

2010

2009

2008

2007

2006

2005

2004

2003

2001

2002

1999

2000

1998

1997

1996

1995

1994

0

1993

2013

2011

2012

2010

2009

2008

2007

2006

2005

2004

2003

2001

2002

1999

2000

1998

1997

1996

1995

1994

1993

10

0

FIGURE 1 | Number of determined tertiary structures of proteins and number of different protein folds. (a) Growth of the released Protein Data Bank (PDB) structures per year in last two decades http://www.pdb.org/pdb/statistics/contentGrowthChart.do?content=total&seqid=100 (accessed March 28 2014) (dark gray: number of released PDB structures per year, light gray: total number of PDB structures). (b) Data on the growth in the number of folds available in the PDB according to the CATH classification http://www.pdb.org/pdb/statistics/contentGrowthChart.do?content=foldcath (accessed March 28 2014)1 (dark gray: number of new folds per year, light gray: total number of unique folds). Growth of the new protein folds stopped in the last years.

in different cellular compartments, which limits the variability and versatility of their acquired three-dimensional (3D) shapes. Nucleic acid polymers, on the other hand, arise by replication of the existing linear polynucleotide sequence, which enables not only precise and scalable encoding of heritable information, but also the evolutionary variability through mutations, such as point mutations, indels, and large segment rearrangements. Polypeptides as the polymers of amino acids are encoded by the linear sequence of nucleotides utilizing the complex cellular machinery performing transcription and translation. The variability of monomeric units, 4 nucleotides in the case of DNA and 20 naturally occurring amino acids in proteins, allows for an extremely high combinatorial diversity that encodes a wide range of nanoscale structures and can host different functions within the defined 3D scaffold. Nature, by and large, uses folded proteins as carriers of functional properties, such as catalytic functions, binders of small molecules, or interactions with other biopolymers, such as proteins, nucleic acids, carbohydrates, or membranes. Despite some native structural variability, the main function of nucleic acids is to store and transmit information encoded within the linear nucleotide sequence in the context of a myriad of proteins involved in regulation, duplication, translation of the nucleotide sequence, and even their own synthesis. Folding of proteins into versatile nanostructures is defined by a large number of weak cooperative short- and long-range interactions encoded by the amino acid sequence of polypeptides. A delicate balance between the entropic and enthalpic contributions results in relatively weak net stabilizing interactions. As a result of such a complex interplay of interactions,

it is still very challenging to predict the protein tertiary structure from the primary structure based only on the aforementioned principles, despite the fact that all the information required by nature for the correct folding of proteins is encoded in the amino acid sequence itself. The problem of designing entirely new protein folds distinguishable from those observed in nature is, however, even more challenging. Currently there are over 100,000 experimentally determined protein structures in public databases, which are classified into approximately 1300 different protein folds (Figure 1). Interestingly, the number of protein folds plateaued in the last couple of years, despite the fact that more now than ever new protein structures are being determined each year. Despite the possibility that difficult to crystalize proteins have eluded structure determination, this paucity of newly discovered protein folds indicates that nature seems to have utilized a limited number of folds. A scientific quest for designing novel protein folds might therefore shed new light on our current understanding of the protein folding. A quintessential problem of protein folding predictions is the apparent lack of straightforward relations between the amino acid sequence and its tertiary structure. On the contrary, structure of DNA is determined mainly by the complementarity of Watson–Crick base pairs. Somewhat less obvious sequence-structure relations are characteristic of RNA transcribed from DNA by the action of RNA polymerases in the form of single-stranded chains. RNA molecules, such as tRNA, subunits of ribosomes or aptamers (Figure 2), can also fold into complex and compact tertiary structures. Structures of aptamers, for example, are as difficult to predict as structures of the folded protein domains, as their folds are

© 2014 Wiley Periodicals, Inc.

WIREs Nanomedicine and Nanobiotechnology

DNA

(a)

TOPOFOLD: designed modular biomolecular folds

Protein

(b)

Compact fold

(c)

proposal to use stably branched asymmetric Holliday junctions to prepare two-dimensional (2D) and 3D lattices.7 Despite the latter not being realized until only a few years ago, his initial proposal had broad and far-reaching consequences for a newly emerged field of research. Several strategies for DNA nanoassembly were reported over the years (Figure 3), according to which constructs can be categorized as: 1. ‘multi-strand assembly’ adopting structure only upon interaction with other strands, resulting in finite-sized 2D or 3D objects (Figure 3(a) and (b));

(d)

Modular fold

2. ‘hierarchical assembly’ composed of repeating structural units (modules) connected via sticky end or T-junction cohesion (Figure 3(c)–(e)); FIGURE 2 | Compact native and modular topological folds of nucleic acids and proteins. Nucleic acids and proteins can form compact folded structures, stabilized by multiple interactions of monomeric units. Both polymers can also be engineered into topological folds defined by the pairwise interactions between modules. Examples of compact native structures and designed modular folds for nucleic acids and proteins are shown. The cobalamin riboswitch aptamer adopts a complex compact tertiary structure (4FRG) (a), stabilized by interactions, similar as the compact fold of a protein, in this example lysozyme (4I8S) (b). A DNA tetrahedron composed by the hierarchical self-assembly2 (c) and a single-chain polypeptide tetrahedron3 (d) with a hollow core have been designed by the modular approach.

also stabilized by the numerous weak interactions. The programmable nature of nucleic acids, however, led to the development of structural DNA and RNA nanotechnology based on modular pairwise interactions as opposed to natural compact folds (Figure 2), whose state-of-the-art currently by far precedes the use of alternative biopolymers, such as polypeptides, polysaccharides, or any other man-made nanomaterials. Watson–Crick base pairing greatly simplified folding predictions by reducing the problem to the sequence of nucleotides rather than using a computationally more expensive approach when considering all atoms. In the following section we address the current status of nucleic acid-based nanotechnology leading to inspiration and opportunities for polypeptide-based nanoassemblies. More comprehensive examinations of the field and its future challenges and perspectives have been published elsewhere in several reviews.4–6

DNA NANOSTRUCTURES—THE FOREFRONT OF NANOASSEMBLY Nadrian Seeman pioneered the field of structural DNA nanotechnology over three decades ago with a

3. ‘scaffold-based assembly’ where up to several hundreds of oligonucleotides arrange around a long single-stranded DNA scaffold, where they act as staples that guide the folding of the scaffold into a 2D or 3D object (Figure 3(f)); and 4. ‘single-strand assembly’ whose structure relies on intramolecular interactions (Figure 3(g)).

Multi-strand Assembly The basic constituent building blocks of multi-strand assembly are short single-stranded oligonucleotides that comprise multiple binding domains, pairwise complementary to binding domains of other oligonucleotides in the designed structure. Although one can regard them as modular building blocks,8,14 they differ from the tiles described in the next section as they adopt no structural preferences on their own. They obtain the defined structure through Watson–Crick associations between matching binding domains. Thus, cohesion results in structure and vice versa. The first multi-strand nanoassemblies emerged as 3D geometrical objects, where DNA helices and junctions were mapped to the edges and vertices of convex polyhedra, respectively.2,15,16 Such DNA cages were envisioned as transport vehicles for targeted drug delivery in nanomedicine.17 The initial ligation-closure experiments involving the synthesis of a DNA cube15 and a truncated DNA octahedron18 showed that branched junctions were flexible and therefore geometrically not well defined.15 This led to a realization, that the resulting nanostructures could be described as topological rather than geometrical species, as in solution they could adopt multiple tertiary structures.5 In contrast, convex polyhedra whose faces are all triangles, such as a tetrahedron,2

© 2014 Wiley Periodicals, Inc.

wires.wiley.com/nanomed

Advanced Review

Multi-strand assembly

Hierarchical assembly

(a)

(c) Anneal

(d)

(b) (e) Anneal

Scaffolded assembly (f)

Single-strand assembly (g) Anneal

FIGURE 3 | Strategies for the DNA nanoassembly. (a) Multi-strand assembly of two-dimensional (2D) structures using single-stranded tiles

(SSTs).8 SSTs are color-coded to present pairwise associations between corresponding binding domains. This does not imply sequence identity. Each SST has a unique sequence and the final structure is fully addressable. Assembly of three-dimensional (3D) structures was also demonstrated using SSTs. (b) Multi-strand assembly of a DNA tetrahedron.2 (c–e) Structural motifs for hierarchical nanoassembly. (c) A paranemic crossover (PX)9 tile (left) and a double crossover (DX)10 tile (right). In contrast to PX tiles, DX tiles are topologically entangled. (d) Comparison of an asymmetric Holliday junction7 (left) and a branched four-point star motif (right), which offers higher rigidity at the point of branching. (e) A DNA tetrahedron11 assembled from three-point star motifs with sticky ends. (f) Scaffold-based assembly: short oligonucleotides are designed such as to guide the folding of a long single-stranded DNA into a predetermined planar or 3D object of nanodimensions.12 (g) Single-strand assembly approach relies exclusively on intramolecular bonding to guide the folding of a DNA molecule. Models presented in (a) and (c–f) were obtained from the library of parts using NanoEngineer-1 3D CAD software and the final images were processed using QuteMol13 molecular visualization system.

a bipyramid,16 etc., exploit essentially the principle of tensional integrity.19 In such cases the structure’s geometric form is balanced by intrinsically opposing forces of rigid helices pushing outward in the axial direction and single-stranded linkers in DNA junctions pulling the ‘edges’ together. Recently Yin and coworkers8,14 demonstrated a colossal scale-up in terms of size and complexity of multi-stranded designs. Using hundreds of synthetic 32-mer oligonucleotides constructed as concatenates of four 8-mer binding domains, they were able to produce over a hundred different 2D and 3D shapes. Because the four-binding domains conceptually serve the same role as sticky ends in hierarchical assembly, authors referred to 32-mers as single-stranded tiles (SST). Owing to a unique 1∘ structure of each constituent SST, different shapes are constructed by

simple omission of SSTs from the one-pot assembly mix, akin to cutting out a certain figure from a piece of paper or carving a statue from a block of stone.

Hierarchical Assembly The main telltale characteristic of hierarchical nanoassemblies is the double nonoverlapping transition nature of the temperature profile obtained by temperature ramping, as apparent from work of Winfree and coworkers.20 This reflects a seemingly two-step process in self-assembly of hierarchical designs (Figure 4). High-temperature transition (or multiple partially overlapping transitions in case of structurally more complex modules21 ) corresponds to the assembly of oligonucleotides into topologically predefined modules, such as tiles, helical bundles or

© 2014 Wiley Periodicals, Inc.

WIREs Nanomedicine and Nanobiotechnology

TOPOFOLD: designed modular biomolecular folds

1

2

260-nm absorbance

1

Denatured state

2 Formation of modules

3 3

Sticky ends

Annealing

30

50

70

Temperature (°C)

FIGURE 4 | Characteristic temperature profile of hierarchical DNA nanoassemblies. The most distinguishable feature of hierarchical approaches is the bimodal nature of the first derivative of the temperature profile obtained by thermal annealing. This reflects two thermodynamically uncoupled processes: sequence-dependent binding of oligonucleotides into structurally defined modules, such as the three-point star motif (step 2) and their association into more complex assemblies by means of sticky end (step 3) or T-junction hydrogen bonding.

branched tiles, while the low-temperature transition corresponds to higher-order associations between modules through sticky end22 or T-junction23 hydrogen bonding. Thus, the former and the latter might be defined as a structure- and a cohesion-dependent transition phase, respectively. Owing to nonoverlapping temperature windows of the aforementioned transitions, one could easily select a step-wise or hierarchical nanoassembly approach, where the building blocks (modules) are annealed separately at higher temperatures and then combined to form larger assemblies via annealing at lower temperatures,24 instead of the more popular one-pot approach.20,21 Following the initial successes using double crossover tiles to prepare 2D lattices,10 molecular toolbox of available DNA modules expanded greatly in terms of structural diversity and complexity. Several different tiles based on formation of multiple-crossovers between parallel helices were reported,9,25 flexible DNA junctions were upgraded into reinforced and much sturdier branched tiles with different orders of connectivity24,26 and helical bundles with varying numbers of helices were used to form DNA nanowires,27 to name a few. Additionally, DNA wires and lattices with varying surface patterns and cavities of different shapes and sizes were reported successfully organizing proteins24 and metallic nanoparticles.28 However, such potentially infinite lattices were usually limited to under several

micrometers in dimension and were prone to structural errors, which arose from differences in stoichiometric imbalances between constituent oligonucleotides and mispriming events during cohesion.4,5 In addition to the construction of 2D arrays, hierarchical nanoassemblies consisting of branched tiles were used to prepare objects that can be mapped to the edges of polyhedral structures like an icosahedron,29 tetrahedron,11 dodecahedron,11 and a buckyball.11 Studies of the influence of single-stranded poly(dT) linker lengths on geometric properties of polyhedra and their one-step assembly yields have shown that a consensus must be achieved in terms of optimal linker lengths used for a particular DNA nanostructure. The group of Chengde Mao showed that the linker length in branched tiles is intimately related to structural rigidity which in turn tips the scale to preferential assembly of a particular nanostructure. The imposed rigidity of the tree-point branched tile using three-nucleotide poly(dT) linkers lead to assembly of nanostructures with larger polyhedral angles, such as the dodecahedrons and buckyballs, in contrast to the formation of tetrahedrons when five-nucleotide linkers were used.11 In the case of using five-point branched tiles, the change of only one base in the length of the linker (from five to four nucleotides) meant the difference between successful assembly of the icosahedron and the formation of DNA 2D pseudo-crystals.29 Molecular

© 2014 Wiley Periodicals, Inc.

wires.wiley.com/nanomed

Advanced Review

dynamics (MD) simulations have shown that longer linkers are able to explore more conformations within the structure and allow larger rotational adjustments along the edges in order to prevent deformation of hydrogen bonds at the margins of dsDNA edges.30

Scaffold-Based Assembly The emergence of scaffold-based approaches fueled an immense expansion of DNA nanotechnology over the last decade. Although Yan et al.31 were the first to prepare a finite-sized 2D DNA lattice constructed from tiles assembled along a single-stranded DNA molecule, which served as a guide for position-specific incorporation of oligonucleotides, it was only after Rothemund’s paper, on what later became known as DNA origami,12 that this strategy became prevailing in structural DNA nanotechnology. DNA origami technique enabled efficient one-pot assembly of a long single-stranded scaffold using approximately 200 in silico-designed oligonucleotides to guide its folding into a preselected 2D shape. The most characteristic traits of the DNA origami technique, which made it ‘the next best thing’ in structural DNA nanotechnology since the inception of the field itself,7 are the previously unimaginable high yields of self-assembly (over 90% for 2D structures), high structural fidelity with few errors, high tolerance to stoichiometric imbalances (works well for 2- to 10-fold excess of oligonucleotides over the scaffold strand) and enzymatic stability.12,32 Position-specific decorations of fully addressable DNA origami structures were presented,33 and several design tweaks expanded the scaffold-based assembly into the third dimension.34,35 More detailed overviews of the DNA origami technique, its impact and numerous applications have been published in several other great reviews.36,37

Single-Strand Assembly It was suggested that polyhedra be designed as long single strands of DNA which fold by means of intramolecular hydrogen bonding to enable replication using PCR or in vivo amplification. This approach was first attempted by Joyce and coworkers,38 who constructed a 1.7-kb single-stranded DNA that folded into an octahedron. Despite the fact that authors used five 40-mer oligonucleotides to assist with the assembly, which amounts to approximately 25% of base pairs present in the final structure, the majority of pairing arose intramolecularly via paranemic crossovers, resulting in an octahedron with double helical edges. By definition, paranemic crossovers do not involve

intertwining of DNA strands; therefore, neither knots nor catenanes were formed.38 The first discrete single-stranded DNA nanostructure with exclusively intramolecular interactions was the tetrahedron constructed by the group of Yan.39 Because of topological restrictions imposed by the antiparallel nature of DNA, their design consisted of five edges composed of single DNA helices and a particular double helical edge. This approach was necessary, considering that a topology of a completely antiparallel stable double trace of a tetrahedron does not exist (see the mathematical analysis below). In contrast, at least one topological solution for a triangular prism with all antiparallel edges does exist, as was demonstrated by He et al.40 Both examples show that long single strands of DNA are able to escape kinetic traps during folding in order to self-assemble into complex 3D structures, such as sequence-encoded polyhedra, which raises the question whether this procedure could be applied to the construction of other polyhedral DNA cages as well. Because of entirely intramolecular associations, single-stranded assembly approach eliminates experimental errors related to stoichiometric imbalances. However, this fact alone does not guaranty higher yields because of the more complex folding. Thus, one should remain mindful of the possible folding pathways with local energetic minima acting as kinetic traps while designing nucleic acid or polynucleotide sequences. Considering faster and simpler DNA synthesis, single-stranded DNA nanoassembly might serve as a prototyping tool for polypeptide nanoassembly in terms of topological studies and its effects on folding. One major drawback of such an analysis might be the topological restrictions imposed by the exclusively antiparallel orientation of DNA strands in the double helix, which means that only a subset of all possible topological solutions could be probed this way. Nevertheless, it could still capture some important features of the modular polypeptide design.

RNA—More Than a Double Helix RNA emerged as an alternative to DNA in nanotechnology applications because of its unique structural and physicochemical properties. The sugar moiety in RNA differs from that of DNA by an additional 2′ -hydroxyl group. This forces RNA to adopt an A-type conformation of the double helix, which results in increased thermal stability. Still, the presence of 2′ -hydroxyl makes RNA more chemically labile and prone to hydrolysis. RNA is also easily degraded by ribonucleases that are ubiquitously present in the

© 2014 Wiley Periodicals, Inc.

WIREs Nanomedicine and Nanobiotechnology

TOPOFOLD: designed modular biomolecular folds

environment, which aggravates the need for good laboratory practice. Yet, there are many advantages of using RNA instead of DNA. In principle, RNA follows similar rules of base pairing compared with those of DNA, leading to an antiparallel double helix with A:U and G:C hydrogen bonding. Letter U stands here for uracil, a demethylated thymine analog, which is used by RNA as one of the four nucleobases. Even so, in comparison to DNA, many more conformations are accessible to RNA, which can be the result of intramolecular folding, a larger tolerance for mismatched duplexes as well as numerous posttranscriptional modifications of bases, adding diversity to the final structure. More than 200 distinct folds6 were estimated for RNA molecules, which opens up opportunities for nanotechnology applications. The observation that large natural RNA molecules consist of mosaic structures and are able to retain its form and function even after the re-association of disjointed parts gave rise to the architectonic approach, which forms the basis of RNA nanotechnology. RNA architectonics relies on combinatorial assembly of tecto-RNA units available in vast libraries of structural and interacting motifs. These were obtained through data mining of existing NMR and X-ray structures of natural molecules as well as via in silico design. Structural motifs are RNA folds with given geometrical properties, such as the RA-motif41 that forms a 90∘ angle or a four-way junction.42 Structural variety of RNA nanoassemblies using tecto-RNA is thus limited by the geometric properties of currently available parts. An alternative approach to RNA nanodesign is similar to DNA nanotechnology approaches. Here we start with a distinct 3D structure and optimize the primary structure using positive and negative design in order to accomplish the desired folding. This is contrary to the usual approaches associated with the protein or RNA folding problems where we start from an amino acid or ribonucleotide sequence. Jaeger and coworkers43 successfully demonstrated construction of an RNA cube built from several strands of artificially designed RNA. Such control is possible to accomplish only with the in silico support for RNA folding predictions, which remains quite challenging owing to the more promiscuous nature of RNA in terms of base complementarity.

Challenges for Nucleic Acid Nanotechnology An important obstacle toward broader application and industrial use of DNA nanotechnology is the costs of DNA synthesis. Many of the aforementioned approaches rely on the assembly of short

oligonucleotides, nowadays mostly obtained through solid-phase chemical synthesis,44 which ipso facto limits the amount of material obtained at low costs. The proof of concept for the potentially inexpensive in vivo synthesis of complex DNA nanostructures has been published by Joyce and coworkers38 Their approach resulted in double-stranded amplification, so an additional step of obtaining single-stranded DNA was unavoidable. Yan and coworkers demonstrated that single-stranded concatenates incorporating simple nanostructures, such as the paranemic crossover motif (PX)45,46 and a four-arm junction,46 are readily amplified by means of rolling circle amplification45 or the use of phagemids for in vivo replication.46 To further uphold the potential for replication of more complex DNA nanostructures, they designed and successfully replicated in vivo a 286 nucleotide long single-stranded chain that folds into a nanoscale tetrahedron.39 The reported yield was 50 pmol of the tetrahedron from the 250 mL of bacterial culture, which amounts to just less than 20 ng of material per mL of culture. Recent experiments by Sobczak et al.47 demonstrated narrow transition-temperature ranges for folding scaffolded DNA objects, which seems to corroborate the notion that such folding is cooperative.12 Additionally, they established a protocol for isothermal refolding at a certain fixed temperature chosen from the lower boundary of the transition-temperature range for a particular DNA origami structure.47 This finding opens up possibilities for in vivo DNA origami assembly, however, expressing hundreds of oligonucleotides (e.g., by means of reverse transcription) inside a single cell to achieve in vivo assembly for biological applications or to reduce the costs of synthesis (over $1000 per DNA origami design) is a daunting task that has yet to be undertaken. The additional disadvantage of DNA nanostructures is the lack of the range of functionalities available to amino acids, which could be ameliorated by the incorporation of different chemical groups or even protein domains into the DNA nanostructure scaffold. Most types of DNA assembly allow the selected functionalities conjugated or non-covalently bound to the oligonucleotides to be introduced into the DNA nanostructures at the resolution of a few nanometers.24,33

DESIGNED PROTEIN SELF-ASSEMBLIES Proteins play an indispensable role in catalysis, molecular recognition, cellular interactions, cytoskeletal structure, and many other processes. Protein

© 2014 Wiley Periodicals, Inc.

wires.wiley.com/nanomed

Advanced Review

folding problem, even as a simplified version, such as the sequence threading or the reduced hydrophilic-hydrophobic interaction simplification, represents an NP-complete problem,48 which means that the reduction of the number of interacting elements has very strong effects on the computational complexity of the problem. Challenges in the design of new protein folds, stabilized by the same type of packing stabilization found in natural proteins, led to the consideration of designing protein folds based on interacting modules and toward the exploration of alternative protein folding approaches. Reduction of the folding problem to the interaction of only several dozens of elements would extremely simplify the task of structural design. Such a simplification can be achieved by the construction of folds composed of orthogonal pairwise interacting segments. The key to this approach is sufficient orthogonality of the building modules. This problem is solved quite easily for nucleic acids, where the orthogonality between modules of nucleotide sequences is defined simply by Watson–Crick base pairing rules. Natural protein structures are modular at the level of folding domains. This modularity is an important driving force of the protein evolution through combinations of independently folding domains. Larger polypeptide-based assemblies rely mainly on non-covalent interactions between protein subunits, including oligomers as well as larger polymers, e.g., cytoskeletal protein assemblies or molecular machines. The first strategies for designed protein nanostructures relied on the interaction of linked natural protein oligomerizing domains. Several proteins composed of two smaller protein domains with different oligomerizing states can self-assemble non-covalently into discrete assemblies or protein lattices.49–52 Further designed nanostructures were prepared by engineering surfaces of native protein domains for specific non-covalent interactions and were analyzed in recent reviews.53–55 Structures of many pairwise interacting protein domains are available; however, folded protein domains are relatively large and therefore less suitable for the design of complex nanostructures as they occupy most of the space within the designed cages. This type of assembly is also heavily dependent on symmetry50,54 , although advanced protein modeling could also support construction of asymmetric assemblies.

interacting folded domains. In the most basic case single peptides forming 𝛼-helices or 𝛽-strands have been designed to self-assemble into globular or fibrous structures.56–59 One of the most suitable modular building elements, which can facilitate the rational design of new protein folds, is the coiled-coil motif. Coiled-coils are found in approximately 8% of naturally occurring proteins. Well understood rules for the formation of such structural elements and their substantial selectivity render coiled-coils an invaluable tool in the design of novel protein structures. Coiled-coils are composed of two or more helices that are wound around each other in either parallel or antiparallel orientation. They are characterized by a periodic heptad repeat with residue positions labeled as abcdefg (Figure 5(a)). Each heptad spans two helical turns measuring approximately 1 nm in length, which provides the variability in the length of each building module. Association of coiled-coil chains is governed by hydrophobic interactions between amino acids in positions a and d, resulting in a hydrophobic core and electrostatic interactions among oppositely charged residues at positions e and g or e and e′ (and g and g′ ) depending on the orientation. The specificity of interactions between heptads somewhat resembles base-pair specificity in DNA. The advantage of coiled-coil dimers in comparison to DNA base duplex is their ability to oligomerize in either parallel or antiparallel orientation, offering distinct benefits for structural design. The rules for coiled-coil formation, their oligomerization state, and interaction partner specificity have already been well established,60,61 resulting in the ability to design de novo coiled-coils.62–64 Application of the combinatorial pattern of electrostatic interactions and negative design-based hydrophobic interactions allow for an extension of the set of designed peptides beyond natural orthologous pairs, providing a valuable toolkit for designed polypeptide-based self-assembly.65,66 Other peptide-specific interacting structural elements, such as 𝛽-strands and hairpins in particular, could offer an alternative type of building elements for future polypeptide nanostructures. However, the current lack of knowledge to efficiently predict novel orthogonal interacting segments of this type renders their use less appropriate for nanoassembly of complex and geometrically well-defined nanostructures that require many of such orthogonal building elements.

Design of Orthogonal Building Blocks for Protein Self-Assemblies

Multiple-Module Polypeptide Assembly

Single polypeptide chains are potentially more suitable as building blocks compared with much larger

Association of coiled-coils can lead to the formation of nanofibers,67–69 membranes,70 nanotubes,71,72

© 2014 Wiley Periodicals, Inc.

WIREs Nanomedicine and Nanobiotechnology

TOPOFOLD: designed modular biomolecular folds

FIGURE 5 | Coiled-coil dimer-based design of topological modular protein folds. (a) Interactions underlying the stability and specificity of a coiled-coil dimer. The positions of seven amino acid residues (heptad repeat) are denoted by abcdefg. Positions ‘a’ and ‘d’ are typically occupied by hydrophobic residues, forming a hydrophobic core. Positions e and g are frequently occupied by charged residues that participate in the interhelical electrostatic interactions. Positions ‘b’, ‘c’, and ‘f’ can be chemically modified to introduce the desired function into the coiled-coil assembly. The specificity and orthogonality of the desired coiled-coil combination can be improved by the negative design, by introducing polar asparagine at the ‘a’ position that most favorably interacts with another Asn at the opposing chain, in order to maximize the difference between the designed and unwanted chain pairing. (b) Self-assembly of a polypeptide fold based on the concatenated coiled-coil-forming segments (green, yellow, violet). Topological fold from a single-polypeptide chain relies on the specific interactions between concatenated segments to pair in a selected orientation with their complementary interacting segments within the same chain. The topology of the self-assembled polypeptide chain is defined by the orientation and sequential arrangement of each coiled-coil pair.

nanostructured films,73 spherical structures,74 and responsive hydrogels.75,76 Linearly connected, interacting coiled-coil domains form fibrils based on staggered overlapping interactions. On the basis of the similar principles as the tethered oligomerizing folded domains, linked coiled-coil forming pairs were used as building elements. Formation of 2D or 3D structures requires at least three interacting partners. Tethering of dimerizing to the trimerizing coiled-coil peptides led to the assembly of spherical cages.77 The oligomerization of building block units with a small intrinsic curvature resulted in a hexagonal lattice that closed to form unilamellar spherical cage-like particles measuring around 100 nm in diameter. The size of the cage-like particles could be controlled by altering individual coiled-coils, demonstrating opportunities to

vary properties of assemblies by modifying the characteristics of modular building blocks.

Topological Fold from a Single-Polypeptide Chain While the oligomerizing peptide design strategy offers some distinct advantages, mainly because of its simplicity in design and synthesis, it is limited to symmetric structures. An important drawback is also the lower stability of modularly assembled structures compared with single-chain proteins. This is owing to the entropic cost associated with the intermolecular assembly of a unique 3D structure from several independent subunits. Intermolecular assemblies are concentration-dependent and produce

© 2014 Wiley Periodicals, Inc.

wires.wiley.com/nanomed

Advanced Review

FIGURE 6 | Topological design of the self-assembling polypeptide tetrahedron. (a) Twelve coiled-coil-forming peptide segments (marked as arrows) were concatenated in a defined order, connected by the tetrapeptide linker segments (SerGlyProGly, blue circle). Four parallel and two antiparallel orthogonal peptide pairs were used to construct a tetrahedron-forming polypeptide chain based on the publication Gradisar et al.3 (b) A schematic representation of the polypeptide path forming tetrahedron. (c) Molecular model of the tetrahedral fold. The edges are formed by orthogonal coiled-coil pairs. (d) Tetrahedral particles visualized by transmission electron microscopy (TEM). Samples of self-assembled polypeptide were negatively stained after Ni-NTA-coated nanogold beads were bound to one vertex of the tetrahedron via the hexahistidine tag. The representative tetrahedron-like structures from TEM image and projections of a tetrahedron model are presented.

stable structures only at relatively high concentrations, as opposed to folding of single-chain protein sequences.78 Self-assembly from a single polypeptide chain also supports formation of the most homogeneous particles. Recently a modular approach was applied to the design of polypeptide-based nanostructures based on a single chain of self-interacting coiled-coil forming peptides. Intramolecular folding of modular interacting chains provides for a new strategy of polypeptide assembly which requires at least six coiled-coil forming segments (three pairs) to form 2D (Figure 5(b)) and 12 segments (six interacting pairs) (Figure 6(a) and (b)) to form 3D polypeptide folds. In this fold, the rigid edges of the polyhedron are formed from the dimers of the two interacting segments. Tension between the edges in combination with the topological arrangement of interacting segments stabilized the tertiary structure. Gradisar et al.3 demonstrated the formation of a polypeptide nanoscale tetrahedron that self-assembles from a single polypeptide chain comprising 12 concatenated coiled-coil–forming segments connected by flexible peptide linkers (Figure 6). Although the building block segments do not form a

regular structure before they dimerize with interacting segments, the chain can be viewed as rigid segments with flexible peptide linkers that act as hinges. This approach relies on the ability of those segments to pair in the defined orientation with their complementary interacting segments within the same polypeptide chain. The selected shape of the polyhedral cage is deconstructed into the edges composed of orthogonal coiled-coil dimers followed by threading the polypeptide chain through all the coiled-coil forming segments in a single path, interlocking the chain into a stable structure. The topology of the self-assembled polypeptide chain is defined by the pairing orientation of each coiled-coil pair and their precise sequential arrangement. Since the beginning and the end of the polypeptide encompassing the polyhedron coincide at the same vertex, the polypeptide chain can be cyclically permuted. All single-chain paths have to start and end in the same vertex, which has been used to demonstrate the correct folding by the reconstitution of the split fluorescent protein.3 In case of a tetrahedron, graph theory analysis showed that it cannot be constructed exclusively from either parallel or antiparallel dimeric segments but requires a combination of

© 2014 Wiley Periodicals, Inc.

WIREs Nanomedicine and Nanobiotechnology

(a)

TOPOFOLD: designed modular biomolecular folds

(b)

(c)

FIGURE 7 | Mathematical topological solutions of self-assembling tetrahedron from a single chain. Three distinct topomeres built either from four parallel and two antiparallel (a) or three parallel and three antiparallel coiled-coil pairs (b, c) are possible.

both (Figure 7). Theory suggests that the tetrahedron could self-assemble from a single-polypeptide chain in the topology of three distinct topomeres built either from three parallel and three antiparallel, or four parallel and two antiparallel coiled-coil pairs. The topological solution comprising four parallel and two antiparallel coiled-coil dimeric edges was selected for the experimental realization (Figure 6). Three of the parallel pairs were selected from the designed orthogonal coiled-coil forming set,65 each composed of four heptads and designed based on the known coiled-coil stability and selectivity principles. In addition to the designed parallel hetero-dimers, one parallel homodimer based on the natural GCN479 and two antiparallel homodimers62,80 were used. The tetrapeptide Ser-Gly-Pro-Gly was successful as the flexible linker to connect the consecutive coiled-coil forming segments. The purpose of this linker is to prevent the continuity of the helix across two segments (a)

and the flexibility for the formation of edges with three of the linkers meeting at each vertex. The annealed tetrahedral polypeptide had substantially greater stability than its coiled-coil segment constituents, as expected because of the cooperativity with a midpoint of unfolding at 3 M GdnHCl. This topological fold comprised a large cavity within the folded polypeptide cage, which might be useful for different applications, such as cargo encapsulation or formation of artificial nanocompartments for catalysis (Figure 8). The designed polypeptide tetrahedron has edges comprised of four heptad repeats, which resulted in size of approximately 5 nm in length (Figure 6(c)), as confirmed by dynamic light scattering (DLS) analysis and atomic force microscopy (AFM) and transmission electron microscopy (TEM) imaging. Nanogold beads or other protein domains can be attached to the tetrahedron at the vertex, allowing coupling of nanogold beads to additionally estimate the size of (b)

FIGURE 8 | Comparison of the distribution of hydrophobic residues in a natural protein and in a topological protein fold. (a) In the majority of natural protein folds, a single hydrophobic core stabilizes the protein fold. In contrast to native proteins, polypeptide topofolds do not have a discrete hydrophobic core since their structure is defined by the topological arrangement of interacting segments. (b) A large cavity can be observed in the core of such proteins.

© 2014 Wiley Periodicals, Inc.

wires.wiley.com/nanomed

Advanced Review

the tetrahedron by electron microscopy (Figure 6(d)). Reshaping of the internal cavity volume could be achieved in two ways: (1) by increasing the length of coiled-coil segments that form polyhedral edges or (2) by increasing the number of faces via building more complex polyhedra. The former would require the use of coiled-coil building blocks extended by one or more heptad repeats, while the latter requires the increased number of available orthogonal coiled-coil pairs and will yet have to be explored. Expansion of the complexity of polypeptide polyhedra raises the need for a more comprehensive topological scrutiny. Supercoiling of longer coiled-coils, for example, might lead to the formation of topological knots. Further efforts therefore should be dedicated toward higher designability of TOPOFOLD structures through selection of optimal stable double traces and intelligent intramolecular pairwise rearrangements with respect to coiled-coil stabilities in order to avoid kinetic or topological traps that can compete with the anticipated structure.

MATHEMATICAL ANALYSIS OF SINGLE-CHAIN TOPOLOGICAL FOLDS Design of polyhedral shapes from the natural modular biopolymeric elements using the reduced representation in the form of concatenated rigid elements with flexible hinges can be supported by different mathematical tools to analyze the number and different properties of the possible paths. From the topological point of view, arbitrary polyhedra with dimeric modules mapped to their edges can in principle always be realized by a single chain of concatenated dimerizing segments with a defined order of interacting segments. This holds true as long as we have sufficient orthogonal parallel and antiparallel modules available for construction. This mathematical proof, however, does not mean that highly complex topological polypeptide folds could necessarily be implemented in reality. By contrast, the number of theoretically achievable single-chain nucleic acid-based polyhedra is limited by the fact that DNA can only form antiparallel and not parallel dimers.

A Probabilistic Argument Why Topofolds Are Rare in Nature In this section we use the general mathematical terminology to establish general results on the design of modular topological folds. It deals specifically with

concatenated modular segments that can form parallel or antiparallel dimers within segments of the same chain. The limitation to antiparallel pairing of nucleic acids limits the results for nucleic acids to a small subset of all topological polyhedra. The questions that we pose are what is the number of combinations and how probable is the random arrangement of interacting segments into a chain that could fold into the selected topological fold. A protein that is composed of a single-stranded polypeptide with 2m peptide constituents (joined with flexible linkers between each two consecutive segments), that are pairwise joined as m dimers in a polyhedral skeleton will be called a topofold. We impose two additional requirements. Namely, the strand is not allowed to cross itself at any vertex of the polyhedron and the angles around each vertex must form a single circuit, therefore prohibiting any vertex of a polyhedron to fall apart. It can be shown that in principle any polyhedron P may be assembled as a topofold. The analysis of probabilities of forming the segment that can fold into a selected polyhedron may provide arguments to the question why topofolds seem to be very rare or absent from the natural protein sequences. Here we give a probabilistic argument why topofolds have to be designed rather than randomly generated by shuffling interacting segments. We are given a polyhedron P. Let n be the number of its vertices, m the number of its edges, and f the number of its faces. By the well-known Euler polyhedral formula we have n − m + f = 2. For tetrahedron T we have n = 4, m = 6 and consequently f = 4. In a single string self-assembly the length of the path is 2m containing m pairs. Hence the string may be modeled as a path graph P2e + 1 , where edges of this graph represent peptides. In the case of tetrahedron, this is P13 . The sequence of edges is given as a string of literals w = s1 s2 … s2e , where each, possibly with a′ (prime), represents an edge and is a symbol from some alphabet of constituents S. Each symbol s from S appears exactly twice in w, either with a prime or without it. A prime means that the peptide binds in reversed direction with its counterpart, i.e., primes are used to distinguish antiparallel dimers from parallel ones. As usual, pairs of symbols are called dimers and we distinguish parallel and antiparallel dimers. Let p be the number of parallel dimers and let q be the number of antiparallel dimers. Then we have m = p + q.

© 2014 Wiley Periodicals, Inc.

WIREs Nanomedicine and Nanobiotechnology

TOPOFOLD: designed modular biomolecular folds

Finally we may have homo- and hetero-dimers. Let h denote the number of homo-dimers and let k denote the number of hetero-dimers. Again it holds m = h + k. Let us have the 2m ingredients S that may form a self-assembling polyhedron P. We may think that S is obtained from a valid self-assembling string by taking it apart. Two natural questions arise: 1. What is the number b(P, p, h) of all strings of length 2m that can be formed from these ingredients? This number is independent of P. In particular, it is independent of the number of parallel or antiparallel dimers. It only depends on the number of homo- and hetero-dimers. 2. What is the number a(P, p, h) of self-assembling strings, i.e., strings w formed out of S in such a way that w will (at least in theory) self-assemble into the polyhedron P? For the question 1 we give the following answer. If all the dimers are hetero-dimers, h = 0, the answer is clearly b(P, p, 0) = (2m) ! However, in general, if there are h homo-dimers, the answer is b(P, p, h) = (2m) !/2h . We may view this number in the following way: We start with a self-assembling string, take it apart and glue the edges in an arbitrary way back into a single string. By putting less strict conditions on strings w, its total number may be even larger. Consider all strings of length 2m composed of symbols from S with an arbitrary number of repetitions and not requiring that all symbols are actually present, and denote this number by B(P, p, h). If we consider S to be a set, then its size is given by |S| = 2m − h. The total number of strings with described properties is B(P, p, h) = (2m – h)2m . This number attains a maximum in case h = 0, i.e., there are no homo-dimers, and attains a minimum in case h = m, i.e., all dimers are homo-dimers. Clearly, ( ) m2m ≤ B P, p, h ≤ (2m)2m . The answer for 2 is much more complicated. The theory behind this enumeration problem was outlined in the paper by Fijavž et al.81 Actually, we may prove that if a sequence w self-assembles into polyhedron P then any of its cyclic shifts as well as their reversals self-assembly into the same polyhedron. For instance, one may show that abcabc self-assembles into a parallel triangle 𝚫. There are six shifts, but only three of them are distinct, namely abcabc, bcabca, and

cabcab. If we reverse those strings we obtain three more: cbacba, acbacb, and bacbac. But all of them are essentially the same—by permuting the symbols we can obtain any of them from any other. The set of all strings that self-assemble into P can be partitioned into classes, such that two strings belong to the same class if one can be obtained from the other by a combination of cyclic shift and substitution of symbols. We will only keep one member from each class and call it a template. The set of all templates for P will be denoted 𝒯P . It turns out that abcabc is the only template for triangle 𝚫. This example however does not reflect the most general situation. For a tetrahedron T there exist three templates: • edbef′ cb′ afdac • bed′ bcfe′ cadf’a • edbcfe′ b′ afdac’ If a symbol s appears as both s and s ′ it represents an antiparallel dimer otherwise it represents a parallel dimer. This description does not make a difference between homo- and hetero-dimers. The first string has two parallel pairs (p = 2) while the other two have three parallel pairs (p = 3). It was actually the first string that was realized in the study by Gradisar et al.3 The second and third sequences are topologically distinct yet they can be realized with the same set of ingredients. Based on the number p we can partition the set 𝒯T into two disjoint sets 𝒯T,2 and 𝒯T,3 . In general 𝒯P = Up 𝒯P,p where 𝒯P,p may be empty sets if polyhedron P admits no templates with p parallel edges. For each template of tetrahedron T we have in theory 2m = 12 strings (obtained by 12 cyclic shifts). For all three templates combined we would have 36 possibilities. But this number is too large. Namely, if you shift the second string by four places to the right, the original string can be obtained by substitution a → c, b → a, c → b, d → f , e → d, f → e. Instead of 12 we only get four essentially distinct strings by shifting bed′ bcfe′ cadf′ a. We will call two strings equivalent if one can be obtained from the other by substitution of symbols. Let shd (w) denote the cyclic shift of string w by d symbols to the right. The mapping shd is called an automorphism if shd (w) is equivalent to w. Note that sh0 (w) = sh2m (w) = w. So the formula for the number of possibilities goes like this: ∑ ( ) 2m a P, p, h = p!q!2k |Aut (T)| T ∈𝒯 P,p

where 𝒯P,p denotes the set of all templates with exactly p parallel dimers. Aut(T) is the group of all

© 2014 Wiley Periodicals, Inc.

wires.wiley.com/nanomed

Advanced Review

TABLE 1 Summary for the Triangle Template T

Polyhedron Triangle

abcabc

n

m

3

p

3

|Aut(T )|

3

a (P , p , h ) 3−h

6

36 × 2

b (P , p , h )

𝜋(P , p, h)

h

720/2

0.067

TABLE 2 Summary for the Tetrahedron Polyhedron

Template T

n

m

p

|Aut(T )|

a (P , p , h )

b (P , p , h )

𝜋(P , p, h)

Tetrahedron

edbef′ cb′ afdac

4

6

4

1

576 × 26 − h

12 !/2h

7.7 × 10− 5

bed′ bcfe′ cadf′ a

4

6

3

3

576 × 26 − h

12 !/2h

7.7 × 10− 5

edbcfe′ b′ afdac′

4

6

3

1

576 × 26 − h

12 !/2h

7.7 × 10− 5

automorphisms of a template string T. For conve∑ 2m , so the nience we will write 𝛼 (P, p) = T ∈ 𝒯P,p |Aut(T)| previous formula can be rewritten as ( ) a P, p, h = p!q!2k 𝛼 (P, p) = p! (m − p)!2m−h 𝛼 (P, p) . The probability 𝜋(P, p, h) is the quotient ( ) ) a P, p, h p! (m − p)!2m−h 𝛼 (P, p) 𝜋 P, p, h = ( )= (2m)!∕2h b P, p, h (

=

p! (m − p)!2m 𝛼 (P, p) . (2m)!

Parameter h miraculously disappeared from the formula. This means that the quotient is independent of h. It only depends on the number of parallel dimers. Let us give you an example. For the triangle (m = 3) there is only one template (see above) with p = 3, therefore 𝛼 (𝚫, 3) = 66 = 1 as |Aut(abcabc)| = 6. Hence, we obtain a(𝚫, 3, h) = 3 ! 3 ! 23 − h . Moreover b(𝚫, 3, h) = 6 !/2h . Therefore: ( ) 3!0!23−h 3!23 1 = = ≈ 0.067. 𝜋 𝚫, 3, h = h 6! 15 6!∕2

( ) 4!2!26 𝛼 (T, 4) ≈ 7.7 × 10−5 𝜋 T, 4, h = 12! ( ) 3!3!26 𝛼 (T, 3) 𝜋 T, 3, h = ≈ 7.7 × 10−5 . 12! Tables 1 and 2 show a summary for the previous two examples. Further examples are given in Table 3. We may define this quotient in another way using B(P, p, h) instead of b(P, p, h). The quotient Π(P, p, h) = a(P, p, h)/B(P, p, h) depends on p and h and can be written as ( ) p! (m − p)!2m−h 𝛼 (P, p) , Π P, p, h = ( )2m 2m – h where 0 ≤ p, h ≤ m. By using calculus one can prove that the maximum for Q(P, p, h) is attained for h = m. In this case ( ) p! (m − p)!2m−h 𝛼 (P, p) , Π P, p, h = ( )2m 2m − h where 0 ≤ p, h ≤ m.

Let us do it for tetrahedron T: { } | ( )| ′ ′ ′ ′ 𝒯T,4 = edbef cb afdac , |Aut edbef cb afdac | = 1, | | 12 𝛼 (T, 4) = = 12 1 { ′ ′ ′ ′ ′} 𝒯T,3 = bed bcfe cadf a, edbcfe b′ afdac , ( | ′ ′ ′ )| |Aut bed bcfe cadf a | = 3, | | ( | ′ ′ ′ )| |Aut edbcfe b afdac | = 1, | | 12 12 𝛼 (T, 3) = + = 16 3 1

The nonequivalent templates were listed by a computer program. The algorithm starts with faces of the given polyhedron and then explores all possible gluings of faces, keeping only those with a one-face embedding and discarding the equivalent occurrences. Because no topofolds have been detected in the natural proteins identified so far, we have chosen two simple models of random composition of linear chains with predefined orthogonal segments. Our calculations confirm that the sequence that is compatible with a selected topofold is nontrivial and occurs at the increasingly small probability for larger polyhedra (Table 3).

© 2014 Wiley Periodicals, Inc.

WIREs Nanomedicine and Nanobiotechnology

TOPOFOLD: designed modular biomolecular folds

TABLE 3 Probabilities for Several Polyhedra for Both Models Polyhedron P

m

p

|𝒯P ,p |

𝛼(P , p)

𝜋(P , p, h)

( ) max 𝜋 P , p, h

Π(P , p, m)

max Π (P , p, m))

Triangle

3

3

1

1

0.067

0.067

0.0082

0.0082

Tetrahedron

6

3

2

16

7.7e-5

7.7e-5

2.6e-7

2.6e-7

4

1

12

7.7e-5

Four-sided pyramid

8

0

2

24

1.2e-5

3

6

96

8.5e-7

2.5e-10

4

9

120

8.5e-7

2.5e-10

5

4

64

5.6e-7

1.6e-10

6

7

96

1.7e-6

4.9e-10

0

3

36

1.0e-6

3

4

72

2.5e-8

2.1e-12

4

7

108

2.5e-8

2.1e-12

5

12

216

5.0e-8

4.1e-12

Three-sided prism

Three-sided bipyramid

9

9

p

p

2.6e-7 1.2e-5

1.0e-6

3.4e-9

8.7e-11

6

14

192

6.6e-8

3

9

126

4.4e-8

4

6

108

2.5e-8

2.1e-12

5

1

18

4.1e-9

3.5e-13

3.4e-9

8.7e-11

5.5e-12 1.2e-7

3.6e-12

6

4

72

2.5e-8

2.1e-12

7

10

144

1.2e-7

9.7e-12

9.7e-12

Octahedron

12

12

26

512

1.6e-9

1.6e-9

3.1e-15

3.1e-15

Cube

12

4

12

288

1.8e-12

3.4e-12

3.5e-18

6.4e-18

6

34

768

2.6e-12

5.0e-18

8

22

528

3.4e-12

6.4e-18

( ) Note that probabilities of the second model (upper bound max Π P, p, m given in the last column) are in each case much smaller than the corresponding p ( ) probabilities of the first model max 𝝅 P, p, h . All polyhedra can be realized by proteins but only four-sided pyramid and three-sided prism admit totally p

antiparallel realizations as required by DNA self-assembly.

PARALLELS AND DIFFERENCES BETWEEN NATURAL AND TOPOLOGICAL POLYPEPTIDE FOLDS Tertiary structure of protein folds in natural proteins is stabilized by a continuous hydrophobic core, which glues together the secondary structure or less regular segments of the polypeptide chain. Long-range interactions and cooperativity are important for the unique folding of proteins. The topology of single-chain modular polypeptide assemblies is in contrast uniquely defined only by the sequential order of coiled-coil segments in the chain and the lack of one segment or their scrambled order prevents the correct self-assembly. Many different coiled-coil building blocks can be selected for each segment, provided we maintain the orthogonality within the selected set for each concatenated chain. As described above, only a tiny fraction of all possible arrangements of segments from the selected set can result in the correct fold. Similar to

the modular topofolds, many natural proteins share the same fold, yet they may not exhibit any statistically significant sequence identity that would indicate the common evolutionary origin, demonstrating that folding determinants are restricted to a limited number of positions82 and that the convergent evolution led to the repeated discovery of the same stable protein fold in nature. Long-range interactions between amino acid residues, which are separated in the sequence, direct the packing of segments into the tertiary structure of natural protein folds. Although this is particularly apparent for 𝛽-strands, where all interactions between the 𝛽-strand-forming segments are nonlocal, long-range interactions are also essential for packing of helices and non-regular segments of chains. This being said, long-range interactions between 𝛼-helical segments of modular topological folds lead to packing via mechanisms resembling that of the packing of 𝛽-strands. A notable distinction, however, is the lack

© 2014 Wiley Periodicals, Inc.

wires.wiley.com/nanomed

Advanced Review

(a)

(b)

480 Seq. 470 460 450 440 430 420 410 390 380 370 360 350 340 330 320 310 300 290 280 270 260 250 240 230 220 210 200 190 180 170 160 150 140 130 120 110 100 90 80 70 60 50 40 30 20 10

Seq.

220 210 200 190 180 170 160 150 140 130 120 110 100 90 80 70 60 50 40 30 20 10

FIGURE 9 | Contact map of designed tetrahedron in comparison to folded natural protein. Long-range interactions occur between the interacting coiled-coil forming segments of the topological protein fold in this case a tetrahedron (a) in comparison to the natural protein fold(s) comprising primarily 𝛽-strands, in this case cyan fluorescent protein (b), evaluated by the contact map. Parallel and antiparallel orientations of interacting segments can be observed in both maps as lines parallel and perpendicular to the diagonal.

of trans contacts between distinct edges because of structural restrictions imposed to the fold through the principle of tensional integrity (Figure 9). What could be the reason that topofolds have not been observed in natural proteins? The possible reasons we have considered include functionality, stability, folding, and evolvability. Lack of functionality is very unlikely to be the reason for the apparent absence of this type of structures as the ability to interact could be incorporated into this type of proteins. Reconstitution of the fluorescent protein triggered by the correct folding of single-chain tetrahedron has already been demonstrated.3 Moreover, the unique solvent-accessible cavity inside polypeptide polyhedra should make them suitable for many additional features different from natural proteins, such as encapsulation of molecules and possibly also engineering of catalytic sites.

Concerning the stability of designed polypeptide nanostructures we found that the designed tetrahedron is reasonably stable, comparable to average natural proteins, therefore the thermodynamic stability of topological protein folds is probably not problematic. Stability against the proteolytic degradation will have to be examined; however, it is likely that modifications in the design could make this type of proteins more or less prone to the proteolysis, depending on the requirements for selected applications. The ability of topological proteins to fold under the native conditions may represent a limitation. Indeed similar to the large majority of DNA-based nanostructures the designed tetrahedron was only folded in vitro by a slow annealing from the denaturing conditions. In bacteria, the tetrahedral polypeptide was produced in the form of insoluble aggregates, which may be either because of the high concentration

© 2014 Wiley Periodicals, Inc.

WIREs Nanomedicine and Nanobiotechnology

TOPOFOLD: designed modular biomolecular folds

and formation of intermolecular interactions or alternatively aggregation may occur because of the slow kinetics of folding. A highly connected protein network has indeed been observed at higher refolding concentrations. Fast transfer of the denatured polypeptide to the native refolding conditions resulted in the formation of incomplete or misfolded structures, so this may be an important concern. It is however likely that this limitation may be overcome by more careful design of the stability and the orthogonality of the building elements or even by engineering of the folding pathway. The last and perhaps the most plausible explanation is that this type of folds may be difficult to evolve by the random mutations as the absence or wrong positioning of even a single segment may completely prevent formation of the correct topological fold and consequently the functionality. We cannot dismiss the argument that nature may not have sampled all possible protein folds as some folds are clearly easier to evolve by random mutations or are, on the other hand, more resistant to mutations. In nature, protein domains are combined in a limited number of ways. Only a small fraction of the possible domain combinations actually exist in genomes. Vogel et al. identified around 1400 two- and three-domain combinations that are conserved in many different multidomain proteins.83 The sequential N- to C-terminal order of natural domain combinations tends to be strongly conserved as a result of functional or evolutionary reasons.84 Although the tetrahedron is the smallest 3D polyhedron, a smaller 2D object should be significantly easier to evolve. An example of such a fold is a trigon composed of six parallel coiled-coil forming segments that can form three dimeric coiled-coils. These could arise by a duplication of segments in succession of three homodimeric segments. Finally, it may also be possible that this type of protein folds are yet to be discovered among natural proteins.

OUTLOOKS FOR THE (BRIGHT) FUTURE OF PROTEIN ORIGAMI Compared to the elaborate field of structural DNA nanotechnology, the field of designed modular polypeptide nanotechnology is at the very beginning. Only two of four reported strategies implemented in DNA nanotechnology have been used for construction of polypeptide-based nanostructures, i.e., the multi-chain assembly approach combining multiple interacting peptides and the singlechain assembly

approach. In contrast to DNA origami, which uses hundreds of different components, peptide-based assembly has so far been limited to only several distinct tethered peptides employing high symmetry for the assembly. Single-chain polypeptide assembly approach, however, does not rely on intrinsic symmetrical properties of its constituent building blocks and each edge and vertex can in fact be uniquely addressed. Scaffold-based assembly is probably less attractive for polypeptide-based nanostructures than it is for DNA nanostructures due to the requirement for a very large orthogonal set of building elements and a very long polypeptide scaffold. The hierarchical assembly approach remains to be tested for larger polypeptide-based assemblies which are likely to require several chains to be combined, akin to the DNA-based assemblies. This could be achieved by assembly of several pre-folded subunits via specific interacting domains, which could be either coiled-coils or other protein domains. The hierarchical assembly will likely overcome the obstacles toward folding larger assemblies but will, in addition, also decrease the number of required orthogonal building elements because of the possibility to reuse some of the elements in several different subunits. The platform of polypeptide topofolds offers numerous challenges and opportunities and it is difficult to list them all. Some of the more important are the expansion of the toolbox of orthogonal parallel and antiparallel coiled-coils, both in terms of their number and length, precise design of linker regions and structures of vertices, inclusion of tri-, tetra- and other oligomeric coiled-coils besides dimers, construction of complex polyhedra beyond tetrahedron, introduction of (multiple) functionalities into nanostructures in order to drive the field toward applications, in vivo folding, to explore designed DNA-protein hybrids, the regulated assembly/disassembly of polypeptide nanostructures by chemical, biological, or physical signals, design of the folding pathways, development of better modeling tools, hierarchical polypeptide assembly, investigation of different building blocks apart from coiled-coils, etc. Those challenges may require substantial amount of time and effort to achieve but, given the potential rewards, it is likely that the field will grow and develop in order to move the current boundaries. Given the exciting recent development of the field of DNA nanostructures, which 30 years after its inception is more vigorous than ever, we have all the reasons to be optimistic.

© 2014 Wiley Periodicals, Inc.

wires.wiley.com/nanomed

Advanced Review

REFERENCES 1. Sillitoe I, Cuff AL, Dessailly BH, Dawson NL, Furnham N, Lee D, Lees JG, Lewis TE, Studer RA, Rentzsch R, et al. New functional families (FunFams) in CATH to improve the mapping of conserved functional sites to 3D structures. Nucleic Acids Res 2013, 41:D490–D498.

17. Juul S, Iacovelli F, Falconi M, Kragh SL, Christensen B, Frøhlich R, Franch O, Kristoffersen EL, Stougaard M, Leong KW, et al. Encapsulation and release of an active enzyme in the cavity of a self-assembled DNA nanocage. ACS Nano 2013, 7:9724–9734.

2. Goodman RP, Schaap IA, Tardin CF, Erben CM, Berry RM, Schmidt CF, Turberfield AJ. Rapid chiral assembly of rigid DNA building blocks for molecular nanofabrication. Science 2005, 310:1661–1665.

18. Zhang Y, Seeman N. Construction of a DNA-truncated octahedron. J Am Chem Soc 1994, 116:1661–1669.

3. Gradisar H, Bozic S, Doles T, Vengust D, HafnerBratkovic I, Mertelj A, Webb B, Sali A, Klavzar S, Jerala R. Design of a single-chain polypeptide tetrahedron assembled from coiled-coil segments. Nat Chem Biol 2013, 9:362–366.

20. Schulman R, Winfree E. Synthesis of crystals with a programmable kinetic barrier to nucleation. Proc Natl Acad Sci U S A 2007, 104:15236–15241.

4. Lin C, Liu Y, Rinker S, Yan H. DNA tile based selfassembly: building complex nanoarchitectures. Chem Phys Chem 2006, 7:1641–1647. 5. Seeman NC. Nanomaterials based on DNA. Annu Rev Biochem 2010, 79:65–87. 6. Jaeger L, Chworos A. The architectonics of programmable RNA and DNA nanostructures. Curr Opin Struct Biol 2006, 16:531–543. 7. Seeman NC. Nucleic acid junctions and lattices. J Theor Biol 1982, 99:237–247. 8. Wei B, Dai M, Yin P. Complex shapes self-assembled from single-stranded DNA tiles. Nature 2012, 485: 623–626. 9. Shen Z, Yan H, Wang T, Seeman N. Paranemic crossover DNA: a generalized holliday structure with applications in nanotechnology. J Am Chem Soc 2004, 126:1666–1674. 10. Winfree E, Liu F, Wenzler LA, Seeman NC. Design and self-assembly of two-dimensional DNA crystals. Nature 1998, 394:539–544. 11. He Y, Ye T, Su M, Zhang C, Ribbe AE, Jiang W, Mao CD. Hierarchical self-assembly of DNA into symmetric supramolecular polyhedra. Nature 2008, 452:198–201. 12. Rothemund PW. Folding DNA to create nanoscale shapes and patterns. Nature 2006, 440:297–302. 13. Tarini M, Cignoni P, Montani C. Ambient occlusion and edge cueing for enhancing real time molecular visualization. IEEE Trans Vis Comput Graph 2006, 12: 1237–1244. 14. Ke Y, Ong LL, Shih WM, Yin P. Three-dimensional structures self-assembled from DNA bricks. Science 2012, 338:1177–1183. 15. Chen J, Seeman N. The synthesis from DNA of a molecule with the connectivity of a cube. Nature 1991, 350:631–633. 16. Erben CM, Goodman RP, Turberfield AJ. A selfassembled DNA bipyramid. J Am Chem Soc 2007, 129:6992–6993.

19. Fuller R. Tensile-integrity structures. US Patent 3,063, 521, 1962.

21. LaBean T, Yan H, Kopatsch J, Liu F, Winfree E, Reif JH, Seeman NC. Construction, analysis, ligation, and self-assembly of DNA triple crossover complexes. J Am Chem Soc 2000, 122:1848–1860. 22. Mathieu F, Liao S, Kopatsch J, Wang T. Six-helix bundles designed from DNA. Nano Lett 2005, 5:661–665. 23. Hamada S, Murata S. Substrate-assisted assembly of interconnected single-duplex DNA nanostructures. Angew Chem 2009, 121:6952–6955. 24. Park SH, Pistol C, Ahn SJ, Reif JH, Lebeck AR, Dwyer C, LaBean TH. Finite-size, fully addressable DNA tile lattices formed by hierarchical assembly procedures. Angew Chem Int Ed Engl 2006, 45:735–739. 25. Li X, Yang X, Qi J, Seeman N. Antiparallel DNA double crossover molecules as components for nanoconstruction. J Am Chem Soc 1996, 118:6131–6140. 26. He Y, Chen Y, Liu H, Ribbe AE, Mao C. Self-assembly of hexagonal DNA two-dimensional (2D) arrays. J Am Chem Soc 2005, 127:12202–12203. 27. Park SH, Barish R, Li H, Reif JH, Finkelstein G, Yan H, Labean TH. Three-helix bundle DNA tiles self-assemble into 2D lattice or 1D templates for silver nanowires. Nano Lett 2005, 5:693–696. 28. Sharma J, Chhabra R, Liu Y, Ke Y, Yan H. DNAtemplated self-assembly of two-dimensional and periodical gold nanoparticle arrays. Angew Chem Int Ed Engl 2006, 45:730–735. 29. Zhang C, Su M, He Y, Zhao X, Fang PA, Ribbe AE, Jiang W, Mao C. Conformational flexibility facilitates self-assembly of complex DNA nanostructures. Proc Natl Acad Sci U S A 2008, 105:10665–10669. 30. Oteri F, Falconi M, Chillemi G, Andersen FF, Oliveira CLP, Pedersen JS, Knudsen BR, Desideri A. Simulative analysis of a truncated octahedral DNA nanocage family indicates the single-stranded thymidine linkers as the major player for the conformational variability. J Phys Chem C 2011, 115:16819–16827. 31. Yan H, LaBean TH, Feng L, Reif JH. Directed nucleation assembly of DNA tile complexes for barcodepatterned lattices. Proc Natl Acad Sci U S A 2003, 100: 8103–8108.

© 2014 Wiley Periodicals, Inc.

WIREs Nanomedicine and Nanobiotechnology

TOPOFOLD: designed modular biomolecular folds

32. Mei Q, Wei X, Su F, Liu Y, Youngbull C, Johnson R, Lindsay S, Yan H, Meldrum D. Stability of DNA origami nanoarrays in cell lysate. Nano Lett 2011, 11:1477–1482. 33. Chhabra R, Sharma J, Ke Y, Liu Y, Rinker S, Lindsay S, Yan H. Spatially addressable multiprotein nanoarrays templated by aptamer-tagged DNA nanoarchitectures. J Am Chem Soc 2007, 129:10304–10305. 34. Douglas SM, Marblestone AH, Teerapittayanon S, Vazquez A, Church GM, Shih WM. Rapid prototyping of 3D DNA-origami shapes with caDNAno. Nucleic Acids Res 2009, 37:5001–5006. 35. Han D, Pal S, Nangreave J, Deng Z, Liu Y, Yan H. DNA origami with complex curvatures in three-dimensional space. Science 2011, 332:342–346. 36. Kuzuya A, Komiyama M. DNA origami: fold, stick, and beyond. Nanoscale 2010, 2:310–322. 37. Saaem I, LaBean TH. Overview of DNA origami for molecular self-assembly. WIREs: Nanomed Nanobiotechnol 2013, 5:150–162. 38. Shih WM, Quispe JD, Joyce GF. A 1.7-kilobase singlestranded DNA that folds into a nanoscale octahedron. Nature 2004, 427:618–621. 39. Li Z, Wei B, Nangreave J, Lin C, Liu Y, Mi Y, Yan H. A replicable tetrahedral nanostructure self-assembled from a single DNA strand. J Am Chem Soc 2009, 131:13093–13098. 40. He X, Dong L, Wang W, Lin N, Mi Y. Folding singlestranded DNA to form the smallest 3D DNA triangular prism. Chem Commun (Camb) 2013, 49:2906–2908. 41. Chworos A, Severcan I, Koyfman AY, Weinkam P, Oroudjev E, Hansma HG, Jaeger L. Building programmable jigsaw puzzles with RNA. Science 2004, 306:2068–2072. 42. Laing C, Jung S, Iqbal A, Schlick T. Tertiary motifs revealed in analyses of higher order RNA junctions. J Mol Biol 2009, 393:67–82. 43. Afonin K, Bindewald E, Yaghoubian A, Voss N, Jacovetty E, Shapiro BA, Jaeger L. In vitro assembly of cubic RNA-based scaffolds designed in silico. Nat Nanotechnol 2010, 5:676–682. 44. Caruthers MH. Gene synthesis machines: DNA chemistry and its uses. Science 1985, 230:281–285. 45. Lin C, Wang X, Liu Y, Seeman NC, Yan H. Rolling circle enzymatic replication of a complex multi-crossover DNA nanostructure. J Am Chem Soc 2007, 129: 14475–14481.

48. Berger B, Leighton T. Protein folding in the hydrophobic-hydrophilic (HP) model is NP-complete. J Comput Biol 1998, 5:27–40. 49. Padilla JE, Colovos C, Yeates TO. Nanohedra: using symmetry to design self assembling protein cages, layers, crystals, and filaments. Proc Natl Acad Sci U S A 2001, 98:2217–2221. 50. Sinclair JC, Davies KM, Venien-Bryan C, Noble ME. Generation of protein lattices by fusing proteins with matching rotational symmetry. Nat Nanotechnol 2011, 6:558–562. 51. Doles T, Bozic S, Gradisar H, Jerala R. Functional self-assembling polypeptide bionanomaterials. Biochem Soc Trans 2012, 40:629–634. 52. Lai YT, Cascio D, Yeates TO. Structure of a 16-nm cage designed by using protein oligomers. Science 2012, 336:1129. 53. Lai YT, Tsai KL, Sawaya MR, Asturias FJ, Yeates TO. Structure and flexibility of nanoscale protein cages designed by symmetric self-assembly. J Am Chem Soc 2013, 135:7738–7743. 54. King NP, Sheffler W, Sawaya MR, Vollmar BS, Sumida JP, Andre I, Gonen T, Yeates TO, Baker D. Computational design of self-assembling protein nanomaterials with atomic level accuracy. Science 2012, 336: 1171–1174. 55. Bozic S, Doles T, Gradisar H. Jerala R. Curr Opin Chem Biol: New designed protein assemblies; 2013. 56. Potekhin SA, Melnik TN, Popov V, Lanina NF, Vazina AA, Rigler P, Verdini AS, Corradin G, Kajava AV. De novo design of fibrils made of short 𝛼-helical coiled coil peptides. Chem Biol 2001, 8:1025–1032. 57. Lamm MS, Rajagopal K, Schneider JP, Pochan DJ. Laminated morphology of nontwisting 𝛽-sheet fibrils constructed via peptide self-assembly. J Am Chem Soc 2005, 127:16692–16700. 58. Marini DM, Hwang W, Lauffenburger DA, Zhang SG, Kamm RD. Left-handed helical ribbon intermediates in the self-assembly of a 𝛽-sheet peptide. Nano Lett 2002, 2:295–299. 59. de la Paz ML, Goldie K, Zurdo J, Lacroix E, Dobson CM, Hoenger A, Serrano L. De novo designed peptidebased amyloid fibrils. Proc Natl Acad Sci U S A 2002, 99:16052–16057. 60. Woolfson DN. The design of coiled-coil structures and assemblies. Adv Protein Chem 2005, 70:79–112. 61. Mason JM, Muller KM, Arndt KM. Considerations in the design and optimization of coiled coil structures. Methods Mol Biol 2007, 352:35–70.

46. Lin C, Rinker S, Wang X, Liu Y, Seeman NC, Yan H. In vivo cloning of artificial DNA nanostructures. Proc Natl Acad Sci U S A 2008, 105:17626–17631.

62. Gurnon DG, Whitaker JA, Oakley MG. Design and characterization of a homodimeric antiparallel coiled coil. J Am Chem Soc 2003, 125:7518–7519.

47. Sobczak JP, Martin TG, Gerling T, Dietz H. Rapid folding of DNA into nanoscale shapes at constant temperature. Science 2012, 338:1458–1461.

63. Kammerer RA, Steinmetz MO. De novo design of a two-stranded coiled-coil switch peptide. J Struct Biol 2006, 155:146–153.

© 2014 Wiley Periodicals, Inc.

wires.wiley.com/nanomed

Advanced Review

64. Zaccai NR, Chi B, Thomson AR, Boyle AL, Bartlett GJ, Bruning M, Linden N, Sessions RB, Booth PJ, Brady RL, et al. A de novo peptide hexamer with a mutable channel. Nat Chem Biol 2011, 7:935–941. 65. Gradisar H, Jerala R. De novo design of orthogonal peptide pairs forming parallel coiled-coil heterodimers. J Pept Sci 2011, 17:100–106. 66. Thomas F, Boyle AL, Burton AJ, Woolfson DN. A set of de novo designed parallel heterodimeric coiled coils with quantified dissociation constants in the micromolar to sub-nanomolar regime. J Am Chem Soc 2013, 135:5161–5166. 67. Ryadnov MG, Woolfson DN. Engineering the morphology of a self-assembling protein fibre. Nat Mater 2003, 2:329–332. 68. Ryadnov MG, Woolfson DN. Fiber recruiting peptides: noncovalent decoration of an engineered protein scaffold. J Am Chem Soc 2004, 126:7454–7455. 69. Dong H, Paramonov SE, Hartgerink JD. Self-assembly of 𝛼-helical coiled coil nanofibers. J Am Chem Soc 2008, 130:13691–13695. 70. Peng X, Jin J, Nakamura Y, Ohno T, Ichinose I. Ultrafast permeation of water through protein-based membranes. Nat Nanotechnol 2009, 4:353–357. 71. Zhang S. Fabrication of novel biomaterials through molecular self-assembly. Nat Biotechnol 2003, 21: 1171–1178. 72. Ueda M, Makino A, Imai T, Sugiyama J, Kimura S. Rational design of peptide nanotubes for varying diameters and lengths. J Pept Sci 2011, 17:94–99. 73. Knowles TPJ, Oppenheim TW, Buell AK, Chirgadze DY, Welland ME. Nanostructured films from hierarchical self-assembly of amyloidogenic proteins. Nat Nanotechnol 2010, 5:204–207. 74. Gour N, Mondal S, Verma S. Synthesis and self-assembly of a neoglycopeptide: morphological

studies and ultrasound-mediated DNA encapsulation. J Pept Sci 2011, 17:148–153. 75. Petka WA, Harden JL, McGrath KP, Wirtz D, Tirrell DA. Reversible hydrogels from self-assembling artificial proteins. Science 1998, 281:389–392. 76. Banwell EF, Abelardo ES, Adams DJ, Birchall MA, Corrigan A, Donald AM, Kirkland M, Serpell LC, Butler MF, Woolfson DN. Rational design and application of responsive 𝛼-helical peptide hydrogels. Nat Mater 2009, 8:596–600. 77. Fletcher JM, Harniman RL, Barnes FR, Boyle AL, Collins A, Mantell J, Sharp TH, Antognozzi M, Booth PJ, Linden N, et al. Self-assembling cages from coiledcoil peptide modules. Science 2013, 340:595–599. 78. Park S, Cochran J. Protein Engineering and Design. Boca Raton, FL: CRC Press; 2009. 79. Lumb KJ, Carr CM, Kim PS. Subdomain folding of the coiled coil leucine zipper from the bZIP transcriptional activator GCN4. Biochemistry 1994, 33: 7361–7367. 80. Taylor CM, Keating AE. Orientation and oligomerization specificity of the Bcr coiled-coil oligomerization domain. Biochemistry 2005, 44:16246–16256. 81. Fijavž G, Pisanski T, Rus J. Strong traces model of self-assembly polypeptide structures. MATCH Commun Math Comput Chem 2014, 71:199–212. 82. Friedberg I, Margalit H. Persistently conserved positions in structurally similar, sequence dissimilar proteins: roles in preserving protein fold and function. Protein Sci 2002, 11:350–360. 83. Vogel C, Berzuini C, Bashton M, Gough J, Teichmann SA. Supra-domains: evolutionary units larger than single protein domains. J Mol Biol 2004, 336:809–823. 84. Han J-H, Batey S, Nickson AA, Teichmann SA, Clarke J. The folding and evolution of multidomain proteins. Nat Rev Mol Cell Biol 2007, 8:319–330.

© 2014 Wiley Periodicals, Inc.

TOPOFOLD, the designed modular biomolecular folds: polypeptide-based molecular origami nanostructures following the footsteps of DNA.

Biopolymers, the essential components of life, are able to form many complex nanostructures, and proteins in particular are the material of choice for...
3MB Sizes 0 Downloads 5 Views