Evolutionary Anthropology 23:50–55 (2014)

Crotchets & Quiddities

What Works Works. But What Works? Genomes As Works in Progress

KENNETH M. WEISS One of the most important debts that we owe to evolutionary theory is its rejection of Lamarck’s ideas that when individuals strive to achieve some objective their success is thereby written into the inheritance they pass on to their offspring. Instead, we recognize more clearly than even Darwin did (he hedged in his ‘pangenesis’ theory of inheritance1) that variation arises randomly with respect to any function it may have, and is screened by natural selection. Darwin’s view was that selection is a kind of Newtonian force, like gravity, that works even on the tiniest scale to preserve what’s favorable and remove what isn’t.2 In this sense, Darwin was an ‘adaptationist’, as are many biologists to this day. But how accurate is that view of life? In fact, it’s not so obviously true as is widely thought. For starters, it has been difficult to see in what way, for a sizeable fraction of observed biological variation, it could plausibly make any difference to selection. Instead, the evolutionary lesson may simply be that what works does work. Rather than a stringent screen for ‘survival of the fittest,’ a more apt description of selection may be as a more tolerant ‘failure of the frail’, that weeds out what really doesn’t work. However, much of the remain-

Ken Weiss is Evan Pugh Professor of Anthropology and Genetics at Pennsylvania State University. Email: [email protected]

C 2014 Wiley Periodicals, Inc. V

DOI: 10.1002/evan.21365 Published online in Wiley Online Library (wileyonlinelibrary.com).

ing variation proliferates in a selectively neutral way; that is, without any effect on fitness.2,3 The central problem in such disputes is that what Darwin saw, and what we see today, is a snapshot of a very slow process that took place in the past. We have to supply the animation, inferring its nature by comparison, guesswork, and population genetics theory that only loosely constrains what might happen. Darwin could consider only physical traits, but if those traits are relevant to evolution they must be inherited. Since we can now ‘see’ DNA, we can ask what works at this fundamental causal level of life. If strong adaptationist ideas are right, we should be able to show that the elements in our genome are the result of selective constraints. We should not find anything that hasn’t been adaptively useful. This is no mean task, since a human genome is more than 3,000,000,000 nucleotides long. We have about 20,000 protein-coding genes, but these together comprise only about 2% of the total genome; the remaining DNA is found in between these regions (Wikipedia: human genome). For several decades, the prevailing theory was that DNA is a necklace of protein codes strung together on a thread of nucleotides that didn’t do anything and were cutely dismissed as ‘junk DNA’. But such an assertion raises the interesting question of how could one ever allege that something has no function? The answer, if you’re an adaptationist, is simply to accept— assume — that there’s no such thing as a trait that has no adaptive value:

what is here must have been favored by natural selection’s eagle-eye. In this view, the cost of maintaining DNA, watching over its integrity, and producing copies of it in our billions of continually replicating cells must impose a huge energy burden to maintain a large functional vacuum of unused nucleotides that nature should abhor. Think of how much less you’d have to eat and compete for, and how much more nimble you’d be when a lion was after you, if you didn’t have to make literally miles and miles of useless DNA and haul it around all the time! Surely selection would not have tolerated such a burden. On the other hand, if you’re a skeptical neutralist, you’re happy to see evidence of functionless DNA. After all, stuff just happens, and the very act of searching for, and purging, useless DNA would itself impose a burdensome cost, a game not worth the candle. From that viewpoint, intergenic DNA is just a sea of flotsam in among the ships. But this view, too, is not accurate, because numerous functions have clearly been identified over the last quartercentury in noncoding DNA; that is, DNA that does not directly code for protein, and this discovery has led us to be more circumspect about using the dismissive term ‘junk’. And that’s where our story begins.

ALONG CAME ENCODE Now that whole genome sequences from humans and many other species are available, we’re able to investigate systematically what this DNA does. A few years ago, a project called ENCODE, an acronym for

What Works Works 51

CROTCHETS & QUIDDITIES

Figure 1. ENCODE project logo. Source: www.genome.gov/10005107. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]

Encyclopedia of DNA Elements, was launched “to identify all functional elements in the human genome sequence” (Fig. 1 and http://www.genome.gov/10005107).4 In late 2012, to great fanfare, ENCODE released its latest results. Three major approaches are taken to identify functional DNA elements: experimental, evolutionary, and statistical. In turn, these triangulate our approach to the problem by searching for mechanism, variation, and pattern. Mechanisms of the action of a specified bit of DNA can be experimentally identified. Variation is the result of evolutionary history, whereby a mechanism has been produced; once explained in one region of DNA, that sequence can be statistically analyzed within and between species to find similar patterns elsewhere in the genome where related function can be inferred. But life is not so straightforward, and we have a difficult time disentangling the empirical from the theoretical. This is easily seen by what is called the C-paradox.

THE C-PARADOX There is essentially no correlation between the size of a species’ genome, denoted by C for the number of nucleotides it contains, and the biological complexity of the species (see Wikipedia: C-value enigma). Even closely related species can have genomes that differ in size by orders of magnitude (powers of 10). Among the more dramatic examples are lungfish, with more than 300 times as much DNA as pufferfish, yet both have similar organs, body organization, and physiology. From an adaptationist perspective, this hits painfully close to home

because, for example, humans genomes are similar in size to mouse genomes and more than 40 times smaller than that of a lungfish. It would be unacceptably humbling to think that we’re 40 times simpler than fish! Not only that, but despite how spectacularly better we (think we) are than they, primate genomes are all basically identical in C-value (and very similar in sequence).5 On the other hand, if this DNA has no function, then we don’t need any explanation relating genomic and organizational complexity. We don’t have to ask how the pufferfish can get away with so little DNA or how the lungfish can have evolved lumbering around so much. However, if even airlines charge for overweight baggage, it’s fair to ask how such a cost-free burden can exist in life. If it can, then it would be possible for the extra DNA to have been a sequence reservoir, which, by fortuitous mutation from time to time, could acquire adaptively advantageous function. Even a strident pan-adaptationist should have no problem in principle with the possibility that functionless DNA baggage may be too difficult or costly for the organism’s cells to seek and purge. Although one might view it as adaptationists’ ultimate escape hatch, they could argue that ignoring functionless DNA is itself a form of adaptation! Perhaps it just so happened that pufferfish ancestors had some genome-cleaning mechanism and kept a clean house, while lungfish didn’t. These are interesting points to contemplate, but they’re moot unless we can know for sure whether noncoding DNA really has no other, more legitimate function. The word ‘gene’ was coined in 1909 by the Danish botanist Wilhelm Johannsen, to refer to the physical units of heredity — whatever they were. The molecular identity of DNA elements that code for protein was worked out over the following halfcentury, but the word has subsequently become a kind of semantic albatross hanging around our necks. That’s because we now know of many kinds of genomic elements having location and activity that are experimentally replicable, but that are not

directly protein-coding (see Box 1). We know what some of those elements do, but not everything in genomes is so clear. The noncoding RNA (ncRNA) and regulatory element categories are where simple explanations end and food-fights begin. This matters, because the way in which we evaluate what genomes do affects how we understand evolution itself.

WHAT’S IN A NAME? Poor Juliet thought that names didn’t make the named, but sometimes what’s in a name doesn’t always smell as sweet. The ENCODE reports4 were accompanied by an excessive, if perhaps nowadays typically orchestrated amount of self-congratulatory trumpeting by investigators, journals, and popular media. In this case, the name is ‘function’, a word I’ve been using rather casually; perhaps one needs to be more careful. The ENCODE papers named many different genomic functions such as those described in Box 1, and reported finding instances of them all around the genome, asserting that the bulk of the genome is, in fact, not just a junkyard bystander, as had been assumed. The impression given was that these discoveries would revolutionize everything except sliced bread (and, who knows, maybe that, too!). And, indeed, their very naming of all these functional sites caused troubles that would worry even a Montague or a Capulet. For example, if one isolates all the RNA from a given type of cell, sequences each molecule, and aligns the RNA sequence against the genome DNA sequence, it turns out that a high fraction—some have estimated half or even more of our DNA—is transcribed into RNA. The patterns are replicable, not just laboratory or cellular artifacts, and there is clear evidence of biologically important activity for some of the RNAs.7 This lends credence to the idea that there may be functions for the rest. But what functions? A few months after the ENCODE release, things hit the proverbial fan with a strong satirical blast by Dan Graur and colleagues8 and a less inflammatory one by Ford Doolittle,9 as well as others. Vitriolic exchanges went viral on the web. To try to

52 WEISS

CROTCHETS & QUIDDITIES

Box 1. A Few Types of Non-protein-coding Units in Genomes Centromeres and telomeres There are some jumbled sequences near the centromere of each chromosome, which are used in the process by which chromosomes are sorted when cells divide. These are largely made of fragments of sequence found elsewhere in the genome, so seem to have no additional function in the centromere. Similarly, the ends of chromosomes have highly repeated sequence elements called telomeres. These protect the chromosome from being damaged or eaten away by the chemical environment of the cell. So they have at least this function, and they have also been associated with biological aging. Introns - Most genes consist of protein coding regions interspersed with noncoding regions of DNA. The whole gene is transcribed into RNA, but the introns are spliced out in the making of mature messenger RNA. Some elements of introns are involved as splicing signals, but most introns seem to have little if any other function. Regulatory elements - There are short sequences, generally

What works works. But what works?: Genomes as works in progress.

What works works. But what works?: Genomes as works in progress. - PDF Download Free
154KB Sizes 2 Downloads 4 Views