Proc. Nati. Acad. Sci. USA Vol. 89, pp. 7526-7530, August 1992 Plant Biology

Classification and evolution of a-amylase genes in plants (cereals/phylogeny/polymerase chain reaction/signatu regions)

NING HUANG*, G. LEDYARD STEBBINS, AND RAYMOND L. RODRIGUEZt Department of Genetics, University of California, Davis, CA 95616

Contributed by G. Ledyard Stebbins, April 24, 1992

a-Amyl genes reside on homeologous chromosomes 6A, 6B, and 6D, whereas 10 or 11 a-Amy2 genes are found on chromosomes 7A, 7B, and 7D. The wheat a-Amy3 genes (14, 15) map to the homeologues of chromosome 5. Different experimental approaches to the study of cereal a-amylases have led to different nomenclatures and classification schemes (16). Consequently, some confusion has arisen as more protein and nucleotide sequences have accumulated in electronic data bases. For example, in barley, genes for high pI isozyme have been alternatively called Amyl (17) and Amy2 (18). On the other hand, genes for low pI isozyme are called either Amy2 (17) or Amyl (18). Since zymograms do not resolve the a-amylase isozymes of rice into distinct high and low pI groups (2, 19), it has been difficult to relate these isozymes to those observed in barley and wheat. Subsequently, Huang et al. (20) used DNA-DNA hybridization to classify 30 rice genomic clones into five hybridization groups. These five groups were eventually consolidated into three subfamilies (Amyl, Amy2, and Amy3) on the basis of protein comparisons to other wheat and barley a-amylase genes (8, 20-22). Other investigators have used the high and low pI nomenclature to describe their cloned rice a-amylase genes (23, 24). Since the cereal a-amylase genes appear to be evolutionarily related, we believe an understanding of their phylogeny can be used to establish a consistent and informative nomenclature applicable to other monocot and perhaps dicot a-amylase genes. For this paper, we examined the phylogenetic relationships of several plant a-amylase genes with and without the rateconstancy assumption of the molecular clock hypothesis. On the basis of this analysis, the cereal a-amylase genes could be divided into two major classes: AmyA and AmyB. The AmyA consists of Amyl and Amy2 subfamilies, whereas AmyB consists of the Amy3 subfamily. All grasses examined so far have AmyA and AmyB gene classes. The AmyB genes contained some structural characteristics of the prototype a-amylase gene, which may relate them to a common ancestor for a-amylase in other angiosperms.

ABSTRACT The DNA sequences for 17 plant genes for a-amylase (EC 3.2.1.1) were analyzed to determine their phylogenetic relationship. A phylogeny for these genes was obtained using two separate approaches, one based on molecular clock assumptions and the other based on a comparison of sequence polymorphisms (i.e., small and localized insertions) in the a-amylase genes. These polymorphisms are called "ra-amylase signatures" because they are diagnostic of the gene subfamily to which a particular a-amylase gene belongs. Results indicate that the cereal a-amylase genes fall into two major classes: AmyA and AmyB. The AmyA class is subdivided into the Amyl and Amy2 subfamilies previously used to classify a-amylase genes in barley and wheat. The AmyB class includes the Amy3 subfamily to which most of the a-amylase genes of rice belong. Using polymerase chain reaction and oligonucleotide primers that flank one of the two signatu regions, we show that the AmyA and AmyB gene classes are present in approximately equal amounts in all grass species exined except barley. The AmyB (Amy3 subfamily) genes in the latter case are comparatively underrepresented. Additional evidence suggests that the AmyA genes appeared recently and may be confined to the grass family.

a-Amylase (EC 3.2.1.1) plays a key role in the metabolism of the plant by hydrolyzing starch in the germinating seed and in other tissues. This is accomplished primarily through the 1,4-a endoglycolytic cleavage of amylose and amylopectin, the principal components of starch granules in plant cells. Because of its importance to cereal seed germination and malting, the genetic basis and biochemical mechanism of this process have been the subject of study for many years (1). More recent descriptions of the biology and biochemistry of plant a-amylases have been reviewed by Akazawa et al. (2) and Fincher (3). Protein sequence comparisons of a-amylases from plants, animals, and microbes reveal four highly conserved domains that correspond to sites necessary for enzyme structure and/or function (4). A subsequent comparison of 11 different cereal a-amylase proteins revealed the same four sites plus three additional sites (5). Two of these correspond to intron splice sites, whereas the third may represent a duplication of the calcium binding site, essential for enzyme stability. These results clearly indicate a common ancestry for the a-amylases and conservation of critical sequences over evolutionary time. Although three a-amylase isozymes have been detected in germinating rice seeds (6, 7), it is now known that a-amylase isozymes are encoded by a family of 10 genes located on five different chromosomes (8-10). Multiple a-amylase genes are also known to exist in barley (11) and wheat (12). In barley, 7 genes encoding a-amylase isozymes with high isoelectric points (p1) map to chromosome 1, whereas four low pI genes map to chromosome 6 (11, 13). In hexaploid wheat, 12-14

MATERIALS AND METHODS Preparation of Plant Genomic DNA. Plant DNAs suitable for polymerase chain reaction (PCR) were isolated using the CTAB procedure (25) as modified by Rogers and Bendich (26). DNA was isolated from the following tissues: 10-day etiolated rice young leaves (cv. M202), 7-day etiolated barley young leaves (cv. Klages and cv. Himalaya), 10-day etiolated young leaves from ryegrass, leaf tissue from Zebrina pendula (wandering jew), and Vigna radiata (mung bean) sprouts. DNA for rye, oats, Triticum aestivum (hexaploid wheat), and Triticum uravta (diploid wheat) were generously provided by H. Zhang and J. Dvorak (Agronomy Department, University of California, Davis), whereas maize DNA was provided by

The publication costs of this article were defrayed in part by page charge, payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact.

*Present address: International Rice Research Institute, P.O. Box 933, 1099 Manila, The Philippines. tTo whom reprint requests should be addressed. 7526

Plant Biology:

Proc. Natl. Acad. Sci. USA 89 (1992)

Huang et al.

J. Callis (Biochemistry and Biophysics Department, University of California, Davis). Brassica and potato DNA were gifts from C. Quiros and J. Hu (Department of Vegetable Crops, University of California, Davis). DNA Sequences of Rice, Barley, Wheat, and Mung Bean a-Amylase Genes. DNA sequences of barley, wheat, and mung bean a-amylase genes were obtained from published reports (see Fig. 1 legend) or from the GenBank data base (Release 70). Partial DNA sequences were not included in this study. Only one sequence was chosen from identical or nearly identical sequences. Unpublished sequences for two wheat DNA sequences (W-Amyl/13 and W-Amy2/54) were kindly provided by D. Baulcombe and M. Lazarus. Rice a-amylase gene sequences were previously determined by this laboratory and are available from GenBank. The DNA sequences were aligned by PILEUP and edited by LINEUP of the GCG Sequence Analysis Software Package (27). Aligned sequences were used to estimate the synonymous and nonsynonymous nucleotide substitution rates using a program developed by Li et al. (28). Nonsynonymous nucleotide substitutions were taken as a measure of genetic distance and the UPGMA program was used to construct the phylogenetic tree shown in Fig. 1 (29). PCR Amplification of the Signature Region. PCR primers were synthesized on a Cyclone Plus DNA synthesizer (MilliGen/Biosearch, Novato, CA). The 22-mer (no. 139, 5'CGGCA GGAGC TCGTC AACTG GG-3') anneals 5' to the signature region, whereas the 24-mer (no. 138, 5'-GAGCA TGCCC TTGGT GGTGA AGTC-3') is located 3' to the signature region (Fig. 3). The expected PCR products are 84 and 87 base pairs (bp) for AmyA genes and 78 bp for AmyB genes. The design of these primers was based on the following two criteria: (i) both primers correspond to highly conserved regions in the a-amylase genes and (ii) the PCR products for the AmyA and AmyB genes could be easily resolved by polyacrylamide gel electrophoresis. The two mung bean a-amylase gene primers were based on the previously published sequence information (30). PCR amplification of total genomic DNA from various plant species was performed according to previously published procedures (8). Fifty to 500 ng of genomic DNA (depending on the genome size) were used for PCR amplification, whereas only 50 pg of cloned a-amylase gene DNA was used for control amplifications. PCRs were carried out in 50 al containing genomic DNA, 0.4 ,uM primer 138, 0.4 ,uM primer 139, 0.2 mg of bovine serum albumin per ml, 1 mM each dNTP, and 1 unit of Taq polymerase (United States Biochemical) in Taq polymerase buffer [20 mM Tris, pH 8.4/5 mM MgCl2/2.5 mM KCI/15 mM (NH4)2SO4]. The DNA denaturation was set at 94°C for 1 min, primer annealing was at 55°C for 1 min, and primer extension was at 72°C for 2 min. The reaction was allowed to run for 30 cycles before a final 7-min primer extension at 72°C. The PCR products were resolved by electrophoresis on 10% polyacrylamide gels run in 1 x TAE (40 mM Tris-acetate/2 mM EDTA) buffer and the separated DNA bands were visualized by ethidium bromide

7527

Table 1. Synonymous (Ks) and nonsynonymous (KA) substitution per site between mung bean (MB), rice, barley, and wheat

Amy gene Monocot/dicot MB/barley MB/rice MB/wheat MB/Amyl MB/Amy2 MB/Amy3

No.

Ks

6 7 3 5 5 6

NC NC NC NC NC NC

KA 0.265 0.266 0.264 0.249 0.278 0.267

± ± ± ± ± ±

0.019 0.019 0.019 0.018 0.019 0.019

Monocot/monocot Rice/barley 0.093 ± 0.010 0.461 ± 0.053 3 Amyl 0.124 ± 0.012 0.671 ± 0.076 3 Amy2 Rice/wheat 0.096 ± 0.010 0.461 ± 0.052 1 Amyl 0.122 ± 0.012 0.559 ± 0.064 1 Amy2 5 0.782 ± 0.086 0.182 ± 0.015 Amy3 0.194 ± 0.016 2 0.752 ± 0.092 Amyl/Amy2 Wheat/barley 0.025 ± 0.005 0.129 ± 0.028 3 Amyl 0.042 ± 0.006 0.207 ± 0.030 3 Amy2 See Fig. 1 for sequence sources. Values are presented as mean ± SD. NC, not calculated.

of barley, rice, and wheat, indicating equal rates of a-amylase gene evolution in these three species. The Ks and KA values were also determined for the monocot a-amylase genes (Table 1) and, as reported previously (31), Ks values were much higher than KA values. A strong, positive correlation (correlation coefficient = 0.982) was observed between these sets of values. Because KS values for monocot/dicot a-amylase genes were not available (Table 1), KA values were used to construct the phylogenetic tree shown in Fig. 1. Assuming that the pattern of gene evolution is divergent, all monocot a-amylase genes were found to share a common ancestor with the mung bean a-amylase gene, indicating that monocot and dicot a-amylase genes were derived from a single progenitor. Since the divergence of monocot/dicot plants, the

staining. RESULTS Phylogenetic Relationship of a-Amylase Genes by Sequence Homology. DNA sequences of 17 a-amylase genes from rice, barley, wheat, and mung bean were aligned and the number of nucleotide substitutions at synonymous (Ks) and nonsynonymous (KA) sites was estimated based on this alignment (28). Because synonymous nucleotide substitution sites were saturated between monocot and dicot genes, these values were not calculated (Table 1). Based on the nonsynonymous substitution rate, it is clear that the mung bean a-amylase gene is equidistant from the Amyl, Amy2, and Amy3 genes

0.10

0.15

0.20

0.25

Nonsynonymous substitution/site FIG. 1. Phylogenetic relationship of a-amylase genes from rice (R), barley (B), wheat (W), and mung bean. DNA sequences for the exons of 17 a-amylase genes were aligned and a phylogenetic tree was constructed. DNA sequence information was obtained from the following sources: RAmy2A (22), Amy32b (32), CloneE (33), gKAmyl55 and gKAmyl41 (17), Amy2/54 and Amyl/13 (generously provided by D. Baulcombe and M. Lazarus), Amy6-4 and Amy46 (13), RAmylA (20), RAmy3A, RAmy3B, and RAmy3C (21), RAmy3D and RAmy3E (8), mung bean (30), Amy3/33 (14).

7528

Plant Biology:

Huang et al.

Proc. Natl. Acad. Sci. USA 89 (1992)

Table 2. Synonymous (Ks) and nonsynonymous (KA) substitution per site between rice and barley L Gene KA Ks Amyl 453 0.461 ± 0.053 0.093 ± 0.010 Amy2 453 0.671 ± 0.076 0.124 ± 0.012 ADH2 379 0.564 + 0.065 0.090 ± 0.012 Waxy 611 0.704 + 0.063 0.091 ± 0.009 Lectin 228 0.594 ± 0.108 0.227 ± 0.024 CAB2 266 0.812 ± 0.107 0.119 ± 0.015 Values are presented as mean ± SD. L, number of codons compared. Sequences were obtained from GenBank Release 70.

a-amylase gene in the monocot lineage leading to the grasses (Poaceae) expanded into a multigene family consisting of three subfamilies: Amyl, Amy2, and Amy3. To estimate the rate of a-amylase gene evolution relative to other nuclear genes, Ks and KA values for six different rice and barley genes were calculated (Table 2). The values for the a-amylase genes (Amyl and Amy2) were found to be comparable to other nuclear plant genes shown in Table 2 and to those estimated by Wolfe et al. (31). Use of Sequence Polymorpbhsms to Determine the Phylogeny of Cereal a-Amylase Genes. DNA sequence comparisons revealed a possible phylogenetic relationship based on the rate-constancy assumption of the molecular clock hypothesis. Alternatively, a phylogeny for the a-amylase genes was derived based on the analysis of two polymorphic regions (small, localized insertions) in the aligned sequences. Insertions in DNA are generally considered to be rare events compared to nucleotide substitutions and tend to be maintained after gene duplication (34). Therefore, genes sharing the same insertion are probably derived from a common ancestor and should be classified into the same group. Examination of the aligned a-amylase DNA sequences revealed two distinct regions that are phylogenetically informative (Figs. 2 and 3). These regions of sequence polymorphisms are called "a-amylase signature regions" since they can be used to define the subfamily of a particular cereal a-amylase gene without the need to determine genetic distance. The first of these regions (a-amylase signature I) is located around the junction between the signal peptide and the amino-terminal end of mature peptide (Fig. 2). On the basis of sequence differences in this region, the a-amylase genes could be organized into three groups consistent with the Amyl, Amy2, and Amy3 subfamilies currently used to demungbean

AmyB RAmy3E W-Amy3/33 RAky3B RAmy3C Rkmy3A RAmy3D

AmyA B-AmI46 B-gKAmyl4l W-Amyl/13 B-AmV64 R-Anm1A B-CloneE

W-AWy2/54 B-Aky32b

B-gKkny155 RAmy2A Consensus

Oligos

Mungbean

61 TCT TCC CCT GCC

RAmy3E W-Amy3/33 RAmy3B RAmy3C RAmy3A RAmy3D

TCC TCC TCT TCT CCC ACC

B-Amy46

TGC AGC TTG GCC TCC GGG

AGC AGC CAC CAC GAC TGT

TTA TTA TTG TTA GTC AAC

............

GCA ... ... CAA GCA ... G...CAG GCC ...G...CAA GCC ... G...CAG GCG CAC GCG CAG TCG ... GGT CAA ...

GCC GCT GCC GCT ACG GCA

...

CAA CAA CAG CAG CAG CAG

99 TTG CTG TTT CAG GTT ATT GTC GTT ATC GTC

CTC CTT CTC CTC CTC CTC

TTC TTC TTC TTT TTC TTC

CAG CAG CAG CAG CAG CAG

Amy3

... CAA GTC CTG TTT CAG

CAA GTC CGG TTT CAG B-gXAmyl4l TGC AGC TTG GCC TCT GGG. W-Amyl/13 GCC AGT TTG GCC TCT GGC ... G...CAA GTC CTG TTT CAG Amy1 B-Amy6-4

RAmy1A B-CloneE

GCC AGC TTG GCC TCC GGG ... ... CAA GTC CTC TTT CAG TCC AAC TTG ACA GCC GGG. CAA GTC CTG TTT CAG

GCC GGG TTG GCG TCC GGC CAC

...

W-Amy2/54 GCC GGA TTT GCG TCC GGC CAT

...

B-Amy32B

...

BgKAmyl55 RAmy2A

GCC GGG TTG GCG TCC GGC CAT GCC GGG TTG GCA TCC GGC CAT CTC GGC TTG GCT TCC GGC GAC

...

CAA CAA CAA CAG AAG

Consensus

-CC -GC TTG GC- -C- GG- CA-

---

CAA GTC CTC TTT CAG

...

GTC CTC TTT CAG GTT CTC TTT CAG GTC CTC TTT CAG

Amy2

GTC CTG TTT CAG ATT CTC TTC CAG

FIG. 2. Partial DNA sequence aligment of the a-amylase genes. DNA sequences of 17 a-amylase genes from rice, barley, wheat, and mung bean were aligned using the PILEUP command of the GCG DNA Sequence Analysis Software (27). Gaps were inserted only between codons and in multiples of three. Sources of DNA sequences are indicated in the legend to Fig. 1.

scribe the wheat a-amylase genes. This signature region has the following characteristics: TCC GGG --- --- for the Amyl genes, TCC GGC CAT --- for the Amy2 genes, and ----GCN for the Amy3 genes. Although rice a-amylase genes RAmy3A and RAmy3D are at variance with other genes in the Amy3 subfamily, their classification as such is supported by genetic distance measurements (Fig. 1) and sequence characteristics in the a-amylase signature II region (Fig. 3). The a-amylase signature II region is located between nucleotide positions 769 and 855 of the consensus sequence (Fig. 3). In this region, sequences are highly conserved, as revealed in the consensus sequence. Based on the sequence polymorphisms found in these two regions, the a-amylase genes can be sorted into two groups: those genes without insertions (AmyB) and those genes with either a 6- or 9-bp insertion (AmyA). The genes in AmyA group can be further divided into Amyl and Amy2, depending on the sequence polymorphisms shown in Fig. 2. This classification scheme is in complete agreement with a phylogeny based on genetic distance (Fig. 1). Barley Contains Amy3 Gene: The AmyB Clam. Prior to our research, Amy3-type genes were restricted to wheat (14). No gene belonging to this subfamily had been reported for barley

CA6

855 769 GCT ATT ACT GCA mTT GAT TTC ACA ACA AAA WGA AT 1TT GGT GGG GGA CTG GTG AAT TGG GTT GAA TM GCA GGT GGA ....... ...

AGG GGA CGG GGG

CAG CAG CAG CAG CGG CAG GGG CAG

GAG CGI GMG CGG CTC GOG GAG CTG GIG GAG TTG GTG GAG CTC GTG GAG CTG GTG

G ... AAC TGG GIG GAG GG G GG AAC TGG GTG CGG GGC GTC GGC GGG ...

AAC TGG GGG CAG AAC TGG GCG CAG AAC TGG GTG AAG AAC TGG GTG AAC

GCG GIG GGT GG ...

... ... ... ... ... ...

OGG GCG AMG GGGTTC GAGTTC ACA AM C GG AGCI CGG GOC ACM GCG TTT GAC TTC COC ACC AAG GGC GTT CTC

CCT GOG TCA GCC GTC GGT GGC ...... ... OCT GCA TCG CAG GIT GGC GGC ...... ... CCG GOG ACG GCC GTC GGC GGC ...... ... COG GOG ATG

GGG CAG GAG CTG GTG AAC TGG GTG AAC AAG GTG GGC GGC TCC GGC ... CCC GCC AOC GGC ... CCC GOC AOC GGC ... CCG GGT AOC GGG CAG GAG CTG GTG AAC TGG GTG GAC AAG GTT GGC GGC AAA GGG ... CCC GO ACC GAG GAT GGG GCC CGG CAG CTG GTC AAC TGG GTC CGI GTC GGC AMC AGC AAC GGC AOG OGG CAG AAT CTG GTG AAC TGG GTG GAC AAG GTG GGC GGC GCG GGC TCG GG GGC AM OG- CAG AAT GTG GTG AAC TGG GTG GAC MG GTG GGC GGC GGG GOG TCG GCA GOC ATG COGG CAG AAT CTG GTG AAC TGG GTG GAC AAG GTC GGC GGC GCG GCA TCG GCT GGC M'G GGG CAG AAT CTG GTG AAC TGG GIGAC AAG GTG GGG GGC CGG GCG TCG GT GGC ATG GGG CAG GCG TTG GIG GAC TGG GTG GAC AGG GTG GGT GGG ACG GOG TCG GOG GGG ATG -- - - CC- G- AOG GG CGAG GG CTG G MC TGG GG -AC -G G- GGC COGG CAG GAG CTC GTG AAC TGG G--3

OGG CAG GAG CTG GIG AAC TGG GTG AAC AAG GI GGC GGC TCG OGG CAG GAG CTG GTG AAC TGG GIG AMC AMG GTG GGC GGC TCC

GCG TTC GAC TIC ACG AC AMG GGC Ga CTIG GOG TTC GAC TTC AOG AGC AMG GGC GAG CTG GOG TIC GAC TTC ACG AOC AAG GOGC ATC CTG

ACG TTC GAC TTC ACC AC AMG GGC CIC CTG

ACG TTC GAC TIC ACC AC AMG GGC ATC CTC GOC ATGC GT

ACG TIC GAC TTC AOC ACC AG AOG TIC AC TTC AG AMC MG AOG TTC GAC TTC AOC AC AG G GOG TIC GAC TTC AOC AGO MG GIG TTC GAC TIC AOG ACG AA GTG TTC GAT TIC AMG ACO MG

GGC ATC GIG

GGC ATC CTC

GOC GOG GGG IG TIC GAG TTC AOG AGO MG GGG GIG TIC GAC TTC ACG AOC AAG GGG GTG TTC GAC TTC AOG ACG AAG GGG

ATC CTC AT CTG ATA TIG

ATA G

AhT

G ATC ATG

GOG TTC GAG TTC AC- AGO MG GGC AT- CT-

3 -CTG MG TGG TGG TIC COG TAC GAG

FIG. 3. Partial DNA sequence alignment of the a-amylase genes. Sequences were aligned as described in the legend to Fig. 2. The primers for DNA amplification of AmyA and AmyB genes are shown below the consensus sequence. The monocot primers were synthesized based on the consensus sequence, whereas the dicot primers were synthesized based only on the mung bean cDNA sequence. Nucleotide coordinates are based on the consensus sequence starting from the A of the first AUG codon and excluding introns.

Plant Biology: Huang et al. or any other member of the grass family. If our assumption of a common a-amylase ancestor is correct, one would predict that the barley genome should contain at least one Amy3 gene. To test this possibility we took advantage of the signature region shown in Fig. 3 and used flanking primers to search for an Amy3-type gene in barley genomic DNA. According to this strategy, one would predict PCR products of 78 bp for the AmyB genes (which include members of the Amy3 subfamily) and 84-bp and 87-bp products for the AmyA genes. The validity of this strategy was confirmed by PCR amplification of the appropriate positive controls (Fig. 4). For example, when primers 138 and 139 were used to amplify two rice a-amylase cDNA clones, one corresponding to an AmyA gene (20) and the other corresponding to an AmyB gene (8), PCR products of about 87 bp (Fig. 4, lane 2) and 78 bp (Fig. 4, lane 3) could be observed. Furthermore, when these same primers were used to amplify two barley cDNA clones, one corresponding to an Amyl gene and the other to an Amy2 gene, PCR products of about 84 bp (Fig. 4, lane 6) and 87 bp (Fig. 4, lane 5) were obtained. Under the electrophoresis conditions used in this study, the 84-bp and 87-bp products comigrate as one band when run together in the same lane or when amplified from genomic DNA. This can be seen when genomic DNA from rice and wheat (T. aestivum) were amplified. The PCR products of AmyA genes (84 bp and 87 bp) and AmyB genes (78 bp) could be clearly seen in those lanes containing rice (Fig. 4, lane 4) and wheat (Fig. 4, lane 8) amplification products. These results demonstrate that the primers and the amplification condition used were suitable for detecting AmyB genes in the genomes of barley and other species. The amplification of barley genomic DNA (Fig. 4, lane 7) produced two bands that matched those produced by the rice and wheat genomic DNAs and the two rice cDNA clones. The lower of these bands corresponds to the 78-bp product characteristic of AmyB gene class. The faintness of this band is probably due to the low copy number of this gene class in the barley genome. AmyA and AmyB Genes Present in the Grass Family. To determine if AmyA and AmyB genes are present in other members of the grass family, we used PCR amplification of the signature II region to examine genomic DNAs from eight species (Fig. 5). All of them exhibited the 78-bp and 84- and 87-bp bands, characteristic of the AmyA and AmyB gene classes. Except in the case of barley, all genomic DNAs produced AmyA and AmyB bands of approximately equal intensity. Assuming the efficiency of amplification is the same for both gene classes, these species appear to have about equal numbers of AmyA and AmyB genes. In the case of barley, the AmyA class (which contains the Amyl and Amy2 subfamily genes) appears to be the predominant gene class. When this analysis was expanded to include genomic DNA from other members of the grass family, PCR products 1 2 3 4 5 6 7 8 9 bp

Proc. Natl. Acad. Sci. USA 89 (1992) 118 11 0

67 57

FIG. 5. PCR amplification of the signature regions of selected monocot plant genomic DNA. Lane 1, rice; lane 2, barley cv. Klages; lane 3, barley cv. Himalaya; lane 4, wheat (hexaploid); lane 5, corn; lane 6, rye; lane 7, rye grass; lane 8, oat; lane 9, wheat (diploid); and lane M, molecular size markers (see Fig. 4).

indicative of AmyA and AmyB gene classes were observed (Fig. 5). However, when genomic DNAs from distantly related monocots such as Z. pendula or dicot species (e.g., mung bean) were amplified, only the AmyB class genes were detected (Fig. 6). In the case of potato genomic DNA, no bands were observed. In the latter two instances, primers based on the mung bean a-amylase gene sequence (Fig. 3) were used for PCR. The absence of PCR products in the case of potato DNA suggests nucleotide sequence divergence in one or both of the primer regions. Similar negative results were obtained with genomic DNA from Brassica and Arabidopsis (data not shown). Presumably, as more sequence information on dicot a-amylases becomes available, it should be possible to design new primer sets that will enable us to extend this investigation to a wider range of species.

DISCUSSION a-Amylase Genes in Plants Consist of Two Classes: AmyA and AmyB. Based on isozyme and DNA sequence analysis, barley and wheat a-amylase genes can be classified into two groups: the high pI, type A or Amyl group and the low pI, type B or Amy2 group (13, 15, 17, 36). Recently, an additional subfamily of a-amylase genes (Amy3) has been identified in wheat. The genes in Amy3 are much less homologous to the wheat Amyl and Amy2 genes than these genes are to each other (14). One of the wheat Amy3 genes, Amy3/33, is only expressed at low levels in immature seeds and no orthologous gene has been detected in barley (14). Our studies indicate that 6 of the 10 rice a-amylase genes belong to the Amy3 subfamily (8, 9, 21). Unlike the wheat Amy3 gene, however, members of this gene subfamily in rice are expressed in germinating seeds as well as in other tissues of the plant (8). From the data shown in Figs. 1-3, it is clear that the a-amylase genes can be divided into two main classes (AmyA and AmyB) and three subfamilies (Amyl, Amy2, and Amy3). The ability to classify the cereal a-amylase genes in this way allowed us to conduct a cursory survey of the distribution of these gene classes in other grasses using PCR (Fig. 3-5). These results indicate that AmyA and AmyB genes are present in all grass species examined. Furthermore, the relative proportion of genes in these two classes is approximately equal with the exception of barley. In the latter case, AmyB gene(s) were present in low copy number. It is not clear why the distribution of genes within the AmyA and AmyB classes varies so drastically between rice and barley (Figs. 1 and 4). It is tempting to speculate that this asymmetry is due to the habitats or growth conditions for which rice and bp

FIG. 4. Amplification of rice, barley, and wheat genomic DNA using primers 138 and 139. The same PCR conditions were used for all template DNAs. The absence of bands in lane 1 (no DNA) indicates that the PCRs were free of contaminating templates and artifact bands. Lane 2, RAmylA (8); lane 3, RAmy3D (8); lane 4, rice genomic DNA; lane 5, CloneE (33); lane 6, pM/C (35); lane 7, barley genomic DNA; lane 8, wheat genomic DNA; and lane 9, molecular size markers: pBluescript KS- digested with restriction enzyme Hae III.

7529

1 2 3 4

84-87 -

78'

FIG. 6. Amplification of signature region of genomic DNA of Z. pendula and mung bean. The DNA of Z. pendula was amplified with the primers used in Figs. 4 and 5. The mung bean primers are shown in Fig. 3. The PCR products of rice (lane 1), Z. pendula (lane 2), mung bean (lane 3), and potato (lane 4) were resolved on a 10%o polyacrylamide gel.

7530

Plant Biology: Huang et al.

barley were selected (e.g., tropical/subtropical vs. temperate climates and relatively slow vs. rapid germination). However, careful investigation of other grass genera will be required before such a hypothesis can be safely formulated. The purpose of maintaining multiple isozymes for a-amylase is not known, although slight differences in the enzymatic properties may make one isozyme better suited to a particular substrate or intracellular environment than another. This notion is supported by studies indicating that wheat a-amylase isozymes I and II adsorb and degrade starch granules at different rates (37, 38). Moreover, we have observed that the rice a-amylase isozyme encoded by the RAmyJA gene has a low pH optimum, whereas the RAmy3D isozyme is active over a broad pH range (M. Terashima and R.L.R., unpublished observation). We believe such enzymatic differences may reflect adaptive changes in the physiology and morphology of each species in response to the climatic and edaphic requirements of their respective habitats. It remains to be seen whether differences in the distribution of genes within the AmyA and AmyB classes are associated with these adaptive changes. Evolution of Plant a-Amylase Genes. On the basis of nucleotide sequence information, the phylogeny of the a-amylase genes shows that monocot and dicot a-amylase genes are derived from a common ancestor. Therefore, the a-amylase genes in the monocot lineage must have resulted from a duplication to form AmyA and AmyB genes. The PCR results (Figs. 4-6) suggest that AmyA genes are restricted to the grass family and arose from a duplication of an AmyB gene after the grass family separated from the ancestors of the Commelinaceae, to which Z. pendula belongs. Subsequent duplications of the AmyA and AmyB genes produced the Amyl, Amy2, and Amy3 genes that comprise the subfamilies shown in Fig. 1. Since all Amy3 genes, including mung bean a-amylase, lack the 6- and 9-bp insertion diagnostic of the AmyA class (Fig. 3), we assumed that the ancestor gene that precedes the monocot-dicot divergence also lacks these insertions. Furthermore, the evolution of the AmyA class can be explained by a 9-bp insertion event before the separation of rice and wheat/barley to produce an Amy2-type gene followed by a 3-bp deletion event to produce an Amyl-type gene. This would explain the presence of the 9-bp insertion in Amyl and Amy2 genes of rice. That the 3-bp deletion occurred in the Amyl genes before the divergence of barley and wheat is supported by the observation that all known Amyl gene in barley and wheat contains this 3-bp deletion. The possibility that the Amyl genes evolved from Amy3 genes by a 6-bp insertion followed by an additional 3-bp insertion to produce the Amy2 genes cannot be ruled out at this time. Further analysis of the signature regions in other subfamilies, such as the Arundinoideae, Chloridoideae, and the Panicoideae, should answer this question. The picture emerging for the cereal a-amylases raises several questions of interest to evolutionists and molecular biologists. For example, why and how has this diversity evolved? Has it aided grasses in occupying the enormous range of habitats that are characteristic of this family? How do differences in the distribution of a-amylase genes correlate with molecular data on plant ribosomal and chloroplast genes? The experimental approach described in this paper should provide answers to these and other important questions in molecular evolution. We thank Michael Clegg and John Gillespie for critical comments on this manuscript and Steve Reinl and John Chandler for their expert technical assistance. 1. Brown, H. T. & Morris, G. H. (1890) J. Chem. Soc. 57, 458.

Proc. Natl. Acad. Sci. USA 89 (1992) 2. Akazawa, T., Mitsui, T. & Hayashi, M. (1988) Biochem. Plants 14, 465-492. 3. Fincher, G. B. (1989) Annu. Rev. Plant Physiol. Plant. Mol. Biol. 40, 305-346. 4. Nakajima, R., Imanaka, T. & Aiba, S. (1986) Appl. Microbiol. Biotechnol. 23, 355-360. 5. O'Neill, S. D., Kumagai, M. H., Majumdar, A., Huang, N., Sutliff, T. D. & Rodriguez, R. L. (1990) Mol. Gen. Genet. 221, 235-244. 6. Miyata, S. & Akazawa, T. (1982) Plant Physiol. 70, 147-153. 7. Daussant, J., Miyata, S., Mitsui, T. & Akazawa, T. (1983) Plant Physiol. 71, 88-95. 8. Huang, N., Koizumi, N., Reinl, S. & Rodriguez, R. L. (1990) Nucleic Acids Res. 18, 7007-7014. 9. Ranjhan, S., Litts, J. C., Foolad, M. & Rodriguez, R. L. (1991) Theor. Appl. Genet. 82, 481-488. 10. Rodriguez, R. L., Huang, N., Sutiff, T. D., Ranjhan, S., Karrer, E. & Litts, J. (1992) Rice Genetics II: Proceeding of the Second International Rice Genetics Symposium (Int. Rice Res. Inst., Los Banos, Philippines), pp. 417-429. 11. Muthukrishnan, S., Gill, B. S., Swegle, M. & Chandra, G. R. (1984) J. Biol. Chem. 259, 13637-13639. 12. Gale, M. D., Law, C. N., Chojecki, A. J. & Kempton, R. A. (1983) Theor. Appl. Genet. 64, 309-316. 13. Khursheed, B. & Rogers, J. C. (1988) J. Biol. Chem. 263, 18953-18960. 14. Baulcombe, D. C., Huttly, A. K., Martienssen, R. A., Barker, R. F. & Jarvis, M. G. (1987) Mol. Gen. Genet. 209, 33-40. 15. Huttly, A. K., Martienssen, R. A. & Baulcombe, D. C. (1988) Mol. Gen. Genet. 214, 232-240. 16. MacGregor, E. A. & MacGregor, A. W. (1987) CRC Crit. Rev. Biochem. 5, 129-142. 17. Knox, C. A. P., Sonthayanon, B., Chandra, G. R. & Muthukrishnan, S. (1987) Plant Mol. Biol. 9, 3-17. 18. Aoyagi, K., Sticher, L., Wu, M. & Jones, R. L. (1990) Planta 180, 333-340. 19. Akazawa, T. & Hara-Nishimura, I. (1985) Annu. Rev. Plant Physiol. 36, 441-472. 20. Huang, N., Sutliff, T. D., Litts, J. C. & Rodriguez, R. L. (1990) Plant Mol. Biol. 14, 655-668. 21. Sutliff, T. D., Huang, N., Litts, J. C. & Rodriguez, R. L. (1991) Plant Mol. Biol. 16, 579-591. 22. Huang, N., Reinl, S. J. & Rodriguez, R. L. (1992) Gene 111, 223-228. 23. Ou-Lee, T., Turgeon, R. & Wu, R. (1988) Proc. Natl. Acad. Sci. USA 85, 6366-6369. 24. Yu, S., Tai, Y., Goldman, S., Chuu, Y., Ou-Lee, T. & Wu, R. (1990) in Structure and Function ofNucleicAcids and Proteins, eds. Wu, Y. & Wu, C. (Raven, New York), pp. 287-295. 25. Murray, M. G. & Thompson, W. F. (1980) Nucleic Acids Res. 8, 4321-4325. 26. Rogers, S.0. & Bendich, A. J. (1985) Plant Mol. Biol. 5, 69-76. 27. Devereux, J., Haeberli, P. & Smithies, 0. (1984) Nucleic Acids Res. 12, 387-395. 28. Li, W., Wu, C. & Luo, C. (1985) Mol. Biol. Evol. 2, 150-174. 29. Nei, M. (1987) Molecular Evolutionary Genetics (Columbia Univ. Press, New York), pp. 287-326. 30. Koizuka, N., Tanaka, Y. & Morohashi, Y. (1990) Plant Physiol. 94, 1488-1491. 31. Wolfe, K. H., Sharp, P. M. & Li, W. (1989) J. Mol. Evol. 29, 208-211. 32. Whittier, R. F., Dean, D. A. & Rogers, J. C. (1987) Nucleic Acids Res. 15, 2515-2535. 33. Rogers, J. C. & Miflliman, C. (1983) J. Biol. Chem. 258, 8169-8174. 34. Smith, T. F., Waterman, M. S. & Fitch, W. M. (1981) J. Mol. Evol. 18, 38-46. 35. Rogers, J. C. (1985) J. Biol. Chem. 260, 3731-3738. 36. Jacobsen, J. V. & Higgins, T. J. V. (1982) Plant Physiol. 70, 1647-1653. 37. Sargeant, J. G. (1978) Starch 30, 160-163. 38. Sargeant, J. G. (1979) in The Biochemistry of Cereals, eds. Laidman, D. L. & Wyn, R. G. (Academic, New York), pp. 339-343.

Classification and evolution of alpha-amylase genes in plants.

The DNA sequences for 17 plant genes for alpha-amylase (EC 3.2.1.1) were analyzed to determine their phylogenetic relationship. A phylogeny for these ...
1MB Sizes 0 Downloads 0 Views