MINI REVIEW

Mammalian glycosyltransferases: genomic organization and protein structure

David H.Joziasse Department of Medical Chemistry, Vrije Universiteit, Van der Boechorststraat 7, NL-1081 BT Amsterdam, The Netherlands

Key words: genomic organization/glycosyltransferases/mammals/protein structure

Introduction The glycosyltransferases (GTs) form a group of 2: 100 functionally related, membrane-bound enzymes that are involved in the biosynthesis of the oligosaccharide portions of glycoproteins and glycolipids. GTs catalyse the following general reaction: XDP-Gly + HO-acceptor—Gly-O-acceptor + XDP in which XDP-Gly is a nucleotide sugar. The acceptor substrate may be an oligosaccharide (either free or conjugated), a lipid or a protein. Many GTs recognize identical donor or acceptor substrates, and it has been anticipated that they would exhibit substantial protein similarity. Surprisingly, examination of the first cloned GT cDNAs showed only very little homology between the enzymes. However, the mammalian GTs of which the primary sequence is known share a common membrane topology [reviewed in Paulson and Colley (1989)]. A single hydrophobic region serves as a non-cleavable, signal-anchor sequence and links a short amino-terminal cytoplasmic tail to a large carboxyterminal catalytic domain. The latter domain is positioned within the lumen of the endoplasmic reticulum (ER) or the Golgi apparatus. Therefore, the GTs classify as type II transmembrane proteins. The Golgi GTs appear to share this topology with other Golgi enzymes like the murine processing a-mannosidase II (Moremen and Robbins, 1991). In a lower © Oxford University Press

Genomic organization So far, six mammalian GT genes have been cloned and characterized (Table I). A comparison of their intron-exon structures shows that two types of genomic organization can be distinguished (Figure 1). Both a2,6-sialyltransferase (a2,6-ST), /31,4-galactosyltransferase (j31,4-GT) and al,3-galactosyltransferase (al,3-GT) have their coding sequence distributed over several exons, whereas the al,3-fucosyltransferase (al,3-FT) and /Jl^-N-acetylglucosaminyltransferase (j31,2-GNT I) genes encode this entire sequence on a single exon. Coding sequence exons of sialyltransferase and the galactosyltransferases code for 12-230 amino acids, with an average of 69 amino acid residues. This number is significantly larger than the average for coding sequence exons (45 amino acid residues) as determined by Traut (1988). The difference is the result of the presence of at least one unusually long coding exon in each of the GT genes. For example, exon-9 from the al,3-GT gene encodes 230 amino acid residues, which corresponds to the major part of the catalytic domain. It is evident that the size of the single coding sequence exon in the al,3-FT and /31,2-GNT I genes is also much above average. The /31,2-GNT I exon encodes 445 amino acid residues [1335 nucleotides (nt)], the 'myeloid type' al,3-FT exon 405 (1215 nt) and the 'plasma type' al,3-FT exon 374 (1122 nt). In the GT genes described so far, a single exon (2-3 kb in length) contains the 3' portion of the coding sequence including the translation termination signal, together with all of the relatively long 3' untranslated region. In addition, most of the genes contain one or more exons that carry 5' untranslated sequence. These exons may be located > 15 kb (al ,3-GT; Joziasse et al., 1992) to 40 kb (a2,6-ST; Wen et al., 1992) upstream of the exon that contains the translational initiation site. In the case of a2,6-ST, this extends the total length of the gene to at least 80 kb of genomic DNA. The sizes of other GT genes are in the same order (Table I). To date, only little information is available on the regulatory elements of GT genes. Best characterized in this respect is the 271

Downloaded from http://glycob.oxfordjournals.org/ at Carleton University on December 1, 2014

In recent years, several glycosyltransferase genes and cDNAs have been cloned and characterized. Although the glycosyltransferases seem to share the same general architecture, there is only little sequence similarity between the various enzymes. Moreover, a comparison of the organization of the genes shows that there is no common pattern of intron-exon structure. In addition, there seems to be little or no correlation between glycosyltransferase exons and protein domains. Taken together, these observations suggest that many of the glycosyltransferase genes evolved independently. So far, only two glycosyltransferase gene families have been described. These families may have evolved by exon-shuffling, or by gene duplication and subsequent divergence. For specific glycosyltransferases, mechanisms such as alternative splicing and alternative promoter usage play a role in the production of multiple protein isoforms from a single gene. These isoenzymes may differ in their enzymatic properties or cellular localization.

eukaryote such as yeast, the glycosyltransferases and processing glycosidases show a similar membrane orientation. Both an ER/Golgi-located a-mannosidase (Camirand et al., 1991) and GDP-Man:Mana-R al,2-mannosyltransferase (Hausler and Robbins, 1992) are type n transmembrane proteins. The reaction catalysed by a GT involves two substrates and a varying number of cofactors. Therefore, one would anticipate that the enzymes consist of several distinct functional units. It is of interest to know whether such units correspond to separate exons and, if so, whemer there is any indication that exonshuffling played a role in the evolution of GTs. This review summarizes our present knowledge about GT genes, and will focus on the correlation between genomic organization and protein structure, and on the mechanisms by which multiple enzyme isoforms are generated from a single GT gene.

D.H.Joziasse

Table I. Glycosyltransferase genomic sequences. The number of exons that contain (part of) the protein coding sequence is indicated, together with the estimated total length in kb of the gene (which includes the intron sequences) Number of exons

Length of the gene (kb)

Murine al,3-galactosyltransferase' Human al,3-galactosyltransferasebx (partial sequence) Murine /31,4-galactosyltransferased Rat a2,6-sialyltransferase^ Human /31,2-GlcNAc-transferase Ih Human al.S-fucosyltransferase1 ('myeloid type') Human al.S-fiicosyltransferase1 ('plasma type')

6 >2

>35 n.d.

6

2:50 2:80 n.d. n.d.

5 1 1

n.d.

n.d., not determined. •Joziasse el al., 1992. "Joziasse et al., 1991. c Larsen et al., 1990. d Hollis et al., 1989. "Wang etal., 1990. f Svensson etal, 1990. •Wen etal, 1992. "Hull etal., 1991. 'Lowe et al., 1991. J Weston etal, 1992.

rat a2,6-ST gene, of which the cell type-specific regulation of expression has been studied in detail (O'Hanlon etal., 1989; Paulson et al ,1989; Svensson et al., 1990, 1992; Wang et al., 1990). Transcription of the gene appears to be under the control of at least three different promoters. One of these promoters is active in all rat tissues examined, whereas the other two regulate transcription in a tissue-specific fashion, as in liver and kidney (Svensson et al., 1990, 1992; Wang etal., 1990; Wen etal., 1992). Recently, it was reported that binding sites for various liver-restricted transcription factors are part of the liverspecific promoter (Svensson etal., 1990, 1992; Wang etal., 1990). Binding of these factors appears to stimulate the expression in liver of a characteristic 4.3 kb transcript. For other GT genes, there are indirect indications for the existence of multiple promoters, but none of these promoters has been characterized yet. The occurrence of two sets of transcripts for /31,4-GT indicates that there are at least two transcriptional start sites (Shaper et al., 1988; Russo et al., 1990) and is suggestive of the presence of two different promoters in the /31,4-GT gene. Both start sites are used in virtually all tissues, with the exception of murine male germ cells (Shaper et al., 1990). In addition, there is preliminary evidence that points to the existence of a third, germ cellspecific promoter in the mouse /31,4-GT gene (Shaper et al., 1990; Harduin-Lepers et al., 1991). Similarly, the production of two sets of a 1,3-GT transcripts in calf thymus could suggest that in this tissue the transcription of the gene is driven by two separate promoters (Joziasse etal., 1989). The gene that encodes human /31,2-GNT I may also be under the control of more than one promoter: one that precedes the upstream exon(s) and a second promoter in the intron preceding the 3'-terminal exon (Hull etal., 1991). 272

Most likely, each GT consists of several distinct structural and functional domains. Recently, a number of primary sequences of GTs have been deduced from cloned cDNAs. From these studies, a general blueprint of the GT protein structure starts to emerge. In contrast, little is known so far about the nature of the functional units of the proteins. Structural units To date, only the gross structural features of the GTs have been identified. Interestingly, at this level of resolution there is a remarkable similarity between the enzymes [reviewed in Paulson and Colley (1989)], which could suggest that in the Golgi apparatus they face the same steric constraints. As depicted in Figure 2, all enzymes are composed of the same type of elements which, moreover, show little variation in size from enzyme to enzyme (Paulson and Colley, 1989; Kumar etal., 1990; Yamamoto etal., 1990; Lowe etal., 1991; Sarkar etal., 1991). Shared structural features between the GTs include the presence, close to the amino terminus of the protein, of a single hydrophobic sequence of sufficient length to span the lipid bilayer. This hydrophobic segment of 16-19 amino acid residues is likely to serve a double function: it acts as a non-cleavable signal-anchor sequence, but in addition it contains Golgi targeting or retention signals (see below). One end of the transmembrane segment is linked to an aminoterminal cytoplasmic tail that ranges in size from 6 to 27 amino acid residues. The other end is linked to the carboxy-terminal catalytic domain via the so-called 'stem region' of the molecule. Based on the amino acid sequence of the amino terminus of soluble, catalytically active enzyme species, the stem region of a2,6-ST and /31,4-GT comprises at least 35-37 amino acid residues. The primary structure of the stem region is not conserved between the GTs, but there are multiple shared features that suggest that this region is a flexible, accessible peptide segment largely devoid of secondary structure. Additionally, for specific GTs the stem region contains proteolytic sites which upon cleavage yield an enzymatically active, soluble form of the protein lacking both the signal-anchor sequence and the short cytoplasmic tail. The catalytic domain, finally, may form a compact, globular unit that seems to be protease-resistant. It represents an independently folding, enzymatically active protein domain, which typically is —300 amino acids in length. Both this part of the enzyme molecule and the stem region are subject to post-translational modifications such as phosphorylation (Strous et al., 1987) and glycosylation.

Functional units Theoretically, several functional units may be distinguished in a GT molecule. First, GTs possess binding sites for both the nucleotide-sugar donor (XDP-Gly) and the acceptor substrate. Usually, the latter binding site accomodates more than only the terminal sugar. It often recognizes a larger portion of" the acceptor that may include the penultimate sugar residue and sugars beyond, remote from the site of reaction (Joziasse et al., 1985). Specific GTs contain binding sites for dolichol-sugars instead of nucleotide sugars (Albright et al., 1989). As yet, there exists no clear picture of the substrate binding sites of the GTs. Analogous to the carbohydrate-binding domain of another series of sugar-binding proteins—the lectins

Downloaded from http://glycob.oxfordjournals.org/ at Carleton University on December 1, 2014

Enzyme

Protein domain structure of GTs

Mammalian glycosyltransferase genes

•4HKD 1

2

3

a1,3-GT 4

5

6

7

8

~

31,4-GT

ATG

J-a2,6-ST -

1

0

1

2

3 K1

K2

K3

ATG

a1,3-FT Fig. 1. Organization of GT genes. The intron—exon structure of five GT genes is represented. Black boxes represent protein coding sequence exons, open boxes 5'- and 3'-untranslated exonic sequence. The translational initiation site is indicated (ATG). Alternative promoter usage in rat kidney produces three different transcripts from the a2,6-ST gene. These transcripts contain one or two of the exons (Kl, K2, K3) that localize between exon-3 and exon-4 on the a2,6-ST genomic map, and are absent from the const'tutively expressed transcript. The transcripts that incorporate K2 contain an in-frame translational start site. The schematic of al,3-FT is representative both of the 'myeloid' and the 'plasma' type gene sequences. Data are derived from the references listed in Table I.

>c a1,3-GT (J1.4-GT a2,6-ST (J1.2-GNTI a1,3-FT Cytoplasmlc tail

6 - 27

residues

Signal-anctior sequence Stem region

16-19

residues

35 - 45

residues

Catalytic domain

• 300

residues

Fig. 2. Alignment of GT exons and protein domains. The protein domain structure of a typical GT is represented schematically by a rectangle which is subdivided to show the major structural elements of the protein. The amino (N) and carboxy terminus ( Q are indicated. The size of the stem region is a minimum estimate based on the amino acid sequence of the amino terminus of soluble forms of the proteins. The lower part of the figure shows the exon-structure of the six GT genes that have been described to date. Data are derived from the references listed in Table I.

— one would predict that in GTs the glycoconjugate-binding domains may represent a fold or pocket, made up by a number of conserved residues that are dispersed over long segments of an otherwise variable peptide sequence (Quiocho, 1986; Drickamer, 1988; Holt, 1991; Quesenberry and Drickamer, 1991; Weis etal., 1991). This is consistent with the observation that within the binding domains of sugar-binding proteins there are often short regions of higher sequence

similarity (Holt, 1991). Such regions could represent areas of contact between carbohydrate and the polypeptide, and may determine binding specificity. Within this context, it was noted that al,3-GT and £1,4-GT contain small regions of similarity, which in each of the proteins may be part of a larger homologous UDP-Gal binding domain of ~ 120 amino acid residues (Joziasse etal., 1989; Yadav and Brew, 1990). Not every galactosyltransferase contains these elements, however, and a relatively small number of conserved amino acid residues may actually be sufficient to determine the nucleotide sugar specificity of the binding site. In this respect, it is relevant that a limited number of specific substitutions along a stretch of ~90 amino acid residues are.required to change the UDP-Gal binding site of the blood group B transferase into a UDPGalNAc binding site (Yamamoto and Hakomori, 1990). In GTs, donor and acceptor sugar binding sites may be more or less independent subdomains of the catalytic domain (Holt, 1991). It is unknown whether these binding sites are in close proximity or maybe physically overlapping, or are strictly separate. In line with the above analysis, even for (31,4-GT, the only GT that has been characterized in molecular detail with respect to its functional domains (Strous, 1986, and references therein; Aoki et al., 1990; Yadav and Brew, 1990, 1991), there exists no clear-cut model of the substrate binding sites. It was suggested from photoaffinity labelling and site-directed rnutagenesis experiments that donor and acceptor binding sites in this enzyme partly overlap (Aoki et al., 1990). However, these experiments do not necessarily imply a close proximity along the linear sequence and other reports indeed suggest that the UDP-Gal binding site of 01,4-GT is structurally separate from regions involved in the acceptor binding (Yadav and Brew, 1990). 273

Downloaded from http://glycob.oxfordjournals.org/ at Carleton University on December 1, 2014

P1.2-GNTI

D.H.Joziasse

Is there a correlation between GT exons and protein structure? In recent years, the issue of whether proteins evolved by the 'shuffling' of exon-encoded peptide 'units' has generated substantial debate (Gilbert, 1978; Gilbert etal., 1986; Traut, 1988; Baron etal., 1991; Dorit and Gilbert, 1991; Patthy, 1991a,b). If exons represent original building blocks of a protein, one would anticipate finding a correlation between exons and the units of protein structure and function in present-day proteins. A study of GT gene organization which detects such correlation not only would shed more light on the relationship between the GTs and their evolution, but might also contribute to the debate that deals with the importance of 'exon-shuffling'. 274

Are one or more of the structural and functional units of the GTs represented by separate exons in the corresponding gene? As pointed out above, GTs share the same structural elements and may have related three-dimensional structures. Nevertheless, this similarity is not reflected in their genomic organization (Figure 2) as there is no common pattern of intron—exon structure between the GT genes. The /31,2-GNT I and al,3-FT genes have single-exonic protein coding sequences, whereas the j31,4-GT, al,3-GT and a2,6-ST genes are divided into multiple exons. Even between the three multi-exon genes there appears to be no similarity in organization. The al,3-GT gene encodes the catalytic domain in a single exon, whereas five exons are needed to encode this domain in the 01,4-GT gene. Moreover, even though the overall architecture of the amino-terminal portion of the various GTs is remarkably similar (see above), in both the a2,6-ST and the /31,4-GT gene the cytoplasmic tail, the transmembrane segment and the stem region are contained in a single exon, whereas the al,3-GT gene is organized quite differently. For al,3-GT, the putative stem region of the enzyme is distributed over two or perhaps three separate exons, and a single exon specifies the transmembrane segment, together with the cytoplasmic tail (Figure 2). As such, the a l ,3-GT gene may contain the only example so far of a protein domain-specific exon. The above analysis indicates that so far there is no clear correlation between GT exons and protein domains. "What can we conclude from this lack of correlation? First, it is likely that both the single-exonic and the multi-exonic GT sequences do encode multi-domain proteins, but that the former genes have evolved as single entities, without introns. This would imply that, in the evolution of the al,3-FTs and jSl,2-GNT I, exon-shuffling did not play a role. Independent evolution would also be consistent with the observed lack of primary sequence similarity between most GTs. Second, it is possible that during evolution the original pattern of GT exon organization has been obscured by more recent events, such as the insertion or deletion of introns. Once inserted, these introns may have contributed to the process of exon-shuffling to yield some of the present-day GT genes (Rogers, 1990; Palmer and Logsdon, 1991; Patthy, 1991a). An example in which a catalytic domain of >250 amino acid residues has served as a building block in different GTs will be discussed below. Third, it is likely that a number of examples of correlation between exons and protein units are not yet fully appreciated because of the lack of information on the organization of the catalytic domain of the GTs. Therefore, attempts to correlate GT exons with functional protein units will require a detailed structure-function analysis involving the clarification of the three-dimensional structure of the GTs by X-ray analysis, sitedirected mutagenesis studies coupled with in vitro expression, and chemical modification of specific amino acid residues. Glycosyltransferase gene families Recently, a number of GT sequences have been reported that are structurally related and can be arranged into two gene families (Yamamoto etal., 1990, 1991; Joziasse etal., 1991; Kumar et al, 1991; Lowe et al., 1991; Weston et al., 1992). Two major mechanisms for the generation of such a family are the process of gene duplication and subsequent divergence, and the exchange of exons between different protein families (exon-shuffling).

Downloaded from http://glycob.oxfordjournals.org/ at Carleton University on December 1, 2014

In addition to the acceptor sugar binding site, some GTs contain a binding site specific for the aglycon portion of the acceptor. For example, two different iV-acetylgalactosaminide a2,6-sialyltransferases have been detected that show a striking difference in their aglycon specificity (Bergh and Van den Eijnden, 1983). More recently, a /31,4-GalNAc-transferase has been described that recognizes a tripeptide motif on glycoprotein hormones (Smith and Baenziger, 1992). This type of enzyme will generate specific carbohydrate modifications on a limited subset of acceptor substrates only. Furthermore, some GTs contain binding sites for activator or modifier proteins. The binding of a-lactalbumin to /?1,4-GT results in an increased affinity of the enzyme for glucose, resulting in a shift to the production of the milk disaccharide lactose. The binding site of a-lactalbumin is located in part of the 'stem region' and in the arnino-terrninal half of the catalytic domain (amino acid residues 93-250), and is structurally separate from the UDP-Gal- and acceptor binding domains (Lee et al., 1983; Navaratnam et al., 1988; Yadav and Brew, 1991). The amino-terminal hydrophobic sequence anchors the GTs and other Golgi enzymes into the membrane. In addition, the same polypeptide segment provides signals for intracellular targeting, as has been shown recently for /J1.4-GT and for a2,6-ST (Munro, 1991; Nilsson etal., 1991; Teasdale etal., 1992). The transmembrane portion of the GTs by itself is sufficient to target a reporter protein that is normally secreted, to the Golgi apparatus with high efficiency. Additional targeting signals seem to be present in regions directly flanking the transmembrane segment and may represent separate functional elements. It seems that the main function of the cytoplasmic aminoterminal tail of the GTs is to provide positive charges that are important for the protein to acquire the proper membrane topology (Parks and Lamb, 1991; Sakaguchi et al., 1992). The short peptide usually carries 1—3 lysine or arginine residues, which results in the sequence flanking the amino-terminal end of the transmembrane domain having a net positive charge over the sequence flanking the carboxy-terminal end. Lastly, many of the GTs are dependent on divalent cations for activity and possess specific binding sites for such ions. An enzyme like /31,4-GT contains both high- and low-affinity metal ion-binding sites (O'Keeffe etal., 1980; Navaratnam etal., 1988). One of these sites is involved in maintaining the structural integrity of the protein, whereas the second is associated with UDP-Gal binding. Metal has to bind to the former site prior to other substrates, and prior to the binding of metal to the latter site (O'Keeffe et al., 1980).

Mammalian glycosyltransferase genes

possibly the result of promoter switching (Russo et al., 1990; Shaper et al., 1990). For /31,4-GT, alternative promoter usage leads to the production of two sets of mRNAs that differ by — 200 bp in length. Translation of these mRNAs yields two forms of the /31,4-GT protein, which differ by 13 amino acids at the amino terminus. The functional significance of this observation is unclear at present, although it has been suggested that the two forms differ in intracellular targeting (Shaper etal, 1990; Lopez etal, 1991). The occurrence of split genes and multiple promoters offers various possibilities of generating protein isoforms from a single gene and thus confers a potential advantage on the cell. An example is provided by the al,3-GT gene: multiple al,3-GT protein forms are produced from a single genomic locus through alternative splicing (Joziasse et al., 1992). This process generates, in virtually all mouse tissues, four different transcripts that differ in the length of the stem region that they encode. Whether this difference confers different enzymatic properties upon the various enzyme isoforms is as yet unknown. Recently, it has been reported that transcription from a kidney-specific promoter, in combination with alternative splicing leads to the production of three different a2,6-ST mRNAs in rat kidney (O'Hanlon et al, 1989; Wang et al., 1990; Wen et al., 1992). Two of these transcripts contain in-frame start codons and upon translation would yield proteins with a molecular mass of 22-25 kDa (cf. Figure 1) that lack the signal-anchor sequence and most of the stem region (Svensson et al., 1990; Wang et al, 1990; Wen et al., 1992). It will be of interest to determine whether such proteins have any sialyltransferase activity and, if so, to examine their intracellular localization. In contrast, the al,3-FTs are encoded on only a single exon (Weston etal., 1992). Here, gene duplication rather than alternative splicing may have provided a spectrum of closely related enzyme forms with slightly different enzymatic properties (Macher etal., 1991; Weston etal, 1992).

Future prospects Glycosyltransferase isoforms Over the years, various types of GT protein isoforms have been reported. These isoenzymes may differ in their enzymatic properties or in their subcellular localization. Usually, GTs are found intracellularly, anchored into the internal membranes of the cell, but in addition secreted forms of the enzymes have been detected. In a number of instances, GTs have been reported to occur at the cell surface, where they may function in cell-cell adhesion (Roseman, 1970). GT protein isoforms can be generated from a single gene by several different mechanisms. First, secreted forms of the GTs are produced by proteolysis in the Golgi apparatus at multiple protease-sensitive sites within the 'stem' region of the protein. The resulting soluble forms of the GTs lack the amino-terminal tail, the transmembrane sequence and part of the 'stem' region. Taken together, this sequence amounts to 63 amino acids for a2,6-ST (Weinstein etal, 1987) and to 77 amino acids for human /31,4-GT (Masri et al., 1988). The soluble GT species are responsible for the enzyme activity detected in milk and in body fluids such as serum. Another interesting mechanism for the production of GT isoforms is the use of multiple transcriptional start sites,

A further analysis of GT genes should reveal evolutionary relationships and common patterns in genomic organization. Furthermore, the cell type-specific regulation of the expression of many GT genes remains to be examined. As more GT genes are cloned, the characterization of their promoter elements will help to understand how the production of cell type-specific carbohydrate structures is regulated. Finally, one may anticipate that knowledge of exon organization will eventually help to identify functional units in the catalytic domain of the GTs.

Acknowledgements The author wishes to thank Dr D.H.van den Eijnden and Dr I.van Die for helpful advice in preparing the manuscript.

Abbreviations al,3-FT, GDP-Fuc:[GaljSl,4]GlcNAc01-R al,3-fucosyltransferase (E.C. 2.4.1.152); (31,2-GNT I, UDP-GlcNAc:Manal,3Man/31-R 01 ^-W-acetylglucosaminyltransferase (E.C. 2.4.1.101); GT, glycosyltransferase; al,3-GT,

275

Downloaded from http://glycob.oxfordjournals.org/ at Carleton University on December 1, 2014

The various al,3-FTs form a family of related genes that includes the Lewis al,3/4-fucosyltransferase, in addition to the 'myeloid type', and the 'plasma type' al,3-FT (KukowskaLatallo et al., 1990; Lowe et al., 1991; Weston et al., 1992). The members of this family are similar over most of their length and two of the genes are syntenic on human chromosome 19 (Weston etal., 1992). These observations suggest that the al,3-FT gene family probably arose by gene duplication and subsequent divergence. For al,3-GT, a group of related genes has been described that includes both al,3-GT functional genes and pseudogenes, and also the blood group A and B transferase genes (Joziasse etal., 1991; Larsen etal., 1990; Yamamoto etal, 1990, 1991). A sequence similarity was found between the blood group transferases on the one hand, and al,3-GT on the other. As described above for the a l ,3-FTs, this similarity may be the result of a gene duplication. Consistent with this interpretation is the shared chromosomal localization of al,3-GT and the blood group transferases on human chromosome 9q33-34 (Shaper etal, 1992). Alternatively, these observations might represent the first example of exon-shuffling in GT evolution. The sequence similarity between al,3-GT and the blood group A and B transferases is limited to the exons 8 and 9 of murine al,3-GT, which are likely to encode the catalytic domain of the enzyme. The junction between the similar region and the unrelated regions occurs precisely at the boundary between exons 7 and 8 of cd,3-GT (Joziasse etal, 1992). The amino-terminal sequence of the proteins, as encoded by al,3-GT exons 4 - 7 , (amino acids 1-95), turned out to be entirely different. These data suggest that from an ancestral gene, a portion encoding much of the catalytic domain of the protein (~ 275 amino acid residues) has been reused in a number of distinct GT genes. If this hypothesis is correct, one would predict the existence of a phase-2 intron—exon boundary at the equivalent position in the blood group A and B transferase genes, since all intron—exon boundaries in al,3-GT are of the phase-2 type (Patthy, 1987). Future studies on other multi-exon GT genes may produce additional examples of exon-recycling.

D.H.Joziasse UDP-Gal:Gal/31,4GlcNAc-R al,3-galactosyltransferase (E.C. 2.4.1.124; E.C. 2.4.1.151); /31,4-GT, UDP-Gal:GlcNAc/31-R j31,4-galactosyltransferase (E.C. 2.4.1.38); a2,6-ST, CMP-NeuAc:Gal|31,4GlcNAc-R a2,6-sialyltransferase (E.C. 2.4.99.1); nt, nucleotides.

References

Genet. Dev., 1, 464^69. Drickamer.K. (1988) Two distinct classes of carbohydrate recognition domains in animal lectins. J. Biol. Chem., 263, 9557-9560. Gilbert,W. (1978) Why genes in pieces? Nature, 271, 501. Gilbert,W., Marchionni.M. and McKnight.G. (1986) On the antiquity of introns. Cell, 46, 151-154. Harduin-Lepers.A., Shaper,N.L., Mahoney.J.A. and ShaperJ.H. (1991) Characterization of a unique munne spermatid /31,4-galactosyltxansferase transcript, dycoconjugaie J., 8, 150. Hausler.A. and Robbins.P.W. (1992) Glycosylation in Saccharomyces cerevisiae: cloning and characterization of an al,2-mannosyltransferase structural gene, dycobiology, 2, 77—84. Hollis,G.F., Douglas.J.G., Shaper.N.L., Shaper.J.H., Stafford-Hollis.J.M., Evans,R.J. and Kirsch.I.R. (1989) Genomic structure of munne /31,4-galactosyltransferase. Biochem. Biophys. Res. Commun., 162, 1069—1075. Holt,G.D. (1991) Indentifying glycoconjugate-binding domains. Building on the past, dycobiology, 1, 329-336. Hull.E., Sarkar.M., Spruijt,M.P.N., HoppenerJ.W.M., Dunn.R. and Schachter,H. (1991) Organization and localization to chromosome 5 of the human UDP-GlcNAc:a-3-D-mannoside (31,2-A'-acetylglucosaminyltransferase I gene. Biochem. Biophys. Res. Commun., 176, 608-615. Joziasse.D.H., Schiphorst/W.E.C.M., Van den Eijnden.D.H., Van Kuik.J.A., Van Halbeek.H. and Vliegenthart.J.F.G. (1985) Branch specificity of bovine colostrum CMP-sialic acid:jV-acetyllactosaminide a2,6-sialyltransferase. J. Biol. Chem., 260, 714-719. Joziasse,D.H., Shaper,J.H., Van den Eijnden.D.H., Van Tunen.A.J. and Shaper.N.L. (1989) Bovine al,3-galactosyltransferase: isolation and characterization of a cDNA clone. Identification of homologous sequences in human genomic DNA. J. Biol. Chem., 264, 14290-14297. Joziasse.D.H., ShaperJ.H., Jabs.E.W. and Shaper.N.L. (1991) Characterization of an al,3-galactosyltransferase homologue on human chromosome 12 that is organized as a processed pseudogene. J. Biol. Chem., 266, 6991-6998. Joziasse.D.H., Shaper.N.L., Kim.D., Van den Eijnden.D.H. and Shaper.J.H. (1992) Murine al,3-galactosyltransferase: A single gene locus specifies four isoforms of the protein by alternative splicing. J. Biol. Chem., 267, 5534-5541. Kukowska-LatalloJ.F., Larsen.R.D., Nair.R.P. and LoweJ.B. (1990) A cloned human cDNA determines expression of a mouse stage-specific embryonic antigen and the Lewis blood group al,3/4-fucosyltransferase. Genes Dev., 4, 1288-1303. Kumar.R., YangJ., Larsen,R.D. and Stanley,P. (1990) Cloning and expression of /V-acefylglucosaminyltransferase I, the medial Golgi transferase that initiates complex N-linked carbohydrate formation. Proc. Natl. Acad. Sci. USA, 87, 9948-9952. Kumar.R., Potvin.B., Muller,W.A. and Stanley,P. (1991) Cloning of a human orl,3-fucosyltransferase gene that encodes ELFT but does not confer ELAM-1 recognition on Chinese hamster ovary cell transfectants. J. Biol. Chem., 266, 21777-21783. Larsen.R.D., Rivera-Marrero.C.A., Ernst,L.K., Cummings.R.D. and Lowe, J.B. (1990) Frameshift and nonsense mutations in a human genomic sequence

276

Downloaded from http://glycob.oxfordjournals.org/ at Carleton University on December 1, 2014

Albright.C.F., Orlean.P. and Robbins.P.W. (1989) A 13 amino acid peptide in three yeast glycosyltransferases may be involved in dolichol recognition. Proc. Natl. Acad. Sci. USA, 86, 7366-7369. Aoki.D., Appert.H.E., Johnson.D., Wong.S.S. and Fukuda,M.N. (1990) Analysis of the substrate binding sites of human galactosyltransferase by protein engineering. EMBO J., 9, 3171-3178 Baron.M., Norman,D.G. and Campbell,I.D. (1991) Protein modules. Trends Biochem. Sci., 16, 13-17. Bergh.M.L.E. and Van den Eijnden.D.H. (1983) Aglycon specificity of fetal calf liver and ovine and porcine submaxillary gland a-W-acetylgalactosaminide a2,6-sialyltransferase. Eur. J. Biochem., 136, 113—118. Camirand,A., Heysen.A., Grondin.B. and Herscovics.A. (1991) Glycoprotein biosynthesis in Saccharomyces cerevisiae. Isolation and characterization of the gene encoding a specific processing a-mannosidase. J. Biol. Chem., 266, 15120-15127. Dorit.R.L. and Gilbert,W. (1991) The limited universe of exons. Curr. Opin.

homologous to a murine UDP-Gal:/3-r>Gall,4-D-GlcNAc al,3-galactosyltransferase cDNA. J. Biol. Chem., 265, 7055-7061. Lee.T.K., Wong,L.-J C. and Wong.S.S. (1983) Photoaffinity labeling of lactose synthase with a UDP-galactose analogue. J. Biol Chem., 258, 13166-13171. LoweJ.B., Kukowska-LataltoJ.F., Nair.R.P., Larson.R.D., Marks.R.M., Macher.B.A., Kelly.R.J. and Emst.L.K. (1991) Molecular cloning of a human fucosyltransferase gene that determines expression of the Lewis x and VIM-2 epitopes but not ELAM-1-dependent cell adhesion. /. Biol. Chem., 266, 17467-17477. Lopez.L.C, Youakim.A., Evans.S.C. and Shur.B.D. (1991) Evidence for a molecular distinction between cell surface and Golgi forms of /31,4-galactosyltransferase. J. Biol. Chem., 266, 15984-15991. Macher.B.A., Holmes.E.H., Swiedler.S.J., Stults.C.L.M. and Smka,C.A. (1991) Human al,3-fucosyltransferases. Glycobiotogy, 1, 577-584 Masri,K.A., Appert.H.E. and Fukuda.M.N. (1988) Identification of a fulllength coding sequence for human /3-jV-acetylglucosaminide:/31,4-gaIactosyltransferase. Biochem. Biophys. Res. Commun., 157, 657-663. Moremen.K.W. and Robbins.P.W. (1991) Isolation, characterization, and expression of cDNAs encoding murine a-mannosidase II, a Golgi enzyme that controls conversion of high mannose to complex N-glycans. J. Cell Biol., 115, 1521-1534. Munro,S. (1991) Sequences within and adjacent to the transmembrane segment of a2,6-sialyltransferase specify Golgi retention. EMBO J., 12, 3577-3588. Navaratnam.N., Ward.S , Fisher,C, Kuhn.N.J., KeenJ.N. and Findlay, J.B.C. (1988) Purification, properties and cation activation of galactosyltransferase from lactating-rat mammary Golgi membranes. Eur. J. Biochem., 171, 623-629. Nilsson,T., Lucocq,J.M., Mackay.D. and Warren.G. (1991) The membrane spanning domain of 01,4-galactosyItransferase specifies trans Golgi localization EMBO J. 12, 3567-3575. O'Hanlon.TP., Lau.K.M , Wang.X. and Lau.J.T.Y. (1989) Tissue-specific expression of /S-galactoside a2,6-sialyltransferase. Transcript heterogeneity predicts a divergent polypeptide. J. Biol. Chem., 264, 17389-17394. O'Keeffe.E.T., Hill.R.L. and Bell.J.E. (1980) Active site of bovine galactosyltransferase: Kinetic and fluorescence studies. Biochemistry, 19, 4954-4962. PalmerJ.D. and LogsdonJ.M. Jr (1991) The recent origins of introns. Curr. Opin. Genet. Dev., 1, 470-477. Parks,G.D. and Lamb,R.A. (1991) Topology of eukaryotic type II membrane proteins: Importance of N-terminal positively charged residues flanking the hydrophobic domain. Cell, 64, 777-787. Parthy,L. (1987) Intron-dependent evolution: preferred types of exons and introns. FEBS Lett., 214, 1-7. Patthy.L. (1991a) Exons—Original building blocks of proteins? BioEssays, 13, 187-192. Patthy,L. (1991b) Modular exchange principles in proteins. Curr. Opin. Struct. Biol., 3, 351-361. PaulsonJ.C, Weinstein.J. and Schauer.A. (1989) Tissue-specific expression of sialyltransferases. /. Biol. Chem., 264, 10931-10934. Paulson,J.C. and Colley.K.J. (1989) Glycosyltransferases: Structure, localization, and control of cell type-specific glycosylation. J. Biol. Chem., 264, 17615-17618. Quesenberry,M.S. and Dnckamer.K. (1991) Determination of the minimum carbohydrate-recognition domain in two C-type animal lectins. dycobiology, 1, 615-621. Quiocho.F.A. (1986) Carbohydrate-binding proteins: tertiary structures and protein-sugar interactions. Annu. Rev. Biochem., 55, 287-315. RogersJ.H. (1990) The role of introns in evolution. FEBS Lett., 268, 339-343. Roseman.S. (1970) The synthesis of complex carbohydrates by multiglycosyltransferase systems and their potential function in intercellular adhesion. Chem. Phys. Upids, 5, 270-297. Russo.R.N., Shaper.N.L. and ShaperJ.H. (1990) Bovine 01,4-galactosyltransferase: Two sets of mRNA transcripts encode two forms of the protein with different amino-terminal domains. J. Biol. Chem., 265, 3324-3331. Sakaguchi.M., Tomiyoshi.R., Kuroiwa.T., Mihara.K. and Omura,T. (1992) Functions of signal and signal-anchor sequences are determined by the balance between the hydrophobic segment and the N-terminal charge. Proc. Natl. Acad. Sci. USA, 89, 16-19. Sarkar,M., Hull.E., Nishikawa,Y., Simpson.R.J., Moritz.R.L., Dunn.R. and Schachter,H. (1991) Molecular cloning and expression of cDNA encoding the enzyme that controls conversion of high-mannose to hybrid and complex /V-glycans: UDP-N-acetylglucosamine:a-3-D-mannoside /3-1,2-N-acetylglucosaminyltransferase I. Proc. Natl. Acad. Sci. USA, 88, 234-238. Shaper.N.L., Hollis,G.F., DouglasJ.G., Kirsch.I.L. and Shaper^.H. (1988) Characterization of the full length cDNA for murine /31,4-galactosyltransferase. J. Biol. Chem., 263, 10420-10428.

Mammalian glycosyltransferase genes

Downloaded from http://glycob.oxfordjournals.org/ at Carleton University on December 1, 2014

Shaper.N.L., Wright.W.W. and ShaperJ.H. (1990) Murine 01,4-galactosy 1transferase: Both the amounts and structure of the mRNA are regulated during spermatogenesis. Proc. Natl. Acad. Sci. USA, 87, 791-795. Shaper.N.L., Lin.S., Joziasse,D.H., Do,Y.K. and YangfengJ.L. (1992) Assignment of two human al,3-galactosyltransferase gene sequences (GGTA1 and GGTA1P) to chromosome 9q33-q34 and chromosome 12ql4-ql5. Genomics, 12, 613-615. Smith,P.L and BaenzigerJ.U. (1992) Molecular basis of recognition by the glycoprotein hormone-specific A'-acetylgalactosamine-transferase. Proc. Nail. Acad. Sci. USA, 89, 329-333. Strous.G.J. (1986) Golgi and secreted galactosyltransferase. CRC Crii. Rev. Biochem., 21, 119-151 Strous.G.J., Van Kerkhof.P. and Schwartz,A.L. (1987) Golgi galactosyltransferase contains senne-linked phosphate. Ear. J. Biochem., 169, 307-311. Svensson,E.C, Soreghan.B. and PaulsonJ.C. (1990) Organization of the |3-galactoside a2,6-sialyltransferase gene. J. Biol. Chem., 265, 20863-20868. Svensson.E.C, Conley,P.B. and Paulson.J.C. (1992) Regulated expression of a2,6-sialyltransferase by the liver-enriched transcription factors HNF-1, DBP, and LAP. /. Biol. Chem., 267, 3466-3472. Teasdale.R.D., D'Agostaro.G. and Gleeson.P.A. (1992) The signal for Golgi retention of bovine /? 1,4-galactosyltransferase is in the transmembrane domain. J. Biol. Chem., 267, 4084^1096. Traut,T.W. (1988) Do exons code for structural or functional units in proteins? Proc. Natl. Acad. Sci. USA, 85, 2944-2948. Wang.X., O'Hanlon.T.P., Young,R.F. and Lau,J.T.Y. (1990) Rat Bgalactoside a2,6-sialyltransferase genorruc organization: alternate promoters direct the synthesis of liver and kidney transcripts. Glycobiology, 1, 25-31. Weinstein.J., Lee.E.U., McEntee,K., Lai,P. and Paulson.J.C. (1987) Primary structure of /3-galactoside a2,6-sialyltransferase. J. Biol. Chem., 262, 17735-17743. Weis.W.I., Kahn,R., Forme.R., Dnckamer.K. and Hendrickson.W.A. (1991) Structure of the calcium-dependent lectin domain from a rat mannose-binding protein determined by MAD phasing. Science, 254, 1608—1615. Wen,D.X , Svensson.E.C. and Paulson^ C. (1992) Tissue-specific alternative splicing of the /3-galactoside a2,6-sialyltransferase gene J. Biol. Chem., 267, 2512-2518. Weston.B.W., Nair.R.P., Larsen,R.D. and Lowe.J.B. (1992) Isolation of a novel human al,3-fucosyltransferase gene and molecular comparison to the human Lewis blood group orl,3/4-fucosyltransferase gene. J. Biol. Chem., 267, 4152-4160. Yadav.S. and Brew.K. (1990) Identification of a region of UDP-Gal:GlcNAc /34-galactosyltransferase involved in UDP-Gal binding by differential labeling. J. Biol. Chem., 265, 14163-14169. Yadav.S.P. and Brew,K. (1991) Structure and function in galactosyltransferase. Sequence locations of a-lactalbumin binding site, thiol groups and disulfide bond. J. Biol. Chem,, 266, 698-703. Yamamoto.F. and Hakomori.S. (1990) Sugar-nucleotide donor specificity of histo-blood group A and B transferases is based on amino acid substitutions. J. Biol. Chem., 265, 19257-19262. Yamamoto.F., Clausen,H., White.T., Marken.J. and Hakomon,S. (1990) Molecular genetic basis of the histo-blood group ABO system. Nature, 345, 229-233. Yamamoto.F., McNeill,P.D. and Hakomori.S. (1991) Identification in human genomic DNA of the sequence homologous but not identical to either the histo-blood group ABH genes or orl,3-galactosyltransferase pseudogene. Biochem. Biophys. Res. Commun., 175, 986-994. Received on April 14, 1992; accepted on May 8, 1992

277

Mammalian glycosyltransferases: genomic organization and protein structure.

In recent years, several glycosyltransferase genes and cDNAs have been cloned and characterized. Although the glycosyltransferases seem to share the s...
721KB Sizes 0 Downloads 0 Views