Nucleic Acids Research, Vol. 18, No. 23 6935

k. 1990 Oxford University Press

Genomic analysis of the

major

bovine milk protein

genes

David W.Threadgill + and James E.Womack* Department of Veterinary Pathology, Texas A&M University, College Station, TX 77843, USA Received August 27, 1990; Revised and Accepted October 16, 1990

ABSTRACT The genomic arrangement of the major bovine milk protein genes has been determined using a combination of physical mapping techniques. The major milk proteins consist of the four caseins, ca81 (CASAS1), a52 (CASAS2), ( (CASB), and x (CASK), as well as the two major whey proteins, a-lactalbumin (LALBA) and (-lactoglobulin (LGB). A panel of bovine X hamster hybrid somatic cells analyzed for the presence or absence of bovine specific restriction fragments revealed the genes coding for the major milk proteins to reside on three chromosomes. The four caseins were assigned to syntenic group U15 and localized to bovine chromosome 6 at q31 - 33 by in situ hybridization. LALBA segregated with syntenic group U3, while LGB segregated with U16. Pulsed-field gel electrophoresis confirmed genetic mapping results indicating tight linkage of the casein genes. The four genes reside on less than 200 kb of DNA in the order CASAS1-CASB-CASAS2-CASK. Multiple restriction fragment length polymorphisms were also found at the six loci in three breeds of cattle. INTRODUCTION Six proteins, comprising 95 % of the total protein in bovine milk, have previously been classified according to their biochemical characteristics into the four caseins, asl, as2, 3, and x, (CASAS1, CASAS2, CASB, and CASK, respectively) and the two major whey proteins, ca-lactalbumin (LALBA) and 13lactoglobulin (LGB) (1). Milk proteins, synthesized and excreted by the mammary epithelial cells during lactation, provide the suckling calf with a source of minerals and amino acids (2). The caseins raise the calcium and phosphate concentrations in milk by forming stable micelles with these minerals. The calciumsensitive caseins, casl, a52, and (3, so called because they are precipitated in the presence of low concentrations of calcium, are stably maintained in a micelle suspension as a result of their interaction with x-casein (3). Colloidal calcium phosphate (CaPO4), which accounts for approximately 5% of the weight of casein micelles, is sequestered by the miceiles through interactions with the clustered serine phosphate residues of the calcium-sensitive caseins (4). Kappa-casein plays the crucial role of stabilizing the casein micelles, and when cleaved at a specific correspondence

phenylalanine-methionine bond by chymosin, causes initiation of micelle aggregation, resulting in curd formation (3). Betacasein, in addition to determining the surface properties of casein micelles, is essential for curd formation when the milk is clotted by chymosin digestion (5). The ai-caseins appear to be less important for curd formation than ,B- or x-casein since they are apparently lacking in several species. Instead, the a-caseins probably determine the capacity of casein micelles for colloidal CaPO4 transport (5,6). Curd formation is physiologically important since it ensures the retention of milk proteins in the stomach of the infant, thus allowing digestion to occur (1). Similarly, casein aggregation plays an important role in forming the curd used for making cheese (7). Alpha-lactalbumin, which influences lactose synthesis by modifying the substrate specificity of galactosyl-transferase, is important to milk synthesis since lactose, an impermeable disaccharide, is the major osmole of milk (8). The role of (lactoglobulin is not as clear since milk from many mammalian species appears to be devoid of it (9). Amino acid sequence similarities with human serum retinol-binding protein indicates that (-lactoglobulin may bind and transport retinol or other small hydrophobic ligands to the infant (10). Beta-lactalbumin can bind small hydrophobic ligands, including retinol, which is normally supplied in milk and is essential to the growth of the new-born. Since milk proteins from other species have historically been defined according to physical and chemical properties shared with the bovine milk proteins (6,11), and because of the expanding importance of the milk proteins to the biotechnology and dairy industries (12,13), a combination of physical mapping techniques has been used to characterize the relationship between the major bovine milk protein genes. Additionally, a systematic search for restriction fragment length polymorphisms (RFLPs) at the six loci in three breeds of cattle has been performed.

MATERIALS AND METHODS Hybridization Probes The probes for all six genes were bovine cDNA sequences cloned into the Pst I site of pBR322. The CASASI probe was the 1.2 kb insert of pBas5C184, while the CASK probe was the 900 bp insert of pBxC371 (6). The probes for CASAS2, the 550 bp insert of pBa,2C23, and CASB, the 1 kb insert of pB(3C468, have also been described (5). The LALBA probe was the 700 bp insert

should be addressed

*

To whom

+

Present address: Department of Genetics, Case

Western

Reserve University, 2109 Adelbert Rd, Cleveland, OH 44106, USA

6936 Nucleic Acids Research, Vol. 18, No. 23 of pBa-LA5 (14), and the probe for LGB was the 800 bp cDNA fragment from pBf3L13 (15). The probes were random primed labeled (16) with either [a-32P]dCTP to specific activities greater than 109 DPMs/Ag for Southern hybridization, or with 3HdCTP, 3HdATP, and 3HTTP for in situ hybridizations to specific activities greater than 108 DPMs/tg.

Preparation of Genomic DNA The hybrid somatic cells, prepared by fusion of bovine peripheral leukocytes with the HPRT deficient Chinese hamster cell line E-36, have been previously described (17). Bovine blood samples were obtained from the Texas Agricultural Experiment Station at MacGregor, Texas. High molecular weight DNA was prepared from hybrid somatic cells and white cells as described (18). The DNA samples were digested with restriction endonucleases, electrophoresed, and blotted to nylon membranes (Zetabind, CUNO, Meridian, CT) according to standard procedures (19).

Southern Hybridization Prehybridizations and hybridizations were done as previously reported (20). Filters were washed to a final stringency of 0.1 x SSC and 0.1 % SDS at 65°C before being put against Kodak XAR-5 film with one Dupont Cronex Lightning Plus intensifying screen at -70°C for one to seven days. Pulsed-Field Gel Electrophoresis High molecular weight bovine fibroblast and leukocyte DNA was prepared in agarose blocks and digested with restriction endonucleases as described (21). Approximately 1,ug of DNA was used per lane. The preparation of lambda oligomers for size standards has also been described (21). Field inversion gel electrophoresis (FIGE) was run with 1% agarose gels in 0.5 x TBE at 150 volts and 15°C for 96 hours without buffer recirculation. DNA samples were run into the gels for 15 min before the pulsed field conditions were initiated. The beginning forward pulse was 0.3 sec with ending pulse of 180 sec and a constant 3:1 forward to reverse ratio. Clamped-homogeneous electric field (CHEF) gel electrophoresis was with 1 % agarose gels in 0.5 x TBE at 190 volts and 14°C. Switch intervals of 30 sec for 13 hours were followed by 80 sec switch intervals for 8 hours. Ethidium bromide stained gels were briefly exposed to ultra-violet light before being transfered to Zetabind nylon membranes as described (19) In Situ Hybridization In situ hybridization was essentially according to Naylor et al. (22) except washes were at 41 'C. Fifty ng of each of the four casein probes were pooled to give a final probe concentration of 200 ng/ml. After hybridization, the slides were coated with Kodak NTB-2 emulsion and exposed at 4°C in the dark for four to six days. Bromo-deoxyuridine (BrDU) substituted chromosomes from primary bovine fibroblasts were G-banded after hybridization by the fluorescence plus Giemsa (FPG) technique (23). Chromosomes were identified by comparison with published standards (24).

RESULTS AND DISCUSSION Syntenic Assignments Genetic linkage between several of the genes coding for the major bovine milk proteins has previously been demonstrated by using electrophoretic variants of the proteins (25,26,27,28,29).

.,=s'-:wxi

CASAS1, CASB, and CASK showed very tight linkage, while LGB has been reported to be distantly linked to the casein gene complex (28,29). However, the linkage relationships of LALBA and CASAS2 to the other genes has not been established. Using a panel of hybrid somatic cells and cloned complementary DNAs, the genomic relationship between these genes was determined by segregation analysis. The four caseins recognized different Eco RI restriction fragments as determined by hybridization to test filters and were thus hybridized concurrently to the hybrid somatic cell DNAs (Fig. 1). The four casein genes segregated concordantly with phosphoglucomutase-2 (PGM2) in 35 independent hybrid cell lines indicating that they reside on a single unidentified bovine chromosome, labeled syntenic group U15 (Table 1). This supports findings in rabbits and mice where the caseins have been shown to be syntenic (30,31) and in pigs and mice where linkage between the caseins has been demonstrated (32,33). Nucleic and amino acid sequence comparisons have indicated that the three calcium-sensitive caseins evolved from a common primordial gene (5,6). These conclusions, despite the high level of sequence divergence, were based on the similar leader sequences exhibited by all four caseins as well as the conserved sequences clustered around the serine phosphate residues of the calcium-sensitive caseins. The x-casein gene, however, is believed to have evolved from the same primordial gene as the -ychain gene of fibrinogen (FGG) (6), the serum protein which, when cleaved by thrombin, serves an analogous function in the blood clotting cascade as x-casein does in milk curd formation. Interestingly, FGG, is syntenic with PGM2 in humans, but is asyntenic with CASK and PGM2 in cattle and mice (31,34,35). The genes coding for the two major whey proteins, LALBA and LGB, segregated discordantly with each other as well as with the casein genes (Fig. 1 and Table 1). All restriction fragments revealed by LALBA were concordant with glyceraldehyde-3-phosphate dehydrogenase (GAPD) of bovine syntenic group U3 while LGB segregated with the ABL oncogene of syntenic group U16. The mapping of LALBA, as well as hypothesized LALBA psudo-genes (36), to U3, which also contains the stomach lysozymes, supports its hypothesized evolutionary origin from a primordial lysozyme gene (37). The mapping of LGB asyntenic with the caseins discounts the

W.,-X. ..

?'::::::':' :.8B .-.

WF

T'' r

as

i.

s.::

if

4w

41

...0 I....

AwLs

40

MP

s'SEi,

'-'

.:

..

>

#. o se

_::.:

._.,

'9..: .....

Figure 1: Hybridization to hybrid somatic cell DNAs. Bovine (Bo), hamster (H), and hybrid somatic cell (1 -3) DNAs were digested with Eco RI. Lanes from test filters containing bovine DNA were hybridized with probes for AS 1 (CASAS 1), AS2 (CASAS2), B (CASB), and K (CASK). No cross-hybridization was observed with hamster DNA. Size markers are Hind III digested X DNA. Individual hybrid lanes are scored for the presence (I+) or absence (-) of bovine specific hybridization.

Nucleic Acids Research, Vol. 18, No. 23 6937

Sal I revealed that CASAS 1 and CASB reside on a 150 kb restriction fragment while CASAS2 is on a 125 kb fragment and CASK is on a 100 kb fragment (Fig. 2A). Fortuitously, this cell line produced natural partial digestions due to methylation at CpG dinucleotides in the restriction enzyme recognition sequences. This physically linked CASASI, CASB, and CASAS2 on a 275 kb fragment and also indicated that the entire casein gene complex is present on a single 375 kb Sal I restriction fragment, with CASK residing at one end. The order within the complex, however, could not be resolved so another enzyme, Sma I, which also produced restriction fragment variation indicative of natural partial digestions was used to further refine the map. This allowed the assignment of CASASI to the other end of the complex since none of the restriction fragments produced by CASAS 1 were identical with those produced by the other genes. The physical distance between CASAS2 and CASK could also be reduced to 75 kb. Additionally, CASB, CASAS2, and CASK were found to reside on a 175 kb Sma I restriction fragment. This predicts that the entire complex resides on less that 220 kb of DNA and supports and extends upon previous genetic studies indicating the order to be CASAS1-CASB-CASAS2-CASK (25,26,27,28,29). Support for this finding was provided by partial digestions of a leukocyte DNA sample using limiting concentrations of Apa I. While CASAS 1 and CASB are physically linked on 55 kb and 110 kb partial digestion fragments, CASAS 1, CASB, and CASAS2 reside on 195 kb and 210 kb partial fragments and all four loci are present on a 260 kb fragment. When aligned with the maps produced with Sal I and Sma I, the Apa I map indicates that the complex must reside on less than 200 kb of DNA (Fig. 2B).

proposed linkage between these genes (28,29). No known homologue of LGB is known to exist in other species with wellcharacterized genomes (9). However, several non-milk borne proteins do show significant similarity to LGB including a human uterine protein, pp14 (38,39). An interesting association does exist between bovine syntenic group U16 and the terminus of human chromosome 9q. Since LGB is genetically linked to the bovine J blood group (25,28), this blood group can also be assigned to syntenic group U16. Other loci which have been previously assigned to U16 include the ABL oncogene and argininosuccinate synthetase (ASS) (20). These two loci reside on HSA 9q along with the human ABO blood group (40). Several similarities between the J and ABO blood groups suggest that their syntenic association with ABL and ASS may indicate a common evolutionary origin. Both blood groups have four phenotypes, produced by two antigenic alleles and one null allele. Additionally, natural anti-J antiserum of cattle cross-reacts with human A antigen (41). However, the relationship between the ABO locus, which codes for a glycosyltransferase (42), and the J locus cannot be definitively established until J has been molecularly characterized. These results indicate that any human homologue of LGB would be expected to reside at the q-terminus of human chromosome 9. Long-Range Physical Map Since all four casein loci reside on a single bovine chromosome, the physical organization of the gene complex was investigated using pulsed-field gel electrophoresis. In order to minimize any confusion caused by restriction fragment length polymorphisms (RFLPs), blood and fibroblast DNA samples from a Holstein cow homozygous for several polymorphic restriction sites within the casein genes was used to create a restriction map around the gene complex (see RFLP section). Two enzymes were found to be informative with DNA from a primary fibroblast cell line.

Chromosomal Localizations Knowing that the four casein loci reside within a tightly linked multi-gene complex, a mixture of the four probes was used to

TABLE 1. Concordancy analysis.

CAS/Marker

Syntenic groupa

U1 (PGD) U2 (SOD2) U3 (GAPD) U4 (MPI) U5 (PKM2) U6 (PGM1)* U7 (LDHA)* U8 (MDH2) U9 (GPI)* U1O (SOD1) UII(ITPA) U12 (ACY1) U13 (HOX1)* U14 (GSR) U15 (PGM2)* U16 (ABL) U17 (IDH1) U18 (ACO1)* U19 (CAT) U20 (GLO1) U21 (GH)* U22 (AMH)* U23 (ALDH2) U24 (MOS)

Concordant +/+ -/-

0 0 0 0 0 0 2 0 1 0 0 0 1 0 6 0 0 2 0 0 2 1 0 0

19 17 15 30 22 42 38 28 37 22 17 24

36 27 29 21 30 40 23 27 34 43 14 22

Discordant +/- -/+ 0 0 0 0 0 10 8 0 9 0 0 0 8 0 0 0 0 8 0 0 6 10 0 0

13 19 16 6 14 2 4 8 6 14 12 12 7 9 0 17 6 3 13 9 2 2 19 13

LALBA/Marker

% Concordant 59 47 48 83 61 78 77 78 72 61 59 67 71 75 100 55 83 79 64 75 82 79 42 63

LGB/Marker

Concordant Discordant % Concordant +/+ -/- +/- -/+ 11 12 6 4 5 1 2 3 0 12 8 5 1 5 0 11 5 2 9 4 0 0 9 10

13 10 13 15 8 15 16 12 13 14 9 10 13 13 16 10 15 15 13 11 16 15 4 13

5 6 0 14 13 17 16 14 18 6 7 13 18 13 18 9 13 16 9 14 20 20 8 9

1 6 0 1 7 1 0 4 3 2 4 6 1 3 0 6 1 1 3 5 0 0 10 3

80 65 100 56 39 47 53 45 38 76 61 44 42 53 47 58 59 50 65 44 44 43 42

66

Concordant Discordant % Concordant +/+ -/- +/- -/+ 7 8 11 2 3 2 1 3 0

9 9 2 1 4 0 17 3 2 10 5 0 0 8 7

9 7 9 13 7 16 15

12 13 12 10 7 15 12 16 18 13 15 14 12 18 18 5 12

9 9 4 15 14 15 16 14 17 8 6 15 15

13 17 0 14 15 7 12 17 17 6 10

4 8 4 3 9 0 1 4 3 4 2 9 1 4 0 0 3 1 2 4 0 0 11 5

55 47 71 45 30 55 48 45 39 64 70 27 50

48 48 100 48 53 73 52 51 51 43 56

Early passage hybrid cells were used in syntenic groups marked by asterisks for the concordancy analysis with the caseins genes because of the extensive segregation of these syntenic groups in later passages.

a

6938 Nucleic Acids Research, Vol. 18, No. 23

A

Sal ASS

1.'

AS1

A t!-.t

'>

t,

-

., ,..

-

a

_.

-

1.rl.:

--

-

Apa I I.l

M

A _4

'D

:--

-

C

CD

Y

AS!

'.

'it

~~~AS:.

...

.w ~

4

~

~

~

ef

B *J. C-

Figure 2: Hybridization to pulsed-field gels and restriction maps of the casein gene complex. (A) Hybridization with the casein probes to digested Holstein fibroblast (Sal I and Sma I) or leukocyte (Apa I) DNA separated by FIGE (Sal I and Sma I) or CHEF gel electrophoresis (Apa I). Natural partial digestions of Sal I and Sma I are evidenced by multiple hybridizing restriction fragments even with excess enzyme. The filters were hybridized consecutively with the four probes with stripping in between to remove the hybridization signal. Identically migrating fragments are connected by arrows. Apa I enzyme units are noted above the ethidium bromide stained gel. An Apa I site is present within CASB. The sizes of the hybridizing bands indicated on the left of each figure are in kb. The size standards were lambda multimers (X) (50 kb steps) or yeast chromosomes (Y) (strain 334). (B) Restriction maps around the casein gene complex. The order and sizes of restriction fragments produced by each enzyme are indicated along the horizontal maps. Fragments observed with each enzyme are shown below their respective map. Estimated size of the complex is shown at the bottom by aligning the three maps with CASASI assigned to position 0.

Nucleic Acids Research, Vol. 18, No. 23 6939

accumulation of grains is found on bovine chromosome 2. The discrepancy may have been caused by only scoring grains on the three easily identified metacentric sheep chromosomes, none of which carry the homolog to bovine chromosome 6. The chromosome carrying LGB and syntenic group U16 has not been identified, however syntenic group U3, containing LALBA, has recently been assigned to bovine chromosome 5 (44).

A j A.,

imw I

i

,

0

J,

V

W.

...A,'T.t. If

I

fm

!.a Ab

9 .4 "

0

b

i.

0 0.1,t

-

qw:iw

A-

0. f.

.40,

6

VAW 16 4) -"

t-s .'k

.:

40

B

S:~~

* *

2

1

3

diiim

ill1Ir im Ig 23

l0

9

i1 i~ muini i 13

12

£1

i

8

7

5

4

i1

S *S

*

*

i

I III

am II

ul

2I`n

16

14

6

*: *:: .:

UX 1:

GUZI"IU 4 1JF-l GU RU 19

18

26

27

20

lit: 28 29

21

1 I1 E Ill'

Ii['-11'' 1

x

23

22

1

will. 24

25

" l

y

Figure 3: In situ hybridization with the casein gene probes. (A) Bovine metaphase chromosomes hybridized with a probe mixture consisting of the four casein probes. Chromosomes were identified by post-hybridization FPG banding. Chromosome 6 homologues are labeled and an arrow points to a specific hybridization signal. (B) Standardized bovine idiogram showing the distribution of silver grains from 64 metaphase spreads. A significant peak is present at 6q31-33. The other grains are randomly distributed.

localize the gene complex to a specific bovine chromosome by in situ hybridization. Their chromosomal location could not be determined by synteny analysis due to the difficulty in identifying a partial set of acrocentric bovine chromosomes within the rodent background of the hybrid somatic cells. Metaphase spreads were collected from fibroblast cultures which had been propagated in the presence of bromo-deoxyuridine during the early part of the cell cycle. This allowed the chromosomes to be easily identified after hybridization by induction of differential FPG (fluorescence plus Giemsa) banding (Fig. 3A). When the silver grains revealed by autoradiography were scored on 64 metaphase spreads, 48 of the 158 total grains (30%) were located on chromosome 6 (Fig. 3B). A prominent peak was obvious at the q31 -33 region which contained 38 of the 48 chromosome 6 grains (79%). Chromosome 6 is easily identified by the prominent, subtelomeric Giemsa-positive band. This assignment differs from a previous report suggesting that the casein gene complex is on sheep chromosome 2q (43), which is cytogenetically similar to bovine chromosome 2. No data were presented, but it is interesting to note that when the grains on bovine chromosome 6 are removed from the idiogram in Fig. 3B, -an apparent

Restriction Fragment Length Polymorphisms Early investigations into the variability of the milk proteins revealed a high level of electrophoretic polymorphism in various breeds of cattle (45,46,47). After the primary amino acid sequences were elucidated, the relationship between the electrophoretic variants and specific amino acid substitutions was determined (47). This laid the groundwork for investigations into phylogenetic studies between cattle breeds based on allelic frequencies (48) in addition to identifying correlations between electrophoretic variants of the milk proteins and economically important traits such as susceptibility to mastitis infection (49) or curd firmness which influences the quality and quantity of cheese production (50). Six DNA samples from each of three different cattle breeds were randomly selected in order to determine the extent of detectable restriction fragment length polymorphisms (RFLPs) at the milk protein loci. The three breeds were selected for their diverse origins and included Brahman (Bos indicus), Hereford (Bos taurus, beef breed), and Jersey (Bos taurus, dairy breed). The 18 DNA samples were digested with four restriction enzymes, Eco RI, Hind RI, Msp I, and Taq I. All four enzymes revealed RFLPs at the CASAS 1 locus. Msp I had a simple two allele pattern with the fragment labeled A in Fig. 4A gaining a restriction site to produce fragments B and C. Eco RI produced five polymorphic restriction fragments, while Taq I revealed six and Hind Ill revealed two polymorphic fragments. Eco RI and Hind produced monomorphic restriction patterns for CASAS2. However, Msp I and Taq I were polymorphic at CASAS2 producing six and three variant restriction fragments, respectively (Fig. 4B). CASB and CASK, although monomorphic with Eco RI, were polymorphic with Hind HI, Msp I and Taq I. Hind III and Taq I produced simple two-allele RFLPs at CASB with fragments A in Fig. 4C gaining restriction sites to produce fragments B and C, while Msp I revealed three variant restriction fragments. CASK had a two allele RFLP with Hind III with fragment A gaining a restriction site to produce fragments B and C (Fig. 4D). Msp I and Taq I each revealed four polymorphic restriction fragments. Both enzymes apparently revealed three alleles, A, B, and C+D. LGB was polymorphic with Msp I and Taq I but monomorphic with Eco RI and Hind III. Msp I produced three polymorphic restriction fragments (Fig. 5A), while Taq I produced a single variant restriction fragment. Individuals either had the 1.9 kb Taq I fragment or were lacking it. LALBA was polymorphic with all four restriction enzymes. Eco RI and Msp I produced seven and eight polymorphic restriction fragments, respectively (Fig. SB). Hind Im and Taq I each produced two distinct restriction fragment patterns, possibly indicating haplotype specific alleles. In general, the Brahman appeared to exhibit much more variation at the milk protein loci than did either the Hereford

6940 Nucleic Acids Research, Vol. 18, No. 23

A Hind 11

Mwsi;

44

,:

t

;i . ...34 oW t.:z.,....:

_m a

M

::,X.

- *^ a

a - a *

S_

ak

:

ae.a

a ..

rF

4

O

d 1.

.

WOMsp

$, )

M

i

a

S so

_

-;

h.hi..w .

a40f

o.

_.4_, _

j-.-

t

Figure 4: RFLPs identified at the four casein loci, CASASI (A), CASAS2 (B), CASB (C), and CASK (D). Only representative individuals are shown. The polymorphic restriction fragments are labeled on the right and their sizes (in kb) are on the left of each figure.

-/i('. {i .w' *

P

Msvb

af.

*2

Figure 5: RFLPs identified at the two major whey protein loci, LGB (A) and LALBA (B). Only representative individuals are shown. The polymorphic restriction fragments are labeled on the right and their sizes (in kb) are on the left of each figure.

Nucleic Acids Research, Vol. 18, No. 23 6941 TABLE 2. Distribution of polymorphic restriction fragments. Number of Individuals Brahman Hereford JerseEy n=6 n=6 n= 6

Restriction Fragment

CASASI

Eco RI

A B C

D E Hin d III

A

5 5 6

B

1

C

1

A B

0 5

A

B I

Msp

Taq I

CASAS2

I

Msp

Hin d III

A B C

Msp

I

A B C

Taq I

A B C

CASK

Hin d III

A B C

Msp

I

A B C

Taq I

D A B C

LGB

Msp

I

D A B C

LALBA

Taq I Eco RI

D E A A B C

Hin d III

D E F G A B C

Msp

I

A B C

D E F

Taq I

G H A B C

D E

6 6 6 0 0 6 0 0 0 0 0 1 6 0 0 6 0 0 0 6 0 6 6

6 6

1

1

6 3

0 6 6 5 6 6 6 6 0 0 5 0 6 6 6 0 0 6 6

0 6 6 3 4 4 3 4 0 0 3 0 4 4 6 0 0 6 6

4 6 0 6 6 0 6 0 0 0 6 2 5 1 6 6 0 0 6 0 6 0 0 6

0 6 0 6 6 0 6 0 0 0 6

3 5 4 2 5 3 3 1 5 5 3 3

C

CASB

6 6 0

2 0 6 6 3 6

3 0 3 5 5 6 0 0 0 6 2 2 6

1

D E F A B D E F A B

0 0

3 3 6 0 3 0 0 0 4 6 0 0 6 0 0 0 6 0 6 6 0 6 0

C

C

Taq I

1

2 5 4 4

1

6 6

1

0 0 2 2 4 4 5 6 1 6 1 6 2 5 6 5 5 1 0 6 6 2 5 3 5 5 5 1 6 6 1

0

6 0 6 6 0 0 6 0 6 0 0 6

or Jersey (Table 2). Twentyseven restriction fragments were restricted to the Brahman. Only eight fragments were restricted to a Bos taurus breed. Ten fragments differed between Hereford and Jersey. The Jersey had five fragments not seen in the Hereford, while the Hereford also had five fragments not seen in the Jersey. Only four of the identified RFLPs at the whey protein genes were variant in a Bos taurus breed. Whether or not this trend is due to actual variation differences between the breeds will require an extensive RFLP analysis. Several of the RFLPs, two at CASK and one at LALBA, could be related to known protein polymorphisms. Two amino acid substitutions in the CASK-A allele produce CASK-B, Thr to Ile and Asp to Ala at positions 136 and 148, respectively (47). Comparison of amino and nucleic acid sequences within these regions reveal that both substitutions create Hind Ill and Taq I restriction sites. Therefore, the Hind mI[ and Taq I RFLPs A and B +C and A, B, and C +D, respectively (Fig. 4D) are probably identical to the polymorphism described for the protein. While the polymorphic restriction site recognized by Hind III/Taq I is generated by the nucleic acid base substitution producing the protein polymorphism (51), the Pst I RFLP reported by Rando et al. (52), distinguishing CASK-A from CASK-B, recognizes a different restriction site near the nucleic acid substitution. The other RFLP with an associated protein polymorphism was at LALBA. Alpha-lactalbumin A (LALBA-A) differs from LALBA-B by a single amino acid substitution at position 10 (47). The Gln to Arg substitution is produced by an A to G transition which also creates an Msp I restriction site in the LALBA gene. Since LALBA-A is restricted to African and Asian breeds, the Msp I restriction fragment G, which was limited to the Asian Brahman breed (Fig. 5A and Table 2), at LALBA was apparently produced by a gain of restriction site in fragment D, with lane 4 being heterozygous for this restriction site. This Msp I RFLP probably corresponds to the known protein polymorphism. However, pedigreed animals segregating this and the other RFLPs in conjunction with sequence comparisons between alleles will be required to unambiguously associate known milk protein phenotypes with the identified RFLPs.

CONCLUSIONS The arrangement of the genes coding for the major bovine milk proteins has significant implications to breeding programs aimed at developing individuals with pre-determined genotypes. The tight physical linkage of the caseins indicates that rather than using selective breeding programs to produce individuals with desired casein haplotypes, either pre-existing or genetically engineered individuals with the haplotypes would be much more efficient at introducing the desired alleles into breeding populations. This route has recently become feasible with the advent of transomic animals and yeast artificial chromosome technology, which now allow entire gene complexes to be manipulated in the laboratory before being introduced into the genome (53). Additionally, the entire complex could be manipulated to produce pharmacologically important compounds in the milk as has been reported with the whey proteins in sheep (54). This could include multimeric proteins which need to be expressed in defined stoichiometric quantities in order to form active multimers. Likewise, cattle, or related species like sheep, could also be altered to become factories for proteins with specific amino acid compositions to improve the dietary value of milk in highly utilized breeds adapted to local environments in underdeveloped countries (55,56).

6942 Nucleic Acids Research, Vol. 18, No. 23 The close proximity of the casein genes supports the hypothesis of common hormonal regulation for the entire complex (57). This is particularly relevant in light of the previously mentioned interests in using milk protein systems in transgenic animals as pharmaceutical factories. If regulatory elements are shared between the casein genes, greater understanding of the organization of the genes relative to their common regulatory regions will be required in order to assure adequate levels of expression in transgenic animals. Finally, the RFLP search indicates that the casein gene complex and the two whey protein loci are sufficiently polymorphic to act as anchor loci during the construction of a bovine genetic map.

ACKNOWLEDGEMENTS We thank E. Owens for excellent technical assistance, D. Gallagher for help in chromosome identification, L. Skow for allowing access to his field-inversion system, R. Fries for providing the bovine idiogram, and A. Mackinlay and L. Schuler for providing the probes. This work was supported in part by USDA grants 86-CRCR- 1-2202 and 87-CRCR-1-2437 and by the Institute of Biosciences and Technology.

REFERENCES 1. Jenness, R. (1985) In Larson, B. L. (ed.), Lactation. Iowa State University, Ames, Iowa, pp. 164-197. 2. Larson, B. L. J. (1979) Dairy Res. 46, 161-174. 3. Mackinlay, A. G. and Wake, R. G. (1971) In McKenzie, H. A. (ed.), Milk Proteins: Chemistry and Molecular Biology. Academic Press, New York, Vol. 2, pp. 175-215. 4. Sleigh, R. W., Sculley, T. B. and Mackinlay, A. G. (1979) J. Dairy Res. 46, 337-342. 5. Stewart, A. F., Bonsing, J., Beattie, C. W., Shah, F., Willis, I. M. and Mackinlay, A. G. (1987) Mol. Biol. Evol. 4, 231-241. 6. Stewart, A. F., Willis, I. M. and Mackinlay, A. G. (1984) Nucleic Acids Res. 12, 3895-3907. 7. Pearse, M. J., Linklater, P. M., Hall, R. J. and Mackinlay, A. G. (1986) J. Dairy Res. 53, 381-390. 8. Kuhn, N. J. (1983) In Mepham, T. B. (ed.), Biochemistry of Lactation. Elsevier, Amsterdam, pp. 159-176. 9. Perviaz, S. and Brew, K. (1985) Science 228, 335-337. 10. Godovac-Zimmermann, J. (1988) Trends Biol. Sci. 13, 64-66. 11. Jenness, R. J. (1979) Dairy Res. 46, 197-210. 12. Gordon, K., Lee, E., Vitale, J. A., Smith, A. E., Westphal, H. and Hennighausen, L. (1987) Biotechnology 5, 1183-1187. 13. Simons, J. P., McClenaghan, M. and Clark, A. J. (1987) Nature 328, 530-532. 14. Hurley, W. L. and Schuler, L. A. (1987) Gene 61, 119-122. 15. Willis, I. M., Stewart, A. F., Caputo, A., Thompson, A. R. and Mackinlay, A. G. (1982) DNA 1, 375-386. 16. Feinberg, A. P. and Vogelstein, B. (1983) Anal. Biochem. 132, 6-13. 17. Womack, J. E. and Moll, Y. D. (1986) J. Hered. 77, 2-7. 18. Blin, N. and Stafford, D. W. (1976) Nucleic Acids Res.3, 2303-2308. 19. Maniatis, T., Fritsch, E. F. and Sambrook, J. (1982) Molecular Cloning: A Laboratory Manual. Cold Spring Harbor University Press, Cold Spring Harbor, New York. 20. Threadgill, D. W. and Womack, J. E. (1990) Genomics, in press. 21. van Ommen, G. J. B. and Verkerk, J. M. H. (1986) In Davies, K. E. (ed.), Human Genetic Diseases: A Practical Approach. IRL Press, Oxford, pp. 113- 133. 22. Naylor, S. L., Mcgill, J. R. and Zabel, B. U. (1987) In Gottesman, M. M. (ed.), Methods in Enzymology. Academic Press, New York, Vol. 151, pp.279-292. 23. Modi, W. S., Nash, W. G., Ferrari, A. C. and O'Brien, S. J. (1987) Gene Analysis Tech. 4, 75-85. 24. DiBerardino, D., Hayes, H., Fries, R. and Long, S. (1990) Cvtogenet. Cell Genet. 53, 65-79. 25. Hines, H. C., Kiddy, C. A., Brum, E. W. and Arave, C. W. (1969) Genet. 62, 401-412.

26. Matyukov, V. S. and Urnyshev, A. P. (1980) Genetika 16, 572-574. 27. Mercier, J. C., Grasclaude, F. and Ribadeau-Dumas, B. (1972) Michwissenschaft 27, 402-408. 28. Hines, H. C., Zikakis, J. P., Heanlein, G. F. W., Kiddy, C. A. and Trowbridge, C. L. (1981) J. Dairy Science 64, 71-76. 29. Nikiforov, V. S., Lyubimova, Z. P., Matyukov, V. S., Snopova, A. A., Tarasevich, L. F., Putyatova, V. G. and Stepanyuk, E. V. (1985) Genetika 21, 839-844. 30. Gellin, J. Echard, G., Yerle, M., Dalens, M., Chevalet, C. and Gillois, M. (1985) Cytogenet. Cell Genet. 39, 220-223. 31. Gupta, P. J., Rosen, J. M., D'Eustachio, P. and Ruddle, F. H. (1982) J. Cell Biol. 93, 199-204. 32. Glasnak, V. (1968) Comp. Biochem. Physiol. 25, 355-357. 33. Geissler, E. N., Cheng, S. V., Gusella, J. F. and Housman, D. E. (1988) Proc. Natl. Acad. Sci. USA 85, 9635-9639. 34. Zhang, N. and Womack, J. E. (1990) Proc. Texas Genet. Soc. 17, 22. 35. Shows, T. B., Ruddle, F. H. and Roderick, T. H. (1969) Biochem Genet. 3, 25. 36. Soulier, S., Mercier, J. C., Vilotte, J. L., Anderson, J., Clark, A. J. and Provot, C. (1989) Gene 83, 331-338. 37. Prager, E. M. and Wilson, A. C. (1988) J. Molec. Evol. 27, 326-335. 38. Julkunen, M., Seppala, M. and Janne, 0. A. (1988) Proc. Natl. Acad. Sci. USA 85, 8845-8849. 39. Ali, S. and Clark, A. J. (1988) J. Mol. Biol. 199, 415-426. 40. Smith, M. and Simpson, N. E. (1989) Cytogenet. Cell Genet. 51, 202-225. 41. Zaleski, M. B., M. B., Dubiski, S., Niles, E. G. and Cunningham, R. K. (1983) Immunogenetics. Pitman, Boston, pp. 266-269. 42. Yamamoto, F., Clausen, H., White, T., Marken, J. and Hakomori, S. (1990) Nature 345, 229-233. 43. Mercier, J. C., Gaye, P., Soulier, S., Hue-Delahaie, D. and Vilotte, J. L.

(1985) Biochimie 67, 959-971. 44. Fries, R., Threadgill, D. W., Hediger, R., Gunawardana, A., Blessing, M., Jorcano, J. L., Stranzinger, G. and Womack, J. E. submitted. 45. Aschaffenburg, R. and Drewry, J. (1955) Nature 176, 218. 46. Aschaffenburg, R. (1968) J. Dairy Res. 35, 447-460. 47. Eigel, W. N., Butler, J. E., Ernstrom, C. A., Farrell, H M., Harwalker, V. R., Jenness, R. and Whitney, R. M. (1984) J. Dair. Science 67, 1599- 1631. 48. Manwell, C. and Baker, C. M. A. (1980) Anim. Blood Groups Biochem. Genet. 11, 151- 162. 49. Osterhoff, D. R., Ward-Cox, I. S. and Giesecke, W. H. (1973) J. S. Afr. Vet. Ass. 44, 47-51. 50. Bynum, D. G. and Olson, N. F. (1982) J. Dairy Science 65, 2281-2290. 51. Rogne, S., Lien, S., Vegarud, G., Steine, T., Langsrud, T. and Alestrom, P. (1989) Anim. Genet. 20, 317-321. 52. Rando, A., DiGregorio, P. and Masina, P. (1988) Anim. Genet. 19, 51 -54. 53. Pachnis, V., Pevny, L., Rothstein, R. and Costantini, F. (1990) Proc. Natl. Acad. Sci. USA 87, 5109-5113. 54. Clark, A. J., Ali, S., Archibald, A. L., Bessos, H., Brown, P., Harris, S., McClenaghan, M., Prowse, C., Simmons, J. P., Whitelow, C. B. A. and Wilmut, I. (1989) Genome 31, 950-955. 55. Puigserver, A. J., Sen, L. C., Clifford, A. J., Fenney, R. E. and Whitaker, J. R. (1978) Adv. Exp. Med. Biol. 105, 587-612. 56. Bremel, R. D., Yom, H. C. and Bleck, G. T. (1989) J. Dairy Sci. 72, 2826-2833. 57. Rosen, J. M. (1987) In Neville, M. C. and Daniel, C. W. (eds.), The Mammary Gland. Plenum,New York, pp.301 -322.

Genomic analysis of the major bovine milk protein genes.

The genomic arrangement of the major bovine milk protein genes has been determined using a combination of physical mapping techniques. The major milk ...
2MB Sizes 0 Downloads 0 Views