MOLECULAR AND CELLULAR BIOLOGY, Nov. 1991, p. 5735-5745

Vol. 11, No. 11

0270-7306/91/115735-11$02.00/0

nit4,

a

Pathway-Specific Regulatory Gene of Neurospora Encodes a Protein with a Putative Binuclear Zinc DNA-Binding Domain

crassa,

GWO-FANG YUAN, YING-HUI FU, AND GEORGE A. MARZLUF* Department of Biochemistry and Molecular Genetics, The Ohio State University, Columbus, Ohio 43210 Received 19 April 1991/Accepted 1 August 1991

nit4, a pathway-specific regulatory gene in the nitrogen circuit of Neurospora crassa, is required for the expression of nit-3 and nit-6, the structural genes which encode nitrate and nitrite reductase, respectively. The complete nucleotide sequence of the nit4 gene has been determined. The predicted NIT4 protein contains 1,090 amino acids and appears to possess a single Zn(II)2Cys6 binuclear-type zinc finger, which may mediate DNA binding. Site-directed mutagenesis studies demonstrated that cysteine and other conserved amino acid residues in this possible DNA-binding domain are necessary for nit4 function. A stretch of 27 glutamines, encoded by a CAGCAA repeating sequence, occurs in the C terminus of the NIT4 protein, and a second glutamine-rich domain occurs further upstream. A NIT4 protein deleted for the polyglutamine region was still functional in vivo. However, nit-4 function was abolished when both the polyglutamine region and the glutamine-rich domain were deleted, suggesting that the glutamine-rich domain might function in transcriptional activation. The homologous regulatory gene from Aspergillus nidulans, nirA, encodes a protein whose amino-terminal half has approximately 60% amino acid identity with NIT4 but whose carboxy terminus is completely different. A hybrid nit-4-nirA gene was constructed and found to function in N. crassa. When primary nitrogen

sources

such

as

glutamine, gluta-

mate and ammonium are limited, Neurospora

crassa can use

a variety of secondary nitrogen sources, including nitrate, nitrite, purines, amino acids, and proteins. The utilization of secondary nitrogen sources requires the expression of a set of unlinked structural genes which are controlled in parallel by common regulatory genes and metabolic signals. This regulatory system, designated the nitrogen control circuit, is one of several global metabolic regulatory circuits of N. crassa (32). In the nitrogen circuit, a negative-acting regulatory gene, nmr, prevents the expression of various nitrogen catabolic enzymes when primary nitrogen sources are available. Nitrate reductase and other nitrogen-related enzymes are constitutively expressed in nmr mutants, even in the presence of sufficient primary nitrogen sources to fully repress synthesis of these enzymes in wild type (12, 22). In contrast, the major nitrogen regulatory gene, nit-2, acts in a positive manner to turn on the expression of structural genes in the nitrogen circuit under conditions of nitrogen limitation (17). The nit-2 gene encodes a protein which possesses a single zinc finger domain that is essential for function and which binds in a sequence-specific mode to the promoter regions of nitrogenregulated structural genes (20, 21). Thus, nit-2 and nmr, the major control genes of the nitrogen circuit, together mediate nitrogen catabolic repression in N. crassa. An additional feature of the nitrogen regulatory circuit is that pathway-specific control genes mediate specific induction of enzymes by substrates or intermediates. For example, nit4, a pathway-specific regulatory gene, controls the nitrate-induced expression of nit-3 and nit-6, which encode nitrate and nitrite reductase, respectively (32, 41). The nit4 gene appears to regulate the expression of structural genes at the transcriptional level, since no nit-3 gene transcripts could

*

Corresponding author. 5735

be detected in a nit4 mutant (18). An interesting feature is that the nit-3 gene is constitutively expressed in certain nit-3 mutants even in the complete absence of inducer, which suggests that nitrate reductase may play an autoregulatory role (19, 42). One possible mechanism for this autogenous regulation is that nitrate reductase directly interacts with a nit4-encoded protein and modulates its activation function or prevents its import into the nucleus (19). The nit4 gene has been isolated and demonstrated to be expressed to yield a constitutive, but very low abundance, transcript (16). In this study, the entire nucleotide sequence of the nit4 gene and its flanking regions has been determined. nit4 can be translated to give a protein of approximately 120,000 Da. The NIT4 protein contains a putative DNA-binding domain consisting of a Zn(II)2Cys6-type single zinc finger motif near its amino terminus and also possesses two glutamine-rich regions near its carboxy terminus, which could constitute trans-acting domains. NIT4 shows considerable homology to NIRA, a similar regulatory protein of Aspergillus nidulans. MATERIALS AND METHODS Strains. The N. crassa wild-type strain 740R231A and nit4 mutant (allele 2994) strain were obtained from the Fungal Genetics Stock Center (University of Kansas Medical Center). Cultures were grown in Vogel's liquid medium with shaking at 30°C as described previously (11). nit4 deletions and transformation assay. pNIT4B was constructed from pNIT4 (16) by removing a ClaI site. The ApaI and ClaI sites in the polylinker of pBluescript vector of pNIT4B were used to make a series of deletion clones by the exonuclease III-mung bean nuclease method (24). After double digestion of pNIT4B with ApaI and ClaI, unidirectional deletion clones were produced by digestion from the 5' overhang of ClaI site with exonuclease III, which is not active with the 3' overhang of ApaI site. The single-strand

5736

YUAN ET AL.

region was then removed by mung bean nuclease. The deletion clones and also mutants obtained from oligonucleotide-directed mutagenesis were assayed for function by transformation into nit4 mutant protoplasts, with appropriate positive and negative controls (43). It should be noted that with N. crassa, the vast majority of transformants arise from ectopic integration of the exogenous DNA, either as a single copy or in very low copy number, with homologous recombination being relatively rare (17, 18, 20). Thus, transformation assays determine whether the manipulated gene being tested is itself functional or nonfunctional. The transformation assay tests for complementation of a nit4 mutant to allow the use of nitrate as a nitrogen source. The nit4 mutant strain (allele 2994) used as the host for the transformation assays has a frameshift mutation and thus can possess only a truncated, inactive protein. DNA sequencing and primer extension. DNA sequences were determined by the dideoxy-chain termination method (39) with [c-32P]dATP and Sequenase (United States Biochemical Corp.). Alkaline-denatured DNA prepared as minipreps (4) was used as template. Band compressions were eliminated when necessary by using dITP in place of dGTP. Alternatively, the heat-resistant enzyme Taq polymerase was used instead of Sequenase to read through compression regions by carrying out the sequencing reactions at 70°C. Primer extension experiments were performed by annealing a 5'-end-labelled 17-mer oligonucleotide primer, which hybridized at position 15, with 20 ,ug of poly(A)+ RNA at 30°C in 80% formamide overnight. Moloney murine leukemia virus reverse transcriptase (Bethesda Research Laboratories, Inc., Gaithersburg, Md.) was used to extend the products at 37°C for 1 h. Poly(A)+ RNA and cDNA isolation. N. crassa total RNA was isolated from mycelia by the method of Reinert et al. (38). The poly(A)+ RNA fraction was isolated by oligo(dT)-cellulose chromatography (1). The cDNA library was constructed as described previously (22). Four rounds of plaque hybridization with random-primer-labelled (13) pNIT4B probe were required to isolate nit4 cDNA clones. Purified XDNA was digested with EcoRI, and the cDNA inserts were subcloned into pBluescript KS+ vector and sequenced as described above. Oligonucleotide-directed mutagenesis. The Kunkel method (30) was used for site-directed mutagenesis. Since the 4.7 kb of the cloned nit4 gene and the flanking region is too large to be efficiently mutated, a smaller fragment which encodes the Zn(II)2Cys6 finger and the adjacent regions of nit4 was subcloned into pUC119 vector. After mutagenesis and confirmation by sequencing, each mutated region was used to replace the corresponding fragment of a nit4+ gene. Construction of a nit4-nirA hybrid gene. A region of the nirA gene which encodes the Aspergillus NIRA protein amino acid residues 491 to 892 was amplified via the polymerase chain reaction (PCR) technique. One of the primers used for the PCR, TTGCCGAAGGAATTCGAGCCG, corresponds to the coding sequence for NIRA amino acid residues 491 to 497 except that the CTC codon for Leu-493 was changed to the Phe codon, TTC, in order to generate an EcoRI site. This substituted Phe (residue 493) corresponds exactly to Phe-517 of NIT4; an EcoRI site occurs at this location in the nit4 gene. The other PCR primer annealed to the end of NIRA coding region and contained a BamHI site to facilitate the cloning step. The primers were added to A. nidulans genomic DNA, and PCR was performed with an Ericomp Thermal Cycler as recommended by Perkin-Elmer Cetus (Norwalk, Conn.). The amplified nirA fragment was

MOL. CELL. BIOL.

blunt ended with T4 DNA polymerase and cloned into the EcoRV site of pBluescript KS. The nirA fragment was then isolated by treating the plasmid DNA with EcoRI and BamHI enzymes and inserted into plasmid pC13 (16), from which an EcoRI-BamHI DNA fragment was removed. The fragment which was eliminated from pC13 encodes the NIT4 protein C-terminal 573 amino acids, which includes a glutamine-rich and a polyglutamine domain. The resulting plasmid contains a hybrid nit4-nirA gene. After sequencing to ensure that the proper construction had been achieved, the hybrid nit4-nirA gene was assayed for function by transformation into a Neurospora nit4 mutant host strain. Computer-aided sequence analysis. IBI Pustell sequence analysis software (International Biotechnologies, New Haven, Conn.) was used to manipulate and analyze nucleotide sequences. The predicted NIT4 protein was compared with sequences of the Protein Identification Resource data base (version 19; National Biomedical Research Foundation), which contains 10,528 entries with 2,802,543 residues. RESULTS nit4 nucleotide sequence. A restriction map of nit4 and the strategy used to sequence it are shown in Fig. 1. The entire 4.7-kb region which encompasses the nit4 gene and flanking DNA was determined; both strands were sequenced with overlapping deletion clones. The complete nucleotide sequence is displayed in Fig. 2. A hexanucleotide sequence, CAGCAA, is repeated 13 times near the 3' end of the nit4 gene. Intron sequence of the nit4 gene. A computer analysis of the nit4 nucleotide sequence revealed two long open reading frames, suggesting the presence of an intron. The presence of a small intron was established by comparing the DNA sequences of the genomic and a cDNA clone. This intron consists of only 59 bases and possesses N. crassa intron consensus sequence (35) at its 5' and 3' splicing sites and at the internal branch point site (Fig. 2). Structural analysis of the nit4 gene. The transcription direction of nit4 (Fig. 1) was determined by conducting Northern (RNA) blot hybridizations with single-stranded DNA probes (data not shown). The nit4 gene is transcribed to give an mRNA of approximately 3.5 kb. Primer extension analysis revealed that the transcriptional initiation of nit4 occurs at a single site located 37 bases upstream of the first AUG codon (designated +1) which begins a long open reading frame (Fig. 2). The 3' end of nit4 mRNA was identified by comparing the sequences of genomic and cDNA clones. A possible poly(A) signal AATAAA (15) and the polyadenylation site of the nit4 transcript occur 104 and 129 bp, respectively, downstream of the UGA stop codon (Fig. 2). The first AUG codon begins a long open reading frame (Fig. 2) and is closely followed by two additional in-frame AUG codons. Translation from the initial AUG codon to the UGA stop codon yields a protein of 1,090 amino acids, with a calculated molecular weight of 120,236 and a pl of 6.3. The translated NIT4 protein contains a stretch of 27 glutamine residues near its C terminus, which is encoded by the CAGCAA repeated sequence. Nine glycines in a stretch, which are encoded by a GGC repeated sequence, occur adjacent to the polyglutamine domain. A glutamine-rich domain (25% glutamine in an interval of 105 amino acids) occurs further upstream of the polyglutamine region. The region of the NIT4 protein from residues 53 to 81 is rich in cysteine and may constitute a Zn(II)2Cys6-type zinc cluster (see below).

N. CRASSA nit4 GENE

VOL. 11, 1991 BstXI Sal I

Xb

II I AXh

Xma3

irSmol

Sol I Soc IA

t11 I Xmo3

AK

3,000

2,000

1,000

AAP

RX

Xb I

Xb

-I I I

5737

4,000

Sac I I I

Xb I

I

K A H K A

SacI

H RI H A

A RV K

RI

Xh

ql.-

0-

-4 I0

-*

i-

,

-

4-~ 4-~

4-

-41

44-

-4b

4

-

4-

4-

4-

4-

44-

4-e4

if

4-~

FIG. 1. nit4 gene structure, restriction map, and sequencing strategy. The direction of transcription is indicated by the bold arrow, which represents the 3.5-kb nit4 transcript. The predicted NIT4 protein, consisting of 1,090 amino acid residues, is represented by the cross-hatched

box; the black box indicates the single intron. The short horizontal arrows indicate the strategy for sequencing both strands, which was accomplished with deletion clones and oligonucleotide primers. Restriction sites and length of the genomic segment sequenced are shown.

Analysis of deletion clones of nit4. Deletion clones which truncated the 5' and 3' ends of the cloned nit4 gene were constructed and transformed into the nit4 mutant strain to localize the nit4 gene boundaries. The subclone pNIT4B, which contains the entire nit4 gene, and the deletion clones B75 and C13 complemented the nit4 mutant, but the other deletion clones shown in Fig. 3 were all nonfunctional. These results indicate that the CAGCAA multiple repeated sequence, which encodes the polyglutamine domain, can be eliminated and still yield a functional nit4 gene product. However, when deleted into the glutamine-rich domain from its C terminus, nit4 function was lost. It seemed possible that the polyglutamine domain and the glutamine-rich domain might be redundant, and that a functional NIT4 protein would have to possess one or the other but not both of these regions. To examine this possibility, an internal deletion was made in frame so that the resulting NIT4 protein possessed the polyglutamine but not the glutamine-rich region. This mutant lacked function in vivo, which suggests that the NIT4 glutamine-rich region is essential, whereas the polyglutamine region is dispensable. Comparison of deletions pC9 and pC13 shows that several hundred bases upstream of the initial AUG codon are necessary for nit4 function. A possible binuclear zinc DNA-binding domain. Several regions of the putative NIT4 protein display homology to the yeast GAL4 protein. A possible single Zn(II)2Cys6-type zinc cluster was identified in NIT4 by its significant homology to several fungal regulatory proteins, including one Neurospora protein, QA-1F (2), one Aspergillus protein, QUTA (3), and several yeast proteins, such as GAL4, LAC9, and PPR1 (14, 25, 26, 28, 29, 33, 45) (Fig. 4). This zinc finger motif and the immediate downstream basic region represent the DNA-binding domain of GAL4 and several other fungal regulatory proteins. To investigate whether this region of the NIT4 protein is important for its function, 16 different mutants were constructed by site-directed mutagenesis. Mutants were obtained which led to substitutions for cysteine residues, conserved basic amino acids, a conserved proline residue, and a nonconserved alanine (Fig. 5). The functional activity of each nit4 mutant was assayed by its ability to transform a nit4 mutant strain, in parallel with a nit4+ gene

as a positive control. All substitutions for the two cysteine residues examined abolished nit4 function, even the conservative change of Cys-73 to a serine. As expected, conversion of the codon for Cys-73 to a stop codon gave rise to a nonfunctional nit4 gene. The mutants in which the conserved lysine (residue 60) was changed to either glutamate or glutamine were also nonfunctional. In contrast, mutants in which a nonconserved amino acid (alanine 66) was changed to glycine and even the more radical substitution, aspartate, retained normal nit4 function. Proline 68 corresponds to a conserved residue which gave a zinc-dependent conditional mutation when altered in GAL4; similarly, we found that mutant nit4 genes with three different substitutions of Pro-68 displayed partial function in vivo (Fig. 5). Three mutants which affected residues in the basic region downstream of the zinc finger motif were also obtained; one mutant which caused substitutions for Arg-91 and Lys-92 was nonfunctional in vivo, whereas two mutants which altered Arg-96 and Lys-98 retained partial function. Witte and Dickson (46) showed that three of four variant LAC9 proteins with amino acid substitutions located on the carboxy-terminal side of the single zinc finger were nonfunctional and concluded that the zinc finger and adjacent basic region were responsible for DNA binding. A NIT4-NIRA hybrid protein is functional. The NIT4 protein of N. crassa and the corresponding NIRA protein of A. nidulans show marked amino acid identity throughout the amino-terminal half of the proteins but are completely different throughout their carboxy-terminal halves (see below). Unlike NIT4, the carboxy-terminal half of NIRA lacks any glutamine-rich regions. Since this region of NIT4 seemed to be essential for function, we wondered whether the completely different carboxy-terminal part of NIRA could substitute for this region of NIT4. PCR was used to clone the 3' end of the Aspergillus nirA gene, which was then used to replace the corresponding 3' end of the cloned nit4 gene, as described in Materials and Methods. One primer used in the PCR introduced an EcoRI site such that the fusion occurred in a homologous region of NIT4 and NIRA; e.g., Lys-491 of NIRA corresponds to Lys-515 of NIT4 (Fig. SB). Following the point of fusion, the two proteins display considerable

-470 -460 -450 -440 -430 CTT GACCAACCCG GCCGCCCCCA TCGCATCAC TGTTGTGACT

-490 -480 -510 -500 CTCTTGTCCT CTTCTCACCC GGCCGTGTAC TCTGACTCAA

-340 -400 -390 -380 -370 -360 -350 -420 -410 MCTCTTCC CGAAATCCCT ACCCCGGACTACCCCGGATC TCCCTTGCCT CTGTTGGGCT GTCTCGAC ACGCCCGCG TGTGAGGACC -290 -280 -260 -250 -270 -330 -320 -310 -300 GTGGAGTTCT GTTGCCGACC TCTTCGCCAGG TTATCACTGT TGGTGGGAT CtCGTCCGCGA CTTTGATMT

-240

-220

-230

-210

-170 -160 -190 -180 -200 TCAGTCCCAC CGGTCCCTTA AGAGGAGG CCCCGGGGAC TCCCCGCCAA

-120

-90 -80 -70 -100 -110 AGACGATATC AGGCGTCACA ATTCCGTGTC GAGGCTTGCT TGCTGTCACA

CGTTCAAGAG ATCCTCCATT GCTTCAGTCG

-150

-140

-130

TGGTACAGAA CCCATC

TTCCCCGC

-20

-40 + -30 -60 -50 GTTGCAGTCC TTGCCAGAAC CATTACCACG GCTTGTACTT

-10 CAC CAAGGTACTC CC

50 10 20 30 40 60 1 CGG CAG ATG MC AGT TCG GAT GTT CAA ATG ATG TCC TCT CM GAT GCG CCA GGC TCA GCC GGA CTC GCC CCC GAC S S D V Q N N S S Q D A P G S A G L A P D N N 130 140 80 110 120 70 90 100 AC ATC GCC TCT TCG TTA CCT TCC MG MG MG TCT AGG CGG GGC GCT GAC CCT ACC MC CAA G CGA CGA TGT N I A S S L P S K K K S R R G A D P T U Q K R R C

150

160

180

170

190

200

210

GTG AGC ACG GCC TGC ATC GCC TGT CGG AGG AGG MG TCC MG TGT GAC GGC GCC CTT CCG AGC TGC GCG GCC TGT V S T A C I A C R R R K S K C D G A L P S C A A C

260 270 280 230 240 250 290 GCC AGT GTC TAC GGC ACC GAG TGC ATC TAT GAC CCA MT TCA GAT CAC CGC CGA AM GGC GTC TAC CGG GAG MG A S V Y G T E C I S D H R R K G V Y R E K Y D P U

220

340 360 320 310 330 350 300 MC GAT AGC ATG AM GCC CAG MC GCC MA CTG CAG ATC CTA ATA GAG GCC ATT TTG MT GCT TCC GAG GM GAT N

D

S

N

K

A Q

U A

T

L

I

Q

L

I

E

A I

A

L

S

E

E

D

430 410 420 440 390 400 380 GTG ATC GAC ATT GTG AG MG ATT CGC ACC TGT GAC GAC CTT GAC GM GTA GCC GAG TCC ATC CGC AGG GAT GAG I R R D E V I I R T C D D L D E V A E S I V R R D

370

46 480 490 500 510 470 450 MG MC GCA MA GCA ACC MC GAC MC GAT GCAGC GAT GM CCC ACC CM CCT GGC AGA GAT GAC GCT ACC AGT M A T A T M D U D D S D E P T Q P G R D D A T S K

520

540

530

560

550

580

570

590

CAG GCC GTT GAG GGT GAG AGG GAC TTG GCC CGT MAG ATG GGA GAG CTC CGG ATC GM MT GGT TCT GU CGT m G E L R I E U Q A V E G E RD AL R K U G S V R F

600 610 640 650 660 630 620 ATT GGC GGT ACC TCC CAC CTC ATA TAT CTC MT GAM CC ACA GAT GCC TCT GMA G CCG GAG CU GM MC CC I

G

G

T

S

H

L

I

Y

L

S

E

P

T

D

A

S

E

E

P

E

L

E

T

R

740 720 730 690 700 710 680 CTG TCG ACC TGC GAT GAG AAT CCA ATC ACC ACC TGG ACA GAG GTG ACC AAG AAT CCT CAA CTA ATC ATC CAT CTC U P Q L I I T T U T E V T K I H L L S T C D E N P

670

750 760 800 780 790 770 810 GTC MC ATG TAT TTC MC TGG CAC TAT CCT TAC TTC ACC MG CTG TCT AGA C TTG m TAT CGG GAT UC ATC M U H Y P Y F T T L S R S L F Y R D F I V U N Y F 880 890 a50 870 820 840 860 830 AG GGA AAC GCG GCT GGC CA CCA CGC TCC ACT GT TAC TGT TCG TCC CTG CTG GTC MT GCC ATG CTC GCG CU

K

G M

A

A G

Q

P

R

S

T

V

Y

C

S

S

L

L

V

M A N L

IL

950 900 910 930 940 960 920 GGC TGT CAC UC ACA AGT GTG GAC GGT GCC UT GCG GTA CCT GGG GM MC MG MC MG GGT GAT CAC UC TTC G C H F T S V D G A F A V P G D S R T K G D H F F FIG. 2. Nucleotide sequence of nit4 and its flanking regions. The AUG translation start codon which initiates the long open reading frame is numbered + 1. The translated amino acid sequence of the NIT4 protein is shown beneath the DNA sequence. The vertical arrows indicate the 5' and 3' mRNA termini which were identified by primer extension and comparing sequences of cDNA and genomic DNA. An AATAAA sequence element (at position 3496), a possible poly(A) signal, is boxed. A single small intron is located at positions 1587 to 1645. The consensus sequences of 5' and 3' splicing sites and the internal branch site are underlined. The putative Zn(II)2Cys6 zinc finger is underlined. A CAGCAA repeated sequence, which encodes the polyglutamine domain, occurs at 3051 to 3131. Nine glycines in a stretch, encoded by a GGC repeated sequence, are encoded by nucleotides 2991 to 3017. The glutamine-rich domain is encoded by nucleotides 2322 to 2636.

5738

N. CRASSA nit4 GENE

VOL. 11, 1991

990 1000 1010 1020 1030 980 1040 970 GCC GAG GCC AAG AGA CTG ATT GTT CG MC GAC GAG TAC GAA MA CCA AGG CTC ACT ACC GTG CM GCT CTG GCG A K R I V 0 N D E Y E K P R L T T V Q A A E L L A

1060 1070 1090 1080 1100 1050 1110 CTT ATG TCA GTG CGA GM GCA GGC TGC GGA COG GM GCG MA GGC TGG GTA TAC AGC GGC ATG AGC TTC AGA ATG L N S V R E A G C G R E A K G W V Y S G N S F R N

1120

1190 1170 1180 1140 1150 1160 1130 GCT CM GAC ATT GGC CTG MT CTA GAC ATT GGA TCG CTG GAC GM MG GAG GTG GAT GCC CGG AGA ATC ACC UT F I T A 0 D I G L N I G S L D E K E V D A R R L D

1240 1260 1200 1210 1220 1230 1250 TGG GGT TGC UT GTG m GAC MG TGC TGG TCA MC TAT CTC G0G CGG CTA CCA CAG CTT CCC MA MC ACC TAT Y F V F D K C W S H L G R L P Q L P K N T Y W G C

1270

1320 1330 1340 1290 1310 1280 1300 MC GTT CCG MG TAT GAT GTG m CCA GAC GM GAT GCT GM CTG TGG TCT CCT TAT ACC GAC GCA GGC TTC GAC N V P K Y D V F P D E D A E L U S P Y T D A G F D 1410 1390 140 1360 1370 1380 1350 CAG TCT TGC MG CAG CCA TCT COG ACA CGG GCT ATA GGC HA CG TTG TCG MA CTC TGC GAG ATC MT MT GAT G L Q L S K L C E I S S D Q S C K Q P S R T R A I

110 1480 1490 1430 1450 1470 1440 CTT TTA CTC TTC TTC TAC CAC CCC MT CAT AH GGA AGA TCG MC GGC MG TCA GCT GAG CTC MA MG CTT AGC F F Y G R S S G K S A L K K I E L S L H P S H L L

140

1510 1520 1530 1540 1550 1560 1500 GM CTT CAT CGG CGT CTG GM GAC TOG AGA ACG GAG CTA CCA G GAA TTC GM CCC MG GAT GGC CM TTA CCC E L H R R L E D W R T E L P K E F E P K D G Q L P

1570

1590

1580

1610

1600

1620

1630

1640

MT GTC ATT CTA ATG GA CGTA GGT m CTC ACT CTC TAC TTA CCC MG ACT 000 CO AGT TAC TCA CCT m CAC N V I L N H

1660 1670 1680 1650 1690 17W0 1710 AG) C ATG m TAT CAT CTA CAG TAT ATC CAT CTA UC CGA CCC UC CTA MG TAC ACA MG GM OCT TCG CCC N

F

Y

H

L

Q

Y

I

H

L

F

R

P

F

L

K

Y

T

K

E

A S

P

1720 1740 1730 1760 1750 1770 178 CTG GAG MG GTC CAG CCC CGA AGA ATA TGC ACA ACT MC 0CC MT MC ATA TCC AAA CTT ATG CGA CTT TAC AAA I C T T H A N S L E K V 0 P R R S K L N R I L Y K

1800 1850 1810 1820 1830 1840 1860 MG CTA TAT MC CTC CGG CMA ATA TGT MC ATT GCC GTA TAC ATG CTG CAT TCT GCG TGC ACA ATA CAC ATG CTC K L Y N IA H N L L R Q I C N V Y N L H S A C T I

1870 1880 1890 1900 1910 1920 1930 19.0 MT TTG CCC GAO A ACT GCC AGA COG GAT ATA ACC CAC GGC GTT AGA CA TTG GAG GMA ATG OCT GMA GAC TGG H L P E K T A R R D I T N G V R Q L E EN A E D U

1950 1980 1960 1970 1990 2000 2010 CCT TOT GC COG AGA ACA TTA GGT ATC ATC AGC GTT CTA GCT CG;MG TOG MT GTG GM CTA CCC GAG GAG GCC P C A R R T L I I S V L A R K U N V E L P EE A

2020 2030 2040 2050 2060 2070 2080 2090 GCC ATT GTC TTG MG AGGCA GAC GMA MG TAC GGA ATG m AMC ACT TCC GAG GTC CCC TCA CCC MAC AGG ACG A

I

V

L K

R

T

D

E

K

Y

G

N

F

S

T

S

E

V

P

S

P H

R

T

2100 2110 2160 2120 2140 2150 2130 GCC CCT TCT CTC GCT CCA TCC TCG CCT GCA CCT CCT TAC ACC CCA GM GCG TCT CTG CU U TCT ACA ACA GCC A P S L A P S S P A P P Y T P E A S L L F S T T A

2170 2180 2190 2240 2200 2210 2220 2230 GCC GCA GTA OTC GM CM CM CCG TAO ACT CCA ATG ACA CTA TAT MC CM CCA ACT COT CCG CAT CTT GGT CCG A

A

V

L

E

Q

O P

Y

S

P

N

T

L

Y

S N

P

T

P

P

H

L

G

P

90 2250 2260 2270 2300 2310 GAG AGT ATA TCT GAT CCC GGA ATG TCG CCT MT ATA OTC AC UC TCC GAO CTG COT GAT CCG TCC GCT CCC ATC E S I S D P G N S P N I V T F S D L P D P S A P I

FIG. 2-Continued.

5739

5740

MOL. CELL. BIOL.

YUAN ET AL.

23 2350 2390 2340 703 2330 2320 ATC CCC CAG CM CAG CAG MC ATG CAG GCT ATA TCT AC TTA MC CMA C MT ATG CM CAC CM CAC CAC CAC H L H Q N H M U M S S L S Q M N Q A I Q 0 Q P Q I 2450 2460 2440 2420 2430 2410 2400 CAC CTA CM MC CM CAT CM CCT CM CAG CCA CAC CAT MC CM ATG MC TM CAG CA CM CM CAT MC CTG L T Y Q Q Q QG N NM G M Q H Q P Q Q P U N K N L

2540 2530 2510 2520 2490 2500 2480 2470 CTC ACC CAC CCG GTG MC GCC TCA TCA ATG TCC ATIG C GAC T CTC GCC AC ATT MG GCA TGG GT ATA C T A W G I I P L T H P V S A SS N S N S D T L A T 2610 2600 2590 2570 2580 2560 2550 ACC TCT TCA CCC GGC MC MC MT MC MT MC ATC GTA TCC CM CAT CCG CAT CAT CM MG CM CCA CM CM V S Q U P U N Q K Q P Q Q M N I N NN T S S P G M

2680 2690 2660 2670 2640 2650 2630 2620 CM CM CM CCA CM GCA CM CGA TAT CCC A GTT GGA TCA GTC GGC ACA MC MC GTG ALL CCT CCL GCA GCT Q Q Q P Q A Q R Y P T V G S V G T N T V K P P A A

2760 2750 2730 2740 2720 2710 2700 GCC ACG CAG ACC TTT ACT CCA GCC CAG TTA CAC GCG MT MC TTG GCG ACT GCA ACG MG TCA ACA GCT TCC MC E L A T A T R S T A S K L A A T Q T F T P A QG I 2840 2820 2830 2790 280 2810 2780 270 CAC MG MC GTT GGC CGA CAC GTA AGC CCC MC TCG ATC TAC GCC ATT GAT GGA CM GAT TGG TA CTG AG GAT I D G O D W Y L K D H K S V G R H V S P S S I Y A

287

2860

2850

2880

2910

2900

2890

GGC GTC ACC TGG CM CAG GGC m CMG GGG TGG GAT CTG GAG GGC GGT GGL GCL GGC ACA GCT ACA GC CT GGT G F Q G U D L E G G G A G T A T S T G G V T U Q0

2980 2960 2970 2950 2940 2930 2920 GGT ATT GGG GAT GGG GGT GGC CCT ACT GGT CGT GCT GGT GAT AGC ATG GCG AGG CTA GCT CCT CGC GGCG P R G D SN A R L G D G G G P T G GA G G I

2990 T ATT U I

3060 3050 3040 3010 3030 3020 3000 GGC GGC GGC GGC GGC GGC GGC GGC GGT MGT ACA GG CM CGA CM CM CG CMA CG CGC CMG CM CG CMA CAG G G G G G G G G G S T G Q R Q Q Q Q Q R Q Q Q Q Q 3070

3140

3130

3120

3110

3100

3090

3080

CMA CMG CA CG CA CMG CAR CAG C CM CM CM CAL CM CM CM CM CMG CM CG CG CM GG GCT MT E L M a Q Q 0 Q Q Q Q Q Q Q Q Q Q Q Q Q Q q0 Q Q00 3210 3200 3190 3180 3170 3160 3150 ATG m GCG TAT CAT CAT GGG GCA GAG MA GCG GGG CGT GGT ATA GM TCAACG GGC ATGGGC ATG ACG ACG GTA G U T T V E S T G M N F L Y H UG A E R 6 G G G I 3 3290 3 3250 3260 3240 3220 3230 CGC GGT GGC CCL CCL GGG TTC GAT CCL ATA CCL CCL TCT GGG CTA TTG GAT CAT TTG GTG GGT CTG CAT GM TTG

G

G

G

330

G

G

G

F

3310

D

P

I

G

3320

G

S

G

3330

L

L 10 0 3340

V

L

G

L

D

E

L

3350

GGC AGC TTG GM CCC TTG CCL CAT CTC CCL GGT CTA GAT TGGTTGGG MCTCTCAT GTGATTG TTGGGACATC G S L D G L G H L P G L Q Stop

3410 3380 3400 3420 3390 3370 TCCCAGTCT AGATTGLGTT TGTTTCTC TCATGTGATT TGAATTGTG

3430 TTT

3440

3450

AAATGGATGG GTAGAGGGGG

3530 3520 3510 3540 3500 3490 3480 3460 3470 MTTCGGACA TAATATATA TACCCATGGT TGGCGATMG TGTCCTTCTA TACMACLTT CCTTCGAAAA AAAAA

3620 3610 3600 3590 3630 3580 3570 3560 3550 MCCTTCAGG MGALGCTCT TMGGTGCC ATGCGTGATT CAALGCTTGC ATAMTACTT GGCTCGACGT LGAAALCT TTGCMTGGL

FIG. 2-Continued.

N. CRASSA nit4 GENE

VOL. 11, 1991

5741

I

1090---

pNIT4B

pB7

----------------

pD21

----------------.

pC30

----------------.

pC42.-_ -

-

-

-

O

: -t

Ia

*

l:

T

\\\\

*

I

*

. IP

-1-A

-

I

t

-

810

-

\\\\\

331

-

831

-

s\\xx5

937

-

RAMM\\

980

-

1 mm-.,l -

PC4j

pC9

+

74 9

-IIjI

pC62

980

--------------' £: pC13 l \f1 980 + determined use of deletion mutants. The box indicates the NIT4 protein-coding region. FIG. 3. 5' and 3' borders of the nit4 gene, by open The 5' untranslated region is shown with a dashed line. The intron is indicated as a small black box, and the Zn(II)2Cys6 zinc finger, near the amino terminus, is represented by a dotted box. The glutamine-rich and polyglutamine domains near the carboxyl terminus are shown as hatched and cross-hatched boxes, respectively. Deletion clones pB75 and pC13 possess function, as assayed by transformation, whereas the other clones are inactive. The number of amino acids left in each nit4 deletion clone is shown. +, functional in transformation assay; -, nonfunctional.

A.

/RK R

S

IC

I NH2------T-A

/ A-L\

P

D R SK C\ C

R

A"

G

\ /\ /

-

Zn Zn / \/ \ I

,,Y

A

A

TN

Y P /_

N-S-D-H-R-R-K-G-V-Y-R-E-K-N-D-S-N-K---COO B. NIT4 NIRA GAL4

A C I A C I A C D LAC9 A C D PPR1 A C X ARGR2 G C W HAPI R C T QA-1F A C D QUTA A C D NAL63 8 C D LBU3 A C V

A A I A R T I Q 8 C

C C C C C C C C C C

R[ R RS I 8 R R R R R L K L R KRKX W R L K X I R G R X V R X R X V R A A RB R _X D R VR R V

EB_CR Q Q8

RC[

G C G K C SX X C SK X C D Q K C D L X C D R X C D G X C D G C R CD A

A N B T E R L I A N

L L X V F H R Q Q X

CA C A C A C T C X C Q C Q A C F I C S - C N -P C T

P P P P P P P P P P

S 8 X T S H

A A X N R R Q P T R R

C C C C C C C C C C C

A S 8S L X L X A K E X T K V B A 8 I Q AK

V V -

-

Y G TB C Y R T T C N NW E C Y N L D C L E V P C 8 N L P C T G V AH Q R GS C L 8 P P C R N L N C K N V PC

C. I Y D P N B D G V YR E N D BN A (( V Y D PN (G V Y ) D T D T L T D GAL4 Y B P®T®() P L TRA L T B V B 8RL V Y B P Q V V ()T P L T (Oa ILAC9 L T E N B N ()V PPRI V 8 L D P A T G D V P 8 Y V F F L E D L ® L ® W 8 G G Y D I (g) ARGR2 PN Q F D P Y G V P I P L C Y N E RPl T W A E E A E @E L L @ D N E L @ B PA T Y N IBT P G L P T G Y L L L N QUTA MAL63 T YL L2P L ( G P ®S I ®A G S L ®®-i I A LBU3 I L® D F T Y® A®NB F( E A IBE K

NIT4

NIRA

FIG. 4. (A) Proposed Zn(II)2Cys6 zinc finger structure in the putative NIT4 protein. The six cysteine residues are shown coordinating two zinc atoms to form a cloverleaflike structure. (B) Homology among the six cysteine-type zinc finger regions of several fungal regulatory proteins. The conserved cysteine and other amino acid residues are blocked. (C) Amino acid sequence of the basic region immediately downstream of each zinc finger motif. Basic amino acid residues are circled.

5742

MOL. CELL. BIOL.

YUAN ET AL. DISCUSSION

A

Pan and Coleman (36, 37) found that the DNA-binding domain of the GAL4 protein has the sequence Cys-X2-CysX6-Cys-X6-Cys-X2-Cys-X6-Cys and forms a Zn(II)2Cys6 biS G+ L(+) S EGL ':-nuclear metal complex. This same distinctive cysteine-rich D+ Q(+) REmotif is found in NIT4 and other fungal regulatory proteins, Stc 3pR(+) Qand in each case it is followed immediately by a basic region 98 91 909192 80 76 (Fig. 4). This class of single-zinc-cluster DNA-binding proE K I VYGTE1RIIYDPNSDHRRKGVYR

53

lI

59 60

63

66

68

733

70

APRRR KS KnDG ALPSE]AALq]A S

5

teins regulates diverse types of metabolic pathways, carbohydrate metabolism (GAL4, LAC9, and MAL63) (23, 29, 45), quinate utilization (QA-1F and QUTA) (2, 3), and nitrogen catabolism (NIT4 and DAL81) (5). Both the zinc cluster and the basic region on its carboxy-terminal side are necessary for DNA binding of GAL4 and LAC9 proteins (8, 46). The amino acids downstream of the zinc fingers of GAL4, LAC9, and PPR1 are necessary for specific recognition of target DNA sequences. We used site-directed mutagenesis to determine whether amino acid residues in the putative zinc finger region of the NIT4 protein were required for nit4 function. When the third and fifth cysteine residues of the possible Zn(II)2Cys6 structure were individually replaced by another small un-

tiJ Q~+)

\

E N-

(+-)

E

B NIR A

AAG

GAA

CTC

GAG

K

E

L

E

TTC

GAG

AAEco RlGAT

PCR PRIMER

AAG

GAA

Eco Rl

NIT 4

AAG

GAA

TTC

K

E

F

GAA E

C NIR A

892

S 892 1090

NIT 4 I

914

NIT 4/ NIR A (517)

(397)

FIG. 5. Site-directed mutagenesis and hybrid gerie construction. (A) Amino acid substitutions in the zinc finger al nd downstream basic region. Numbers indicate the amino acid resiidue (relative to the amino-terminal methionine, +1). Most mutant:s have a single amino acid substitution; the connected arrows indica in which a mutant had two amino acid changes. Results of transfor(transformation assays: +, functional; (+), partially funct gene mants grow slowly); -, nonfunctional. (B) Region of fusion. The fusion occurred between the coding regi ons for residues Lys-491 to Glu-494 of NIRA and the correspol nding residues, Lys-513 to Glu-516, of NIT4 (see Fig. 6). Shown are 12 bases of the PCR primer used which indicate the creation of an EcoRI site with the substitution of Phe (as found in NIT4) for Leu-4' 93 of NIRA. (C) Diagram of the encoded NIRA, NIT4, and hybrrid NIT4-NIRA proteins. The black boxes illustrate the glutamine-co ntaining regions of NIT4 which are missing entirely in the hybrid pro: nit-4-nitA gene encodes a protein composed of 517 residues of NIT4 fused to the carboxy-terminal 397 residues of NIRA.

tional niti4-nirA

tamino-terminal

homology for approximately 130 amino aciids, following which they are completely distinct. The NIT4-NIRA hybrid protein consists of 517 residues identical to NI T4 at its amino terminus, followed by 397 residues of NIRA ( Fig. SC). This hybrid gene (which contains the nit-4 pro: moter region) functioned in vivo in N. crassa and transfiormed a nit4 mutant with an efficiency essentially identic al to that obtained with a wild-type nit4 gene. The apparelnt requirement of NIT4 for a glutamine-rich region appears tto be compensated for by some domain in the carboxy-termtinal portion of NIRA. The intact nirA gene does not appearr able to complement a nit-4 mutant, but it may not be ex[pressed in N. crassa because of different promoter requirennents.

charged amino acid, glycine and serine, respectively, nit4 function was lost. Similarly, when lysine 60 of NIT4, which is conserved at this position in the single zinc finger motif of this class of regulatory proteins, was substituted by either a glutamate or a glutamine residue, nit4 function was abol-

ished. In contrast, when alanine 66 of NIT4, which occurs at a position occupied by different amino acids in various single zinc finger proteins, was substituted by glycine or by aspartate, the mutated NIT4 protein was functional. The homology of the NIT4 protein to the zinc finger domains of other regulatory proteins is striking; this feature and the finding that changing individual amino acid residues in this region abolishes nit4 function strongly imply that NIT4 possesses a single Zn(II)2Cys6 zinc finger which, with the adjacent basic region, constitutes a DNA-binding domain. The finding that nit4 mutants which specify NIT4 proteins with substitutions of certain amino acids in this region, e.g., Ala-66 and Pro-68, retain function, whereas nearby substitutions of amino acids, predicted to be essential, cause a complete loss of function, suggests that stability of the mutant proteins is probably not an important factor; rather, these results indicate that loss of function reflects a requirement for specific amino acids at certain positions in this domain of the NIT4 poen

A common feature of regulatory factors is the presence of distinct domains for DNA binding and transcriptional activation (34). GAL4 contains two acidic regions which have been shown to be responsible for transcriptional activation (31). NIT4 contains only relatively short acidic regions but contains two regions which are extremely rich in glutamine. Similarly, the yeast DAL81 protein, which controls purine catabolism, contains a single Zn(II)2Cys6 zinc finger and two polyglutamine regions (5). A glutamine-rich motif in the mammalian SP1 regulatory protein has been demonstrated to function in activation of gene expression (9, 10). We found that nit4 function was lost when the glutamine-rich domain was deleted from the NIT4 protein. Thus, the glutamine-rich domain of NIT4 might function in activating gene expression. Many proteins contain polyglutamine regions, such as the TATA box-binding protein, transcription factor TFIID (27), the Drosophila neurogenesis protein NOTCH (44), and the yeast SSN6 protein (40), but the function of these polyglutamine regions, if any, is still unknown.

90 80 70 60 50 40 30 20 10 CIYDPNSDHR NIT4 MNSSDVQMMS SQOAPGSAGL APONIASSLP SKKKSRRGAD PTN CRRCVS TACIACRRRK SKCDGALPSC AACASVYGTE * ******** *** * ********** ***** ****

***

MG EKLDPELSSD GPHTKSSSKG QGT-STDNA- PAS-RRCVS TACIACRRRK SKCDGNLPSC AACSSVYHTT CVYDPNSDHR

NIRA

180 170 160 150 140 130 120 110 100 EKND SMKAQNATLO ILIEAILNA EEDVIDIVRR IRTCDDLDiV AESIRRDEKN ATATNDNDDS DEPTOPGRDD ATSQAVEGER * * * ** ** i ** * ** **** **** EEDAfDLVRQ IRSCDNLEDV AQSLVNQEKK SSGWLSNAVI HEENDIAQTD QFES-----TLRTKNSTLL NIRA RKGVYKKDTDTLI*ALLNYF

NIT4 RK

270 260 250 240 20 220 210 200 190 NIT4 DLARKMGELR IENGSVRFIG GTSHLIYLSE P1ASEEPEL ETRLSTCD PITTWTEVTK NPQLIIHLVN MYFNWHYPYF TTLSRSLFYR *

I

** *** *** ** * ** ** * NIRA ELAGKMSNLV LD-GSRKFIG GTSNLIFLPP

GfELNEFKPG . .

*** ** *** **** * **** **.** * ** ** -LATNGDLEG SVTRWTTVTD DQQLISHLLT MYFSWHYPFF TTLSKELFYR

_

J

_

360 350 340 330 320 310 300 290 280 NIT4 DFIKGNAAGQ PRSTVYCSSL LVNAMLALGC HFTSVDGAFA VPGDSRTKGD HFFAEAKRLI VQNDEYEKPR LTTVQALALM SVREAGCGRE *

***** *** ****** ** *

*

*

*

**

* ******** *********

*

* * ** *** ******

NIRA DYSRG-VPSQ -----YCSSL LVNTMLALGC HFSSWPGARE DPDNSATAGD HFFKEAKRLI LDND .VNSK LCTVQALALM SVREAGCGRE

380

370 NIT4

FRMAQODIGLN AKGWVYSGMS ********* **** * ***

390 LDIGSL-D-EKE

*

** *

*

0

4 0

410

400

CFVFDKCWSN VDARRITFWG ********* ** *******

YLGRLPQ

**** **

NTYNVPKYbV *

1*

440

4501

FPDEDAELWS PYTDAGFDQSI * * *** ** * *

NIRA GKGWVYSGMS FRMAFDLGLN LESSSLRDLSEEEIDARRITFWG CFLFDKCWSN YLGRQPQ T ANTSVSAVDI LPNEESTLWS PYSDMGPSRE

I

540 530 510 520 500 490 480 470 460 NIT4 CKQPSRTRAI GLQLSKLCEI SSDLLLFFYH PSHIGRSSGK SAELKKLSEL HRRLEDWRTE LPKEFEPKDG QLPNVILMHM FYHLQYIHLF *******

* * ** * * **

* *,

***

*******

* *** *

**** **

* ***

**** ** *

***

NIRA YAQPSRTRAV ADQISQLCKI SGDLVVFFYD LAPKEKPSSK QLELKKLSEI HTRLEAWKKG LPKELEPREG QLPQALLMHM FYQLLLIHLY 630 620 610 600 590 580 570 560 550 NIT4 RPFLKYTKEA SPLE-KVQPRR ICTTNANSIS KLMRLYKKLY NLRQICNIAV YMLHSACTIH MLNLPEKTAR RDITHGVRQL EEMAEDWPCA * * ** ********

***

* **

**

***

*

** ** ****

*

** * * ***

NIRA RPFLKYTKST SPLPQHVSPRK LCTQAAAAIS KLLRLYKRTY GFKQICNIAV YIAHTALTIH LLNLPEKNAQ RDVIHGLRHL EEMGESWLCA

720 710 700 690 680 670 660 650 640 NIT4 RRTLGIISVL ARKWNVELPE EAAIVLKRTD EKYGMFSTSE VPSPNRTAPS LAPSSPAPPY TPEASLLFST TAAAVLEQQP YSPMTLYSHP **** *

* ** ***

** **

*

*

*

**

*

NIRA RRTLRILDIS ASKWQVQLPR EAVIVFEQTH ARWGSWGPWD QAASPSTTSD SPPSVSSQSV VATTDLSQPV SQSAGNQPAN PSMGTSPNLT

810 800 780 790 770 760 740 750 730 NIT4 TPPHLGPESI SDPGMSPNIV TFSDLPDPSA PIIPOQQQNM QAISSLSQNN MLHQHHHHLQ NQHQPOQPHH NHMTYQOOQH NLLTHPVSAS NIRA QPVASQYSST PSGPVSVSAM RAVORSFSAQ LAHNEAROPE PTYLRPVSYT SYGPVPSTQS AQEQWYSPTE AQFRAFTAAH SMPTTSAQSP 900 890 870 880 860 840 850 830 820 NIT4 SMSMSDTLAT ITAWGIPTSS PGNNNNNNIV SQHPHHQKQP QQQQPQAQR YPTVGSVGTN TVKPPAAATQ TFTPAQLHAN NLATATRSTA NIRA LTTFDTPENL VEESQODWWSR DVNALQLGAE DUTQNWNNGL PTTSADWRYV DNVPNIPSTS APDADYKPPO PPPNMARPNQ YPTDPVANVN

990 980 970 960 950 940 930 910 920 NIT4 SNHKSVGRHV SPSSIYAIOG QDWYLKDGVT WQQGFQGWL EGGGAGTATS TGGIGDGGGP TGGAGDSMAR LAPRGNIGGG GGGGGGSTGQ

NIRA SNQTNMIFPG SFQR

1000 NIT4 RQOOQQROOQ

1080 1070 1060 1040 1050 1030 1010 1020 QOOOOQOQO QQOQEANMFA YHHGAERGGG GIESTGMGMT TVGGGGGGFD PIGGSGLLDD LVGLDELGSL OQOOQQQ 0

1090 NIT4 DGLGHLPGLD FIG. 6. Homology between the NIT4 and NIRA proteins. NIT4 amino acid residue numbers are given above the sequence. Stars indicate identical amino acid residues. The Zn(II)2Cys6 zinc finger and the glutamine-rich domain are underlined. The polyglutamine domain is double underlined. Acidic regions are blocked with dashed lines. Closed boxes indicate the highly homology regions. 5743

5744

YUAN ET AL.

An alternative explanation for the lack of function observed with deletions in nit4 genes which result in loss of segments of the NIT4 protein must be considered. Such mutant proteins could be unstable and degraded in vivo, obviously resulting in a null phenotype; in this case, the region deleted from the protein might be important for stability but lack any specific function. Since no procedure is yet available to detect the NIT4 protein, e.g., with specific antibodies, the possibility that mutant proteins might be unstable cannot be dismissed. The nirA gene of A. nidulans (6), which corresponds to nit4 of N. crassa, has recently been sequenced, which makes possible a direct comparison of the NIT4 and NIRA regulatory proteins (Fig. 6). The NIRA protein consists of 892 amino acids and thus is somewhat shorter than NIT4, which contains 1,090 amino acids. In a stretch of approximately 600 amino acids, NIRA shows a high degree of homology to the corresponding region of NIT4 (residues 45 to 655), with 59% amino acid identity (Fig. 6). Of particular significance is the fact that 90% amino acid identity occurs between NIT4 and NIRA in the putative zinc cluster and immediate adjacent residues (Fig. 6). This strong homology in the putative DNA-binding domain suggests that the DNA recognition elements for NIT4 and for NIRA may be very similar. Indeed, transformation experiments have demonstrated that the nit4 gene of N. crassa can substitute for nirA in A. nidulans mutants which lack NIRA function (23a). In a region downstream of the zinc cluster, NIT4 (residues 335 to 418) has considerable homology to the corresponding segments of NIRA and of other fungal regulatory proteins which possess a single Zn(II)2Cys6 zinc cluster motif (7). NIT4 and NIRA possess short homologous acidic regions (Fig. 6), which have been implicated as activation domains of some trans-acting proteins (31). The carboxy-terminal 300 to 400 amino acids of NIT4 and NIRA are completely different from each other. NIT4 contains two segments which are extremely rich in glutamine; in contrast, NIRA lacks any such glutamine-rich regions and instead is more acidic in this part of the protein. It seems possible that NIT4 and NIRA bind DNA in a very similar fashion but may employ different mechanisms to activate structural gene transcription. We constructed a hybrid gene which encodes a protein that is identical to NIT4 for its entire aminoterminal half and is NIRA throughout its carboxy-terminal half; this mosaic gene functions in N. crassa. If the carboxyterminal portions of these proteins are required for gene activation, as seems likely, the transcriptional apparatus of N. crassa is able to respond to the distinct Aspergillus NIRA protein carboxy-terminal domain. The nirA gene contains four introns, whereas nit4 has but a single intron, which occurs in exactly the same position as the fourth intron of nirA, both interrupting a conserved histidine CAC codon (CA-intron-C) and possessing related 5' and 3' splice sites. It is interesting to note that the 3' end of the nit4-nirA hybrid gene includes one intron of the nirA gene; obviously, Neurospora cells are able to recognize and correctly splice out this intron of Aspergillus origin. NIT2, the major regulatory protein of the nitrogen circuit of N. crassa, binds to specific core elements, TATCTA, of the nit-3 and alc structural gene promoters (21). Neither NIT2 nor NIT4 alone elicits any detectable transcription of the nit-3 gene; however, when both of these positive-acting regulatory proteins are present, nit-3 is turned on to a high level of expression (18). It seems probable that the NIT4 protein also binds to specific recognition elements in the 5' promoter region of nit-3. NIT2 and NIT4 may each contrib-

MOL. CELL. BIOL.

ute certain activation domains, e.g., acidic regions of NIT2 and glutamine-rich regions of NIT4, which only in combination strongly enhance transcription of nit-3. ACKNOWLEDGMENTS This research was supported by Public Health Service grant GM-23367 from the National Institutes of Health. Oligonucleotides used in this work were prepared by Richard Swenson and Jane Tolley, Ohio State Biochemical Instrument center. We thank Gertraud Burger, Claudio Scazzocchio, and B. Franz Lang for sharing the nucleotide sequence of the nirA gene prior to its publication, and we thank Elizabeth Oakley for the gift of Aspergillus genomic DNA. REFERENCES 1. Aviv, H., and P. Leder. 1972. Purification of biologically active globin messenger RNA by chromatography on oligothymidylic acid-cellulose. Proc. Natl. Acad. Sci. USA 69:1408-1412. 2. Baum, J. A., R. Geever, and N. H. Giles. 1987. Expression of qa-1F activator protein: identification of upstream binding sites in the ga gene cluster and localization of the DNA binding domain. Mol. Cell. Biol. 7:1256-1266. 3. Beri, R. K., H. Whittington, C. F. Roberts, and A. R. Hawkins. 1987. Isolation and characterization of the positively acting regulatory gene QUTA from Aspergillus. Nucleic Acids Res. 15:7991-8001. 4. Birnboim, H. C., and J. Doly. 1979. A rapid alkaline extraction procedure for screening recombinant plasmid DNA. Nucleic Acids Res. 7:1513-1523. 5. Bricmont, P. A., J. R. Daugherty, and T. G. Cooper. 1991. The DAL81 gene product is required for induced expression of two differently regulated nitrogen catabolic genes in Saccharomyces cerevisiae. Mol. Cell. Biol. 11:1161-1166. 6. Burger, G., J. Tilburn, and C. Scazzocchio. 1991. Molecular cloning and functional characterization of the pathway-specific regulatory gene nirA, which controls nitrate assimilation in Aspergillus nidulans. Mol. Cell. Biol. 11:795-802. 7. Chasman, D. I., and R. D. Kornberg. 1990. GAL4 protein: purification, association with GAL80 protein, and conserved domain structure. Mol. Cell. Biol. 10:2916-2923. 8. Corton, J. C., and S. A. Johnston. 1989. Altering DNA-binding specificity of GAL4 requires sequences adjacent to the zinc finger. Nature (London) 340:724-727. 9. Courey, A. J., D. A. Holtzman, S. P. Jackson, and R. Tjian. 1989. Synergistic activation by the glutamine-rich domains of human transcription factor Spl. Cell 59:827-836. 10. Courey, A. J., and R. Tjian. 1988. Analysis of Spl in vivo reveals multiple transcriptional domains, including a novel glutamine-rich activation motif. Cell 55:887-898. 11. Davis, R. H., and F. deSerres. 1970. Genetic and microbial research techniques for Neurospora crassa. Methods Enzymol. 17A:79-143. 12. Dunn-Coleman, N. S., A. B. Tomsett, and R. H. Garrett. 1981. The regulation of nitrate assimilation in Neurospora crassa: biochemical analysis of the nmr-1 mutants. Mol. Gen. Genet. 182:234-239. 13. Feinberg, A. P., and B. Vogelstein. 1983. A technique for radiolabeling DNA restriction endonuclease fragments to high specific activity. Anal. Biochem. 132:6-13. 14. Friden, P., and P. Schimmel. 1987. LEU3 of Saccharomyces cerevisiae encodes a factor for control of RNA levels of a group of leucine-specific genes. Mol. Cell. Biol. 7:2708-2717. 15. Friedman, D. I., and M. J. Imperiale. 1987. RNA 3' end formation in the control of gene expression. Annu. Rev. Genet. 21:453-488. 16. Fu, Y.-H., J. Y. Kneesi, and G. A. Marzluf. 1989. Isolation of nit4, the minor nitrogen regulatory gene which mediates nitrate induction in Neurospora crassa. J. Bacteriol. 171:4067-4070. 17. Fu, Y.-H., and G. A. Marzluf. 1987. Characterization of nit-2, the major nitrogen regulatory gene of Neurospora crassa. Mol. Cell. Biol. 7:1691-16%. 18. Fu, Y. H., and G. A. Marzluf. 1987. Molecular cloning and

VOL . 1 l, 1991

analysis of the regulation of nit-3, the structural gene for nitrate reductase in Neurospora crassa. Proc. Natl. Acad. Sci. USA 84:8243-8247. 19. Fu, Y.-H., and G. A. Marzluf. 1988. Metabolic control and autogenous regulation of nit-3, the nitrate reductase structural gene of Neurospora crassa. J. Bacteriol. 170:657-661. 20. Fu, Y.-H., and G. A. Marzluf. 1990. nit-2, the major nitrogen regulatory gene of Neurospora crassa, encodes a protein with a putative zinc finger DNA-binding domain. Mol. Cell. Biol. 10:1056-1065. 21. Fu, Y. H., and G. A. Marzluf. 1990. nit-2, the major positiveacting nitrogen regulatory gene of Neurospora crassa, encodes a sequence-specific DNA-binding protein. Proc. Natl. Acad. Sci. USA 87:5331-5335. 22. Fu, Y. H., J. L. Young, and G. A. Marzluf. 1988. Molecular cloning and characterization of a negative-acting nitrogen regulatory gene of Neurospora crassa. Mol. Gen. Genet. 214:74-79. 23. Giniger, E., S. M. Varnum, and M. Ptashne. 1985. Specific DNA binding of GAL4, a positive regulatory protein of yeast. Cell 40:767-774. 23a.Hawker, K. L., P. Montague, G. A. Marzluf, and J. R. Kinghorn. 1991. Heterologous expression and regulation of the Neurospora crassa nit4 pathway-specific regulatory gene for nitrate assimilation in Aspergillus nidulans. Gene 100:237-240. 24. Henikoff, S. 1984. Unidirectional digestion with exonuclease III creates targeted breakpoints for DNA sequencing. Gene 28:351359. 25. Johnston, M. 1987. A model fungal gene regulatory mechanism: the GAL genes of Saccharomyces cerevisiae. Microbiol. Rev. 51:458-476. 26. Kammerer, B., A. Guyonvarch, and J. C. Hubert. 1984. Yeast regulatory gene PPR1. I. Nucleotide sequence, restriction map and codon usage. J. Mol. Biol. 180:239-250. 27. Kao, C. C., P. M. Lieberman, M. C. Schmidt, Q. Zhou, R. Pei, and A. J. Berk. 1990. Cloning of a transcriptionally active human TATA binding factor. Science 248:1646-1649. 28. Kari, P., K. S. Kim, S. Kogan, and L. Guarente. 1989. Functional dissection and sequence of yeast HAP1 activator. Cell

56:291-301. 29. Kim, J., and C. A. Michels. 1988. The MAL63 gene of Saccharomyces encodes a cysteine-zinc finger protein. Curr. Genet. 14:319-323. 30. Kunkel, T. A. 1985. Rapid and efficient site-specific mutagenesis without phenotypic selection. Proc. Natl. Acad. Sci. USA 82:488-492. 31. Ma, J., and M. Ptashne. 1987. Deletion analysis of GAL4 defines two transcriptional activating segments. Cell 48:847853. 32. Marzluf, G. A. 1981. Regulation of nitrogen metabolism and gene expression in fungi. Microbiol. Rev. 45:437-461.

N. CRASSA nit4 GENE

5745

33. Messenguy, F., E. Dubois, and F. Descamps. 1986. Nucleotide sequence of the ARGRII regulatory gene and amino acid sequence homologies between ARGRII PPRI and GAL4 regulatory proteins. J. Biochem. 157:77-81. 34. Mitchell, P. J., and R. Tjian. 1989. Transcriptional regulation in mammalian cells by sequence-specific DNA binding proteins. Science 245:371-378. 35. Orbach, M. J., E. B. Porro, and C. Yanofsky. 1986. Cloning and characterization of the gene for ,-tubulin from a benomylresistant mutant of Neurospora crassa and its use as a dominant selectable marker. Mol. Cell. Biol. 6:2452-2461. 36. Pan, T., and J. E. Coleman. 1989. The DNA binding domain of GAL4 forms a binuclear metal ion complex. Biochemistry 29:3023-3029. 37. Pan, T., and J. E. Coleman. 1990. GAL4 transcription factor is not a "zinc finger" but forms a Zn(II)2Cys6 binuclear cluster. Proc. Natl. Acad. Sci. USA 87:2077-2081. 38. Reinert, W. R., V. B. Patel, and N. H. Giles. 1981. Genetic regulation of the qa gene cluster in Neurospora crassa: induction of qa messenger ribonucleic acid and dependency on qa-1 function. Mol. Cell. Biol. 1:829-835. 39. Sanger, F., S. Nicklen, and A. R. Coulson. 1977. DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. USA 74:5463-5467. 40. Schultz, J., L. Marshall-Carlson, and M. Carlson. 1990. The N-terminal TPR region is the functional domain of SSN6, a nuclear phosphoprotein of Saccharomyces cerevisiae. Mol. Cell. Biol. 10:4744 4756. 41. Soger, S. J., and N. H. Giles. 1965. Genetic control of nitrate reductase in Neurospora crassa. Genetics 52:777-788. 42. Tomsett, A. B., and R. H. Garret. 1981. Biochemical analysis of mutants defective in nitrate assimilation in Neurospora crassa evidence for autogenous control by nitrate reductase. Mol. Gen. Genet. 184:183-190. 43. Volhmer, S. J., and C. Yanofsky. 1986. Efficient cloning of genes of Neurospora crassa. Proc. Natl. Acad. Sci. USA 83:48694873. 44. Wharton, K. A., B. Yedvobnick, V. G. Finnerty, and S. Artavanis-Tsakonas. 1985. Opa: a novel family of transcribed repeats shared by the Notch locus and other developmentally regulated loci in D. melanogaster. Cell 40:55-62. 45. Witte, M. M., and R. C. Dickson. 1988. Cysteine residues in the zinc finger and amino acids adjacent to the finger are necessary for DNA binding by the Lac9 regulatory protein of Kluyveromyces lactis. Mol. Cell. Biol. 8:3726-3733. 46. Witte, M. M., and R. C. Dickson. 1990. The C6 zinc finger and adjacent amino acids determine DNA-binding specificity and affinity in the yeast activator protein LAC9 and PPRL. Mol. Cell. Biol. 10:5128-5137.

nit-4, a pathway-specific regulatory gene of Neurospora crassa, encodes a protein with a putative binuclear zinc DNA-binding domain.

nit-4, a pathway-specific regulatory gene in the nitrogen circuit of Neurospora crassa, is required for the expression of nit-3 and nit-6, the structu...
2MB Sizes 0 Downloads 0 Views