J. Mol. Biol. (1990) 212, 135-142

Rubredoxin

Reductase of Pseudomonas oleovorans

tructural Relationship to Other Flavoprotein Oxidoreductases Based on One NAD and Two FAD Fingerprints Gerrit Eggink’?, Henk Engel’, Gert Vriend2j: Peter Terpstra’ and Bernard Witholt’

Groningen

‘Department of Biochemistry Biotechnology Center, University

of Groningen

and 2Department of Chemical Physics, University of Groningen Nijenborgh 16, 9747 AG Groningen, The Netherlands (Received 4 August

1989; accepted 12 October 1989)

The oxidation of alkanes to alkanols by Pseudomonas oleoljorans involves a three-component system: alkane hydroxylase, rubredoxin and rubredoxin reductase. Alkane enzyme hydroxylase and rubredoxin are encoded by the alkBPGHJKL operon, while previous studies indicated that rubredoxin reductase is most likely encoded on the second alk: cluster: the a&ST operon. In this study we show that alkT encodes the 41 x lo3 M, rubredoxin reductase, on the basis of a comparison of the expected amino acid composition of AlkT and the previously established amino acid composition of the purified rubredoxin reductase. The a&T sequence revealed significant similarities between AlkT and several NAD(P)H and FAD-containing reductases and dehydrogenases. All of these enzymes contain two ADP binding sites, which can be recognized by a common @$-fold or fingerprint, derived from known structures of cofactor binding enzymes. By means of this amino acid fingerprint we were able to determine that one ADP binding site in rubredoxin reductase (AlkT) is located at the N terminus and is involved in FAD binding, while the second site is located in the middle of the sequence and is involved in the binding of NAD or NADP. In addition, we derived from the sequences of FAD binding reductases a second amino acid fingerprint for FAD binding, and we used this fingerprint to identify a third amino acid sequence in AlkT near the earboxy terminus for binding of the flavin moiety of FAD. On the basis of the known architecture and relative spatial orientations of the NAD and FAD binding sites inI related dehydrogenases, a model for part of the tertiary structure of AlkT was developed.

oxygenase (McKenna & Coon, 1970; Ruettinger et 1977). The 19 x lo3 Mr rubredoxin transfers electrons to the alkane hydroxylase (Peterson et al., 1967; Peterson & Coon, 1968). The third component is the cytoplasmic rubredoxin reductase, a 55 x lo3 ih!, FAD-containing enzyme, which reduces rubredoxin at the expense of NADH (Ueda ek al.,

1. Introduction

al.,

Pseudomonas oleovorans carries the catabolic plasmid OCT, which enables it to metabolize C,-C,, n-alkanes as a sole source of energy and carbon (Chakrabarty et al., 1973). The enzyme system involved in the first step of alkane oxidation has been isolated and characterized by Coon and his coworkers. n-Alkanes are oxidized by the threecomponent’ alkane hydroxylase complex in an NADH-dependent reaction. The first component of the enzyme system is the membrane-bound 41 x IQ3 &!& alkane hydroxylase, which is a mono-

1972).

The three components are encoded b;y the alk regulon, which is localized on the OCT plasmid and consists of two distinct gene clusters: the alkBFGHJKL operon and the aEPrR locus (Fennewald et al., 1979; Owen et al., 1984; Eggink et al., 19876). The 7.5 kb§ all%BFGHJKL operon codes for alkane hydroxylase (all&), rubredoxin 1 (a&F),

1‘ Author to whom all correspondence should be addressed. 1 Present address: European Molecular Biology Laboratory (EMBL), Meyerhofstr. 1, D-6900 Heidelberg, Germany.

$ Abbreviations used: kb; lo3 bases or base-pairs; GRMAN, human glutathione reductase. 135

0022-6836/9(1/020135-08

$03.00/O

0

1990 Academic

Press Limited

G. Eggink

136

The numbers in the boxes indicate the apparent molecular masses ( x 10e3) of the a& proteins determined by SDS/ polyacrylamide gel electrophoresis.

rubredoxin 2 (alleG), aldehyde dehydrogenase (a&H) and alkanol dehydrogenase (a&J and possibly a&K) (Eggink et al., 1987a; Kok et al., 1989a,b; see Fig. 1). Rubredoxin 2 is the functional electron carrier, while no clear function has been ascribed to rubredoxin 1. The a&J and a&K cistrons encode 58 x lo3 M, and 59 x IO3 Mr proteins, respectively. From genetic data it was concluded that AlkJ is involved in alkanol dehydrogenation, but the role of AlkK remains unclear. Neither gene product is part of the alkane hydroxylase complex, since both a&J and a&K can be deleted from the operon without affecting the alkane hydroxylase activity (Eggink et aE., 19876). Thus, unlike alkane hydroxylase and rubredoxin, rubredoxin reductase is not encoded on the alkBFGHJKL operon. In our previous study on the genetic organization of the aEM locus we concluded that rubredoxin reductase might be encoded by this second gene cluster of the a& system (Eggink et al., 1988). The a&R locus was found to code for two proteins: a 99 x lo3 M, regulatory protein (a&S), which positively regulates the expression of the alkBFGHJKL operon, and a 48 x lo3 M, protein (&CT) (Eggink et al., 1988; see Fig. 1). The latter is not involved in regulation of alkBFGHJKIJ expression, but it is essential for alkane hydroxylase activity. In this paper we present evidence that alkT does in fact encode rubredoxin reductase. Comparison of the nucleotide sequence of a&T with known sequences shows similarities to other FAD-containing reduetases, especially with the putidaredoxin reductase of the Pseudomonas putida cytochrome P-450cam system.

2. Materials

and Methods

(a) Recombinant DNA

techniques

Plasmid DNA was isolated by the alkaline lysis method described by Birnboim & Doly (1979). Restriction endonucleases and bacteriophage T4 DNA ligase were obtained from Boehringer-Mannheim GmbH (Mannheim, West Germany) and Bethesda Research Labs GmbH (Neu Isenburg, West Germany) and used under conditions recommended by the suppliers. Analysis of piasmid DNA by restriction digestions followed by agarose gel electrophoresis was carried out as described by Maniatis et al. (1982). Nucleotide sequencing was done with the dideoxy chain termination method described by Sanger et al. (1977),

et

al.

Figure 2. Sequencing strategy for the allc’l’ coding region. The thin line on top of the Figure represents the area coding for the 48 x lo3 M, protein AlkT as determined by analysis in minicells (Eggink et al., 1988). The boxes indicate the open reading frames found. The arrows in the lower part of the Figure indicate the direction and extent of sequencing from each restriction site. Only restriction sites used for sequencing are shown. BgliI (B); UaeIII (H), WpaI (HP), PvuII (P), Sal1 (S).

with bacteriophages M13mp18 and -Ml3mp19 as vectors for HaellT, PvuII, BgZII, HpaI and SalI-generated aEk DNA fragments. pGEc228 (Eggink et al., 1988) was used as a source of alkR DNA. The sequence strategy is shown in Fig. 2. Both strands of the coding region of al&T were completely sequenced.

(b) Calculations

The nueleotide and amino acid sequences were analyzed on the VAXlljl50 using the NAQ and PSQ data-base search programs from the Protein Identification Resource System (Dayhoff et al., 1983), the Swiss-Prot Protein Data Base (EMBL, Heidelberg) release 8, the FASTP sequence alignment program (Lipman & Pearson, 1985) and the alignment program described by Argos (1987).

3. Results and Discussion (a) Nucleotide

sequence

of alkT

The aEkT cistron lies downstream from the reguiatory gene allcS on a 4.9 kb XaET fragment. Roth cistrons were previously identified in minicell experiments using a set of overlapping subclones. a&T was identified as a cistron encoding a 48 x lo3 M, protein. The position of a&T, as indicated in Fig. 2, was also deduced from these minicell experiments (Eggink et al., 1988). DNA sequencing of the BglII-Sal1 fragment was carried out by tbe dideoxy nucleotide chain termination method @anger et al., 1977). The sequencing strategy used is depicted in Figure 2. The complete nucleotide sequence of the 2.1 kb BglII-Sal1 fragment, is shown in Figure 3. Two large open reading frames were identified. The first open reading frame is the carboxy-terminal part of a larger open reading frame, which encodes the 99 x lo3 nil: regulatory protein AlkS. The second open reading frame starting 52 base-pairs downstream from ORFl codes for a 385 amino acid residue polypeptide with a molecular weight of 41,086. The position and length of this open reading frame is in very good agreement with the position of the a&T cistron on the 4.9 kb Sal1 fragment as deduced from minicell experiments (Eggink et al., 1988).

Rubredoxin reductase of P. oleovorans 50

10

ATC TAT Ile Tyr

CAG CGC TTA Gin Arq Leu

GTC TGT CAA GGC ATA Val Cys Gin Gly Ile

ACG GGC ATA Thr Gly Ile

AAT Asn

AAT TTA Asn Leu

100 GCC CCC CTA AAA GCA CGC CTG CTG CTT GTT CAA TCA CTA GTG CTT Ala Pro Leu Lye Ala Arq Leu Leu Leu Val Gln Ser Leu Val Leu

GCT ATT Ala Ile

190 CAG CA* Gln Gin

TTA ieu

280 AAG GCT CAG CTT Lys Ala Gln Le"

AA* Lys

370 ATT GAG CGA ATA Ile Glu Arq Ile

GAG ATT Glu Ile

GCC CGT *AA Ala Arq Lys

ATA Ile

ATT Ile

AAC ATT Asn Ile

AAC GCG GGC CAG CTG GA* as" Ala Gly Gln Leu Glu

GTT

TAT

Val

Tyr

CCG GCT TCA TGT GA* Pro Ala Ser Cys Glu

GCA GTG GAG APA Ala Val Glu Ile

GCC TTC Ala Phe

*CA Thr

460 GGR RAG TCC GCA GAG AAT AZ& GCT GAC GCT WA Gly Lys Ser Ala Glu As,, Lys Ala asp Ala Le"

550 AAC AAA CA* ATA GCA AC* Am Lys Gln Ile Ala Thr

AAT an: Asn Met

640 CGC ACG CAR GCA AC* Arq Thr Gin Ala Thr

GAA GCT GAG Glu Ala Glu

ATT Ile

740 ATC GTT GTT GTT GGC GCT GGT *CA Ile Val Val Val Gly Ala Gly Thr

CGT CAA GGA ATT Arq Gln Gly Ile

ACG RAT AAC AAT Thr Asn As,, asn

ATT Ile

ACC ATT Thr Ile

GAA AAT, TTG Glu Lys Leu

TTA Leu

CGC AGT ATG GA* arq Ser Met Glu

TGC TTT Cys Phe

CAA AC* Gln Thr

GAT caa asp Gln

GCG ATT Ala Ile

GGG GCT TTT Gly ala Phe

CAT ATG AGG *AA His Met arq Lys

GTT CTT GAT GA* Val Leu Asp Glu

270 GGT GAT Gly ASI,

360 GTA TGT TTA Val Cys Leu

AGT CTT Ser Leu

450 CCG CGA ATA GTT Pro Arqr Ile Val

TTG AGG CTT GTA AA* Leu arq Leu Val Lys

540 GAG GGG TGC TCA Glu Gly Cys Ser

ATA TTT Ile Phe

GCC *CC Ala Thr

630 TTG AAT G'PA GTG AAT Leu am Val Val Asn

680 TAA AATAATCGGCATTAAGTGATATAGTGAAAAGTATACCGGAGAGAGAATT

730 ATG GCA

Met ala 780 TGG CTT CGT CAA TAT GGT TAT Trp Le" arq Gln Tyr Gly Tyr

TIT Phe

870 CTG AC* Leu Thr

AAA GGG GA* Lys Gly Glu

ATT Ile

AGG ATT ATq :le

TTT Phe

ATT Ile

AAT As,!

1280 ACG CCG GCA GCR GCR ARC TTA Thr Pro Ala ala Ala Asn Leu

GCA GA* Ala Glu

WC Ser

GCA GTG CCA T3a Ala Val Pro Le"

AAG CCA Lys Pro 1000

Ser

Leu

CTT

1050 GCA ACA CCT GCT AGC GCA CGT AGG TTA ala Thr Pro Ala Ser Ala Arq arq Leu

Le"

GGC TTA Gly Le"

TTA Leu

1140 AGG AAA CTT GTG GAG AGT GCG TCT arq Lys Le" Val Glu Ser Ala Ser

1180 GTT GTT GTG TTG GGC GGC GGA GTA Val Val Val Le" Gly Gly Gly Val

GTG ATA GAA GCC *CC Val Ile Glu Ala Thr

1270 CCG CGT GTA ATG GCG CGC GTG GTT Pro Arq Val Met ala ax-q Val Val

GAG TTC Glu Phe

GCG A.&A TTA Ala Lys Leu

AAG Cn; Lys Leu

AAT Asn

1450 GGA ATC GGT GCT ATC CCA GAG Gly Ile Gly Ala Ile Pro Glu

TAT Tyr

1540 GCA Ala

GAG ACA ATT Glu Thr Ile

CAT ART His asn

1630 GCG GTT ACA CAC GCT CAA ATT Ala Val Thr His Al:1 Gln 2'le

1680 CCA ACC CCA CC* CGG TTC Pro Thr Pro Pro Arq Phe

TGG TCT FAT Trp Ser asp

CTT Leu

1720 GGG ATG GCG CTG CAA GGA CTT Gly Met Ala Le" Gl:l Gly Leu

1640 GTC GCA AGT AGC ATC TGT GGC *CA Val ala Ser Ser Ile Cys Gly Thr

TCA ACA CCA GCA Ser Thr Pro Ala

1730 CTA RAG GAC TAC GAT AAA CTC GTT G'PT GCA Leu Lys Asp Tyr Asp Lys Leu Val Val Ala

TTG CCT AX+ Leu Pr‘o Lys

1.360 ATR AAG GGC Ile Lys Gly

1500 GGT GTT GTG GTC GAT GAT CAG ATG TGT ACA TCG GAT ACA AGT ATA Gly Val Val Val Asp asp Gln Met Cys Thr Ser asp Thr Ser 11s 1590

CCT TTT Pro Phe

ACG TCT Thr Ser

TGG GGA ACG ATG GTA CGT TTA Trp Gly Thr Met Val Arq Leu

AAT am

Asn

Arq

1090 ACC TGC GAG GGG TCT G&A CTG TCT GGG Thr Cys Glu Gly Ser Glu LEU Selc Gly

1410 CTT GAA AGT GGA GAA GAA ATT CAG GCG GAT CTG ATT GTA GTT Leu Glu Ser Gly Glu Glu Ile Gin Ala asp Len Ile Val Val

1550 ATC GGC GAC TGC GCA ATG GCT *GA Ile Gly asp Cys da Met ala arq

AAT

CGT

1230 GGG RAG AGG GTC AC* Gly Lys Arq Val Thr

1460 CT?, GAG Cn: GCA ACT GAG GCG GCC CTT GAA GTG AGT AAT Leu Glu Leu Ala Thr Glu Ala Ala Leu Glu Val Ser asn

AT* Ile

GTA TCA ATC GAC G'IG GGG CGT AAG ATA GTT TCT TCT *AA Val Ser Ile asp Val Gly Arq Lys Ile "~1 Se'r Ser Lys

TTA

1320 GTC AGA GCC CGC CTG GAG GCT GAP, GGA ATT Val arq Ala arq Le" Glu Ala Glu Gly Ile

1370 GGC CAT GTT GA?, CAA TGC GTA Gly His Val Glu Gin Cys Val

1820 GCA ACT GAG ACA Ala Thr Glu Thr

ACA CCG ATT Thr Pro Ile

820 AGC SeK 910

AGT GAG ATT Ser Glu Ile

TCG

GAC GCC AAA AAT asp Ala Lys Asn

1190 RTC GGG CTT GAA GTC GCC TCA GCT GCG GTG Ile Gly Leu Glu Val Ala Ser ala Ala Val

Ile

GCG TPA TTG Ala Leu Leu

CA* Gln

CAG CGG CCT CCT CTA TCC RAG GCT Gin Arq Pro Pro Leu Ser Lys Ala

1100

ATT

CAT AGT GCC GCG CAC *GA His Ser Ala Ala His Arq

320 GCA AAA CAG CTT Ala Lys Gln Leu

GCT GGA GTA AAT GCT GCG TTC Ala Gly Val Asn Ala Ala Phe

1010

GGT GCT Gly Ala

TCC GGA CAG CAC WA Ser Gly Gin His Ser 180

Tl'T Phe

500 TTG ACA CGG AAG CAG ATT GCT GTC Leu Thr Arq Lys Gin Ile Ala Val

AK Ile

TTC Phe

960

TAT Tyr

GAT GGA AAA GAA TAC GCG TAT Asp Gly Lye Gl?, Tyr Ala Tyr

AGG AAT arq asn

CTT GAA GAT CAT AAG ATT Leu Glu Asp His Lys Ile

TTG TGT TTG GCC GGA GCG CAG GCA GGT KC Leu Cys Leu Ala Gly Ala Gln Ala Gly Ala

590 GAA GAT GCT ATA AAG 'XG Glu Asp Ala Ile Lys Trp

CAT GTC *CC His Val Thr

90

ACT Thr

230 GTC CGT GGA TTA Val arq Gly Leu

410 GCA GTT AAT TTA Ala Val Asn Leu

920

GTC TGC TAT Val Cys Tyr

AA* Lys

140 TCC CGA GAT CGG *AC Ser Arq Asp Arq Asn

830

AGG GAG TCT GTG GCG CCT TAT Arq Glu Sex- Val Ala Pro Tyr

GAA GGT TTT Glu Gly Phe

137

ATT Ile

AAT

as"

1770 ARC GAA ACT CTT Asn Glu Thr Leu

AA* Lys

1810 GAA CTA GAA GTC CTT GCG TAC AAG CAG GAG CGA CTG Glu Leu Glu Val Leu Ala Tyr Lys Gln Glr:. Arq Leu

1860 CGT CAR GGT GCG CTT GCA GGG AGT ATA arq Gin Gly Ala Leu Ala Gly Ser Ile

AAA TTA Lys Leu

1900 CCT GAT TAG CAATGATGCTCAGCCACTCGAAC Pro ASP

2000 t92.0 1960 CAACffiTCGCGATAGGGACGGCAGTTACCGCCCCGCCCCCCGCACTCCGTACGTGC~~CTACCGCGTA~TGTffiCCCA~C~TTATGT~CGCTT~C~~~GTAq'TGCCAT 2030 ATTTGGTGATGACCGTTTTCTACGCCACATAAATCGGTGGCGAGCT

Figure 3. Kucleotide &RI

2070

2110

sequence of the a&T coding region and the 3’ region of alkX. Nucleotides are numbered site on the top. The underlined nucieotides refer to a putative shine & Dalgarno (1975) sequence.

from the

G. Eggink

Table 1 Comparison of amino acid composition

1988) and genes on the OCT-related IneP-2 plasmid CAM (Unger et al., 1986). The codon usage of alkT, with preference for A and T in the third base position, is similar to the alkBFGH genes (Kok et al., 1989b) and differs strongly from the eodon composition of other Pseudomonas genes that show preferential usage of G and C in the wobble base position.

of rubredoxin reductase, based on protein analysis (Ueda et al., 19’72), and of AlkT, based on nucleotide sequence analysis

Amino acid Axn (Asx) Asp Thr Ser Gln (Glx) GlU @Y Ala CYS Val Met Ile Leu TY~ Phe LYS His A% Trp Pro

Rubredoxin reductase (%)t 72 5.5 6.7 10.1 8.6 1I.9 1.6 9.4 1.6 6.7 10.2 2.5 2.0 49 04 53 0.6 41

et al.

AlkT (%) 3.9 31 6.0 67 2.3 7.8 8.0 11.7 1.6 96 18 7.0 96 2.6 1-8 49 0% 54 03 44

(b) Function

t Asn = Asx and Gln = Glx.

At seven base-pairs upstream from the ATG codon, a sequence (GGAGA) was found that has homology to the ribosome binding sites of other Pseudomonas genes (Deretic et al., 1987) including those found for cistrons of the P. oleovorans alkBAC operon (Kok et al., 1989a,b). The 53% A + T content of the coding region of aEkT was identical with that found for the alkBFGH genes (Kok et al., 19896). The low G+C content (47%) of these genes is in contrast with the high G+C content of the P. putida chromosome (60 to 63%) (Mandel, 1966), genes of the P. putida TOL plasmid (Nakai et al., 1983; Inouye et al., 1986,

qf AlkT

in alkane

~yd~oxylat~on

In a previous paper (Eggink et al., 1988) we showed that alkT is absolutely required for alkane utilization in P. oleovorans but not involved in transcriptional activation of alkBFGHJKL expression. Alkane hydroxylation is the only essential plasmidencoded biochemical activity in alkane utilization; all other necessary functions are also encoded by the chromosome. We concluded therefore that alkT must code for a component of the alkane hydroxylase system. Since alkane hydroxylase (aEkB) and rubredoxin (a&G) have already been identified on the 4.5 kb proximal part of the alkBFGHJKL operon (Kok et al., 1989a,b), which in combination with the alkR locus is sufficient for alkane hydroxylation, it seems likely that alkT codes for rubredoxin reductase. The nucleotide sequence of a&T enabled us to eonfirm this conclusion. First, the amino acid cornposition deduced from the sequence was compared with the amino acid composition determined by Ueda et al. (1972) for the purified P. oleoworans rubredoxin reductase. Although the coding capacity of alkT (41 x 103 n/r,) is considerably lower than the apparent molecular weight found for the purified rubredoxin reductase (55 x lo3 &&), the similarity between the amino acid compositions is striking and highly significant (Table 1). As a measure of the similarity, the sum of the squared differences of the amino acid percentages was calculated. For the comparison of AlkT and rubredoxin reductase the

Table 2 Ximilarity

of the AlkT

amino acid sequence to oxidoreductases

containing

Match sequence Dihydrolipoamide dehydrogenase (Azotobacter winelandii) (Westphal & de Kok, 1988) Mercuric reductase (Neuroqoru crassa) (Barrineau et al., 1984) Mercuric reductase (Pseudomonas aerqinosa) (Brown et al., 1983) Mercuric reductase (Shigella jZesnerii) (Misra et al., 1985) Dihydrolipoamide dehydrogenase (Escherichia coli) (Stephens et al., 1984) Dihydrolipoamide dehydrogenase (pig) (Otulakowski & Robinson, 1987) Dihydrolipoamide dehydrogenase (human) (Otulakowski & Robinson, 1987) Dihydrolipoamide dehydrogenase (I’. putida) (Burns et al., 1989) Glutathione reductase (E. coli) (Greer & Perham, 1986) NADH dehydrogenase (E. coli) (Young et aE., 1981) Glutathione reductase (human) (Krauth-Siegel et al., 1982) Mercuric reductase (Stqhylococcus UUTOUS)(Laddaga et al.. 1987) Thioredoxin reductase (E. eoli) (Russel & Model, 1988) p-Hydroxybenzoate hydroxylase (P. $uo~esce~~) (Wijnands et al., 1986)

FAD

and NAD(P) Sligned score 140 132 131 131 127 126 125 108 91 17 62 51 41 36

as cofactor s.0. score 958 965 914 3024 9.76 912 894 6.30 635 4.70 2.92 1.42 1.12 - 094

Alignments were calculated using the programs FASTP and RDF (Lipman & Pearson, 1985) with ktup = 1 and 200 random runs. Only the aligned scores and the S.D. score above the random mean are tabulated. In all cases the test sequence is the amino acid sequence of AlkT and the listed sequence has been scrambled.

Rubredoxin PUtlddrEdOXl” reclllctase Rubredoxln n&uctase

(CamA) (AlkT)

40 “NRNDNW:~~GTGLAG”E:~~~~~~~~~~~~~”‘~**~~~”=~~~~~ .:.::.: :::_.:: ::. :..:.::. ..: NRiVVYGAGTAGYNRAFWLRoYGYKGEIRIPSRESYAPYa 10 20 30

reductase

50

139

P. oleovorans Fingerprint Am

:..

for

blndlng

e

.

-

.

-

G -

G -

-

G -

-

-

.

-

-

q

(lO?Pj

.

.

-

0

::::::

40

Figure 4. Homology between the amino-terminal part of putida redoxin reductase (CamA) and AlkT. The published sequence of putidaredoxin reductase (Unger et al., 1986) was aligned with the sequence of AlkT using the programs FASTP and RDF (Lipman & Pearson, 1985). This gives 51 0/O identity in a 45 amino acid residue overlap. Aligned score is 110 (ktup = l), S.D. score above mean is 15.4 (50 random runs). The single dots indicate conservative replacements.

sum is 1.35. For all of the other proteins in the Swiss-Prot Protein Sequence Data Base the protein that comes closest to AlkT in amino acid composition is rat pyruvate kinase with a similarity measure of 22.

(c) Relationship dehydrogenases

of

of AlkT to other and reductases

The amino acid sequence of AlkT was compared $0 all of the sequences in the Swiss-Prot Data Base ,release 8) using the program FASTP. The results are summarized in Table 2. A significant similarity was found with several reductases and dehydrogenases. The sequence of the N-terminal 53 amino acid residues of putidaredoxin reductase (Unger et al., 1986) is homologous to a&T (Fig. 4). All reductases in Table 2, including the enzymes that show no significant overall similarity to AlkT, belong to a class of enzymes that contain two ADP binding sites: one is located at the N terminus and is involved in FAD binding, while the second site is located in the middle of the sequence and is involved in the binding of NAD or NADP. Rubredoxin reductase fits t,his pattern, in that it also contains one molecule of FAD and requires NADH as a cofactor (Ueda et al., 1972). The precise location of these sites can be determined on the basis of a consensus sequence (fingerprint) involved in ADP binding (Wierenga et al., 1986). This amino acid fingerprint is derived from known structures of cofactor binding enzymes and predicts whether a particular amino acid sequence can fold into a p&unit with binding properties for the ADP moiety of NAD(P) or FAD. It consists of 11 rules for amino acids that should occur at conserved positions in an amino acid stretch (see Fig. 5). The length of the peptides found to date varies from 29 to 31 residues, because of the fact that the loop between the a-helix and the second P-strand is variable. The three glycine residues as well as the acid residue at the end of the fingerprint are strictly conserved in NAD or FAD binding @$units. Wierenga et al. (1986) concluded from their survey that all fingerprints with a score of 11, and most likely all fingerprints with a score of 10 as well, fold as ADP binding pap-units. A score of 9 out of 11 gives less certainty about the existence of an ADP binding ,i?ctfi-fold. However, if the two

. ..** &I”““GAGTAGYNAAFWL z

AlkT

(If... camA

t

L

*le.** SYYYLGG‘“IGLE”ASAA 144

.

19

IFSRE JO

23

PLVFD 33

t,

NVYIVGTGLRGVEYAFGL 6

A1kT

I

f 31 1

(I

II 161

* “TV1

* 37

*

168

. e 172

Figure 5. Potential ADP binding folds in rubredoxin reductase (AlkT) and putidaredoxin reductase (CamA). AlkT sequences and the sequence of Camh were aligned with a consensus sequence that can fold in a /7@-structure with ADP binding properties (Wierenga et ai., 19816).The loop represents an amino acid stretch with variable length. (0) K, R, H, S, T, Q, N; (m) $; 1; L, V, M, C; G, G; (0) D, E. Asterisks indicate agreement, with the consensus sequence. deviating amino acid residues are not replaced by charged residues, a peptide with a score of 9 could be a candidate for an ADP binding j&@-folding unit. In all of the sequences from the enzymes listed in Table 2 both fingerprints can be identified, except in p-hydroxybenzoate hydroxylase, where the NADP binding site is not clear from the primary sequence. The putative second /?aj-fold has also not been identified in the three-dimensional structure (Schreuder et al., 1988). We identified two potential ADP binding sites in the sequence of AlkT (see Fig. 5). One is located internally (residues 144 to 172) with a score of 11 out of 11. The second potential @/?-fold is located at the amino terminus (residues 2 to 34) with a score of 9 out of 11, The two deviating amino acids are replaced by uncharged residues. Given the location of the NAD(P) and FAD sites in the reductases for which structures have been determined (Karplus & Schulz, 1987; Schierbeek et al., 1989), we conclude that AlkT probably binds NADH at the internal /%$-fold, while FAD is bound to the amino-tenminal /$.$-fold. The fact that AlkT shows homology to a number of reductases and dehydrogenases (Table 2) and that it contains sequences characteristic for other FAD and NADH binding enzymes reinforces the identification of AlkT as rubredoxin reductase. (d) The FAD

binding

site: a second @gerprint

Table 3 shows that the alignment of the enzymes that bind FAD and NAD(P) reveals a third conserved region in addition to the two ADP binding @p-fingerprints proposed by Wierenga et al. (1986). This short conserved region was first identified by Russel & Model (1988), who compared the sequence of thioredoxin reductase with #some disulfide oxidoreductases. They did not, however, ascribe a specific function to this homologous region. When we used the fingerprint’ proposed in Table 3 to scan the Swiss-Prot Data Base (release 8), all the FAD binding reductases (Table 3) a,nd no other proteins were extracted. Therefore this fingerprint alone is very characteristic of the oxidoreductases that bind FAD and can be used as a predictor for the function of newly sequenced proteins.

0. Eggink

The structural basis for this conserved region is clear from the three-dimensional structure of three of the FAD binding enzymes: p-hydroxybenzoate hydroxylase (Schreuder et al., 1988), human glutathione reductase (Karplus & Schulz, 1987) and lipoamide dehydrogenase from Azotobacter vinelandii (Schierbeek et al., 1989). For all three structures it is known that the conserved Asp (Fig. 6) forms hydrogen bonds with the O-3 group of the ribityl chain of the Aavin moiety of FAD. Inspection of the structure of glutathione reductase and lipoamide dehydrogenase shows that the sequence from Thr to Asp forms a beta-sheet that turns at the Asp. Superposition of lipoamide dehydrogenase and human glutathione reductase (GRMAN) on the basis of this stretch of 11 amino acids gives a very good overall three-dimensional alignment of both enzymes. The glycine (Table 3, position 330, GRMAN) is conserved because of a turn in the C chain and to accommodate the PO, group of FAD. The hydrophobic residues at positions 326, 327, 329 (Table 3, GRMAN) are conserved because of the hydrophobic environment. The aromatic amino acid (Table 3, position 327, GRMAN) is at the outside of the molecule and could function as a plug to shield the FAD from the aqueous environment. The threonine at position 321 (Table 3, GRMAN), can form hydrogen bonds with other parts of the molecule. Because the NAD and FAD binding regions are so strongly conserved in the primary sequence, it is likely that all oxidoreductases from this class will have similar three-dimensional structures in these regions. Therefore, glutathione reductase or lipoamide dehydrogenase could be used as a promising template for modeling the structure of similar oxidoreductases. On the basis of this template, rubredoxin reductase is expected to consist of three

et al.

Figure 6. Model for the relative position of the NAD and FAD binding regions in rubredoxin reductase. A. The amino-terminal flab-fold binding FAD. B. PU’AD binding j&!-fold. C. The beta-sheet pointing to FAD. In this sheet the side-chain of Asp331 is shown pointing to the O-3 of the FAD ribityl part (see also Table 3). This Figure has been produced from an alignment of rubredoxin reductase and human glutathione reductase, and the co-ordinates of human glutathione reductase (Brookhaven Protein Databank code SGRS) with the program WHATIF (G. Vriend, unpublished results). The C” chain is represented by ribbons. spatially oriented amino acid segments as illustrated in Figure 6. Segment A (amino acids 2 to 34), the FAD binding /?@fold; segment B (amino acids 144 to 172), the NAD binding /&$-fold; segment C

Table 3 Amino

acid sequence involved

Protein

binding Sequence

Amino acid fingerprint

Rubredoxin reductase Thioredoxin reductase Dihydrolipoamide dehydrogenase Dihydrolipoamide dehydrogenase* Dihydrolipoamide dehydrogenase Dihydrolipoamide dehydrogenase Dihydroiipoamide dehydrogenase Glutathione reductase Glutathione reductase* Mercuric reductase Mercuric reductase Mercuric reductase Mercuric reductase NADH dehydrogenase p-Hydroxybenzoate hydroxylase*

in FAD

(P. oleovorans) (E. (E. (A. (P.

coli) cc&) vinelandii) putida)

(Pig)

(human) (E. co&) (human) (P, aeruginosa) (Staphylococcus mmus) (Shigellu j&m&i) (Neurospora crassa) (E. coli) (P. puorescens)

TxxxxIYIIGD M vwvv AFAA L L TSDTSIYATGD TSIPGVFAAGD TNVPHI FAIGD TSVPGVYAIGD TSMHNVWAIGD TKIPNIYAIGD TKIPNIYAIGD TNIEGIYAVGD TNVKGIYAVGD TSNPNIYAAGD TSNNRIYAAGD TSVEHIYAAGD TSVEHIYAAGD TRDPDIYAIGD MQHGRL F LAGD

Location

1

265-275 276-286 302-312 308-318 296-306 310-320 310-320 293-303 321-331 393-403 378-388 396-406 396-406 301-314 276-286

The conserved aspartate (arrow) forms hydrogen bonds with the O-3 of the ribityl moiety of the bound FAD cofactor. This has been verified in the known 3-dimensional structures of the proteins indicated (*). For further explanation, see the text. For references, see Table 2.

Rubredoxin

(amino acids 265 to 275), the beta-sheet binding the flavin moiety of FAD. These segments are linked by a I10 amino acid residue sequence from segment A to B, an 87 residue sequence from segment B to C, and a 110 residue carboxy-terminal sequence from ammo acid 275 to 385. (e) Rubredoxin

reductase and putidaredoxin

141

reductase of P. oleovorans

reductase

Putidaredoxin reductase is a component of the camphor 5-exohydroxylase system from P. putida. This well-characterized three component system is analogous to alkane hydroxylase and consists of eytochrome P-450cam, putidaredoxin and the FADcontaining putidaredoxin reductase. Gunsalus (1968) has shown that putidaredoxin reductase can replace rubredoxin reductase in assays in vitro. A similarity was also found in the amino acid compositions of putidaredoxin reductase (Tsai et al., 497 I ) and rubredoxin reductase (Ueda et al., 1972). Both are rich in hydrophobic amino acid residues, particularly valine, leucine and isoleucine. To the extent that their sequences can be compared, they show about, Sob;/, homology (Fig. 4). Finally, the amino acid fingerprint ror a possible @$-fold for FA4) binding found at the amino-terminal part of rubredoxin reductase is also present in putidaredoxin reductase (Fig. 5). The score for this fingerprint was 10 out of 11. An important difference between the two proteins is the localization of the corresponding genes. Since each reductase is part of a three-component enzyme system, the corresponding genes might be expected to be organized in single operons, thus allowing coordinate expression of the entire complex. This is true for the putidaredoxin reductase gene camA, which is part of the cam operon. It is not true, however, for the rubredoxin reductase gene alET, which is not part of the alkBFGHJKL operon. Interestingly, although their localizations differ, both camA and a&T are expressed poorly compared to the other two components of their respective enzyme complexes. In the case of alkT this may be related to low promoter activity, since there is no reason for high levels of expression of the regulator gene a&S, which precedes a&T. In the case of camA; poor expression is probably due to the rare start codon GTG (Unger et al., 1986). Thus, it appears that for both the P-450cam and the alkane hydroxylase systems, the reductases are very active compared to the hydroxylases, thereby permitting low intracellular levels of the reductases.

Birnboim, H. C. & Doly, J. (1979). Nucl. Acids Res. 7, 1513-1523. Brown, N. L., Ford, S.

Rubredoxin reductase of Pseudomonas oleovorans. Structural relationship to other flavoprotein oxidoreductases based on one NAD and two FAD fingerprints.

The oxidation of alkanes to alkanols by Pseudomonas oleovorans involves a three-component enzyme system: alkane hydroxylase, rubredoxin and rubredoxin...
963KB Sizes 0 Downloads 0 Views