Volume 2 number 2 February 1975

Nucleic Acids Research

Primary sequence of the 16S ribosomal RNA of Escherichia coli

Chantal Ehresmann', Patrick Stieglerl, George A .Mackie2, Robert A .Zimmermann2,3 J.P.Ebell, and Peter Fellnerl,4 1 Laboratoire de Biochimie, Institut de Biologie Moldculaire et Cellulaire du CNRS, 15, rue Descartes, F-67000 Strasbourg, Francc) Received 13 January 1975 ABSTRACT Recent progress in the nucleotide sequence analysis of the 16S ribosomal RNA from E.coli is described. The sequence which has been partially or completely determined so far encompasses 1520 nucleotides, i.e. about 957. of the molecule. Possible features of the secondary structure are suggested on the basis of the nucleotide sequence and data on sequence heterogeneities, repetitions and the location of modified nucleotides are presented. In the accompanying paper, the use of the nucleotide sequence data in studies of the ribosomal protein binding sites is described.

INTRODUCTION Ribosomal RNA appears to play an essential role in establishing and maintaining the structure of ribosomes, and consequently, in their func1 tion . We have undertaken the study of the primary sequence of the 16S ribosomal RNA of Escherichia coli with the hope that this knowledge will aid the understanding of the secondary and tertiary structures of the 16S RNA, of protein-RNA interactions within the ribosome, of ribosome assembly, and ultimately, of the three-dimensional structure of the ribosome itself. Recent progress has enabled us to determine the primary sequence of about 957. of this molecule. In this paper, we discuss the properties of the primary sequence and its implications for a possible secondary structure of the RNA. The relation of the primary and secondary structures of the 16S RNA to the specificity of ribosomal RNA-ribosomal protein interactions is

reviewed in the accompanying paper2 The Primary Sequence We have previously described the elucidation of roughly 707. of the 3 3-5 . In all cases, [I P 16S RNA (or 30S primary structure of the 16S RNA ribosomal subunits) was digested partially, the fragments separated, and their sequences determined by methods developed by Sanger, Brownlee and 6 Barrell . In many experiments, 16S RNA was incubated under conditions normally employed for ribosomal reconstitution, then chilled, and treated with T1 ribonuclease. Products of this digestion were then separated by 265

Nucleic Acids Research gradient centrifugation into several peaks which were resolved further by electrophoresis in polyacrylamide gels containing urea. Alternatively, the 30S subunit was digested with T1 ribonuclease and the resulting RNA fragments were fractionated by gel electrophoresis.These and other techniques of partial digestion are described elsewhere . Recently, we have also used ribosomal proteins S4, S20, and the combination of S6, S8, S15 and S18 to protect regions of the 16S RNA which are otherwise highly susceptible to ribonuclease attack, and hence seldom recoverable in useful quantities. The combination of these methods has enabled us to extend the In addition, a systematic check of earlier results sequences known so far. has led to the reordering of several blocks within the sequence relative to one another, although only minor revision of the sequences within these blocks was necessary (see below). Initial uncertainties in the order of a number of the larger fragments were caused by difficulties in obtaining and in satisfactorily purifying subfragments from several portions of the molecule. The primary sequence of the 16S RNA molecule so obtained, and the nomenclature of the various sections within it, are presented in Figure 1. The sequences of the T1 ribonuclease derived oligonucleotides from a large part of the 16S RNA molecule, as well as the association of several blocks of sequences within these regions, have recently been confirmed by Santer and Sansucrose

ter

7

The sequence of the extreme 5'- end of the 16S RNA molecule, previously incomplete, has been established largely through the analyses of fragments obtained from the digestion of intact 30S subunits. Nonetheless, a few minor ambiguities remain. The order of sections H'-H-Q' and Q-R around F has been reversed, as it is now known that the overlaps originally proposed were deduced from what appeared to be a single fragment but which has since been found to be an equimolar mixture of fragments of equal length from sections H' and F. The isolation of overlapping sequences linking section I'" directly to

section C'" has led to the elimination of section N which probably represented a contaminant arising from the 23S RNA. In addition, earlier conclusions regarding the partial order of the oligonucleotides within section C'', predicted on the grounds of the putative overlap with section N, have been discarded. As it still remains difficult to obtain fragments covering section C'" by any method of partial digestion, the sequence proposed for this region must be considered tentative.

Suggested overlaps between sections K' and K,

sections D' and D, and sections 0' and A, which were based on their reproducible association in the absence of urea, have proven to be spurious. In fact,

266

Nucleic Acids Research section K' is contiguous with section C'29 while sections 0 and 0' have been found to intervene between sections D' and D. In the previous plan, the precise arrangement of several blocks of the sequence within the 3'-half of the 16S RNA was not established. Through the analysis of extensive overlapping fragments, the order of the sections in this part of the molecule has been resolved. This work will be described fully in a manuscript to be submitted elsewhere. In a few places, segments of the sequence are known to be contiguous, but the overlapping fragments obtained thus far have been too large to yield the precise primary structure at the point of continuity. Such stretches are underlined in Fig. 1. Recently, Uchida et al have studied the sequences of the products arising upon complete T 1 ribonuclease digestion of the 16S RNA. Although most 4 of the oligonucleotide sequences agree with those previously published by us significant differences in 3 large products and small differences involving single nucleotides, usually C residues, in 12 other products, are evident. This is discussed in more detail in the legend to Fig. 1. We believe that their versions of the 3 oligonucleotides in which significant differences are found are likely to be correct, and we have used their versions in Fig. 1. However, in the remaining cases, while we believe that the methods of Uchida et al, using U2 RNase digestion, are more reliable in 5 instances, we think that our previous findings are likely to be correct in at least 5 other cases. It should be emphasised that these differences are small, and our finding that the 16S RNA contains both versions of one oligonucleotide at one position suggests that certain of the other differences might also be real differences in the 16S RNAs studied in the two laboratories. It should also be noted that these small differences require little change to be made in the secondary structure set out in Fig. 2, and in a few cases increase its possible stability. Some very recent results by Noller47 are discussed in the legend to Fig. 1. rhe nucleotide sequence shown in Fig. 1 contains about 1520 base residues. The continuity of a sequence of approximately 690 nucleotides, running from the 5'-terminus of the molecule to the end of section C'1, has been established, as has the continuity of a segment containing about 760 nucleotides which extends from the beginning of section D' to the 3'-terminus of the molecule. As yet, no formal overlaps between the extremities of the central 74-residue fragment K'-C' 2 and the two long terminal fragments have been found, and the number of intervening bases is unknown. However, all except two of the characteristic T1 RNase oligonucleotides in the 16S RNA are believed to be contained within the sequence presented. If any unidentified lin-

267

Nucleic Acids Research Figure 1.

A Plan of the Nucleotide Sequence of the 16S

RNA.

This plan represents the nucleotide sequence of about 95% of the 16S RNA molecule. We have found that the sequence at the extreme 3'-terminus agrees with that recently proposed by Shine and Dalgarno43, but differs from the one originally suggested by Santer and Santer44. We have taken account of the recent re-investigation of the sequences of the T1 RNase oligonucleotides by Uchida et a145 in the above plan (see text). We believe that the significant changes they have made in the oligonucleotides 31, 3 and 2 (our numbering system4) using U2 RNase digestion are likely to be correct, and we have used their results accordingly. The other differences between their results and our previously reported findings4 are small, usually a C residue. Upon re-examination of our data, we believe that the partial venom phosphodiesterase digestion methods which we used are likely to be more reliable than the U2 RNase methods, especially for some of the pyrimidine-rich oligonucleotides, and we believe that our analyses of spots 61, 28a, 21, 37 and 30a are probably correct. However, the results using partial venom phosphodiesterase digests were less clear for oligonucleotides 25, 7, 199 5 and 8, and we have used the results of Uchida et al for these sequences. In the case of oligonucleotides 14 and 56b, the U2 RNase digestion methods should be more accurate, and we have employed the data of Uchida et al, but we nevertheless note that our pancreatic RNase digests of these oligonucleotides indicate one additional C residue to be present in each case. In the case of oligonucleotide 29b, we find both their sequence and our originally published sequence, differing by a single C residue, in our 16S RNA as a heterogeneity (see table 1). This raises the possibility that some of the other small differences between our results and those of Uchida et al represent genuine differences between the 16S RNAs from E.coli MRE 600 and the B and K strains used by Uchida et al. Large capitals mark the beginning of different sections operationally defined by large continuous subfragments isolated after partial digestion. Underlined segments indicate regions where precise overlaps have not yet been obtained, although the sequences in question are known to be contiguous (see the text). Asterisks denote positional heterogeneities in the sequence (see the text and table 1) while parentheses enclose those segments where the oli-

gonucleotide composition is known, but not ordered. Very recently, Noller47 has reported studies of some nucleotide sequences within the 16S RNA by kethoxal modification, which affords a powerful method of examining sequences in some areas of the molecule. Although his results are largely in agreement with those reported here, he noted two places where his observations indicate that 2 neighbouring oligonucleotides have been transposed in the above sequence. In the first case, in D , we originally suggested the sequence UAAACGUCCACG (unpublished). His results show that the order of UAAACG and UCCACG is reversed. He also finds that UCCCGUWAAG rather than UUAAGUCCCG is likely to be the correct sequence in section K. Our proposed sequence of this area is tentative (shown by underlining in the above figure) and the results presented by Noller are likely to be correct, especially since, as he points out, the T1 and pancreatic RNase products from these two regions do not distinguish between the different possibilities.

268

Nucleic Acids Research

0* *.0 0

Primary sequence of the 16S ribosomal RNA of Escherichia coli.

Recent progress in the nucleotide sequence analysis of the 16S ribosomal RNA from E. coli is described. The sequence which has been partially or compl...
1MB Sizes 0 Downloads 0 Views