JCM Accepted Manuscript Posted Online 24 February 2016 J. Clin. Microbiol. doi:10.1128/JCM.02664-15 Copyright © 2016, American Society for Microbiology. All Rights Reserved.

1

Role of Clinicogenomics in Infectious Disease Diagnostics and Public Health

2

Microbiology

3 4

Lars F. Westblade, Ph.D.1 , Alex van Belkum, Ph.D. 2, Adam Grundhoff, Ph.D.3,4, George M.

5

Weinstock, Ph.D.5, Eric G. Pamer, M.D.6, Mark J. Pallen, M.D.7, Wm. Michael Dunne, Jr.,

6

Ph.D.8#

7 8

Department of Pathology and Laboratory Medicine, Weill Cornell Medical College, New York, NY, USA1;

9

bioMérieux, Inc., LaBalme, France2; HeinrichPette Institute, Leibniz Institute for Experimental Virology,

10

Hamburg, Germany3; German Center for Infection Research, Partner Site Hamburg-Lübeck-Borstel,

11

Hamburg, Germany4; The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA5; Memorial

12

Sloan-Kettering Cancer Center, New York, NY, USA6; Warwick Medical School, University of Warwick,

13

Coventry, UK7; bioMérieux, Inc., Durham, NC, USA8

14 15

To whom correspondence should be addressed:

16

Wm. Michael Dunne, Jr., Ph.D

17

100 Rodolphe Street,

18

Durham, NC, 27712, USA

19

E-mail: [email protected]

20

Key words: Clinicogenomics, Next-Generation Sequencing, Infectious Disease Diagnostics,

21

Public Health

1

22

Abstract

23

Clinicogenomics is the exploitation of genome sequence data for diagnostic, therapeutic, and

24

public health purposes. Central to this field is the high-throughput DNA sequencing of genomes

25

and metagenomes. The role of clinicogenomics in infectious disease diagnostics and public

26

health microbiology was the topic of discussion during a recent symposium (session 161)

27

presented at the 115th general meeting of the American Society for Microbiology held this past

28

Spring in New Orleans, LA, USA. What follows is a collection of the most salient and promising

29

aspects from each presentation at the symposium.

30 31 32 33 34 35 36 37 38 39 40 41 2

42

Introduction

43

The explosion of microbiome research is driven by high-throughput DNA sequencing, so-called

44

“next-generation sequencing” (NGS), technologies that allow the genomic content of entire

45

microbial communities (bacterial, viral, and eukaryotic organisms) to be described. Although

46

much of this work is aimed at describing the structure of “commensal” communities, the

47

methodology works equally well to identify pathogens in clinical samples. The key concept in

48

using NGS methodology is that detection of microbes is independent of culture and is not limited

49

to targets used for polymerase chain reaction (PCR) assays. Rather, it is a process of:

50

generating large-scale sequence data sets that adequately sample a specimen for microbial

51

content and then applying computational methods to resolve the sequences into individual

52

species, genes, pathways, or other features.

53

Most microbiome analyses have focused on describing bacterial content and this is usually

54

performed by sequencing the 16S rRNA gene. PCR primers with degenerative sequences are

55

used to amplify all or part of the 16S rRNA gene from a broad range of species in the sample.

56

The mix of amplicons generated from different organisms in the community is then sequenced

57

and the abundance of each species is determined by the number of sequences found for its

58

respective 16S rRNA gene. Although this is useful for defining communities, it is also affords the

59

identification of pathogens with unique 16S rRNA sequences.

60

The sensitivity and specificity of this method is determined in large part by the NGS technology.

61

Before NGS, the full-length 16S rRNA gene was sequenced with the high quality, 700 base-long

62

reads of Sanger, or chain termination, sequencing (sometimes referred to as “first-generation”

63

sequencing technology). This was laborious and expensive and deep sampling was not possible.

64

When NGS became available most work was done on the FLX sequencing instrument (a

65

second-generation sequencing technology) from 454 Life Sciences (Roche Diagnostics, 3

66

Indianapolis, IN). This only permitted 400 base-long sequencing reads and only a portion of the

67

16S rRNA gene was sequenced. The 16S rRNA gene has nine hypervariable regions that

68

provide much of the specificity in species identification. With 454 sequencing typically only three

69

of these regions could be sequenced. But nevertheless this allowed detection to the genus level

70

of most taxa. This methodology can correctly identify pathogens in stool samples from patients

71

with diarrhea, as compared to culture results (GW unpublished results). In addition, using this

72

NGS approach an additional pathogen that was not reported by the diagnostic laboratory in 15%

73

of the samples was identified.

74

Recently, 16S rRNA gene sequencing has moved to the MiSeq and HiSeq sequencing

75

instruments from Illumina (San Diego, CA). This is in part due to the closing of the 454 Life

76

Sciences company and the higher data production and lower cost of the Illumina instruments.

77

These instruments produce shorter reads (100-300 bases) and thus further limits the amount of

78

the 16S rRNA gene that can be sampled, often limited to a single hypervariable region.

79

However, organism identification is possible as a result of shotgun sequencing of several

80

hypervariable regions.

81

A new alternative to Illumina has been developed using the Pacific Biosciences RSII

82

sequencing platform, which is often referred to as a third-generation sequencing technology

83

(PacBio, Menlo Park, CA). With PacBio sequencing, much longer sequence reads are possible

84

and full-length 16S rRNA gene sequencing can now be accomplished at higher data output,

85

lower cost, and much greater convenience than was possible with Sanger sequencing. This

86

methodology is still more expensive than Illumina’s platform but bodes for continued

87

improvement in the use of 16S rRNA gene sequencing for microbiome analysis.

88

The alternative to focusing on the 16S rRNA gene for microbiome analysis is shotgun

89

sequencing of the sample so that all parts of the genome are sequenced. Whereas the 16S 4

90

rRNA gene is only found in bacteria, shotgun sequencing is agnostic and archaebacterial,

91

viruses, and eukaryotic microbes are also sampled. This is often referred to as metagenomic

92

shotgun sequencing since all genomes (the metagenome) are sequenced. This approach

93

requires many more sequencing reads than with 16S rRNA gene sequencing to adequately

94

sample the genomes, and thus only the sequencing platforms that produce the most data are

95

used (Illumina HiSeq and NextSeq instruments). This methodology is significantly more

96

expensive than 16S rRNA gene sequencing and this has also limited its use. But metagenomic

97

shotgun sequencing also allows for antibiotic resistance genes to be detected, as well as

98

virulence factors and other features that could help distinguish a pathogen at the strain level

99

from other non-pathogenic members of a species. Shotgun sequencing is also used for analysis

100

of RNA, either to identify RNA viruses or for transcriptional analysis. In this case,

101

complementary DNA is generated and then NGS is performed. Metagenomic transcription

102

analysis is particularly noteworthy as this method determines which organisms are actively

103

growing and/or whether a gene of interest (antibiotic resistant determinant) is expressed, and

104

thus contributing to the organism’s phenotype.

105

Although use of metagenomics shotgun sequencing is limited by the output and cost required,

106

trends in DNA sequencing technology continue to emphasize instruments that are smaller,

107

faster, and lower cost. The MinION™ instrument from Oxford Nanopore Technologies (Oxford,

108

UK) is a handheld sequencing instrument, and although they are still in development phase,

109

they have been used to sequence bacterial and viral samples (1,2). Thus one can expect

110

continued development in this area and more routine use of these methods in the future for

111

routine diagnostic microbiology.

112

Unbiased Infectious Disease Diagnostics

5

113

Conventional diagnostic methods such as PCR, serology, or microbial culture have been

114

validated and standardized over decades, and continue to represent the gold standard for

115

infectious disease diagnostics. However, while generally cost-effective and robust, these

116

methods share a common limitation: they represent targeted detection approaches and require

117

an accurate initial hypothesis as to the type of pathogen(s) that may be present in a sample of

118

interest. Their narrow scope, especially for PCR- and serology-based methods, is likely one of

119

the reasons why conventional diagnostic tests fail to detect a causative agent in a significant

120

number of cases (3-5). Recently established mass spectrometry-based approaches are less

121

biased, but in most cases still require culture of the infectious agent, thus precluding

122

identification of viruses or other pathogens which are difficult to grow in culture. In contrast, with

123

the advent of NGS technologies it is now possible to perform direct sequencing of DNA or RNA

124

isolated from primary diagnostic material. Hence, metagenomic shotgun sequencing has the

125

potential to fundamentally improve infectious disease diagnostics by allowing broad-range

126

detection of bacterial, viral, fungal, or parasitic agents in a single assay (Figure 1) (6-10).

127

Moreover, it extends the exciting possibility to detect pathogen sequences with only distant

128

homology to existing database entries, or to even identify entirely novel infectious agents.

129

In recent years, the steadily decreasing cost for NGS infrastructure and reagents as well as

130

development of increasingly simplified library preparation workflows have made the

131

establishment of NGS platforms in clinical labs technically feasible. However, a number of

132

challenges still hinder the widespread use of this technique in infectious disease diagnostics.

133

One of the most fundamental requirements is the development of analysis software that is

134

streamlined towards the needs of diagnostic laboratories. Although a number of open-source

135

analysis pipelines for NGS-based pathogen detection are available, their use often requires a

136

significant degree of bioinformatic expertise that is typically not available in clinical laboratories.

137

To facilitate clinically actionable diagnostics, appropriate software solutions must also strike a 6

138

reasonable balance between analytical depth and processing time, and deliver results within

139

hours rather than days (or even weeks). Furthermore, whereas samples subject to truly

140

hypothesis-free clinical diagnostics will require pathogen identification across all taxa, the

141

majority of existing pipelines are designed with an emphasis on either viral or bacterial

142

sequences. Currently available commercial software solutions are likewise limited to the

143

analysis of amplicon sequencing of conserved bacterial genes (e.g., 16S rRNA gene) and

144

therefore are generally unable to detect viral, fungal, or parasitic agents. One of the few publicly

145

available pipelines that has been specifically designed for use in clinical diagnostics is SURPI, a

146

platform for the unbiased detection of infectious agents in shotgun sequencing data that has

147

been used to identify viral or bacterial agents in primary diagnostic material (11-13). Clearly,

148

further refinement of this and other pipelines, preferentially with a graphical user interface that

149

facilitates interpretation by non-informatics personnel, will be a pivotal requirement for the future

150

implementation of NGS in infectious disease diagnostics.

151

At present, there is also a profound lack of harmonization and universally recognized standards

152

for NGS-based microbial diagnostics, a fact which is not surprising given that NGS is still a

153

relatively young technique. While a number of studies have proven the technique’s ability to

154

identify diverse pathogens directly from clinical material, and in some instances in a clinically

155

actionable timeframe (11-16), substantially more empirical data will have to be collected to

156

address a number of open questions. For example, given that shotgun sequencing usually only

157

recovers snippets of genomic information rather than whole genomes, what are the

158

requirements to call the presence of a specific infectious agent to a given taxonomic level?

159

Since it is often not possible to unequivocally assign fragments to a single species, and since

160

current second-generation high-throughput DNA sequencers utilize PCR amplification and thus

161

can only deliver relative rather than total abundance values, how should one arrive at a

162

reasonably meaningful abundance estimation for individual infectious agents? How should one 7

163

deal with potential contaminants, especially those nucleic acids which are frequently introduced

164

via library preparation kits? (17) Considering that not only the choice of the sequencing platform,

165

but also library preparation methods as well as sample matrix composition can have a dramatic

166

impact on the ability to recover infectious agent sequences, what are the read depths at which

167

different diagnostic sample entities should be sequenced, and what are the limits of detection

168

that should be expected for individual pathogens? Resolving these questions and other issues

169

will not only take time, but also require a significant number of systematic multi-center studies

170

with large sample cohorts. Establishment of novel databases that are rigorously annotated and

171

provide either primary read or assembled contig sequences together with clinical metadata

172

would also be an invaluable resource as they would greatly facilitate the identification of

173

‘unusual’ sequence signatures that could indicate the presence of putative pathogens, even if

174

such sequences do not exhibit any recognizable homology to taxonomically classified infectious

175

agents.

176

Given the number of issues that still need to be addressed, conventional methods for routine

177

diagnostics are unlikely to be completely replaced by unbiased NGS anytime soon. For the

178

investigation of challenging clinical cases or outbreak samples, however, it has already become

179

an invaluable complement to conventional tests. In view of its tremendous potential and the

180

rapid technological developments, including steadily increasing throughput of second-generation

181

sequencers and the availability of the first third-generation sequencing units that are small

182

enough to be taken into the field (1), it is clear that unbiased NGS will become an essential

183

instrument in the toolbox of clinical infectious disease diagnostics.

184 185 186 8

187

Antimicrobial Susceptibility Testing Using Next-Generation Methods

188

Over the past century, antimicrobial susceptibility testing (AST) has been dominated by

189

phenotypic approaches. Assays are largely based on the detection of microbial growth. These

190

strategies utilize solid or liquid culture media where the concentration of antimicrobial agent is

191

adjusted to permit definition of minimum bactericidal or bacteriostatic (collectively, inhibitory)

192

concentrations. Formats for such measurements include agar dilution, broth microdilution (BMD),

193

antibiotic gradient diffusion, selective chromogenic media, and ultimately, automated systems

194

such as the Beckman Coulter MicroScan Walkaway™ (Brea, CA, US), the Becton, Dickinson

195

and Company Phoenix™ (Sparks, MD, US) and the bioMérieux VITEK®2 (Marcy I'Etoile,

196

France).

197

Recently, new approaches have been adapted to growth-based AST technology, and most deal

198

with innovative means of distinguishing growing from inhibited/dead microorganisms. These

199

include the use of microfluidics (nanodrop BMD), mass spectrometry (including MALDI-TOF),

200

cantilever technology, micro-calorimetrics, nuclear magnetic resonance and magnetic bead

201

rotation, real-time microscopy, and intrinsic fluorescence to name a few (for a recent review see

202

[18]). All these approaches are promising and beyond the proof of principle stage, but none

203

have entered the current in vitro diagnostic market.

204

Whether nucleic acid-based methods can serve as a proxy for growth-based AST methods has

205

yet to be thoroughly vetted for many clinically relevant species (19). These methods excel in

206

resistance gene detection but equating a resistance gene to an actual minimum inhibitory

207

concentration value is still a work in progress. This may change as high-throughput genomics

208

including NGS and transcriptomics become increasingly accessible, with transcriptomic analysis

209

of stress marker expression (e.g., the SOS response) potentially offering an opportunity to relate

210

molecular AST with phenotypic susceptibility data (20). 9

211

To better understand the potential value of NGS for AST, recent studies have shown that

212

associations between phenotypic resistance profiles (antibiograms) and genotypic resistance

213

predicted from whole genome sequencing (WGS) data can be accurately defined. Using

214

genome sequence information, an inventory of all known antibiotic resistance determinants,

215

including mutations within protein-coding and -noncoding regions (e.g., regulatory elements),

216

can be obtained (21). This generates a global view of the bacterial “resistome” that can be used

217

to assess the presence/absence of such genes and mutations in de novo microbial genome

218

sequences. When comparing the Staphylococcus aureus resistome to a comprehensive

219

reference antibiogram for a development set of >1,000 strains and an equally sized validation

220

set, the documented percentages of major errors (ME: predicted to be resistant but

221

phenotypically susceptible) and very major errors (VME predicted to be susceptible but

222

phenotypically resistant) associated with genotypic antibiotic resistance prediction were 0.2%

223

and 1.1%, respectively (unpublished data). This is in the same range, or better, than that

224

demonstrated for commercial AST systems. Additional studies have demonstrated the

225

applicability of this approach for other organisms, but for species that are genetically more

226

heterogeneous than S. aureus, the levels of ME and VME were higher (22). At present, from a

227

routine laboratory workflow and regulatory standpoint, automated AST systems are better suited

228

for clinical diagnostics; however, with ever decreasing overheads and further maturation of

229

resistome databases, WGS AST may become increasingly more competitive and invasive in the

230

clinical management of patients (23). In addition, these approaches could promote the discovery

231

and characterization of new and emerging antibiotic resistance mechanisms, which will broaden

232

the reliability of WGS AST, and could stimulate the discovery of novel antibiotics.

233

Despite the obvious optimism surrounding NGS AST platforms, prior to their routine

234

implementation in the clinical setting there are several important aspects that must be

235

addressed: i) establishment of tightly regulated genomic databases. These databases will need 10

236

continuous update, and perhaps supplementation with phenotypic, metabolomic, clinical, and

237

outcome data to accommodate the emergence of antimicrobial resistance; ii) implementation of

238

robust, reproducible testing methodologies that generate data in a clinically actionable time

239

frame; iii) development of interpretative guidelines specific for these data (24); iv) approval by

240

various regulatory bodies; v) the expense of such testing compared to phenotypic AST. Clearly,

241

there must be extensive collaboration between academic, corporate, and regulatory bodies to

242

ensure NGS-based AST moves into practice to combat the frightening frequency that multi- and

243

pan-drug-resistant isolates are isolated (25). Importantly, WGS AST will also provide the identity

244

of the offending microorganism, its virulence potential, and epidemiological typing.

245 246

Human Microbiome as a Diagnostic and Prognostic Marker of Disease

247

With the advent of benchtop high-throughput DNA sequencing platforms and accessible

248

computational tools, definition of the composition and abundance of microbes and their

249

genomes (i.e., the microbiome) in a given anatomical environment has been greatly facilitated.

250

Utilizing these high-throughput DNA sequencing platforms, numerous studies have linked the

251

structure of the microbiome, in particular the gastrointestinal microbiome, with human disease,

252

including obesity (26), type 2 diabetes (27), bacterial infection (28), and cancer (29), and with

253

malnutrition (30) and the metabolism of drugs (31). Consequently, survey of an individual’s

254

microbiome using high-throughput DNA sequencing methodologies could be diagnostic for a

255

given disorder and, possibly, prognostic of the likely disease course. However, to account for

256

the extensive microbial variation within and between individuals, it is essential these data are

257

controlled by comparison with microbiome data obtained from healthy and diseased persons

258

spanning a wide geographic and ethnic range.

11

259

The mammalian gastrointestinal microbial flora elicits a number of key functions, not least the

260

development of the immune system (32) and protection against colonization by antibiotic-

261

resistant microorganisms (33). Administration of antibiotics can perturb this fragile ecological

262

niche (34), resulting in colonization with antibiotic-resistant organisms or enhanced risk of

263

intestinal infection with Clostridium difficile (33). Microbes that undergo marked expansion in the

264

intestine as a result of antibiotic exposure have been associated with invasive bloodstream

265

infection. To explore a possible relationship between dense intestinal colonization and

266

bloodstream invasion in humans, investigators have performed NGS sequencing of DNA

267

extracted from fecal specimens obtained from subjects undergoing allogeneic hematopoietic

268

stem cell transplantation (allo-HSCT) was performed (28). Remarkably, intestinal domination of

269

the

270

Enterococcus faecium, preceded bloodstream invasion in this cohort. Enterococci, streptococci,

271

and various Proteobacteria, which include members of the family Enterobacteriaceae, were

272

found to undergo expansion in the gut following antibiotic treatment. Enterococcal intestinal

273

domination was associated with prior metronidazole administration, and increased the risk of

274

vancomycin-resistant Enterococcus bacteremia nine-fold. Similarly, proteobacterial domination

275

resulted in a five-fold increase in the risk of Gram-negative bacteremia, while dominance was

276

reduced 10-fold by fluoroquinolone treatment.

277

In an extension of this work, the diversity of the intestinal microbiota was demonstrated to be

278

predictive of mortality in allo-HSCT recipients (35). By analyzing the microbiota of fecal

279

specimens collected from 80 subjects at the time of stem cell engraftment, it was possible to

280

stratify subjects into high, intermediate, and low microbial diversity groups. Strikingly, overall

281

survival three years after allo-HSCT was 36%, 60%, and 67% for the low, intermediate, and

282

high diversity groups, respectively; implying that high intestinal microbial diversity is prognostic

283

of

gut

with

favorable

a

single

clinical

predominant

outcomes.

antibiotic-resistant

Additionally, 12

species, Vancomycin-resistant

commensal

members

of

the

families

284

Lachnospiraceae and Actinomycetaceae were associated with survival, while Gram-negative

285

bacteria from the phylum Proteobacteria were positively correlated with mortality.

286

Exposure to antibiotics is related to C. difficile infection (33,36), a major cause of infectious

287

diarrhea in hospitalized patients (37). To combat this public health threat, high-throughput DNA

288

sequencing of the fecal microbiota of mice and hospitalized patients treated with antibiotics was

289

utilized to identify C. difficile resistance-associated bacterial species (36). The species with the

290

strongest resistance correlation was Clostridium scindens, which dramatically reduced C.

291

difficile infection, and attendant weight loss and mortality, in an animal model when transferred

292

alone or as part of a microbial consortium post-antibiotic exposure. The mechanism of C.

293

difficile inhibition centers on the C. scindens-dependent conversion of primary into secondary

294

bile acids in the cecum and colon. These data suggest C. scindens offers promise as an

295

alternative treatment option for C. difficile-mediated intestinal disease.

296

In addition to its capacity as a marker for intestinal disease, the gut microbiome has potential as

297

a diagnostic and prognostic marker for systemic diseases, such as rheumatoid arthritis (38). To

298

identify and validate microbial species allied with rheumatoid arthritis, high-throughput 16S

299

rRNA gene sequencing of DNA extracted from 114 stool specimens obtained from patients with

300

rheumatoid arthritis and controls was performed (39). In the setting of untreated new-onset

301

rheumatoid arthritis, Prevotella copri was considerably more abundant than in healthy

302

individuals, signifying that P. copri could play a role in the pathogenesis of rheumatoid arthritis.

303

The increase in Prevotella correlated with reduction in Bacteroides and loss of reportedly

304

beneficial microbes. Similarly, the gut microbiota of patients with psoriatic arthritis and skin

305

psoriasis was observed to be less diverse compared to healthy controls (40). Whereas some

306

genera were less abundant in both conditions, psoriatic arthritis patients had a lower abundance

307

of reportedly beneficial microbes. Taken together, these data suggest that interrogation of the 13

308

gut microbiome could be of diagnostic and prognostic utility for arthritis and other systemic

309

ailments.

310 311

The Role of Clinicogenomics in Public Health Microbiology

312

Over the past 50+ years, public health microbiology, (“public health microbiology version 1.0” )

313

was constrained with complex and labor-intensive workflows and protocols for microbial culture,

314

identification, growth-based phenotypic susceptibility testing, and strain typing (41). Recently,

315

high-throughput DNA sequencing, particularly bench-top sequencing, has brought many new

316

opportunities to this field (42, 43-45) and allows bacterial genomics to be integrated into what

317

might be called “public health microbiology version 2.0 (v2.0) through whole-genome

318

sequencing (WGS) of cultured isolates to provide simultaneous information on organism identity,

319

epidemiology, and antimicrobial therapy (Figure 2).

320

As a practical example of public health microbiology v2.0, a recent case study describes how

321

WGS was applied to a protracted hospital outbreak of multi-drug-resistant Acinetobacter

322

baumannii in Birmingham, England (46). The results showed that the outbreak strain was

323

distinct from previously genome-sequenced strains and enabled the identification of seven

324

major genotypic clusters within the outbreak. WGS also allowed the investigative team to rule

325

17 initially suspicious isolates as unrelated to the outbreak strain. Analysis of genomic data

326

documented within-host diversity in several patients, including mixtures of unrelated strains and

327

within-strain genetic diversity. Using WGS data and conventional epidemiology, the study team

328

was able to reconstruct potential transmission events that linked all but seven of the patients

329

and could also associate patient isolates to those recovered from the environment. WGS

330

focused attention on a contaminated bed and on a burns unit as sources and sites of 14

331

transmission, catalyzing improvements in decontamination protocols. This approach has also

332

been adopted for the WGS of Mycobacterium tuberculosis isolates (47).

333

To fast forward into the near future (public health microbiology v2.1), it is plausible that culture

334

of bacterial isolates might in some settings be replaced by shotgun metagenomic sequencing of

335

clinical samples. There are several potential advantages of “diagnostic metagenomics”: (10) it

336

represents a one-size-fits-all approach to all bacteria that contrasts with the need for so many

337

different laboratory media and atmospheric conditions in conventional bacteriology; it avoids the

338

onerous optimization of target-specific assays needed for amplification- or probe-based

339

diagnosis; it is unbiased and open-ended, i.e., not restricted to finding only what you expected

340

to find. A second case study highlights this approach in which metagenomics was applied to

341

fecal samples obtained from patients with diarrhea during the 2011 outbreak of Shiga-toxin-

342

producing Escherichia coli (STEC) O104:H4 in Germany (16). The investigative team obtained

343

the genome of the STEC outbreak strain from ten samples at greater than ten-fold coverage

344

and from over two-dozen samples at greater than one-fold coverage. In several samples, they

345

found an increased coverage of the Shiga toxin bacteriophage genome relative to other STEC

346

sequences. From some samples, they recovered sequences from Clostridium difficile,

347

Campylobacter jejuni, and Salmonella enterica, and from one, they recovered sequences from

348

the emerging human pathogen Campylobacter concisus, illustrating the ability of metagenomics

349

to deliver unexpected results.

350

Metagenomic analysis has also be applied to the recovery of M. tuberculosis genomes from

351

both historical and contemporary human samples and the results have shown that mixed

352

infections were common in 18th Century Europe. Further, in a proof-of-principle study, the same

353

process was used to identify and characterize pathogenic mycobacteria in modern sputum

15

354

samples (48-50). There have been several other recent proof-of-principle studies demonstrating

355

the utility of this diagnostic approach (13, 15, 51,52)

356

We can envisage an even more ambitious vision for public health microbiology v3.0, in which

357

long-read single-molecule nanopore sequencing will enable an integrated approach to

358

“macromolecular monitoring”, combining analysis of DNA, RNA, and proteins shed in urine and

359

feces together with characterization of informational macromolecules circulating in the

360

bloodstream to provide information not just on infection but also on, for example, cancer and the

361

health of the fetus or of organ transplants (53-57).

362

However, there will be a need for a new computational infrastructure to cope with the demands

363

of big data in clinical microbiology, including a role of cloud computing (58), illustrated by the

364

CLIMB (CLoud Infrastructure for Microbial Bioinformatics) project supported by the UK’s Medical

365

Research Council (59).

366 367

Conclusion

368

Based on the discussions above, next-generation sequencing will steadily work its way into

369

routine diagnostic use within the clinical and public health laboratories over the coming years.

370

This prediction, albeit not entirely in the near future, is based on the universality of the science,

371

i.e., its applicability to the diagnosis of infectious processes and resistance markers in an

372

unbiased fashion for all manner of microorganisms be they viral, bacterial, fungal, or parasitic.

373

Furthermore, it will allow for the ability to monitor changes in the human (or animal) microbiome

374

that forecasts potential risk for, or the existence of other, noninfectious disease processes thus

375

allowing earlier intervention or avoidance – perhaps even alternative treatment modalities.

376

While most of this review centers on the use of NGS and all the analytical permutations that 16

377

have been developed in conjunction with it, we can likely expect more user-friendly distillations

378

of these studies (i.e., multiplex PCR assays) to appear in clinical laboratories in the near future.

379

And this road will provide a fascinating journey indeed.

380 381

References

382

1. Quick J, Ashton P, Calus S, Chatt C, Gossain S, Hawker J, Nair S, Neal K, Nye K, Peters T,

383

De Pinna E, Robinson E, Struthers K, Webber M, Catto A, Dallman TJ, Hawkey P, Loman NJ.

384

2015. Rapid draft sequencing and real-time nanopore sequening in a hospital outbreak of

385

Salmonella. Genome Biol. 16:114

386

2. Judge K, Harris SR, Teuter S, Parkhill J, Peacock SJ. 2015. Early insights into the potential of

387

the Oxford Nanopore MinION for the detection of antimicrobial resistance genes. J Antimicrob

388

Chemother 70:2775-2778.

389

3. Ambrose HE, Granerod J, Clewley JP, Davies NW, Keir G, Cunningham R, Zuckerman M,

390

Mutton KJ, Ward KN, Ijaz S, Crowcroft S, Brown DW, and U.K.A.o.E.S. group. 2011. Diagnostic

391

strategy used to establish etiologies of encephalitis in a prospective cohort of patients in

392

England. J Clin Microbiol 49:3576-3583.

393

4. Denno DM, Shaikh N, Stapp JR, Qin X, Hutter CM, Hoffman V, Mooney JC, Wood KM,

394

Stevens HJ, Jones R, Tarr PI, Klein EJ. 2012. Diarrhea etiology in a pediatric emergency

395

department: a case control study. Clin Infect Dis 55:897-904.

396

5. Louie JK, Hacker JK, Gonzales R, Mark J, Maselli JH, Yagi S, Drew WL. 2005.

397

Characterization of viral agents causing acute respiratory infection in a San Francisco University

398

Medical Center during the influenza season. Clin Infect Dis 41:822-828.

17

399

6. Barzon L, Lavezzo E, Constanzi G, Franchin E, Toppo S, Palu G. 2013. Next-generation

400

sequencing technologies in diagnostic virology. J Clin Virol 58:346-350.

401

7. Chiu CY. 2013. Viral pathogen discovery. Curr Op Microbiol. 16:468-478.

402

8. Dunne,Jr. WM, Westblade LF, Ford B. 2012. Next-generation and whole genome sequencing

403

in the diagnostic clinical microbiology laboratory. Eur J Clin Microbiol Infect Dis 31:1719-1726.

404

9. Miller RR, Montoya V, Gardy JL, Patrick DM, Tang P. 2013. Metagenomics for pathogen

405

detection in public health. Genome Med 5:81.

406

10. Pallen MJ. 2014. Diagnostic metagenomics: potential applications to bacterial, viral, and

407

parasitic infections. Parasitology 141:1856-1862.

408

11. Greninger AL, Naccache SN, Messacar K, Clayton A, Yu G, Somasekar S, Federman S,

409

Stryke D, Anderson C, Yagi S, Messenger S, Wadford D, Xia D, Watt JP, van Haren K,

410

Dominguez SR, Glaser C, Aldrovandi G, Chiu CY. 2015. A novel outbreak enterovirus D68

411

strain associated with acute flaccid myelitis cases in the USA (2012-2014): a retrospective

412

cohort study. Lancet Infect Dis 15:671-682.

413

12. Naccache SN, Federman S, Veeraraghavan N, Zaharia M, Lee D, Samayoa E, Bouquet J,

414

Greninger AL, Luk KC, Enge B, Wadford DA, Messenger SL, Genrich GL, Pellegrino K, Grard G,

415

Leroy E, Schneider BS, Fair JN, Martinez MA, Isa P, Crump JA, DeRisi JL, Sittler T, Hackett Jr.

416

J, Miller S, Chiu CY. 2014. A cloud-compatible bioinformatics pipeline for ultrarapid pathogen

417

identification from next-generation sequencing of clinical samples. Genome Res 24:1180-1192.

418

13. Wilson MR, Naccache SN, Samayoa E, Biagtan M, Bashir H, Yu G, Salamat SM,

419

Somasekar S, Federman S, Miller S, Sololic R, Garabedian E, Candotti F, Buckley RH, Reed

420

KD, Meyer TL, Seroogy CM, Galloway R, Henderson SL, Gern JE, DeRisi JL, Chiu CY. 2014. 18

421

Actionable diagnosis of neuroleptospirosis by next-generation sequencing. N Engl J Med

422

370:2408-2417.

423

14. Fischer N, Indenbirken D, Meyer T, Lutgehetmann M, Lellek H, Spohn M, Aepfelbacher M,

424

Alawi M, Grundoff A. 2015. Evaluation of unbiased next-generation sequencing of RNA (RNA-

425

seq) as a diagnostic method in influenza virus-positive respiratory samples. J Clin Microbiol

426

53:2238-2250.

427

15. Fischer N, Rohde H, Indenbirken D, Gunther T, Reumann K, Lutgehetmann M, Meyer T,

428

Kluge S, Aepfelbacker M, Alawi M, Grundoff A. 2014. Rapid metagenomic diagnostics for

429

suspected outbreak of severe pneumonia. Emerg Infect Dis 20:1072-1075.

430

16. Loman NG, Constantinidou C, Christner M, Rohde H, Chan JZ, Quick J, Weir JC, Quince C,

431

Smith GP, Betley JR, Aepfelbacher M, Pallen MJ. 2013. A culture-independent sequence-based

432

metagenomics approach to the investigation of an outbreak of Shiga-toxigenic Escherichia coli

433

O104:H4. JAMA 309-1502-1510.

434

17. Salter SJ, Cox MH, Turek EM, Calus ST, Cookson WO, Moffatt MF, Turner P, Parkhill J,

435

Loman NJ, Walker AW. 2014. Reagent and laboratory contamination can critically impact

436

sequence-based microbiome analysis. BMC Biology 12:87.

437

18. van Belkum A, Dunne WM Jr. Next-generation antimicrobial susceptibility testing. 2013. J

438

Clin Microbiol. 51:2018-2024.

439

19. Cangelosi GA, Meschke JS. 2014. Dead or alive: molecular assessment of microbial viability.

440

Appl Environ Microbiol. 80:5884-5891.

441

20. Barczak AK, Gomez JE, Kaufmann BB, Hinson ER, Cosimi L, Borowsky ML, Onderdonk AB,

442

Stanley SA, Kaur D, Bryant KF, Knipe DM, Sloutsky A, Hung DT. RNA signatures allow rapid 19

443

identification of pathogens and antibiotic susceptibilities. 2012. Proc Natl Acad Sci U S A.

444

109:6217-6222.

445

21. Walker TM, Kohl TA, Omar SV, Hedge J, Del Ojo Elias C, Bradley P, Iqbal Z, Feuerriegel S,

446

Niehaus KE, Wilson DJ, Clifton DA, Kapatai G, Ip CL, Bowden R, Drobniewski FA, Allix-Béguec

447

C, Gaudin C, Parkhill J, Diel R, Supply P, Crook DW, Smith EG, Walker AS, Ismail N, Niemann

448

S, Peto TE; Modernizing Medical Microbiology (MMM) Informatics Group. 2015. Whole-genome

449

sequencing for prediction of Mycobacterium tuberculosis drug susceptibility and resistance: a

450

retrospective cohort study. Lancet Infect Dis. 2015 Jun 23. pii: S1473-3099:62-66.

451

22. Holt KE, Wertheim H, Zadoks RN, Baker S, Whitehouse CA, Dance D, Jenney A, Connor

452

TR, Hsu LY, Severin J, Brisse S, Cao H, Wilksch J, Gorrie C, Schultz MB, Edwards DJ, Nguyen

453

KV, Nguyen TV, Dao TT, Mensink M, Minh VL, Nhu NT, Schultsz C, Kuntaman K, Newton PN,

454

Moore CE, Strugnell RA, Thomson NR. 2015. Genomic analysis of diversity, population

455

structure, virulence, and antimicrobial resistance in Klebsiella pneumoniae, an urgent threat to

456

public health. Proc Natl Acad Sci U S A. 112:E3574-3581.

457

23. Wright GD.The antibiotic resistome: the nexus of chemical and genetic diversity. 2007. Nat

458

Rev Microbiol. 5:175-186.

459

24. Kahlmeter G. 2015. The 2014 Garrod Lecture: EUCAST - are we heading towards

460

international agreement? J Antimicrob Chemother. 70:2427-2439.

461

25. Nathan C, Cars O. 2014. Antibiotic resistance--problems, progress, and prospects. N Engl J

462

Med. 371:1761-1763.

463

26. Turnbaugh PJ, Hamady M, Yatsunenko T, Cantarel BL, Duncan A, Ley RE, Sogin ML,

464

Jones WJ, Roe BA, Affourtit JP, Egholm M, Henrissat B, Heath AC, Knight R, Gordon JI. 2009.

465

A core gut microbiome in obsese and lean twins. Nature 457:480-4 20

466

27. Qin J, Li Y, Cai Z, Li S, Zhu J, Zhang F, Liang S, Zhang W, Guan Y, Shen D, Peng Y, Zhang

467

D, Jie Z, Wu W, Qin Y, Xue W, Li J, Han L, Lu D, Wu P, Dai Y, Sun X, Li Z, Tang A, Zhong S, Li

468

X, Chen W, Xu R, Wang M, Feng Q, Gong M, Yu J, Zhang Y, Xhang M, Hansen T, Sanchez G,

469

Raes J, Falony G, Okuda S, Almeida M, LeChatelier E, Renault P, Pons N, Batto JM, Zhang Z,

470

Chen H, Yang R, Zheng W, Li S, Yang H, Wang J, Ehrlich SD, Nielsen R, Pedersen O,

471

Kristiansen K, Wang J. 2012. A metagenome-wide association study of gut microbiota in type 2

472

diabetes. Nature 490:55-60

473

28. Taur Y, Xavier JB, Lipuma L, Ubeda C, Goldberg J, Gobourne A, Lee YJ, Dubin KA, Socc

474

ND, Viale A, Perales MA, Jenq RR, van den Brink MR, Pamer EG. 2012. Intestinal domination

475

and the risk of bacteremia in patients undergoing allogeneic hematopoietic stem cell

476

transplantation. Clin Infect Dis 55:905-14

477

29. Ahn J, Sinha R, Pei Z, Dominianni C, Wu J, Shi J, Goedert JJ, Hayes RB, Yang L. 2013.

478

Human gut microbiome and risk for colorectal cancer. J Natl Cancer Inst 18:1907-11.

479

30. Smith MI, Yatsunenko T, Manary MJ, Trehan I, Mkakosya R, Cheng J, Kau AL, Rich SS,

480

Concannon P, Mychaleckyj JC, Liu J, Houpt E, Li JV, Holmes E, Nicholson J, Knights D, Ursell

481

LK, Knight R, Gordon JI. 2013. Gut microbiomes of Malawian twin pairs discordant for

482

kwashiorkor. Science 339:548-54

483

31. Haiser HJ, Gootenberg DB, Chatman K, Sirasani G, Balskus EP, Turnbaugh PJ. 2013.

484

Predicting and manipulating cardiac drug inactivation by the human gut bacterium Eggerthella

485

lenta. Science 341:295-8

486

32. Cebra JJ. 1999. Influences of microbiota on intestinal immune system development. Am J

487

Clin Nutr 69:1046S-51S

21

488

33. Buffie CG, Pamer EG. 2013. Microbiota-mediated colonization resistance against intestinal

489

pathogens. Nat Rev Immunol 13:790-801

490

34. Dethlefsen L, Huse S, Sogin ML, Relman DA. 2008. The pervasive effects on an antibiotic

491

on the human gut microbiota, as revealed by deep 16S rRNA sequencing. PLoS Biol 6:e280

492

35. Taur Y, Jenq RR, Perales MA, Littmann ER, Morjaria S, Ling L, No D, Gobourne A, Viale A,

493

Dahi PB, Ponce DM, Barker JN, Giralt S, van den Brink M, Pamer EG. 2014. The effects of

494

intestinal tract bacterial diversity on mortality following allogeneic hematopoietic stem cell

495

transplantation. Blood 124:1174-82

496

36. Buffie CG, Bucci V, Stein RR, McKenney PT, Ling L, Gobourne A, No D, Liu H, Kinnebrew

497

M, Viale A, Littmann E, van den Brink MR, Jenq RR, Taur Y, Sander C, Cross JR, Toussaint NC,

498

Xavier JB, Pamer EG. 2015. Precision microbiome reconstitution restores bile acid mediated

499

resistance to Clostridiumk difficile. Nature 517:205-8

500

37. Rupnik M, Wilcox MH, Gerding DH. 2009. Clostridium difficile infection: new developments

501

in epidemiology and pathogenesis. Nat Rev Microbiol 7:526-36

502

38. Scher JU, Abramson SB. 2011. The microbiome and rheumatoid arthritis. Nat Rev

503

Rheumatol 7:569-78

504

39. Scher JU, Sczesnak A, Longman RS, Segata N, Ubeda C, Bielski C, Rostron T, Cerundolo

505

V, Pamer EG, Abramson SB, Huttenhower C, Littman DR. 2013. Expansion of intestinal

506

Prevotella copri correlates with enhanced susceptibility to arthritis. Elife 2:e01202

507

40. Scher JU, Ubeda C, Artacho A, Attur M, Isaac S, Reddy SM, Marmon S, Neimann A, Brusca

508

S, Patel T, Manasson J, Pamer EG, Littman DR, Abramson SB. 2015. Decreased bacterial

22

509

diversity characterizes the altered gut microbiota in patients with psosiatic arthritis, resembling

510

dysbiosis in inflammatory bowel disease. Arthritis Rheumatol 67:128-39

511

41. Didelot X, Bowden R, Wilson DJ, Peto TE, Crook DW. 2012. Transforming clinical

512

microbiology with bacterial genome sequencing. Nat Rev Genet 13:601–612.

513

42. Loman NJ, Constantinidou C, Chan JZ, Halachev M, Sergeant M, Penn CW, Robinson

514

ER, Pallen MJ. 2012. High-throughput bacterial genome sequencing: an embarrassment of

515

choice, a world of opportunity. Nat Rev Microbiol 10:599–606.

516

43. Pallen MJ, Loman NJ. 2011. Are diagnostic and public health bacteriology ready to

517

become branches of genomic medicine? Genome Med 3:53.

518

44. Pallen MJ, Loman NJ, Penn CW. 2010. High-throughput sequencing and clinical

519

microbiology: progress, opportunities and challenges. Curr Opin Microbiol 13:625–631.

520

45. Robinson ER, Walker TM, Pallen MJ. 2013. Genomics and outbreak investigation: from

521

sequence to consequence. Genome Med 5:36.

522

46. Halachev MR, Chan JZ, Constantinidou CI, Cumley N, Bradley C, Smith-Banks M,

523

Oppenheim B, Pallen MJ. 2014. Genomic epidemiology of a protracted hospital outbreak

524

caused by multidrug-resistant Acinetobacter baumannii in Birmingham, England. Genome Med

525

6:70.

526

47. Heart of England NHS Foundation Trust. 2014. TB genomics service pilot project.

23

527

http://www.heftpathology.com/item/tb-genomics-pilot-scheme.html

528

48. Chan JZ, Sergeant MJ, Lee OY, Minnikin DE, Besra GS, Pap I, Spigelman M, Donoghue

529

HD, Pallen MJ. 2013. Metagenomic analysis of tuberculosis in a mummy. N Engl J Med

530

369(3):289–290.

531

49. Doughty EL, Sergeant MJ, Adetifa I, Antonio M, Pallen MJ. 2014. Culture-independent

532

detection and characterisation of Mycobacterium tuberculosis and M. africanum in sputum

533

samples using shotgun metagenomics on a benchtop sequencer. PeerJ 2:e585.

534

50. Kay GL, Sergeant MJ, Zhou Z, Chan JZ, Millard A, Quick J, Szikossy I, Pap I, Spigelman M,

535

Loman NJ, Achtman M, Donoghue HD, Pallen MJ. 2015. Eighteenth-century genomes show

536

that mixed infections were common at time of peak tuberculosis in Europe. Nat Commun 6:6717.

537

51. Andersson P, Klein M, Lilliebridge RA, Giffard PM. 2013. Sequences of multiple bacterial

538

genomes and a Chlamydia trachomatis genotype from direct sequencing of DNA derived from a

539

vaginal swab diagnostic specimen. Clin Microbiol Infect 19:E405–8.

540

52. Hasman H, Saputra D, Sicheritz-Ponten T, Lund O, Svendsen CA, Frimodt-Moller N,

541

Aarestrup FM. 2014. Rapid whole-genome sequencing for detection and characterization of

542

microorganisms directly from clinical samples. J Clin Microbiol 52:139–146.

543

53. Acharya S, Edwards S, Schmidt J. 2015. Research highlights: nanopore protein detection

544

and analysis. Lab Chip 15:3424–3427. 24

545

54. Ayub M, Stoddart D, Bayley H. 2015. Nucleobase Recognition by Truncated alpha

546

Hemolysin Pores. ACS Nano 9:7895–7903.

547

55. Daly KP. 2015. Circulating donor-derived cell-free DNA: a true biomarker for cardiac

548

allograft rejection? Ann Transl Med 3:47.

549

56. Ignatiadis M, Dawson SJ. 2014. Circulating tumor cells and circulating tumor DNA for

550

precision medicine: dream or reality? Ann Oncol 25:2304–2313.

551

57. Liao GJ, Gronowski AM, Zhao Z. 2013. Non-invasive prenatal testing using cell-free fetal

552

DNA in maternal circulation. Clin Chim Acta

553

58. Drake N. 2015. How to catch a cloud. Nature 522:115–116.

554

59.CLIMB consortium,. 2015. Cloud Infrastructure for Microbial Bioinformatics.

555

http://www.climb.ac.uk

556 557 558 559 560 561 562 25

563

Figure Legends

564

Figure 1. Next-generation sequencing for clinical infectious disease diagnostics

565

(A) Schematic depiction of diagnostic NGS workflows. Nucleic acids isolated from primary

566

diagnostic material are directly queried by either shotgun or amplicon sequencing. Amplicon

567

sequencing uses PCR amplification with primers that target conserved regions (e.g., the

568

bacterial 16S rRNA gene). Clustered amplicon sequences are then compared to appropriate

569

databases (e.g., Greengenes or SILVA) to identify clusters of so-called “operational taxonomic

570

units” (OTUs) on different taxonomic levels. Amplicon sequencing is sensitive, fast, and cost

571

effective, but due to the use of specific PCR primers is also strongly biased when compared to

572

random shotgun sequencing. Shotgun sequencing reads are usually first aligned to the human

573

(or an appropriate animal host) genome to eliminate reads of host origin (digital subtraction).

574

The remaining reads are then either directly mapped to sequence databases, or first assembled

575

into longer contiguous sequences (contigs) that are subsequently aligned to the database. De

576

novo assembly considerably increases computational overhead and analysis time, but at the

577

same time also significantly decreases classification bias by facilitating the identification of

578

pathogens which exhibit little or no sequence homology to known infectious agents. (B)

579

Whereas the term ‘metagenomics’ in its literal sense suggests the analysis of full genome

580

sequences, the throughput of current NGS technologies usually only allows partial recovery of

581

individual infectious agent genomes, especially in complex diagnostic samples (e.g., stool or

582

respiratory samples). Thus, diagnostic NGS requires bioinformatic approaches that sort

583

sequence fragments (or tags) into taxonomic bins to evaluate the composition of clinical

584

samples.

585 586

26

587

Figure 2. Progressive integration of genomics and metagenomics into public health

588

microbiology. As time progresses, we anticipate that the 19th Century techniques of

589

microscopy and culture will give way to sequence-based approaches, which will also lead to

590

closer integration with the rest of laboratory medicine.

591 592

27

Public Health Microbiology 1.0 Microscopy, Culture, Susceptibility Onerous and complex workflow for phenotypic characterisation of isolates

Public Health Microbiology 2.0 ATGACCATGATTACGGATT CACTGGCCGTCGTTTTACA ACGTCGTGACTGGGAAAAC

Whole-genome sequencing Identification, epidemiology, Susceptibilities of cultured isolates

ATGACCATGATTACGGATTC ACTGGCCGTCGTTTTACAAC GTCGTGACTGGGAAAAC

Diagnostic Metagenomics Culture-independent diagnosis of infection using bench-top sequencer

Public Health Microbiology 2.1

Public Health Microbiology 3.0

ATGACCATGATTACGGATT CACTGGCCGTCGTTTTACA ACGTCGTGACTGGGAAAAC AUGACCAUGAUUACGGAUU CACUGGCCGUCGUUUUACA ACGUCGUGACUGGGAAAAC MYYLKNTNFWMFGLFFFFY FFIMGAYFPFFPIWLHDIN HISKSDTGIIFAAISLFSL

Macromolecular monitoring Nanopore sequencing of DNA, RNA, proteins to monitor infection, cancer, health of microbiome, fetus, transplants

Lars Westblade, Ph.D is an Assistant Professor in Pathology and Laboratory Medicine at Weill Cornell Medical College, and the Associate Director for Microbiology at New York-Presbyterian Hospital (Weill Cornell Campus). Prior to joining Weill Cornell Medical College, he was an Assistant Professor at Emory University. He completed his training in Medical and Public Health Microbiology under the direction of Dr. Michael Dunne and Dr. Carey-Ann Burnham at the University of Washington in St. Louis, School of Medicine. Dr. Westblade is a Diplomate of the American Board of Medical Microbiology.

Role of Clinicogenomics in Infectious Disease Diagnostics and Public Health Microbiology.

Clinicogenomics is the exploitation of genome sequence data for diagnostic, therapeutic, and public health purposes. Central to this field is the high...
1MB Sizes 0 Downloads 16 Views