A fair comparison.

correspondEnce

To the Editor: Recently, Paulson et al.1 introduced a normalization method, reporting that it improves clustering of meta-genomic abundance data, which is very important for many applications in the fast-growing area of microbiome research. However, in our view, the perceived improvement is due to a postprocessing procedure that is preferentially combined with some, but not all, normalizations included in their method comparison, rather than to the proposed normalization itself. Paulson et al.1 compared their normalization method to three existing ones using a data set from a study of microbial communities in the mouse gut and concluded that their method, called cumulative-sum scaling (CSS), “substantially improved” the separation between two known clusters present in the data 1. As the authors kindly provided us with the source code, we were able to reproduce their first figure (Supplementary Fig. 1). However, this was possible only when we applied a logarithm transformation to

Western diet LF-PP diet

b


First MDS coordinate

d

c

DESeq (z = 0.01)

Second MDS coordinate

CSS (z = 1)



a


e

CSS

DESeq

the data normalized with their CSS method but not to the data normalized by the other methods. Combining the log transformation with each of the normalizations shows that differences in cluster separation are due mainly to this additional transformation and not to the normalization itself (Fig. 1). Thus, conceptually simpler methods, such as relative-abundance normalization (also called total-sum scaling (TSS)), should not be dismissed on these grounds. To understand the large effect of the log transformation on this comparison, it is important to note that it is nonlinear, a feature that can fundamentally change the distribution of the data (skewing reduction, for example). Because the transformation is undefined for input values ≤0, one typically adds a small value (pseudocount) to non-negative input data to avoid log(0). However, owing to the nonlinearity of the log, this value also affects the transformation result (Supplementary Fig. 2). Paulson et al.1 set the pseudocount to 1 as a way to preserve zero counts. However, as the four normalizations compared produce output values whose ranges differ by several orders of magnitude, the same pseudocount may not be optimal for all of them. It should instead be chosen to ensure a consistent treatment: for instance, by setting it to a value smaller than the TMM (z = 1) minimum abundance value before transformation (Supplementary Fig. 2 and Supplementary Note). Methodological improvements are crucial in highly complex fields such as metagenomics. We feel, however, that in a comparison of different approaches, it is Western diet important to minimize the potential conLF-PP diet founding sources by ensuring equal treatFirst MDS coordinate ment of all methods under study.

TMM

Total sum

Paul I Costea, Georg Zeller, Shinichi Sunagawa & Peer Bork te rn LF -P P

W es

n LF -P P

te r W es

P -P LF

te

rn

–50

W es


Note: Any Supplementary Information and Source Data files are available in the online version of the paper (doi:10.1038/nmeth.2897). COMPETING FINANCIAL INTERESTS The authors declare no competing financial interests.

0

er n LF -P P


50

W es t

Class posterior log ratio

Total sum (z = 0.00001) Second MDS coordinate

npg

© 2014 Nature America, Inc. All rights reserved.

A fair comparison

Figure 1 | Clustering analysis of different normalization methods. (a–d) First two principal coordinates of multidimensional-scaling (MDS) analysis of mouse stool data normalized by CSS (a), DESeq size factors (b), trimmed mean of M-values (TMM) (c) and total-sum scaling (d). The pseudocount (z) used with the log transformation is indicated in parentheses (Supplementary Note). Colors indicate clinical phenotype (diet). LF-PP, low-fat, plant polysaccharide–rich diet. All normalizations separate samples by diet. (e) Class posterior probability log ratio for Western diet obtained from linear discriminant analysis. Each box corresponds to the distribution of leave-one-out posterior probability of assignment to the ‘Western’ cluster across normalization methods. Samples were optimally distinguished by phenotypic similarity regardless of the method of normalization used. This figure corresponds to Figure 1 in Paulson et al.1 (see also Supplementary Fig. 1).

European Molecular Biology Laboratory, Heidelberg, Germany. e-mail: [email protected] 1. Paulson, J.N., Stine, O.C., Bravo, H.C. & Pop, M. Nat. Methods 10, 1200–1202 (2013).

Paulson et al. reply: Costea et al.1 challenge the fairness of the results presented in the first figure of our paper2, which explored the effect of normalization and transformation procedures on clustering analysis of marker-gene survey nature methods | VOL.11 NO.4 | APRIL 2014 | 359

Fair Is Not Fair Everywhere.

A fair assessment?

Planning a community health fair.

A "transfer fair" approach to staffing.

Some guidelines for conducting a health fair.

Your fair share.

Be critical but fair.

Fair play therapy: a new perspective.

A 90° fair circular waveguide bend.

Fair and prompt.

Software for predictive microbiology and risk assessment: a description and comparison of tools presented at the ICPMF8 Software Fair.

Casino policies: Have Australians had a fair deal?

Robot-assisted rectal cancer surgery deserves a fair trial.

Sotrastaurin in liver transplantation: has it had a fair trial?

Editorial: A fair deal for patients with renal failure.

Equity theory and fair inequality: a neuroeconomic study.

A fair deal for PhD students and postdocs.

Screening for skin cancer at a county fair.

The athlete biological passport: ticket to a fair Commonwealth Games.

Fair pay petition gets a flying start with 150,000 signatures.

Fair, and still a sun lover: risk of gallstone formation.

Fair Innings and Time-Relative Claims.

Iowa State Fair food finder iPhone application.

Community health fair with follow-up.