Journal of Chromatography A, 1335 (2014) 81–103

Contents lists available at ScienceDirect

Journal of Chromatography A journal homepage: www.elsevier.com/locate/chroma

Review

Modern chromatographic and mass spectrometric techniques for protein biopharmaceutical characterization Koen Sandra ∗ , Isabel Vandenheede, Pat Sandra Research Institute for Chromatography (RIC), President Kennedypark 26, 8500 Kortrijk, Belgium

a r t i c l e

i n f o

Article history: Received 11 October 2013 Received in revised form 27 November 2013 Accepted 29 November 2013 Available online 10 December 2013 Keywords: Chromatography Mass spectrometry Biopharmaceutical Monoclonal antibody Therapeutic protein

a b s t r a c t Protein biopharmaceuticals such as monoclonal antibodies and therapeutic proteins are currently in widespread use for the treatment of various life-threatening diseases including cancer, autoimmune disorders, diabetes and anemia. The complexity of protein therapeutics is far exceeding that of small molecule drugs; hence, unraveling this complexity represents an analytical challenge. The current review provides the reader with state-of-the-art chromatographic and mass spectrometric tools available to dissect primary and higher order structures, post-translational modifications, purity and impurity profiles and pharmacokinetic properties of protein therapeutics. © 2013 Elsevier B.V. All rights reserved.

Contents 1. 2. 3.

4.

The protein biopharmaceutical landscape . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Characteristics and features of protein biopharmaceuticals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Analysis of protein biopharmaceuticals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1. Tools used and characteristics revealed at the protein level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1. Liquid chromatography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.2. Mass spectrometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2. Tools used and characteristics revealed at the peptide level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1. Liquid chromatography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2. Mass spectrometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3. Tools used and characteristics revealed at the glycan level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1. Liquid chromatography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.2. Mass spectrometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4. Tools used and characteristics revealed at the amino acid level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1. The protein biopharmaceutical landscape Since the commercial introduction in 1982 of recombinant human insulin for the treatment of diabetes, hundreds of protein biopharmaceuticals, classified as either therapeutic proteins or monoclonal antibodies (mAbs), have been approved by the

∗ Corresponding author. Tel.: +32 56 204031. E-mail address: [email protected] (K. Sandra). 0021-9673/$ – see front matter © 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.chroma.2013.11.057

81 82 84 85 85 90 92 93 94 98 98 99 100 100 100 100

regulatory agencies [1]. Today the global protein therapeutics market is worth over 100 billion dollar, thereby evolving toward a total pharmaceutical market share of 20%. It is expected that, within the current decade, more than 50% of the new drug approvals will be biologics [1–3]. Despite the fact that therapeutic proteins are presently dominating monoclonal antibodies in terms of overall sales, the latter are the fastest growing class of therapeutics [3]. Nowadays, around 30 monoclonal antibodies are marketed, nine displayed blockbuster status in 2010 and five of the ten topselling biopharmaceuticals in 2009 were monoclonal antibodies

82

K. Sandra et al. / J. Chromatogr. A 1335 (2014) 81–103

namely infliximab (Remicade), bevacizumab (Avastin), rituximab (Rituxan/Mabthera), adalimumab (Humira) and trastuzumab (Herceptin) [1,3,4]. The antibody fusion protein etanercept (Enbrel) together with first and second generation erythropoietin’s (EPO) (Epogen/Aranesp) and next-generation insulin (Lantus) and granulocyte colony stimulating factor (G-CSF) (pegfilgrastim – Neulasta) completed the top ten in 2009 [1]. While these top selling biopharmaceuticals are successfully being applied in the treatment of diseases with a high incidence such as cancer, autoimmune disorders, diabetes and anemia, a diverse set of biomolecules have also been introduced dictated toward rare genetic diseases. The monoclonal antibody eculizumab (Soliris) introduced in 2007 for managing ultra-rare paroxysmal nocturnal hemoglobinuria (PNH) and the replacement enzymes acid-␣-glucosidase (Myozyme and Lumizyme) and idursulfase (Elaprase) for the treatment of, respectively, Pompe disease and Hunter’s syndrome are good examples of so-called orphan drugs [1]. Over the years biopharmaceuticals have substantially been engineered to optimize their efficacy and safety profiles [5]. Following its original introduction in 1989, second and third generation variants of EPO appeared on the market in 2001 and 2007, respectively. Compared to the original product, representing the recombinant version of human EPO, serum half-life has substantially been improved by introducing two additional N-glycosylation sites (second generation) and by conjugation to polyethylene glycol (third generation) [6]. Similarly, driven by the need to reduce immunogenicity and increase efficacy, therapeutic antibodies have evolved from purely murine to chimeric, humanized and human sequences [3,7]. The antibody market is further expected to be reshaped by various next-generation formats such as bispecific mAbs, antibody–drug conjugates (ADC), antibody mixtures, antibody fragments (Nanobodies, fragment antigen binding – Fab) and brain penetrant mAbs next to glyco-engineered formats [1,3,7]. Recent years witnessed the introduction of the first bispecific mAb (catumaxomab) and antibody–drug conjugate (brentuximab vedotin) [1,3,4]. With the patents of the first generation therapeutic proteins expired, the last decade experienced the approval of the first biosimilar versions [1,2,8]. Furthermore, the knowledge that the top-selling monoclonal antibodies will become open to the market in the coming years has resulted in an explosion of biosimilar versions in development [1,9,10]. The biosimilar market holds great potential but is simultaneously confronted with major hurdles. This stems from the fact that, opposed to generic versions of small molecules, exact copies of recombinant proteins cannot be produced due to differences in the cell clone and manufacturing processes used. Even innovator companies experience lot-to-lot variability [11] and process changes can have drastic effects as experienced by Genzyme in an attempt to upscale the production of acid-␣-glucosidase (Myozyme) from a 160 L to a 2000 L fermentor. The glycosylation profile of the newly produced enzyme had changed substantially and authorities considered it as not being similar. The product is now marketed separately as Lumizyme [8]. In Europe, 14 biosimilars have been approved including two recombinant human growth hormone (hGH – somatropins), seven recombinant G-CSF (filgrastims) and five recombinant EPO products [1]. Interesting, glycoprofiles of the latter products appeared to be sufficiently similar to the reference medicines to allow their approval by European regulators. Thus far, no follow-on biologics have appeared on the US market, however, new regulations currently in effect will facilitate the approval over the coming years [1,2]. It is clear that biopharmaceuticals have reshaped the pharmaceutical landscape. It is a highly dynamic and rapidly evolving sector challenging to keep up with. Readers interested in the biopharmaceutical market trends are referred to the excellent

yearly and four-yearly overview articles in Nature Biotechnology [1,12–18]. 2. Characteristics and features of protein biopharmaceuticals Opposed to small molecule drugs, protein biopharmaceuticals are large, heterogeneous and subject to a variety of enzymatic and chemical modifications during expression, purification and long-term storage. Their complexity can perfectly be illustrated by the humanized monoclonal antibody trastuzumab (trade name Herceptin) on the market since 1998 for the treatment of HER2 positive metastatic breast cancer and recombinantly produced in Chinese Hamster Ovarian (CHO) cells (Fig. 1). This tetrameric antibody is composed of two heavy and two light polypeptide chains connected through four interchain disulfide bridges. Twelve intrachain disulfide bridges, four within each heavy and two within each light chain, furthermore guarantee its structural integrity. The expected formula based on the cDNA sequence used to transfect the host cell is C6460 H9972 N1724 O2014 S44 (1328 amino acids) corresponding to an average molecular weight of 145,422 Da (calculation based on estimated atomic weights from organic sources [19]). This is roughly 250 times larger than the chemically synthesized worldwide top-selling drug atorvastatin (Lipitor) with molecular formula C33 H35 FN2 O5 . Interestingly, the expected molecular weight of trastuzumab has never been experimentally observed due to co- and post-translational modifications (PTMs) taking place in the cell. During its passage through the endoplasmatic reticulum and Golgi apparatus, complex bi-antennary glycans are sculptured onto a conserved N-glycosylation site (with Asn-Xxx-Ser/Thr consensus sequence) located in the constant region of the heavy chain. A dozen of different sugar species have been measured, of which four are highly dominant (G0, G0F, G1F and G2F) (Fig. 1). Given the combination of two heavy chains and two N-glycosylation sites in a functional molecule, various glycoforms exist at the protein level typically annotated as e.g. G0F/G0F, G0F/G1F, G1F/G1F, etc. [9,20,21]. The non-glycosylated and singly glycosylated variants have as well been observed at low percentages (observations by the authors of this review). During cell culture production, host cell carboxypeptidases act on the antibody resulting in the removal of lysine residues from the C-terminus of the heavy chain [22]. In the case of trastuzumab manufacturing, this action is almost driven to completion with 99% cleavage of the two heavy chain lysines [9,23]. As a result, and depending on the production batch [21], the molecular formulas of the two main species are C6560 H10132 N1728 O2090 S44 (G0F/G0F) and C6566 H10142 N1728 O2095 S44 (G0F/G1F) with average molecular weight values of 148,057 and 148,219 Da, respectively. Other modifications reported on the antibody are cyclization of the heavy chain N-terminal glutamic acid with the formation of pyroglutamic acid, oxidation of methionine residues and deamidation/isomerization of asparagine and aspartate residues. Their origins are likely chemically driven during manufacturing and storage and their relative percentages range from 0.5% to 14% [23,24]. Taking all this together, a substantial number of species occur despite the fact that only one protein is actually cloned. Additionally, some minor aggregation (1000 bar. Chromatography under very high pressure is described as ultra-high pressure or performance LC (UHPLC) [53]. Another prominent feature of UHPLC besides faster analysis is that columns packed with sub 2 ␮m porous particles offer higher efficiency (up to 100,000 plates) in shorter analysis times compared to HPLC. Also elevated temperature LC, pioneered by Horvath [54], is intensively used now to further reduce analysis time by decreasing the viscosity of the mobile phase. More recently, superficially or core shell particles were reintroduced. Initially, the particle sizes were 2.6–2.7 ␮m (shell of 0.5 ␮m) giving much faster mass transfer compared to sub 2 ␮m porous particles. As a consequence the efficiency for both particles is nearly the same but the core shell particles can be operated at ca. 1/3 of the column back pressure of sub 2 ␮m porous particles i.e. at conventional LC pressures (400 bar). The mass transfer kinetics of the sub 2 ␮m porous particles and sub 3 ␮m core shell particles have been compared by Gritti and Guiochon [55]. Very recently, sub 2 ␮m superficially porous particles have been developed and commercialized. Although very promising (N ∼ L/dp for 1 ␮m?), their performances for the analysis of biopharmaceuticals still have to be evaluated in depth and their robustness proven. Moreover, further instrumental developments related to reductions in band dispersion may be required to fully exploit their characteristics with respect to mass transfer kinetics and frictional heat [56]. Kinetic performances of particles have typically been studied in the RPLC mode and very striking differences are noted for some of the other modes used in biopharmaceutical analysis. The efficiencies attainable in IEX and SEC are substantially lower than in RPLC. 3.1.1.2. Selectivity: chromatographic modes. The physico-chemical diversity of proteins (charge, isoelectric point, hydrophobicity, size) makes them well suited to be separated by nearly every liquidbased separation mode. Different selectivities are being offered by reversed-phase liquid chromatography (RPLC), hydrophobic interaction chromatography (HIC), hydrophilic interaction chromatography (HILIC), size exclusion chromatography (SEC) and ion exchange chromatography (IEX). They all have unique features for separations of proteins and in the end provide complementary information. • Reversed-phase liquid chromatography (RPLC) RPLC finds many applications in the characterization of protein biopharmaceuticals and is applied at protein, peptide, amino acid and even glycan level. The efficiency is superior to all other chromatographic modes. Its robustness furthermore makes it well suited for use in a routine environment. Mobile phases typically consist of water, acetonitrile and 0.1% trifluoroacetic acid (TFA) and proteins are eluted according to their hydrophobicity on ˚ porous or superficially columns packed with wide-pore (300 A) porous particles with different chemistries (phenyl, cyanopropyl, propyl – C3, butyl – C4, octyl – C8, octadecyl – C18) using an acetonitrile gradient. The separation mechanism is based on a combination of solvophobic and electrostatic interactions, the latter governed by the interaction of TFA with basic side chains

(arginine, lysine, histidine) and the N-terminus [47]. To cope with the slow diffusion of proteins in liquids, resulting in broad peaks, temperature is often used as a variable. Column temperatures as high as 90 ◦ C are not uncommon in protein separations [57]. Next to better kinetic efficiencies, increased column temperatures lead to higher recoveries (reduced carry-over), a substantial problem encountered with large and hydrophobic proteins. One has to be aware, however, of the risk of thermal degradation at elevated temperatures under acidic conditions. Protein adsorption can furthermore be limited by using less hydrophobic ligands and using iso- or n-propyl alcohol as organic modifier (eventually in a blend with acetonitrile). Post-translational modifications in monoclonal antibodies could clearly be highlighted using a wide pore C8 column operated at 75 ◦ C with TFA as ion-pairing reagent and acetonitrile as organic modifier [58]. To cope with the increasing baseline due to the higher absorbance of TFA in acetonitrile, 0.11% TFA was added to solvent A while 0.09% TFA was added to solvent B. The light and heavy chain obtained following disulfide bond reduction and cysteine alkylation, as well as the Fab and Fc fragments obtained following papain cleavage could readily be separated. Variants originating from N-terminal cyclization (or non-cyclization), deamidation and isomerization, oxidation and clipping could be revealed. Interesting, the clipping observed was at an asparagine proline bond which is known to be susceptible to acid cleavage at higher temperature. Lowering the column temperature to 40 ◦ C reduced the clipping allowing to conclude that this represents an artifact of the method. In order to obtain good recovery in the separation of intact monoclonal antibodies, acetonitrile has been replaced by isopropyl and npropyl alcohol [59]. Degradation products of an intact mAb could clearly be separated as well as heterogeneities related to disulfide bridges. The same groups also explored the possibilities of a diphenyl column in antibody separations and were able to resolve a cysteinylated and non-cysteinylated variant at the intact mAb level and 5 site specific oxidations in the heavy chain [60]. In another interesting study, RPLC at the intact mAb level allowed to reveal buried unpaired cysteines in the light chain [61]. These cysteine redox status variants represented a major source of heterogeneity but were shown to be fully active in a potency assay. The heterogeneity could only be revealed at column temperatures above 55 ◦ C and was observed on different types of stationary phase chemistries (e.g. diphenyl, C8). RPLC has as well been used to assess the drug-linker distribution on antibody heavy and light chains following reduction of the antibody–drugconjugate [38,39]. The hydrophobic nature of polyethylene glycol furthermore makes RPLC attractive for the analysis of PEGylated proteins. Recently, a validated method was described for the determination of PEGylated interferon-alpha2a (PEG20 -IFN-␣2a), used in the therapy of patients with chronic hepatitis C virus (HCV) [62]. The method allowed the separation of non-PEGylated (pre-peak) and multi-PEGylated IFN-␣2a (post-peak) from the intended mono-PEGylated IFN-␣2a. Since the differences in physicochemical properties of the protein and many of its variants are typically small, selectivity cannot always be fully exploited to enhance the resolution. It has recently been described that the main drivers for increased protein resolution are the particle dimensions and temperature, i.e. the kinetic efficiency [46,47]. Most recent protein separations are indeed performed on sub 2 ␮m fully or superficially porous particles. Fig. 2 demonstrates the gain in resolution upon evolving to a sub 2 ␮m fully porous particle. It represents the separation of a 30 kDa therapeutic protein at 60 ◦ C using 0.1% TFA, water and acetonitrile as mobile phase constituents. A variety of subtle protein variants resulting from, among others methionine oxidation and N-terminal cyclization, are visualized. Table 2

K. Sandra et al. / J. Chromatogr. A 1335 (2014) 81–103

87

Table 2 Elution order of common post-translational modifications relative to the main peak. Modification

RPLC protein

CEXb

SEC

HIC

RPLC peptide

Aspartate isomerization Asparagine deamidationa Oxidation PyroGlu from Glu ( H2 O) PyroGlu from Gln ( NH3 ) Succinimide Sugar C-terminal lysine Aggregation Fragmentation

Pre-peak Post-peak + pre-peak Pre-peak Post-peak Post-peak Post-peak Pre-peak Pre-peak – Variable

Variable Post-peak Variable Post-peak Pre-peak Post-peak Variable Post-peak – Variable

– – – – – – – – Pre-peak Post-peaks

Pre-peak Post-peak + pre-peak Pre-peak Post-peak Post-peak Post-peak Pre-peak Pre-peak Post-peak Variable

Pre-peak Post-peak + pre-peak Pre-peak Post-peak Post-peak Post-peak Pre-peak Pre-peak – Variable

a b

Under certain conditions, asparagine deamidation gives rise to asparatate and isoaspartate with the latter appearing as a post-peak and the former as a pre-peak. Situation is typically reversed in AEX.

furthermore reveals the RPLC elution of the most common posttranslational modifications relative to the main peak in a protein biopharmaceutical preparation. Very recently, a generic method development approach has been proposed for the RPLC analysis of monoclonal antibody fragments (heavy chain, light chain, Fab and Fc) on sub 2 ␮m fully or superficially porous particles. Outstanding separations within relatively short analysis times were demonstrated on rituximab, bevacuzimab and panitumumab employing gradients between 30 and 40% acetonitrile, temperatures above 60 ◦ C and 0.1% TFA as ion-pairing additive [57]. A major advantage of RPLC over the other techniques used for protein separations is its compatibility with electrospray ionization mass spectrometry (ESI-MS) given the volatility of the mobile phases. It has to be stressed, however, that the best chromatographic conditions are not necessarily the best mass spectrometric conditions. Indeed, TFA is an ideal ion-pairing reagent and gives rise to superb chromatography, however, it simultaneously acts as a suppressor of the electrospray ionization process resulting in a reduced signal intensity [63]. The reverse is true for formic acid (FA). The limited hydrophobic character of FA furthermore reduces the retention on the column. Despite the loss in intensity, minor variants resolved using trifluoroacetic acid have successfully been measured by mass spectrometry, as exemplified by the separations reported above which have all been complemented with MS for peak identification. • Ion exchange chromatography (IEX) In ion exchange chromatography, electrostatic interaction between the ionic groups of the stationary phase and ionic groups on the protein surface form the basis of the separation. The sample is loaded onto the column under pH conditions where the net charge of the protein is opposite to the charge of the stationary phase. Elution is achieved using a salt gradient [11,23,64–66] or a pH gradient [67–69]. Ion exchange is widely used for profiling the charge heterogeneity of proteins resulting either from enzymatic or chemical modifications. The technology is used both for characterization and release testing. Cation exchange (CEX) is considered the gold standard for the analysis of monoclonal antibodies given their basic nature. Interestingly, Harris et al. described the validated cation exchange method for release testing of Herceptin production lots [23]. A 20 mM Na-phosphate buffer at pH 6.9 and elution with increasing NaCl concentration was shown to be a good choice for this protein with an isoelectric point of 8.45. Seven forms of the product could be resolved related to asparagine deamidation and aspartate isomerization. This could be demonstrated following fraction collection and peptide mapping. Modifications as low as 0.5% could be accurately measured. Since the molecular structure is maintained, potency measurements could be performed allowing to conclude that the deamidated material (asparagine deamidation light chain) had the same activity as the main peak while aspartate isomerization

(heavy chain) substantially reduced potency (see earlier). The method described was used at least up to 2001 in a routine environment but might have been refined in the meantime (no published information available). Fig. 3 shows the strong cation exchange chromatogram of trastuzumab as measured in our laboratory. Dozens of charged variants are observed. The main preand post-peaks, representing the deamidated and asparagine isomerized variants, have a % peak area of, respectively, 12 and 8%. These values are in accordance to the ones reported by Harris et al. [23]. Many more examples can be found in literature reporting on the separation of deamidated variants [64,66]. Cation exchange has as well shown to be particularly powerful to detect lysine truncation [11,22,64,70]. In that respect, Schiestl et al. also used CEX to demonstrate changes in the manufacturing process of Rituxan/Mabthera and Enbrel [11]. For the former, an abrupt change in the quality profile of lots with expiry dates in 2010 or later became apparent. Basic CEX variants, attributed to C-terminal lysine and N-terminal glutamine, were reduced from 30–50% to 10%. For Enbrel, a highly consistent quality profile for production batches having expiry dates until the end of 2009 was demonstrated. After that time period, batches with a second and changed quality profile appeared on the market. Compared to Rituxan/Mabthera, the opposite situation occurred. Basic variants, primarily originating from C-terminal lysine, increased from 15–30% to 40–60%. All these production batches were on the market as Rituxan/Mabthera or Enbrel indicating that these observed changes are believed not to alter the clinical profile. Additional charge heterogeneities where ion exchange has proven its value are truncation and N-terminal glutamine cyclization (pre-peak), succinimide formation (post-peak), sialylation (pre-peaks), glycation (pre-peak), C-terminal amidation (post-peak) and oxidation (post-peak) [64,65,70–72]. Some modifications have a direct effect on charge such as C-terminal lysine truncation (loss of positive charge), deamidation (gain of acidic function), succinimide (loss of acidic function) which easily explains their elution behavior. Other modifications give rise to conformational changes thereby exposing different charged residues on the proteins surface. Aspartate isomerization, for example, does not change the net charge, however, it can cause structural alterations by introducing additional carbon into the peptide backbone. In the case of Herceptin, aspartate isomerization was observed as a post-peak (see Fig. 3). Table 2 highlights the CEX elution of the most common post-translational modifications relative to the main peak. Recently, an anion exchange (AEX) chromatographic method was described for the analysis of a mAb panel for the treatment of Botulism poisoning. Two of the three mAbs (the more neutral ones with pI of 6.7 and 7.6) were shown to give retention on AEX and one displayed two peaks, a main peak and an oxidized variant eluting as pre-peak (more basic). This peak could not be resolved using CEX [73]. Other cases have been shown in which

88

K. Sandra et al. / J. Chromatogr. A 1335 (2014) 81–103

mAU

(a)

mAU 400

monomer

(c)

250

350

200

300 250

150

200

100

150 100

50

excipient

50

0

0 0

10

20

mAU 100

30

40

(b)

Lc asparagine deamidation

0

min

2

4

6

8

10

12

14 min

(d)

mAU 1.4 1.2

80

1

Hc aspartate isomerisation

60

dimer

0.8

fragments

0.6

40 0.4 0.2

20

0

0 16

18

20

22

24

26

28

30

min

0

2

4

6

8

10

12

14 min

Fig. 3. Full and detailed CEX (a and b) and SEC (c and d) chromatograms obtained on trastuzumab (Herceptin). Shown are the 214 nm UV chromatograms. Unpublished results.

oxidized variants are resolved using cation exchange [64]. The rationale for separating oxidized variants on ion exchange is again believed to be related to conformational changes exposing different charged/uncharged residues. While salt gradients are the most common gradients, recent years witnessed a revival of pH gradients [67–69]. Opposed to the former, which are believed to be product specific and time consuming to develop, pH gradient methods appear to be multiproduct compatible and, moreover robust against variations in sample matrix salt concentration and pH. A full validation of a pH gradient based IEX method has as well been described showing superior precision and robustness compared to salt based IEX and imaged capillary isoelectric focusing [68]. As a loose variant on the above, a high-performance cation exchange chromatofocusing method using simple, two component buffer systems, was recently described to monitor deamidation events [74]. Ion exchange is not directly compatible with MS due to the presence of non-volatile salts in the mobile phases [63]. The mass spectrometric measurement of IEX peaks requires their collection and subsequent desalting or dilution prior to MS measurement. Desalting of the collected fractions can be performed in an automated manner using either a small RP cartridge or via SEC hyphenated to mass spectrometry [75]. A fully automated two-dimensional LC–MS methodology incorporating two gradient pumps and various valves allowing IEX or SEC separation, on-line trapping of IEX or SEC peaks on six parallel RP desalting cartridges and subsequent elution into the mass spectrometer has been described [76]. This strategy allowed to assign acidic peaks in a mAb CEX chromatogram to the presence of mono-, di and tri-sialylated glycans. The main peak was composed of the mAb occupied with uncharged glycans. On another mAb, a minor basic CEX peak (1%) could be attributed to a heavy chain sequence variant (Ser → Arg) following on-line reduction prior

to mass spectrometric measurement. Important requirements toward RP cartridges used are low carry-over, good stability and loading capacity and fast conditioning and re-equilibration. • Size exclusion chromatography (SEC) Size exclusion chromatography separates biomolecules according to their size making it powerful to assess protein aggregation and fragmentation [2,36,77]. In analogy to IEX, SEC is used both for characterization and release testing. It represents the only chromatographic technique that does not rely on interactions with the stationary phase and therefore does not require the application of a mobile phase gradient. Instead, molecules travel through the column at a speed depending on particle pore accessibility. Smaller proteins such as erythropoietin are typi˚ while cally analyzed on columns with smaller pores (e.g. 150 A) larger monoclonal antibodies show best resolution using wider ˚ Unfortunately, unwanted protein adsorption pores (e.g. 300 A). to the used resins occurs frequently. A recent review focused on the role various co-solvents (salts, amino acids, organic solvents) play in suppressing protein adsorption [78]. Fig. 3 shows a typical separation obtained on a Herceptin lot present on the market and reconstituted according to the specifications. A monomeric peak is observed at a purity of 99.5% next to a dimer (0.4%) and some higher aggregates (0.1%). Late eluting peaks correspond to fragments and excipients. Upon analyzing a MW protein standard, MW of the peaks can be estimated, yet with an error margin that easily reaches 15% (which is in analogy with SDS-PAGE) this opposed to deviations at 0.005% levels using mass spectrometry. It has been described for this particular protein that aggregation increases substantially upon deviating from the manufacturers recommendation to redisolve in 0.9% NaCl and reconstitute the lyophilized protein in 5% dextrose [25]. Size exclusion is the method of choice for determining aggregation but concerns

K. Sandra et al. / J. Chromatogr. A 1335 (2014) 81–103

in accuracy of the data have resulted in the request from authorities to use orthogonal analytical tools such as analytical ultracentrifugation (AUC) and asymmetric flow field flow fractionation (AF4) to help assess SEC methods [2]. Potential errors include the inability to detect aggregates because of their removal by the SEC column (retained on frits) or their dissociation during the SEC process. SEC has as well been used to display aggregation in antibody–drug conjugates [39]. Many of the drugs conjugated to antibodies are relatively hydrophobic and can increase the likelihood of aggregate formation during manufacturing and storage. Robust methods for aggregation measurement are required for lot-release testing and for stability monitoring. The optimal SEC method for the parent mAb is not necessarily the optimal one for the ADC. The attached hydrophobic drug can lead to secondary interactions resulting in broad, tailing peaks. Addition of organic modifiers to the mobile phase can prevent such interactions leading to identical retention times for parent mAb and ADC [39]. SEC has as well been shown to be powerful in the study of PEGylated proteins, fragments and unusual structures. Fragmentation in the hinge region of a mAb upon a stability study could be demonstrated using the technique. Fragmentation was shown to be a kinetic process that is not caused by low levels of host cell proteases [77]. SEC has revealed a peak between a monomer and dimer mAb which originated from two structural variants containing one and two extra light chains [79]. SEC revealed the presence of a degradation product in a PEGylated recombinant protein. After 12 months of storage at 2–8 ◦ C, an impurity increased from 0.01% to 0.25%. It could be traced back to an N-terminal adduction resulting from the raw material used for PEGylation [80]. The existence of a thioeter linkage between a mAb light and heavy chain has been demonstrated [81]. SEC performed following reduction of the mAb separated the light and heavy chain linked via disulfide bridges from the one linked via the non-reducible thioether bridge [81]. Recently, a method was described for assessing tryptophan oxidation in a monoclonal antibody, next to aggregation and fragmentation, by performing size exclusion chromatography under moderate hydrophobic conditions, i.e. mixed-mode principle [82]. Oxidation appeared as a pre-peak on the monomer main peak by using a moderate salt concentration (sodium acetate and sodium sulphate). In that particular case, the authors utilized secondary interactions in their advantage. This oxidized monomer showed a significantly reduced bioactivity. Using SEC, excipients typically elute in the so-called inclusion limit of the column. A recent paper combined SEC with a mixed-mode chromatographic column in an on-line fashion to further separate excipients present in biopharmaceutical preparations [83]. The method has been used to quantitate proteins and excipients such as sucrose, Na+ , K+ , histidine, Cl− , succinate and polysorbate 80 in different biopharmaceutical drug products including monoclonal antibodies, antibody–drug conjugates and vaccines. Proteins were directed to a UV detector while excipients were monitored by evaporative light scattering detection (ELSD). Regarding protein detection in SEC, while UV is the detector primarily used, more details in aggregation are obtained using multi-angle light scattering (MALS) [84]. An obvious chromatographic trend that is also finding its way into SEC is reduction in particle diameters. Sub 2 ␮m particles are nowadays available resulting in increased resolution or speed. A critical evaluation of such a column was recently described [85]. Improved chromatographic performance opposed to 3 and 5 ␮m columns was demonstrated, this with the important remark that the higher pressures under which these sub 2 ␮m particles are operated, resulted in increased aggregation (due to shear forces and frictional heating). Using the same types of columns,

89

an independent paper described a high-throughput method for aggregate quantification [86]. Assay time below 2 min was achieved by interlaced sample injections with parallelization of two columns. The internal diameter of SEC columns has as well been reduced to offer high sensitivity (limit of detection of 15 pg) needed in the evaluation of therapeutic candidates early on in the development [87]. Since aggregation represents a critical quality attribute, it is appropriate to assess it at the earliest stages. The described method allowed to screen out proteins with poor quality from small-scale cell culture experiments that do not yield enough material for conventional SEC analysis. The same method was subsequently used in a proof-of-concept study to monitor recombinant antigen binding fragments (Fabs) aggregation in vivo in human vitreous humor [88]. Intuitively, one would not consider SEC to be compatible with mass spectrometry. Indeed, in many applications, peaks are desalted prior to mass spectrometric measurement [63,76,89]. The direct coupling of SEC to MS using compromise mobile phases has nevertheless been reported [75,81,90]. In some cases, SEC was solely used for on-line desalting prior to mass spectrometry [75,91]. The native desalting conditions opposed to reversed-phase desalting allowed to maintain the structural integrity of an antibody–drug-conjugate [91]. • Hydrophobic interaction chromatography (HIC) Being a workhorse in downstream processing of protein biopharmaceuticals, HIC has also shown added value as analytical tool. The separation mechanism is based on the adsorption of the hydrophobic region of the protein to the weakly hydrophobic stationary phase in the presence of high salt concentrations. Protein desorption is achieved under non-denaturing conditions by descending salt concentration. It has recently been used to determine the drug-to-antibody ratio of antibody–drug conjugates [38,39] and to highlight antibody heterogeneities originating from methionine and tryptophan oxidation (pre-peaks), aspartate isomerization (pre-peak), succinimide formation (post-peak), C-terminal lysine (pre-peak) and clipped species (pre-peaks) [92–94]. A major advantage of HIC over reversedphase LC is that it preserves the structure, hence, minor variants can be collected and subjected to complementary techniques such as potency determination. In this way, it could be demonstrated that aspartate isomerization in the light chain of a specific antibody rendered it inactive while succinimide formation in the light chain of another antibody did not influence potency [92]. This once more highlights the importance of monitoring modifications and to keep them within specifications. Table 2 highlights the HIC elution of the most common post-translational modifications relative to the main peak. The high salt concentrations used in HIC does not make the technology immediately compatible with mass spectrometry. • Hydrophilic interaction chromatography (HILIC) HILIC has thus far only marginally been used for the separation of proteins [46]. Compounds are resolved on a polar stationary phase by applying a water gradient, the latter differentiating HILIC from normal-phase LC (NPLC) that typically uses non-aqueous buffers. Chromatographic separation is based on differential partitioning of the solutes between the mobile phase and the water-enriched solvent layer adsorbed onto the surface of the packing material, although electrostatic interactions exist depending on the stationary phase, buffer and pH. The dissolution solvent with a high organic content (e.g. 70–90% acetonitrile) represents a major bottleneck to make HILIC and protein separations compatible. An attractive feature of HILIC, on the other hand, is the compatibility with MS. HILIC is a common chromatographic mode to separate glycans as will be discussed below.

90

K. Sandra et al. / J. Chromatogr. A 1335 (2014) 81–103

3.1.2. Mass spectrometry 3.1.2.1. Intact protein measurement. It is already 25 years ago that the 2002 Nobel laureate, John Fenn, described for the first time an elegant way to present proteins to the mass spectrometer [95]. The principle known as electrospray ionization (ESI) really revolutionized the analysis of proteins. In combination with high-resolution mass spectrometers such as time-of-flight (TOF), orbitrap and Fourier transform ion cyclotron resonance (FTICR) MS, great structural detail is highlighted [9,19,20,42,45]. The process of electrospray ionization generates multiply charged ions from proteins resulting in a characteristic charged envelope that can readily be converted to the molecular mass by applying, for example, a maximum entropy algorithm. Typically, average molecular mass is determined, however, in case the mass analyzer offers sufficient resolution, proteins can be resolved at isotope level. The resolution offered by TOF mass analyzers allows isotope resolution of smaller proteins (up to approximately 20 kDa) [96] while that offered by FTICR mass analyzers allows isotope resolution of larger proteins (up to approximately 150 kDa) [97]. The advantage of isotopic resolution for larger proteins is arguable. For a monoclonal antibody, the monoisotopic mass is virtually non-existing and the width of the molecular envelope is very broad (25 Da at half height) providing only limited additional information over average molecular mass [19]. Nevertheless, ultra high resolution is important in studying protein fragmentation spectra (see below). It is only recently that the unit mass baseline resolution for an intact monoclonal antibody has been achieved so the full benefits have yet to be demonstrated [97]. The phenomenon of multiply charging under ESI conditions allows even the largest proteins to fall within the mass range of common mass analyzers (charged envelopes typically range from m/z 500 to 4500) thereby making ESI attractive for the measurement of a wide range of proteins. This is demonstrated in Fig. 4 showing the deconvoluted spectra of intact trastuzumab, its heavy and light chain generated following interchain disulfide bond reduction using dithiothreitol (DTT) and the crystallizable fragment generated following papain cleavage (Fig. 1) measured on a state-of-the-art Q-TOF mass spectrometer with on-line RP desalting. Such a measurement provides an initial confirmation of the gene-derived protein sequence, reveals the structural integrity and furthermore reveals post-translational modifications. The measured average MW values obtained at all levels (intact, Hc, Lc and Fc) are well below 0.005% of the theoretical MW values. The intact mAb, Hc and Fc measurements demonstrate the existence of several glycoforms with characteristic 146 Da and 162 Da spacings indicative for fucose and galactose and consequently N-glycosylation. Four main glycoforms can be revealed at the Hc level containing the Nglycans G0, G0F, G1F and G2F. The intensity of the peaks is indicative for the occurrence of the different glycoforms. A fully functional mAb is composed of two heavy chains linked via disulfide bridges, hence, the measurement of the intact mAb and the Fc allows the simultaneous interrogation of the two N-glycosylation sites providing further structural detail. Monoclonal antibody N-glycosylation MS analysis at different levels, i.e. intact, heavy chain, reduced and non-reduced Fc, was compared from a qualitative and quantitative point of view by Sinha et al. Intact (148 kDa), Hc (51 kDa) and nonreduced Fc (53 kDa) analysis did not provide sufficient resolution for highly accurate quantitation. Reduced Fc analysis appeared to be the most adequate method given the lower molecular weight associated with it (∼27 kDa) [98]. The power of mass spectrometry is further demonstrated by the analysis of a biosimilar and biobetter at the development stage (Fig. 4). In the former case, it is revealed that the glycosylation is qualitatively similar but deviates substantially from the originator product from a quantitative perspective, this taken the originator lot-to-lot variability into account. In the latter case, it is

demonstrated that the glycans lack the core fucose, an essential feature to increase the ADCC. These measurements at the early stage are crucial for further development and, for example, allow to adapt the cell growth conditions to make the biosimilars glycosylation profile more similar to the originator [99]. The biosimilar, furthermore, displayed higher levels of C-terminal lysine (∼10%). C-terminal lysine in the originator is only present at ∼1% and cannot be revealed in the spectrum. Such an informative spectral information is not always obtained. In cases where multiple glycosylation sites, occupied by various glycans, exist, the charge distribution is convoluted over the size distribution resulting in uninterpretable spectra. This is, for example, the case with etanercept, a 150 kDa large protein containing various N- and O-glycosylation sites. To reduce the complexity, etanercept can be treated using papain thereby generating the Fc and the sTNFR fragment [30]. Following Protein A enrichment, the common Fc glycoforms could be observed next to a substantial Cterminal lysine containing variant in accordance with published data [11]. The remaining complexity associated with the sTNFR part, does not allow its mass spectrometric measurement. Peptide mapping is required to obtain structural insights there. Similar issues are encountered with, for example, acid-␣-glucosidase, idursulfase, erythropoietin and PEGylated proteins due to the combined complexity of the organic and biopolymer [100]. Antibody–drug conjugates, on the other hand, have successfully been subjected to mass spectrometric measurement. This allowed to reveal the antibody-to-drug ratio following the enzymatic removal of the N-glycans to facilitate the spectral interpretation [38,39,101]. In cases where the conjugation process takes place on reduced cysteine residues, the mAbs structure is maintained through non-covalent interactions demanding for special care upon transferring them to the gas-phase [91,102]. Electrospray ionization of proteins is normally achieved making use of acidic and organic solutions resulting in denaturation and subsequent multiply charging due to the high number of exposed basic residues. In cases where inter and intramolecular interactions need to be maintained, a volatile aqueous buffer at neutral pH is required. Under these conditions, proteins acquire a lower number of charges requiring instruments with a higher mass range (>4000) such as TOF systems. Native electrospray ionization successfully allowed to determine ADC ratio’s and head-to-head comparisons between MS and HIC showed consistent data. Native mass spectrometry has additional interesting application areas [20,44,103]. It has been used for the measurement of mixtures of deglycosylated antibodies [104], to probe global conformational changes/differences [44,105–108] and for the assessment of the interaction between protein biopharmaceuticals and their targets, e.g. antibody–antigen [44,106,109,110]. It has furthermore been used to measure mAb aggregates (dimers, trimers, tetramers) in SEC collected fractions. The native conditions leave the aggregates intact and m/z values as high as 15,000 were obtained [89]. Aggregation has as well been measured by combining native ionization with ion mobility mass spectrometry [2,111]. The latter separates gas-phase ions according to their size and shape in a drift cell at the millisecond time scale prior to m/z measurement. Ion mobility has furthermore been shown to possess the ability to resolve disulfide structural isoforms of IgG2 [112], to monitor the dynamics of IgG4 Fab arm exchange and bispecific mAb formation [110] and has facilitated the measurement of PEGylated and glycosylated proteins [113,114], among others. While the earlier described examples display modifications with relatively large mass shifts, e.g. C-terminal lysine (128 Da), fucose (146 Da), galactose (162 Da), conjugated drugs, many modifications give rise to more subtle or no mass differences, e.g. aspartate isomerization (0 Da), deamidation (1 Da), oxidation (16 Da), disulfide bridge reduction (2 Da). As stated, the broad

K. Sandra et al. / J. Chromatogr. A 1335 (2014) 81–103

91

Fig. 4. Deconvoluted spectra of trastuzumab (Herceptin), its biosimilar and biobetter acquired using Q-TOF mass spectrometry. Shown are the intact mAb, Fc, Lc and Hc. Samples were introduced following on-line RP desalting. Unpublished results.

molecular envelope obtained on large proteins (e.g. 25 Da on 150 kDa protein), does not allow to resolve the smaller mass differences. Even on smaller polypeptides such as the heavy chain, a 16 Da mass increase, indicative for oxidation, is difficult to reveal given the co-presence of Na-adducts formed during the gas-phase transfer. In these cases, the combined power of chromatography and mass spectrometry is required. As shown in Fig. 5, separating trastuzumabs partially reduced light chain on an RPLC column prior to on-line Q-TOF-MS measurement allows subtle mass differences to be highlighted. Indeed, a major post-peak with a mass difference of only 1 Da, indicative of a deamidation event, and several minor post-peaks with mass differences of 2 and 4 Da, indicative for disulfide bond reduction, are highlighted. In the absence of any chromatographic separation, these peaks would be masked by the more intense native peak, illustrating the perfect marriage between chromatography and mass spectrometry. It is important to realize that 1 Da mass deviations can easily be revealed at light chain level, however, MW measurement at intact mAb level does not allow to draw conclusions based on 1 Da mass deviations. Note that the deamidated post-peak is related to the major pre-peak observed in the CEX chromatogram shown in Fig. 3, corresponding

to trastuzumab with one deamidated light chain. This deamidation, in contrast to aspartate isomerization is not influencing potency [23]. The unresolved minor RPLC pre-peak on the main peak corresponds to the presence of a single hexose residue (162 Da) that can either result from a non-enzymatic glycation (lysine residue) or an enzymatic O-glycosylation (serine, threonine), e.g. Omannosylation. Both events have been demonstrated in antibodies and can be differentiated at the peptide level [20,115,116]. At the end of the 1980s, two independent groups reported on yet another way to make proteins amenable for mass spectrometric measurement [117–119]. In contrast to electrospray ionization, the technique of matrix-assisted laser desorption ionization (MALDI) does not result in multiply charging, hence, requires analyzers with a wide mass range such as TOF systems operated in the linear mode. While the technique has been used extensively for intact protein measurements, it suffers from poor mass accuracy and resolution for large molecules rendering electrospray more appropriate [19,20,120,121]. On a typical monoclonal antibody, electrospray ionization results in the read-out of the various glycoforms at both the intact, heavy chain and Fc level. MALDI, on the other hand, results in a single broad peak with the MW value

92

K. Sandra et al. / J. Chromatogr. A 1335 (2014) 81–103

(b)

(a) x102 N

1.5

S b

23439.48

b

23440.21

c

23442.04

d

23441.83

e

23443.74

f

S S S

1

a

23600.53

D

S S S S

0.5

S N S

N

SH

SH SH SH SH

f

e

d

a

N

S S

SH SH

c

SH

0 10

11

12

13

14

15

16

Response Units vs. Acquisition Time (min)

17

23200 23300 23400 23500 23600 23700 Counts vs. Deconvoluted Mass (amu) Fig. 5. On-line RPLC–UV–MS analysis of the light chain of trastuzumab. (a) UV chromatogram and (b) deconvoluted spectra associated with the labeled peaks. Different post-peaks are observed with 1, 2 and 4 Da mass deviations indicative for deamidation and reduction. A minor pre-peak with a mass increase corresponding to a hexose residue is highlighted. Light chain was separated on a sub 2 ␮m fully porous C4 column operated at 60 ◦ C using acetonitrile and 0.1% TFA as mobile phase constituents. Unpublished results.

comparable to the mean MW calculated from the ESI results [121]. HPLC and MALDI-MS cannot be coupled in an on-line fashion. Peaks can be collected on a MALDI plate at user defined time intervals. Opposed to ESI, TFA is not a suppressant of the ionization process in MALDI. 3.1.2.2. Protein fragmentation. In recent years, protein molecular weight measurement has been complemented with gas-phase fragmentation to obtain additional information regarding the sequence and post-translational modifications. This procedure is typically termed top-down mass spectrometry and requires high-resolution mass spectrometers such as FTICR-MS, orbitrap and Q-TOF to resolve the enormous number of multiply charged fragments generated [122]. Fragmentation mainly relies on electron capture dissocation (ECD – unique for FTICR-MS) and electron transfer dissocation (ETD) and to a lesser extent on collision induced dissociation (CID). While the latter typically cleaves the peptide bond with the generation of so-called y- and b-ions, containing, respectively, the C- and N-terminal fragments, the former two cleave the N C␣ bond with the formation of z- and c-ions. Opposed to CID, ETD and ECD typically maintain labile modifications, such as glycosylation, on the protein backbone and allow the cleavage of disulfide bonds. A recent series of papers described top-down studies on intact antibodies using FTICR with ECD fragmentation and orbitrap and Q-TOF with ETD and CID fragmentation [123–127]. ECD combined with FTICR and ETD with orbitrap generate comparable number of cleavages and around 35% sequence coverage thereby outperforming ETD and CID on Q-TOF and CID on orbitrap. N- and C-terminal sequence can readily be assessed thereby confirming lysine truncation and pyroglutamate formation as well as methionine oxidation sites in forced degradation samples [123]. The numerous disulfide bonds present in a mAb, drastically reduce the efficiency of top-down fragmentation within the protected sequence regions reducing overall sequence coverage and leaving glycosylation uncharacterized. In that respect, top-down characterization has recently been complemented with electrochemical reduction of the disulfide bonds thereby substantially increasing

sequence coverage of smaller proteins [128]. The impact on larger proteins has yet to be demonstrated. MALDI is gaining popularity for top-down sequencing. It typically makes use of in-source decay (ISD) which, in contrast to the techniques of CID, ECD and ETD, does not rely on the selection of specific precursors therefore requiring pure proteins for analysis. In analogy to ECD and ETD, ISD gives rise to c- and z-type ions and conserves labile modifications. Sequencing of large molecules is generally restricted to the N- and C-terminal parts [122]. Recently, the primary structure of a 13.6 kDa Nanobody was determined near to completion by MALDI-TOF/TOF top-down sequencing [96]. Top-down mass spectrometry remains a specialistic tool that can currently not compete with peptide mapping in terms of sequence coverage and determination of modifications and modification sites. It is, however, an interesting addition to the analyst toolbox and provides complementary structural information. Advances in both hard- and software are expected to bring this technology to a next level resulting in wider adaptation. 3.2. Tools used and characteristics revealed at the peptide level Protein measurement is extremely powerful but does not provide the complete picture. While it is indicative for identity and highlights dominant modifications, it typically does not provide the actual amino acid sequence nor does it adequately allow to localize the modifications. The LC–MS measurement presented in Fig. 5 reveals a deamidation on the Lc but this cannot be traced back to a specific asparagine or glutamine residue. The Lc of the measured mAb contains 6 asparagine and 15 glutamine residues which are all prone to this chemical modification. With top-down analysis still at its infancy, further structural detail is currently provided at the peptide level following proteolytic digestion with, for example, trypsin that cleaves peptides C-terminally of an arginine or lysine residue. This strategy is referred to as bottom-up analysis or peptide mapping [19,20,42]. Trypsin is the most common protease in use but additional selectivities are offered by Lys-C, Glu-C, chymotrypsin and Asp-N cleaving, respectively, C-terminal

K. Sandra et al. / J. Chromatogr. A 1335 (2014) 81–103

to lysine, glutamic acid/aspartic acid or aromatic amino acids or N-terminal to aspartate. Peptide mapping can either be performed under non-reducing conditions in case disulfide bridges have to be confirmed or following the execution of a reduction and alkylation step. Caution has to be taken in interpreting peptide mapping data since the sample preparation can give rise to artifact modifications such as asparagine deamidation, S–S scrambling, succinimide loss, glutamine cyclization, all events taking place at the neutral or alkaline pH required for efficient trypsin digestion, or aspecific cleavages resulting from protease action that could mistakenly be considered as protein clipping sites [19,20,42,129]. Recently, artifact modifications have been reported in a study comparing biosimilar and innovator trastuzumab [24]. Thirteen asparagine deamidations at levels as high as 40% were reported on both products. These peptide mapping data were in contradiction to the CEX data for trastuzumab [23]. The authors added a note to the proof acknowledging that these high deamidation levels were due to sample preparation related modifications resulting from a suboptimal sample preparation procedure [24]. Peptide mapping data are generated with the aim to confirm identity and to identify and quantify post-translational modifications and have been used extensively in the comparison of originators and biosimilars [24,30,70,130]. Two Enbrel biosimilars, present on the Chinese market, have been compared to the originator and a deviating peptide map was observed for one of the biosimilars. The variance could be pinpointed to two amino acid substitutions (EEMTK changed to DELTK). By definition, biosimilar products must have an identical amino acid sequence as the reference product to be considered biosimilar in Europe and the US [10]. Upon comparison of a candidate trastuzumab biosimilar to the originator, an identical two amino acid variance was observed in the Fc region next to an increased methionine oxidation level [24]. When comparing TNK-tissue plasminogen activator (TNK-tPA) products from the innovator and follow-on manufacturers, identical primary sequence and disulfide bridging was observed. Substantial quantitative differences were, however, observed in glycosylation occupancy and chain cleavage, the latter required for enzyme activation. More subtle differences were observed in deamidation and oxidation levels [130]. Comparability between biosimilar and originator rituximab was as well demonstrated by peptide mapping, among others [70]. The only deviation observed was related to the presence of C-terminal amidated proline at 2% in the biosimilar. The authors stressed that the level of structural similarity provides confidence that tailored pre-clinical and clinical studies will reveal a comparable safety and efficacy profile. Many of the studies described in the previous parts have been complemented with peptide mapping to localize specific modifications [23,72,80,82,93]. In the identification of trastuzumabs charge heterogeneities observed via CEX, fractions have been collected and subjected to trypsin digestion prior to peptide mapping [23]. The peptide map clearly assigned the dominant pre-peak to the presence of a deamidated asparagine residue in the light chain and the dominant post-peak to an isoaspartate residue in the heavy chain. In both cases, the peptide map showed the non-modified and modified peptide at approximately the same intensities allowing to conclude that only one heavy and one light chain is modified in these particular cases. Peptide mapping of a collected basic CEX peak revealed the presence of a C-terminal peptide ending in an ␣-amidated proline [72]. This variant was not observed in the total peptide map due to its low intensity. A minor degradation product observed in the SEC profile of a PEGylated recombinant protein was identified following its collection and Glu-C peptide mapping. The addition of 1-propanol on the N-terminal residue and the concomitant loss of the PEG moiety were revealed [80]. Oxidations chromatographically detected at protein level have as well been pinpointed to specific sites [82,93].

93

Peptide mapping has furthermore been used to assign PEGylation sites [131], drug conjugation sites in ADCs [39], glycation sites [132], O-mannosylation sites [115], intron translation [120], low level sequence variants [133,134], succinimide formation [135], disulfide bridges and scrambling [136–141], free cysteines [128], lysine truncation [142], N-terminal cyclization [58,143], carbamylation [144], N-terminal truncation and elongation due to alternative cleavage of signal peptides [145], C-terminal extension due to a single base-pair mutation (TAA – stop codon to GAA – Glu) [146] and many more [19,20,42]. Peptide mapping is particularly powerful for the measurement of N- and O-glycosylation site occupancies in multiple glycosylated proteins such as etanercept, erythropoietin [147], tissue plasminogen activator (tPA) [130,148], acid-␣-glucosidase [33] and idursulfase [32], among others. 3.2.1. Liquid chromatography Peptide mapping demands for the best in terms of chromatographic separation since the complexity of the sample is drastically increased following the generation of peptides [149]. Considering fully cleaved tryptic peptides, 62 peptides are theoretically expected in the peptide map of trastuzumab (Fig. 1). The true peptide mixture complexity is much higher due to the presence of modified peptides and sample preparation artifacts such as miscleaved and aspecifically cleaved peptides and trypsin autolysis fragments. In that respect, it has been described in literature that trypsin can generate up to an order of magnitude more peptides than theoretically expected [150]. Peptide maps are predominantly generated using reversed-phase HPLC given the high resolution offered, its robustness and compatibility with mass spectrometry [47,149]. A major advantage over protein analysis is that the resulting peptides have diverse physicochemical properties (MW, polarity) giving rise to a variety of retention factors thereby occupying the complete RPLC separation space. Peptide mapping is commonly performed on columns packed with fully or superficially porous C18 particles utilizing mobile phases containing water, acetonitrile and 0.1% TFA or FA using a gradient evolving from low to high % of acetonitrile (e.g. 5–70% acetonitrile). Again, optimal chromatographic separation is obtained when using TFA over FA, this at the expense of mass spectrometric sensitivity. Fig. 6 shows a typical peptide map obtained on a 30 kDa recombinant therapeutic protein, validated for use in a routine environment for identity and purity testing. The method allows to obtain 100% sequence coverage at UV level, meaning that all peptides indicative for the identity of the protein are baseline resolved. This is much more specific compared to intact protein measurement (Fig. 2) since the identity is spread over many more peaks. The method furthermore allows to assess critical quality attributes such as aspartate isomerization at levels as low as 0.1% using UV 214 nm detection. Peak identity has been confirmed by on-line coupling to mass spectrometry. Such a performance is not readily achieved on larger proteins due to the increased number of species one is confronted with. Despite the feasibility to generate peak capacities up to 1000 using unidimensional liquid chromatography [149,151], the usually random distribution of peaks requires peak capacities in excess of 10,000 to resolve 98% of a tryptic digest containing 100 peptides [152]. However, combining the resolving power of chromatography and mass spectrometry readily allows to obtain sequence coverage >95% for large monoclonal antibodies (see later). Table 2 shows the elution of some common modifications relative to the native peptide. This is exemplified in Fig. 7 for trastuzumab. In some cases, modifications increase the hydrophobic nature of the peptide (cyclization, lysine truncation, succinimide formation) while in other cases the hydrophilic nature is increased (asparatate isomerization, oxidation, glycosylation). Asparagine deamidation represents an interesting case to further elaborate upon. At neutral and basic pH, deamidation proceeds

94

K. Sandra et al. / J. Chromatogr. A 1335 (2014) 81–103

Fig. 6. Validated RPLC peptide map of a 30 kDa therapeutic protein used for identity and purity assessment. 100% sequence coverage is obtained and impurities at levels as low as 0.1% can be monitored. Shown are the full chromatogram (a) and detailed view into the elution region of a specific native and modified peptide (b). Unpublished results.

through formation of a five-membered succinimide ring intermediate that is unstable and hydrolyzes into a mixture of two isomers, namely isoaspartate and aspartate [153]. Under RPLC peptide mapping conditions, the former appears as a pre-peak while the latter appears as a post-peak relative to the native asparagine containing peak [42]. The succinimide, if stable, appears as a late eluting peak. The ratio between isoaspartate and aspartate is usually 3:1 [19,20,42,153]. Deamidation also occurs at acidic pH by direct hydrolysis of the side chain amide, yielding only aspartate. This has been observed in trastuzumab [23]. Glycosylation renders a peptide more hydrophilic; hence, the glycosylated variants can easily be resolved from the non-glycosylated peptide. While RPLC typically cannot fully resolve the various mAb glycoforms, HILIC has been shown to offer baseline resolution of all glycoforms, including the G1F isomers [154]. Since the glycosylated peptides are separated from the bulk of peptides, this observation could be made at UV level. Similar quantitative data could be demonstrated compared to liberated glycan measurement. Opposed to RPLC, where glycopeptides elute in the region of the peptides, HILIC separates digests in three regions, namely the non-, O-linked and N-linked glycopeptides. Glycopeptides originating from different glycosylation sites are sharing the same separation space and the combined use of

RPLC and HILIC showed to be advantageous in the characterization of a fusion protein [154]. Multidimensional chromatography, a workhorse in proteomics [152], has not yet found wide use in biopharmaceutical analysis due to the inherent complexity associated with the instrumentation. This situation might change with commercial systems introduced on the market in 2012. Comprehensive LC × LC is known to substantially increase the chromatographic resolution as long as the two dimensions are orthogonal and the separation obtained in the first dimension is maintained upon transfer to the second dimension [152,155]. Fig. 8 shows a peptide map obtained from trastuzumab upon combining strong cation exchange (SCX) and RPLC, a combination providing good orthogonality toward the analysis of peptides [152]. Fractions were transferred from one dimension to another using a dual loop interface maintaining the resolution offered by the first dimension. The method allows both identity and purity assessment. 3.2.2. Mass spectrometry The early days mainly relied on Edman degradation to reveal the identity of the peaks in RPLC analysis [23,49,148]. Nowadays, most peptide mapping studies are performed using ESI-MS

K. Sandra et al. / J. Chromatogr. A 1335 (2014) 81–103

95

Fig. 7. Elution pattern of native and modified peptides as observed in an RPLC–MS peptide map of trastuzumab. Peaks result from extracting corresponding m/z values at high mass accuracy (5 ppm). (a) Glycosylation, (b) N-terminal cyclization, (c) Lc deamidation, (d) lysine truncation, (e) Hc deamidation, and (f) methionine oxidation. Deamidation observed on the Lc peptide ASQDVNTAVAWQQKPGK does not proceed via a succinimide intermediate which is in accordance to literature data [23]. The detection of both isoaspartate (pre-peak) and aspartate (post-peak) in a 3:1 ratio following deamidation of Hc peptide IYPTNGYTR is indicative for deamidation via a succinimide intermediate which potentially results from the sample preparation procedure. Asn-Gly motifs are known deamidation hot spots. Unpublished results.

due to the convenient coupling to RPLC or alternatively by MALDI-MS of total digests or of collected RPLC peaks [19,20,42]. As is the case with proteins, peptides are typically multiply charged under ESI conditions; this to a lesser extent due to their smaller size. Using high-resolution mass spectrometers full isotope resolution is obtained allowing the determination of the monoisotopic mass typically at mass accuracies below 5 ppm which is highly specific

Fig. 8. Multidimensional chromatography (LC × LC) of a tryptic digest of trastuzumab combining SCX with RPLC. Unpublished results.

toward identification. LC–MS based peptide mapping of monoclonal antibodies typically allows to cover over 98% of the sequence thereby confirming identity. Peptides that are not detected are usually small and their signal might be suppressed in the column flow through. LC–MS data analysis is facilitated by the fact that the predicted primary sequence is known from the cDNA sequence used to transfect the host cell and is commonly performed in an automated manner. Next to confirming the sequence, LC–MS based peptide mapping reveals modifications and modification sites. Comparing the peak areas of the native and modified peptides furthermore allows to quantify the modifications. Examples of modifications observed in trastuzumab, for which the elution pattern is shown in Fig. 7, are asparagine deamidation on the Lc (7%), cyclization of the N-terminal Glu on the Hc with the formation of pyroGlu (3%), Lys truncation at the C-terminus of the Hc (99%) and N-glycans at the peptide containing the consensus sequence for N-glycosylation (98%). A small amount of the non-glycosylated peptide containing the consensus sequence for glycosylation is detected. Implementing tandem MS systems allows to obtain fragmentation data on the peptides further enhancing confidence toward peptide identity, sequence, modifications and modification sites. Fragmentation typically occurs via collision induced dissociation (CID), however, complementary information can be obtained using electron capture or transfer dissociation (ECD, ETD). MS/MS

96

K. Sandra et al. / J. Chromatogr. A 1335 (2014) 81–103

experiments are commonly executed in a data-dependent manner where the instrument software selects the precursors in a survey run based on abundance or charge state. Precursors are then filtered using, for example a quadrupole and fragmented. To increase coverage, earlier fragmented precursors are typically placed in an exclusion list for a certain amount of time allowing the fragmentation of other precursors. Despite being highly informative, this strategy might not cover all peptides of interest in one run and often has to be complemented with subsequent targeted MS/MS runs to obtain more insight in the lower abundant species. Recently, data independent acquisition has been described implementing alternate low and elevated collision energy scanning in the absence of precursor selection. It offers the advantage of acquiring fragmentation data on all species in a single run but evidently puts more constraints on the data analysis part since fragments are decoupled from the precursors [156]. Peptides fragment in a predictable manner under CID conditions (along the peptide bond) giving rise to so-called y- and b-ions, respectively containing the C- and N-terminus of the peptide. The mass difference between successive y- or b-ions corresponds to the residual mass of the amino acids. Stable modifications, such as deamidation and oxidation, reside on the peptide backbone upon CID allowing to reveal exact modification sites. This is illustrated in Fig. 9. CID typically does not maintain labile modifications onto the backbone and in the spectra of N-glycosylated peptides, the sugar fragments often dominate the peptide fragments because the glycosidic bonds are more labile than the peptide bonds (Fig. 9). This feature can be exploited to easily recognize glycosylated peptides in MS/MS spectra and formed the basis of the early days glycopeptide analysis using precursor ion scanning [157–159]. As pointed out earlier, the alternative fragmentation modes of ECD and ETD cleave the N C␣ bond with the formation of z- and c-ions. Opposed to CID, ETD and ECD typically maintain labile modifications, such as glycosylation [148,160–164] on the protein backbone and allow the cleavage of disulfide bonds [137–139]. They have furthermore been shown to be able to differentiate isoaspartic acid from aspartic acid in a peptide sequence by the presence of reporter ions (c + 57 and z − 57) unique to the former [165–167]. The combined use of CID and ETD in protein biopharmaceutical characterization is particularly interesting. With CID generating sugar specific fragments and ETD generating peptide specific fragments, full glycopeptide characterization can be obtained [148,160,161,163,164]. Using a data-dependent multifragmentation approach making use of CID, ETD and CID-MS3 on ETD derived fragments, the disulfide scrambling pattern in monoclonal antibodies and 17 disulfide bridges in recombinant tissue plasminogen activator (tPA) have been characterized [138,139]. Interestingly, it has been shown that analysis of disulfide bond containing peptides in the negative electrospray ionization mode also result in specific fragmentation at the sulfur bond with generation of peptides with and without disulfide bridges facilitating the recognition of the former peptides [168]. An explosion of activities is witnessed in the area of disulfide bridge characterization. Recently, a method has been described for the automatic disulfide bond assignment using a1 ion screening of dimethyl labeled peptides [140]. Dimethyl labeling favors the formation of so-called a1 ions, resulting from the cleavage of C␣ C bonds, upon CID fragmentation. For disulfide linked peptides, multiple a1 ions can be observed due to the presence of multiple N-termini. To cope with disulfide linkages that might reside within a linear peptide, thereby generating only one a1 ion, different proteolytic enzymes are used within the workflow. This method is especially powerful in cases where no prior knowledge on the possible disulfide linkages is present. Another elegant strategy that does not rely on prior knowledge makes use of post-column partial reduction [141]. This method allows for simultaneous detection of

the disulfide linked peptides and their corresponding free cysteine containing peptides in the same spectrum. This method has been shown to be able to detect scrambled disulfide bonds. The resolution and mass accuracy offered by modern mass spectrometers allow to execute experiments using isotopic labeling to obtain otherwise inaccessible structural features. In that respect, 18 O labeling has been used to identify and quantify succinimides in proteins [135]. The identification of succinimide intermediates in proteins is challenging because it hydrolyzes rapidly to isoaspartic and aspartic acid under the neutral to basic pH conditions used for enzymatic digestion. Performing a succinimide hydrolysis in H2 18 O water prior to digestion in H2 16 O allows to quantify the succinimide by monitoring the amount of 18 O incorporation in isoaspartic and aspartic acid containing peptides. 18 O labeling has furthermore been used to discriminate aspartate from isoaspartate containing peptides and to determine deamidation artifacts introduced by sample preparation [129,169]. Of special interest is the incorporation of deuterium during H/D exchange experiments [170–178]. When placing a protein in D2 O, protein backbone amide hydrogens can exchange with the deuterium depending on solvent accessibility. Exposed and dynamic regions of proteins will exchange rapidly while protected and rigid regions will exchange slower. Peptide mapping following deuterium exchange allows to discriminate exposed from buried regions/amino acids and as such has the potential to indirectly assess higher order structures and protein dynamics. Enzymatic digestions are commonly performed with acid proteases such as pepsin to prevent back exchange during proteolysis. The technique has been used in biopharmaceutical comparability studies [70,173], to study conformational changes upon G-CSF PEGylation [172], glucocerebrosidase oxidation [44,106,107], interferon alkylation [108] and mAb deglycosylation [174], to assess the impact of differential mAb galactosylation, fucosylation and oxidation on conformation and receptor binding [175], to characterize the conformation of collected mAb charge variants [176], to study antigen-antibody interaction (epitope mapping) [177,178], etc. Next to dissecting primary and higher order structures, MS measurement at the peptide level has recently been proposed for assessing pharmacokinetic (PK) properties [179–188]. Ligand binding assays (LBA) are commonly used for the quantification of proteins in complex matrices such as serum and plasma and can, hence, be considered as the reference technique in PK studies. LBAs are characterized by a high throughput and sensitivity but suffer from, among others, long method development times and potential interferences from other proteins present in the matrix [189,190]. These drawbacks are challenged by the MS-based approaches that offer the advantage of fast method development, high accuracy and precision due to the possibility of internal standardization and high specificity since chemical information regarding the analyte under investigation is generated/measured. Throughput, on the other hand, is low compared to LBAs. Opposed to structural characterization, which aims at full sequence coverage and benefits from the use of high resolution MS, mass spectrometric protein quantification is typically based on the analysis of one or several surrogate peptides using multiple reaction monitoring (MRM) on triple quadrupole instruments. In the selection of the quantifying peptide, several factors should be considered [179,180]. The primary selection criteria for the surrogate peptide(s) used for quantification is based on the uniqueness to the protein of interest. Secondary selection criteria are based on the mass spectrometric (ionization efficiency and MS/MS fragmentation characteristics) and chromatographic properties of the peptide(s). Also the presence of unstable sequences (e.g. Asp-Gly and N-terminal glutamine) and amino acids that might be prone to chemical modifications (methionine), next to miscleaved peptides should be avoided. A universal surrogate peptide to enable MS-based quantification of a diversity of human

K. Sandra et al. / J. Chromatogr. A 1335 (2014) 81–103

97

Fig. 9. Fragmentation (CID) data obtained on native and deamidated ASQDVNTAVAWQQKPGK (a), native and deamidated IYPTNGYTR (b), native and oxidized DTLMISR (c) and glycosylated peptide EEQYNSTYR + G1F (d) as observed in a trastuzumab tryptic digest (Fig. 7). Glycan symbols correspond to those reported in Fig. 1. The m/z values and y/b annotations presented in italic, differentiate modified from native fragments and allow to assign modification sites. Unpublished results.

98

K. Sandra et al. / J. Chromatogr. A 1335 (2014) 81–103

mAbs and Fc-fusion protein drug candidates in pre-clinical animal studies has recently been described [185]. Although MRM allows highly sensitive detection, the true sensitivity of an MRM assay is constrained by the complexity of the matrix. In that respect, plasma is a challenging matrix given the enormous number of proteins encountered and the presence of the biopharmaceutical at concentrations several orders of magnitude lower than that of the main plasma proteins, such as albumin (35–50 mg/mL) and immunoglobulins [191]. The incorporation of a trypsin digestion step further increases complexity. Sample preparation is therefore a critical step to reach the required sensitivity. Chromatographic pre-fractionation, solid-phase extraction, protein depletion and immuno-affinity enrichment have successfully been applied to quantify low concentrations of proteins [179–187,192]. The ability to incorporate internal standardization, correcting for variations in the analyte response caused by variability in the analytical procedure, is a benefit of MS-based assays over LBAs. A recent review describes the different internal standard options available with their advantages and disadvantages [193]. Isotopically labeled variants of the surrogate peptides created using solid-phase peptide synthesis are particularly popular. An isotopically labeled whole antibody internal standard has recently been described which can be used to quantify a wide range of mAbs. This internal standard, which is added to the sample early on in the analytical workflow corrects for all technical variability including trypsin digestion, fractionation and LC–MS analysis [182]. A validated LC–MS based methodology for the quantification of an IgE binding next-generation biotherapeutic in cynomolgus monkey plasma has recently been described [184]. Nanobody quantification was performed making use of a surrogate tryptic peptide that was chromatographically enriched prior to LC–MS analysis. The validated lower limit of quantification (LLOQ) of the method was 36 ng/mL and was measured with an intra- and interassay precision and accuracy

Modern chromatographic and mass spectrometric techniques for protein biopharmaceutical characterization.

Protein biopharmaceuticals such as monoclonal antibodies and therapeutic proteins are currently in widespread use for the treatment of various life-th...
4MB Sizes 0 Downloads 0 Views