Improving Near-Infrared Prediction Model Robustness with Support Vector Machine Regression: A Pharmaceutical Tablet Assay Example Benoıˆt Igne,a James K. Drennen III,a,b Carl A. Andersona,b,* a b

Duquesne University Center for Pharmaceutical Technology, School of Pharmacy, 600 Forbes Avenue, Pittsburgh, PA 15282 USA Duquesne University, Graduate School of Pharmaceutical Science 600 Forbes Avenue, Pittsburgh, PA 15282 USA

Changes in raw materials and process wear and tear can have significant effects on the prediction error of near-infrared calibration models. When the variability that is present during routine manufacturing is not included in the calibration, test, and validation sets, the long-term performance and robustness of the model will be limited. Nonlinearity is a major source of interference. In nearinfrared spectroscopy, nonlinearity can arise from light path-length differences that can come from differences in particle size or density. The usefulness of support vector machine (SVM) regression to handle nonlinearity and improve the robustness of calibration models in scenarios where the calibration set did not include all the variability present in test was evaluated. Compared to partial least squares (PLS) regression, SVM regression was less affected by physical (particle size) and chemical (moisture) differences. The linearity of the SVM predicted values was also improved. Nevertheless, although visualization and interpretation tools have been developed to enhance the usability of SVM-based methods, work is yet to be done to provide chemometricians in the pharmaceutical industry with a regression method that can supplement PLS-based methods. Index Headings: Near-infrared spectroscopy; Robustness; Partial least squares; Support vector machines regression; Pharmaceuticals.

INTRODUCTION Measurements performed using near-infrared (NIR) diffuse reflection are sensitive to the chemical composition and the physical state of the samples.1 As a consequence, the development of a calibration model for near-infrared spectroscopy (NIRS) must include the necessary variance in these parameters to successfully predict the variability not only of the calibration, test, and validation samples, but also of the routine manufacturing conditions. Thus, a prediction model should be developed using samples including most, if not all, the variability that will be encountered during the material’s routine use.2–3 Although it is possible to include variability that is known or anticipated, it is impossible to foresee the state of all the samples that are to come (in the next crop year or next batches). In the pharmaceutical industry, these sources of variability are well known, but their effect on the model is not always well characterized. Variability can come from three main sources: the raw materials, the process, Received 6 February 2014; accepted 14 May 2014. * Author to whom correspondence should be sent. Email: [email protected]. DOI: 10.1366/14-07486

1348

Volume 68, Number 12, 2014

and the NIRS instrument. The effect of the raw-material variability can be significant if not taken into account. Over time, different lots of raw material will exhibit different particle-size distributions, different physical form ratios (when applicable), different surface properties, and different humidity levels (often related to the seasons during which they were manufactured or stored). Igne et al.4 studied how changes in excipient particle size and active physical form affected the performance of an NIRS calibration model. The authors showed that an untrained, non-robust model is sensitive to varying particle sizes and may not be able to determine that the active ingredient has changed physical form.4 In another study, Igne et al.5 also studied the effect of moisture on model performance and showed that even a limited gain or loss of moisture by the sample can cause a significant increase in the prediction error. Process variability, often affected by the raw-material variability, will express itself in varying product–attribute distributions. For instance, a change in the particle-size distribution of an excipient may modify the density of the resulting tablets if no corrective actions are taken at the tablet press. That same change in density may affect the spectral slope, possibly degrading model performance.6 Consequently, robustness to physical differences must be built into the calibration model to mitigate the effects on predicted values. Another major source of signal modification comes from the aging of the instrument. Examples include the decrease in intensity of the source with use and a decrease in precision of the moving parts as an instrument ages. However, NIR instruments come with performance qualification procedures or system suitability tests that are often performed before use and that will detect such problems. The United States Pharmacopeia (USP) Convention ,1119.7 system suitability test can be used to assess instrument performance. Therefore, although a model may not need to be robust to wavelength shifts because an operator should not use the instrument if a performance test has determined that it does not meet preset qualification criteria, the instrument will need to be exposed to changes in light intensity and varying structured noise levels to ensure that changing a lamp or changing the excipient supplier will not affect the accuracy of the predicted values. The analytical method needs to be made robust to as many sources of variability as foreseeably encountered during its expected lifecycle. As a result, building a useful calibration model requires careful planning. The chemometrician must

0003-7028/14/6812-1348/0 Q 2014 Society for Applied Spectroscopy

APPLIED SPECTROSCOPY

design the right experimental plan and have a sense of the potential changes in the manufacturing process to build a model that will have seen enough variability to be both robust and conserve good predictive ability. A risk assessment methodology is recommended to identify the potential uncertainties that an analytical model will face. Choices can then be made to develop the best model for the intended purpose. The selection of the regression method is important. Although PLS regression8 is the method of choice in the pharmaceutical industry, and more particularly for solid oral-dosage forms of medication, other regression methods have been widely adopted in other fields of research: classical least squares (CLS),9–11 support vector machine (SVM),12–16 and artificial neural networks (ANN).17,18 Note that other regression methods exist and have been successfully applied to spectroscopic data.19 Classical least squares regression possesses significant advantages because it is based on first principles and the theoretical framework proposed by the Beer– Lambert law. However, in diffuse reflectance and transmittance, the light path length is not constant and creates deviations from the law. In response, modifications of the traditional CLS algorithm have been proposed20–22 and successfully used for pharmaceutical powders and tablets.23 With respect to SVM and ANN, very limited work has been done for use with pharmaceutical products.24,25 Nonlinear regression methods address situations where, as described by International Conference on Harmonisation publication Q2(R1),26 the ability (within a given range) to obtain test results that are directly proportional to the concentration of analyte in the sample is not observed. Due to the nature of the NIR signal, the lack of linearity is common (e.g., due to varying light path length). Numerous techniques have been employed to address these issues. The use of preprocessing methods is widely accepted to mitigate differences in path length. Multiplicative scatter correction (MSC),27 extended MSC,28 and standard normal variate29 are typically selected to correct for scattering effects and consequently improve the linearity of the model. Removing variables responsible for the nonlinearity is also a common approach to refining models and reducing the effects of interfering components and external parameters (moisture, density differences, etc.). When linearity is restricted to a limited range in the concentration of the analyte of interest, the use of local regression methods can be successful. Although local regression methods are used in other fields of research, to the knowledge of the authors no work has been published on their implementation for pharmaceutical products. Finally, PLS regression can compensate for nonlinearity by adding additional latent variables to the model, but this increases the risk of overfitting and must be carefully evaluated. As previously mentioned, another approach to solving nonlinearity issues is to employ nonlinear regression methods. Nonlinear methods fit independent variables that are not linear with respect to the parameter of interest using a nonlinear combination of the model parameters (as opposed to linear techniques, which require a linear combination of model parameters).

Artificial neural networks and SVM have generated tremendous interest in the past few decades. Artificial neural networks are a family of techniques originally developed for classification; they are based on a network of individual interconnected neurons located on different layers (typically input, hidden, and output layers). A weight and an activation function are associated with each neuron, and it is the adaptation of these weights by back propagation that allows ANN to fit the input data. Nevertheless, ANN presents three main drawbacks. First, a large amount of data is needed (the number of observations must be larger than the number of weights to evaluate). The second difficulty lies in the number of parameters to tune (activation function, network structure, weight adaptation functions, etc.) requiring mastery of the technique. Finally, the error plane of ANN can exhibit local minima that do not represent the best fit of the training data. A solid validation strategy must be used to limit under- and overfitting. The theory and applications of ANN to NIRS can be found in Borggaard.17 Support vector machines were also initially used for classification purposes. They were developed to perform accurately on limited data, presenting nonlinear relationships using the structural risk-minimization framework.13 This framework aims at reducing the risk of overfitting while maintaining good prediction/classification performance. Similar to SVM classification, which looks for the maximum margin between groups, SVM regression tries to minimize the prediction error relative to an error rate determined by the user (e-insensitive loss function). A least squares optimization criterion also exists. The main advantages of SVM are that the error plane presents only one minimum (a unique and stable solution) and a limited number of parameters needs to be determined: the error rate (for the e-insensitive loss function only), the regularization parameter (c), and the parameter of the kernel function (r2). Regularization and kernel parameters can be determined by cross-validation. The kernel function (or kernel trick) is an approach to changing the original space of the data to an inner product space in which the data are linearly related to the parameter of interest. The kernel trick can also be used for principal component analysis, principal component regression, and PLS to help improve linearity.30,31 Given the regulatory framework in which the pharmaceutical industry operates, the stochastic nature of the ANN makes it unlikely that it will be widely adopted in the near future. However, SVM regression presents many advantages that could potentially be beneficial to the Quality-by-Design initiative by ensuring that the process analytical technologies based on NIRS methods are more linear and robust while maintaining specificity and performance already achieved using the more traditional PLS regression method. In this study, the potential to use SVM for the prediction of individual tablet assays was evaluated. A discussion of the suitability of SVM for spectral modeling in pharmaceutical applications is then provided.

MATERIALS AND METHODS Sample Production and Experimental Designs. A formulation of acetaminophen (APAP; Rhodapap, Rhodia

APPLIED SPECTROSCOPY

1349

TABLE I. Design of experiment (% w/w). Design number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Acetaminophen

Hypromellose

Intra-granular lactose

Micro-crystalline cellulose

Extra-granular lactose

Magnesium stearate

Excipient ratio

19.11 23.21 27.30 31.40 35.49 19.11 23.21 27.30 31.40 35.49 19.11 23.21 27.30 31.40 35.49

2.73 3.32 3.90 4.49 5.07 2.73 3.32 3.90 4.49 5.07 2.73 3.32 3.90 4.49 5.07

5.46 6.63 7.80 8.97 10.14 5.46 6.63 7.80 8.97 10.14 5.46 6.63 7.80 8.97 10.14

51.77 48.65 45.53 42.41 39.29 38.83 36.49 34.15 31.81 29.47 25.89 24.33 22.77 21.21 19.65

20.43 17.70 14.97 12.24 9.51 33.37 29.86 26.35 22.84 19.33 46.31 42.02 37.73 33.44 29.15

0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50

2 2 2 2 2 1 1 1 1 1 0.5 0.5 0.5 0.5 0.5

Organique, Roussillon, France), lactose (monohydrate NF-product 316/Fast-Flo, modified spray-dried; Foremost Farms USA, Rothschild, WI), microcrystalline cellulose (MCC; Avicel PH 200, FMC Biopolymer, Mechanicsburg, PA), hypromellose (HPMC; Pharmacoat 606, Shin-Etsu Chemical Co. Ltd., Tokyo, Japan), and magnesium stearate (MgSt; Mallinckrodt, Hazelwood, MO) was used. Granules of APAP (70% w/w), lactose (20% w/w), and HPMC (10% w/w) were manufactured in 5 kg batches using a fluid bed granulator and dryer (model WSG 5; Glatt, Binzen, Germany). The end-moisture target was 0.25%, and the target median particle size was 250 lm. Granules were subsequently sieved to eliminate aggregates larger than 500 lm and were blended with extragranular lactose, MCC, and MgSt. Two tablet manufacturing scales were used: laboratory and pilot. A five- by three-level, two-factor (active contents of 19.11, 23.21, 27.30, 31.40, and 35.49% w/w 6 15% and 6 30% of the nominal concentration and MCC : lactose ratios of 0.5, 1, and 2) full-factorial design was created in the laboratory. Three replicate tablets per level were manufactured from the same blend for a total of 45 samples. At the pilot scale, only tablets at the nominal excipient ratio (MCC : lactose ratio of 1) and APAP levels varying at the target and 630% of the nominal concentration were manufactured. Table I summarizes the design. For the laboratory samples, materials for each design point were weighed and transferred into 40 mL plastic vials in accordance with the five-level, two-factor, fullfactorial design described for the creation of the calibration set. The materials were mixed for 15 min in a 5.5 L bin-blender (L.B. Bohle, Ennigerloh, Germany) by placing the vials in a foam insert that was previously fit to match the dimensions of a bin-blender. Subsequently, approximately 700 mg of powder was weighed from each design point and manufactured into compacts on an automatic tablet press (Model 3887.1SD0A00; Carver, Wabash, IN) using a 13 mm die and flat-faced punches. For the pilot-scale samples, granules and extra-granular excipients were mixed in a 3.5 quart V-blender for a total blend mass of 1 kg for 45 min according to the design of experiment for the creation of the calibration and test samples. The blends were subsequently compacted on a

1350

Volume 68, Number 12, 2014

38-station rotary tablet press (Elizabeth-Hata International, Inc., North Huntingdon, PA). The stations were tooled using 3/8 in. (9.5 mm) diameter, round biconvex punches and the corresponding dies. Only two stations were tooled to facilitate the collection of the tablets postejection. Three tablets were collected every minute for 20 min. The target tablet weight was 350 mg. The turret speed was left constant at 30 rpm, and the target compression force was 8000 kp. To facilitate the model development when using samples from both manufacturing scales, the density of the compacts generated at the laboratory scale matched the density of the tablets made at the pilot scale. Also, to ensure robustness to changes in the tablet water content, a repeat of the laboratory calibration design was subjected to various relative humidity levels. Four chambers were used in which the target relative humidities (RHs) were 11% (saturated solution of lithium chloride), 32% (magnesium chloride), 52% (magnesium nitrate), and 75% (sodium chloride). Tablets were left to equilibrate in each chamber. The equilibration of the mass over storage time at each RH condition was used as an indicator of stability. Tablets were moved from low to high RH to simplify the experimental design. Near-Infrared Spectroscopy Data Collection. Nearinfrared reflection measurements for both sides of the samples were collected using a benchtop scanning monochromator instrument (XDS Rapid Content Analyzer; FOSS NIRSystems, Inc., Laurel, MD). Spectra were collected over the wavelength range of 400–2500 nm at 0.5 nm increments, averaging 32 co-adds per spectrum. Spectra corresponding to each side of a sample were averaged to give one spectrum per sample. A period of two weeks elapsed between compression and spectral collection to allow the tablets to undergo viscoelastic relaxation. Tablets in the environmental chamber were left for two weeks at room humidity before entering the first environmental chamber (11% RH). Data acquisition from these tablets was performed as soon as their equilibration was complete. The instrument stability testing method outlined in USP ,1119.7 was performed during the study to evaluate whether instrument stability was an issue. No differences

SCHEME 1. Description of the three scenarios used to compare the modeling ability of PLS and SVM regressions.

in high and low flux noise, linearity, and wavelength accuracy were observed (results not shown). Reference Testing. Acetaminophen reference values for all compacts were determined using high-pressure liquid chromatography (HPLC; Waters Alliance 2790, Milford, MA), followed by ultraviolet detection (Waters 2487). Reference testing was performed after all tablets had been equilibrated and scanned at all four RH conditions. A method, modified from USP29-NF24 (‘‘Acetaminophen Tablets’’)32 was employed. The mobile phase was water : methanol : acetic acid (80 : 17 : 3), and the solid phase was a 15 cm by 4.6 mm C18 column with 3 lm packing. The detection wavelength was 243 nm. The error of the laboratory scale was estimated at 0.83%. Experimental Plan. During the course of the study, a total of 6 granulations and 16 tablet batches were manufactured (2 at the laboratory scale and 14 at the pilot scale). Due to seasonal differences, a large variability existed in the particle-size distribution of the granules and in the moisture content of the excipients and granules. The environment of the laboratory in which the experiments were performed was not controlled; consequently, during the winter the air was dry (20–30% relative humidity), but during the summer months it was typical to observe a RH between 60 and 70%. To test the ability of SVM regression to handle the tablet variability that can come from batch-to-batch differences (particle-size differences and moisture differences), three modeling scenarios were created. The first modeling scenario, the risk mitigation scenario, aimed at building a calibration model that could handle all the variability that was anticipated to come in the use of the calibration model. Therefore, that scenario had variability in granulation batches and manufacturing dates included in the calibration set, as a consequence exposing the model to variability in particle size and

moisture. A second scenario, the climate risk scenario, tested the ability of SVM to handle seasonal variability. That scenario included in the calibration only granulations made during the same season and did not include the RH dataset. Finally, the third scenario, the batch risk scenario, exposed the model to variability in climate but not to differences in particle size; all samples in the calibration came from one granulation. Scheme 1 and Table II present, for each scenario, information about the samples included in the calibration and test sets: the granulations, the seasons, the manufacturing scale, the use or non-use of the moisture dataset, and the number of samples. The calibration sets from these three scenarios were then used to evaluate the ability of each regression method to handle the climate and batch variability when predicting a test set generated from a new granulation (one never used in the calibration) and manufactured under summer conditions. Note that, although 16 batches were available, they were not all used. Also, the same number of samples was available for the risk mitigation and batch risk scenarios; however, in the batch risk scenario, all the samples came from the same granulation. Regression Methods. Partial least squares regression with the nonlinear iterative partial least squares (NIPALS) algorithm8 was employed to develop a model for each scenario. The choice of latent variables was determined by comparing the evolution of the root mean square error of calibration (RMSEC) and root mean square error of cross-validation (RMSECV). Support vector machine regression models were developed using the least squares loss function. A radial basis function was used as the kernel, and the optimization was performed using cross-validation and a two-step grid search function. Test samples were not employed to set the model parameters for either regression tool. All calculations were performed using Matlab v. 2011a (The

APPLIED SPECTROSCOPY

1351

RESULTS

TABLE II. Details of each modeling scenarios. Scenarios Parameters Calibration sets Number of samples Number of granulations Humidity conditions Test set Number of samples Number of granulations Humidity conditions

Risk mitigation

Climate risk

Batch risk

405 4 High/low

284 2 Low

405 1 High/low

60 1 High

60 1 High

60 1 High

Mathworks, Natick, MA) equipped with the PLS_Toolbox v. 6.2.1 (Eigenvector Research Inc., Wenatchee, WA) and the LS-SVM lab toolbox v. 1.5 for Matlab developed by Pelckmans et al.33 Model Comparison and Nonlinearity Testing. All PLS and SVM models were evaluated for goodness of fit and for precision and accuracy on the test set (which was fully independent from the calibration set) using the coefficient of determination (r2), root mean square error of prediction (RMSEP), the standard error of prediction (SEP), and the prediction bias (BiasP). Nonlinearity in test predictions was tested following a method proposed by Mark34 to determine whether improvements in linearity were achieved with SVM. Briefly, NIR predictions were fit to reference values using a quadratic function. Reference values represented the x variable, and squared reference values represented the x2 variable. The model fit was assessed at the 95% confidence level. If the quadratic term was significant (p , 0.05), the predictions were deemed nonlinear.

FIG. 1.

1352

Spectral Investigation of the Batch-to-Batch Variability. The observed batch-to-batch variability could be due to differences in particle size and surface water related to differences in granulation and environmental conditions. The spectral effect of these changes can be observed in Fig. 1. Figure 1a displays the raw average spectrum and difference spectrum of two pilot-scale batches made at the target formulation for granulations 1 and 2, manufactured under the same environmental conditions. A baseline difference existed between the two mean spectra, indicating a difference in density.6 Because the same compression force was used for both batches, the difference came from particle-size differences. Sieve analyses were performed on the granules to confirm the results (not shown for brevity). Figure 1b shows the effect of surface moisture on the same granulation. Powders from the same manufacturing batch were stored from December 2012 to June 2013 before blending and compression. No difference in baseline was observable, confirming the effect of particle size. However, the water bands from the combination bands from symmetric stretch (v1), bending (v2), and asymmetric stretch (v3) at 1200 nm (av1 þ v2 þ bv3; a þ b = 2), 1470 nm (av1 þ bv3; a þ b = 2), and 1900 nm (av1 þ v2 þ bv3; a þ b = 1) exhibited significant modifications compared to the rest of the spectra. A model developed without this variability in raw material (particle- size differences and/or environmental changes) is not anticipated to handle the new variability appropriately. This is the rationale behind the batch risk and climate risk scenarios. Partial Least Squares Regression Results. The calibration set of each modeling scenario was used to develop models to predict the common test set. Table III and Fig. 2 present the calibration and test statistics. All

Effect of batch-to-batch variability of the NIR spectra. (a) Particle-size variability. (b) Seasonal variability.

Volume 68, Number 12, 2014

TABLE III. Performance of PLS and SVM regression modes for the three modeling scenarios. Calibration Scenarios PLS modeling Risk mitigation Climate risk Batch risk SVM modeling Risk mitigation Climate risk Batch risk a

Number of latent variables or c/r2a 6 5 6 1735.70/0.30 11367.81/0.52 12897.98/0.55

Test

r 2c

RMSEC (% w/w)

r2p

RMSEP (% w/w)

SEP (% w/w)

0.96 0.97 0.96

1.38 1.33 1.37

0.99 0.98 0.99

0.79 2.71 1.47

0.75 0.94 0.78

0.23 2.54 1.24

0.98 0.98 0.98

0.89 0.95 1.01

0.99 0.99 0.99

1.03 1.49 0.98

1.03 0.85 0.73

0.01 1.23 0.65

Biasp (% w/w)

c, regularization parameter; r2, parameter of the kernel function.

spectra were pretreated using a combination of MSC and mean centering. The responses were autoscaled (zero mean, unit variance). Other combinations were tested, but they did not show improvements over the MSC and mean centering combination. As mentioned in the method section, the number of latent variables was determined using the trends of the RMSEC and RMSECV. The test set was not used to determine the model complexity to avoid overfitting. The number of latent variables was six for the risk mitigation scenario, five for the climate risk scenario, and six for the batch risk scenario. Given the complexity of the system (active ingredient concentration variability, excipient ratio variability, batch-to-batch variability, moisture variation, and scale differences), it is not surprising that the number of chemical degrees of freedom was high.

The PLS results are in agreement with the expectations. The calibration errors are higher than the reported laboratory error for all three scenarios. This may be due to the inability of PLS to account for the significant variability present in the calibration set. The RMSEP was lower for the risk mitigation scenario than for the climate risk and batch risk scenarios (0.79% w/w). The RMSEP was lower than the RMSEC because the variability in the test set was limited compared to the one included in the calibration set. When the calibration model had no builtin robustness to moisture variations, it exhibited a significant bias, which is in agreement with results reported in the literature.5 For the batch risk scenario, although the precision term (SEP) was similar to the risk mitigation scenario, the bias was higher. This could be

FIG. 2. Calibration (blue) and test (red) performance of regressions for each modeling scenario. (a–c) The PLS regressions. (d–f) The SVM regressions.

APPLIED SPECTROSCOPY

1353

explained by the baseline differences not being corrected by MSC and therefore being unknown to the model. Support Vector Machine Regression Results. Using cross-validation (random, fivefold, similar to PLS) and a two-step grid-search optimization approach, the kernel and regularization parameters were identified. The gridsearch function calculates for a given range of parameters the mean square errors. The set of parameters that gives the lowest error is further refined by zooming in on the error plane and determining more precisely the optimal parameters that correspond to the minimum error. Similar to PLS, all models used MSC to pretreat the spectra. Table III and Fig. 2 present the calibration and test statistics. The calibration errors were very close to the reported laboratory error, indicating that SVM regression was able to generate models that were more accurate and precise than PLS by better handling the variability in the calibration set. The test-set RMSEP was larger than for PLS for the risk mitigation scenario, and this was due to a decrease in precision because the bias was 0.01% w/w. It is unclear why the RMSEP was high. This could be due to overfitting the model. However, for the two other scenarios, SVM regression presented significantly lower errors than those obtained using PLS. Although bias was still an issue, the overall error was much lower (comparable SEPs but with a twofold reduction in the bias). While not ideal, this shows that the SVM regression is more robust than PLS. This is particularly true for physical variability. Differences in particle size are expressed in the spectra as changing baselines that affect the linearity of the data. The kernel modification of the original space was able to mitigate the effect of varying the path length. However, although the error was lower for the climate risk scenario, SVM regression was still sensitive to the lack of chemical information in the calibration set (no information was present in the calibration set regarding the effect of surface moisture on the spectra). These results are very encouraging. Support vector machine regression appears to be a suitable alternative to PLS regression for the prediction of individual tablet assays in the presence of significant raw-material variability, something that PLS regression cannot handle well in the absence of a comprehensive survey of the variability included in the calibration set. However, in the best-case scenario, the risk mitigation scenario (the approach that should be implemented when possible), SVM regression did not perform better than the traditional PLS modeling. Nonlinearity Test. In Fig. 2, the test-set predictions visually appear to exhibit nonlinear trends. Although all the PLS regression and the SVM risk climate scenario seem to have nonlinear behaviors, the SVM regression for the risk mitigation and risk batch scenarios did not exhibit such phenomena. Using the method proposed by Mark,34 the test data were evaluated for nonlinearity. Table IV presents the p values for the linear and quadratic terms. With a limit of a = 0.05, all models developed using PLS regression were nonlinear. For SVM regression, only the climate risk scenario produced nonlinear predictions. The improvement in performance with SVM could come in part from its ability to handle the

1354

Volume 68, Number 12, 2014

nonlinearity between the spectra and the assay values. However, when chemical information is missing (moisture variation), SVM regression is as sensitive to nonlinearity as PLS.

DISCUSSION The results presented for SVM regression are very encouraging. The models appear to be more robust to common sources of variability than PLS is when the information is not present in the calibration set. The improvement in linearity of the predicted values is also a significant improvement over PLS. However, compared to the traditional regression methods (PLS and principal component regression), SVM poses a few challenges that remain to be addressed before the method can be adopted by the pharmaceutical industry. Interpretability is one of the issues limiting the wider use of SVM. It is typical with PLS regressions to use the loading vectors and regression coefficients to estimate the variables on which the model is based. Even though some pretreatment methods can interfere with the interpretation (e.g., derivatives and scaling), it is still much easier to determine the variables of importance than after the data has been modified with kernels. Finally, metrics such as variable influence on projection (VIP) have been developed to identify wavelengths ranges of importance to the model.35 For SVM, interpretability is limited because of the transformation of the data using the kernel and the form of the solution. Typical outputs from an SVM regression are a set of alpha coefficients solving the following equation: f ðx Þ ¼

#SVn X

ðai  ai Þ  h/ðxi Þ  /ðx Þi þ b

i¼1

where ,/(x1)  /(x). corresponds to the data mapped onto the feature space (after kernel transformation), and ai and ai are the Lagrangian multipliers. The a values are weights (the relative importance of the data points contributing to the model) applied to the original input space and cannot be interpreted with respect to the concentration or wavelength of importance. In the least squares support vector machine (LS-SVM) setting, each a value is proportional to the error at that data point. To address this issue, the research team of U¨stu¨n et al.36 proposed in 2007 an approach to overcome the interpretability limitations of SVM. To visualize the spectral information in the kernel space, they used a map generated by calculating the correlation between each wavelength variable and each row of the kernel matrix, representing a similarity measure of a sample with the other samples. When the samples were sorted according to the concentration of the analyte, it provides a representation of the explanatory and non-explanatory variables. In addition, to interpret the SVM outputs, the same research team proposed calculating a vector p that ‘‘yields a characteristic profile of the variables which contribute to the overall model.’’36 The vector p is obtained by calculating the inner product of the input matrix with the a vector. Figure 3 presents, for the risk mitigation scenario, the mean spectra of the calibration set and the pure

TABLE IV. Nonlinearity test results. PLS regression Scenarios a

p, X p, X2b a b

SVM regression

Risk mitigation

Climate risk

Batch risk

Risk mitigation

Climate risk

Batch risk

0 0

0 0

0 0

0 0.172

0 0

0 0.256

The p value of the linear term. The p value of the quadratic term.

spectrum of APAP (Fig. 3a), the vector p (Fig. 3b), and the correlation map (Fig. 3c). The p vector shows good similarities with the location of some APAP absorption bands, indicating that the SVM regression extracted information relevant to the parameter to be predicted (i.e., 1130, 2152, and 2462 nm). The correlation map indicates the regions of the spectra that have been emphasized or down-weighted by the kernel. The color gradient shows how one particular region of the spectra is correlated with the concentration of the active ingredient. Thus, there are regions with very significant gradients (i.e., 2400–2500 nm) that correspond to unique peaks of the active ingredient and that were identified by the p vector as being relevant. On the other side, there are regions where the correlation is very low, indicating that the kernel transformation did not retain information from these bands (i.e., 1500–1600 nm and 2250 nm). However, as reflected in the p vector, the SVM kernel and model used data from the entire wavelength range and was not particularly sensitive to the moisture

difference because regions representing interferences would be expected to be down-weighted by the model. These visualization and interpretation metrics provide insights into SVM regression and should enhance its use in the pharmaceutical industry, where regulators often require an estimation of the origin of the signal used by the model. In addition to these tools, new algorithms for SVM are being developed to ease interpretability where the outputs of the model are directly a function of the original input space (wavelengths).37,38 However, there is still an area in which SVM regression lacks compared to PLS and principal component regression—it is the diagnostics. Hotelling’s T2, the distance to model center, and Q residuals, the unmodeled variance, are widely used to assess the suitability of a model to predict a given sample.39 To the knowledge of the authors, no equivalent statistics exist for SVM regression, and this is a major limitation. Without the ability to check the validity of each prediction, it is more difficult to accept an output as is, especially if it is to be used for controlling a manufacturing process or in real-time release testing situations.

CONCLUSION The suitability of SVM regression for the prediction of individual tablet assays was evaluated. The nonlinear regression method showed significant advantages over the more traditional PLS regression method in terms of robustness and linearity of the prediction data. The SVM regression appeared to be less affected by interference and nonlinearity induced by physical changes in the spectra than by the lack of chemical information. However, when a model has been properly built on a suitable calibration set, it is yet to be proven that SVM possesses advantages over PLS in terms of precision and accuracy. The possibility of evaluating a model and diagnosing the quality of the predicted values is still an area where SVM can improve. Although tools exist to determine which wavelength regions a SVM model emphasizes, the results are still difficult to interpret. Nevertheless, ongoing improvements in the algorithms should allow SVM to enter the toolbox of chemometricians working with spectroscopic process analytical technologies in the pharmaceutical industry. FIG. 3. Visualization and interpretability of the risk mitigation modeling scenario. (a) Calibration and active pure spectra of APAP. (b) Inner product between the spectra of the training set and the a vector. (c) Correlation image with samples sorted by concentration (low to high, yaxis).

ACKNOWLEDGMENTS The research team at the Duquesne Center for Pharmaceutical Technology thanks Metrohm NIRSystem for lending the spectrometer. The authors are grateful to Md. Nayeem Hossain for manufacturing the tablets, collecting the spectral data, and performing reference analysis.

APPLIED SPECTROSCOPY

1355

The authors also thank Robert W. Bondi, Jr., and Sameer Talwar for the help in manufacturing the tablets. 1. D.A. Burns, E.W. Ciurczak. Handbook of Near-Infrared Analysis. New York: Marcel Dekker, Inc., 2001. 2nd ed. 2. T. Naes, T. Isaksson, T. Fearn, T. Davies. ‘‘Selection of Samples for Calibration’’. In: A User-Friendly Guide to Multivariate Calibration and Classification. Chichester, UK: NIR Publications, 2002. Pp. 191193. 3. P. Williams. ‘‘Implementation of Near-Infrared Technology’’. In: P.C. Williams, K. Norris, editors. Near-Infrared Technology in the Agricultural and Food Industries. St. Paul, MN: American Association of Cereal Chemists, 2001. 2nd ed., pp. 145-169. 4. B. Igne, S. Shi, J.K. Drennen, C.A. Anderson. ‘‘Effects of Raw Material Variability on the Stability of Near Infrared Calibration Models for Pharmaceutical Products’’. J. Pharm. Sci. 2014. 103: 545556. 5. B. Igne, M..N. Hossain, J.K. Drennen, C.A. Anderson. ‘‘Robustness Considerations and Effects of Moisture Variations on Near Infrared Method Performance for Solid Dosage Form Assay’’. J. Near Infrared Spectrosc. 2014. 22(3): 179-188. 6. J.D. Kirsch, J.K. Drennen. ‘‘Nondestructive Tablet Hardness Testing by Near-Infrared Spectroscopy: A New and Robust Spectral Best-Fit Algorithm’’. J Pharm. Biomed. Anal. 1999. 19(3-4): 351-362. 7. United States Pharmacopeia (USP). Convention I. ‘‘,1119. NearInfrared Spectrophotometry’’. USP32-NF27, Second Supplement. Rockville, MD: USP, 2008. P. 622. 8. H. Wold. ‘‘Path Models with Latent Variables: The NIPALS Approach’’. In: H.M. Blalock, A. Aganbegian, F.M. Borodkin, R. Boudon, V. Capecchi, editors. Quantitative Sociology: International Perspectives on Mathematical and Statistical Model Building. New York: Academic Press, 1975. Pp. 307-357. 9. M.K. Antoon, J.H. Koenig, J.L. Koenig. ‘‘Least-Squares CurveFitting of Fourier Transform Infrared Spectra with Applications to Polymer Systems’’. Appl. Spectrosc. 1977. 31(6): 518-524. 10. D.M. Haaland, R.G. Easterling. ‘‘Improved Sensitivity of Infrared Spectroscopy by the Application of Least Squares Methods’’. Appl. Spectrosc. 1980. 34(5): 539-548. 11. D.M. Haaland, R.G. Easterling. ‘‘Application of New Least-Squares Methods for the Quantitative Infrared Analysis of Multicomponent Samples’’. Appl. Spectrosc. 1982. 36(6): 665-673. 12. V. Vapnik. The Nature of Statistical Learning Theory. New York: Springer-Verlag, 1995. 13. V. Vapnik. Statistical Learning Theory. New York: John Wiley and Sons, 1998. 14. J.A.K. Suykens, T. Van Gestel, J. De Brabanter, B. De Moor, J. Vandewalle. Least Squares Support Vector Machines. Singapore: World Scientific Pub, 2002. 15. N. Cristianini, J. Shawe-Taylor. An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge, UK: Cambridge University Press, 2000. 16. N. Herna´ndez, R.J. Biscay, I. Talavera. ‘‘Support Vector Regression Methods for Functional Data’’. Lect. Notes Comput. Sci. 2007. 4756: 564-573. 17. C. Borggaard. ‘‘Neural Networks in Near-Infrared Spectroscopy’’. In: P.C. Williams, K. Norris, editors. Near-Infrared Technology in the Agricultural and Food Industries. St. Paul, MN: American Association of Cereal Chemists, 2001. 2nd ed., pp. 101-108. 18. Y. Dou, Y. Sun, Y. Ren, Y. Ren. ‘‘Artificial Neural Network for Simultaneous Determination of Two Components of Compound Paracetamol and Diphenhydramine Hydrochloride Powder on NIR Spectroscopy’’. Anal. Chim. Acta. 2005. 528(1): 55-61. 19. P. Filzmoser, M. Gschwandtner, V. Todorov. ‘‘Review of Sparse Methods in Regression and Classification with Application to Chemometrics’’. J. Chemom. 2012. 26(3-4): 42-51.

1356

Volume 68, Number 12, 2014

20. D.M. Haaland, D.K. Melgaard. ‘‘New Prediction-Augmented Classical Least-Squares (PACLS) Methods: Application to Unmodeled Interferents’’. Appl. Spectrosc. 2000. 54(9): 1303-1312. 21. D.K. Melgaard, D.M. Haaland, C.M. Wehlburg. ‘‘Concentration Residual Augmented Classical Least Squares (CRACLS): A Multivariate Calibration Method with Advantages over Partial Least Squares’’. Appl. Spectrosc. 2002. 56(5): 615-624. 22. D.M. Haaland, D.K. Melgaard. ‘‘New Classical Least-Squares/ Partial Least-Squares Hybrid Algorithm for Spectral Analyses’’. Appl. Spectrosc. 2001. 55(1): 1-8. 23. Z. Shi, B. Igne, R.W. Bondi, J.K. Drennen, C.A. Anderson. ‘‘Calibration Transfer from Pharmaceutical Powder Mixtures to Compacts Using the Prediction Augmented Classical Least Squares (PACLS) Method’’. Appl. Spectrosc. 2012. 66(9): 1075-1081. 24. M. Blanco, J. Coello, H. Iturriaga, S. Maspoch, M. Porcel. ‘‘Simultaneous Enzymatic Spectrophotometric Determination of Ethanol and Methanol by Use of Artificial Neural Networks for Calibration’’. Anal. Chim. Acta. 1999. 398(1): 83-92. 25. N. Herna´ndez, I. Talavera, R.J. Biscay, D. Porro, M.M.C. Ferreira. ‘‘Support Vector Regression for Functional Data in Multivariate Calibration Problems’’. Anal. Chim. Acta. 2009. 642(1-2): 110-116. 26. International Conference on Harmonisation. Validation of Analytical Procedures: Text and Methodology Q2(R1). Current Step 4 version. Geneva: ICH Secretariat, 2005. 27. P. Geladi, D. MacDougall, H. Martens. ‘‘Linearization and ScatterCorrection for Near-Infrared Reflectance Spectra of Meat’’. Appl. Spectrosc. 1985. 39(3): 491-500. 28. H. Martens, E. Stark. ‘‘Extended Multiplicative Signal Correction and Spectral Interference Subtraction: New Preprocessing Methods for Near Infrared Spectroscopy’’. J Pharm. Biomed. Anal. 1991. 9(8): 625-635. 29. R.J. Barnes, M.S. Dhanoa, S.J. Lister. ‘‘Standard Normal Variate Transformation and De-Trending of Near-Infrared Diffuse Reflectance Spectra’’. Appl. Spectrosc. 1989. 43(5): 772-777. 30. S. Ra¨nnar, F. Lindgren, P. Geladi, S. Wold. ‘‘A PLS Kernel Algorithm for Data Sets with Many Variables and Fewer Objects. Part 1: Theory and Algorithm’’. J. Chemometr. 1994. 8(2): 111-125. 31. R. Rosipal, M. Girolami, L.J. Trejo, A. Cichocki. ‘‘Kernel PCA for Feature Extraction and De-Noising in Nonlinear Regression’’. Neural Comput. Appl. 2001. 10(3): 231-243. 32. United States Pharmacopeia (USP). Convention. ‘‘Acetaminophen Tablets’’. United States Pharmacopeia and National Formulary (USP29-NF24), Supplement 2. Rockville, MD: USP, 2006. P. 21. 33. K. Pelckmans, J.A.K. Suykens, T. Van Gestel, J. De Brabanter, L. Lukas, B. Hamers, B. De Moor, J. Vandewalle. LS-SVMlab1.5: Least Squares Support Vector Machines. Leuven, Belgium: Katholieke University, 2003. http://www.esat.kuleuven.be/sista/lssvmlab/ [accessed Oct 9, 2014]. 34. H. Mark. ‘‘Application of an Improved Procedure for Testing the Linearity of Analytical Methods to Pharmaceutical Analysis’’. J. Pharm. Biomed. Anal. 2003. 33(1): 7-20. 35. I.G. Chong, C.H. Jun. ‘‘Performance of Some Variable Selection Methods When Multicollinearity Is Present’’. Chemometr. Intell. Lab. 2005. 78(1-2): 103-112. 36. B. U¨stu¨n, W.J. Melssen, L.M.C. Buydens. ‘‘Visualisation and Interpretation of Support Vector Regression Models’’. Anal. Chim. Acta. 2007. 595(1-2): 299-309 (quotation from p. 303). ˜ 37. A. Munoz, J. Gonza´lez. ‘‘Representing Functional Data Using Support Vector Machines’’. Pattern Recogn. Lett. 2010. 31(6): 511516. 38. B. Martin-Barragan, R. Lillo, J. Romo. ‘‘Interpretable Support Vector Machines for Functional Data’’. Eur. J. Oper. Res. 2014. 232(1): 146-155. 39. J.E. Jackson, G.S. Mudholkar. ‘‘Control Procedures for Residuals Associated with Principal Components Analysis’’. Technometrics. 1979. 21: 341-349.

Improving near-infrared prediction model robustness with support vector machine regression: a pharmaceutical tablet assay example.

Changes in raw materials and process wear and tear can have significant effects on the prediction error of near-infrared calibration models. When the ...
559KB Sizes 0 Downloads 15 Views