Analyst View Article Online

PAPER

Cite this: Analyst, 2014, 139, 74

View Journal | View Issue

Quantification of pharmaceuticals via transmission Raman spectroscopy: data sub-selection† Jonathan C. Burley,*a Adeyinka Aina,a Pavel Matousekb and Christopher Brignellc

Published on 25 October 2013. Downloaded on 24/10/2014 00:53:34.

We report the first systematic characterisation of data sub-selection with multivariate analysis to be applied to either TRS or the low-wavenumber Raman region. A model pharmaceutical formulation comprising two polymorphs mixed in the range of 1–99% is investigated. For data sub-selection, sparse partial least squares is for the first time applied to TRS data and compared with principal component analysis. It is found that lowwavenumber data (50–340 cm1) are demonstrably superior for quantitative modelling than data in the Received 5th July 2013 Accepted 25th October 2013

more conventional mid-wavenumber range (340–2000 cm1). Our results point the way to enhanced quantitative analytical capabilities for TRS, with potential application areas including pharmaceuticals,

DOI: 10.1039/c3an01293j

security and process-analytical technology, by combining data sub-selection with low-wavenumber-

www.rsc.org/analyst

capable optics.

1

Introduction

In any spectroscopy experiment a choice is made about which spectral region to employ. Almost invariably a choice must be made before an experiment about which technique or data collection strategy is likely to be most suitable. This choice may be implicit, for example selection of a mid infra-red (mid-IR) spectrometer rather than a near infra-red (NIR) or teraHertz (THz) one. The choice of spectral region may also be explicit, for example selection of a particular grating setting and hence a wavenumber range in a Raman mapping experiment, or selection of a single wavenumber in a UV/vis experiment to allow a simple univariate concentration vs. absorbance graph to be plotted, instead of developing a more complex multi-variate model for calibration and analysis. In other settings there may be a requirement to select only a subset of the data aer the experiment has been performed (data reduction). This might be, for example, the selection of one chemically meaningful peak (e.g. carbonyl) for analysis, or it may be to produce as reliable and/or as simple a quantitative model as possible, or to reduce the mathematical and computational complexity of data analysis (e.g. Raman, NIR or mid-IR mapping with large hyper-spectral datasets). Regardless of whether the decision on which spectral range to use is taken before or aer data collection, it is important

a

School of Pharmacy, Univeristy of Nottingham, Boots Science Building, NG7 2RD, UK. E-mail: [email protected]; Fax: +44 (0) 115 951 5102; Tel: +44 (0) 115 84 68357

b

Central Laser Facility, Research Complex at Harwell, STFC Rutherford Appleton Laboratory, Oxfordshire, OX11 0QX, UK

c

School of Mathematical Sciences, University of Nottingham, NG7 2RD, UK

† Electronic supplementary 10.1039/c3an01293j

74 | Analyst, 2014, 139, 74–78

information

(ESI)

available.

See

DOI:

that the choice is informed and leads to an optimum output of information and creation of new knowledge. A large body of work exists on the application of postexperiment variable selection, particularly in the eld of NIR spectroscopy.1–6 A recent review article covers much of this work.7 The primary drivers for the use of data sub-selection are (i) chemical (e.g. selecting only wavelengths which correspond to the analyte of interest), (ii) physical (e.g. selecting a subregion for which temperature dependence or humidity does not strongly affect the analysis), (iii) statistical (e.g. reducing the input of wavelengths with more noise than signal) and (iv) other requirements (e.g. simplifying models for translation between instruments, speed of analysis, computational requirements etc.). In comparison with NIR spectroscopy, which tends to yield fairly broad peaks which are typically not directly linked to a particular chemical group (overtones and combination bands make up the majority of the spectrum8–10), data sub-selection in mid-IR, THz-IR and Raman spectroscopy has received far less attention.1–7,11–15 This emphasis on NIR likely arises due to the frequent requirement in NIR spectroscopy for chemometric methods to understand data. These methods are less common in mid-IR, THz-IR and Raman spectroscopies as the spectra produced are more amenable to direct interpretation. Two styles of data sub-setting aer data collection are reported in the literature. The more common style is the selection of a number of input variables (wavenumber and intensity pairs) which are non-contiguous (e.g. ref. 7 and references therein). The main aim associated with this sub-setting method is typically the reduction of input noise to the model and an associated increase in the accuracy, precision and reliability of the model. The potential disadvantage of the noncontiguous sub-setting is that it is likely to be specic to a

This journal is © The Royal Society of Chemistry 2014

View Article Online

Published on 25 October 2013. Downloaded on 24/10/2014 00:53:34.

Paper

particular problem (e.g. sample and/or spectroscopic technique). Less commonly, a continuous ‘spectral window’ can be selected (e.g. ref. 5). This method involves placing restrictions on the choice of wavenumber–intensity pairs and it is therefore likely to be less effective at noise removal but may be more transferable. The motivation for the current paper is to examine data subselection in the context of pharmaceutical quality control, with an overall aim of determining whether sub-selection allows an improvement in quantitative ability. This is an area of major industrial and societal importance. We specically focus our attention on two emerging aspects of Raman spectroscopy. Firstly, we examine the utility of low-wavenumber data. Traditionally, many Raman spectrometers have only been able to access the wavenumber region above 300 cm1 unless more specialist equipment was used, access to the lower wavenumber end of the spectrum being limited by lters required to reject the very intense laser light (at 0 cm1). In recent years improvements to the lters used for this have allowed easy access to data well below 100 cm1 in many cases. This lowwavenumber spectral region contains a great deal of information on inter-molecular (rather than intra-molecular) vibrational bands, and can allow for a rapid assessment of crystalline vs. amorphous, salt vs. free base, co-crystal vs. physical mixture, polymorph identication etc. (e.g. ref. 16 and references therein). Second, we examine the application of data sub-selection to transmission Raman spectroscopy (TRS). Despite being initially reported17 in 1967, TRS has only become available routinely in the last few years (for a recent review see Buckley and Matousek18). In terms of applications, TRS can penetrate deeply (ca. 50 mm or more) into opaque samples, compared to backscattering Raman spectroscopy which is strongly biased to the near-surface areas (

Quantification of pharmaceuticals via transmission Raman spectroscopy: data sub-selection.

We report the first systematic characterisation of data sub-selection with multivariate analysis to be applied to either TRS or the low-wavenumber Ram...
451KB Sizes 0 Downloads 0 Views