Environmental Science Processes & Impacts View Article Online

Published on 02 July 2015. Downloaded by University of Cambridge on 01/11/2015 07:44:13.

PAPER

Cite this: Environ. Sci.: Processes Impacts, 2015, 17, 1482

View Journal | View Issue

Data quality through a web-based QA/QC system: implementation for atmospheric mercury data from the global mercury observation system Francesco D'Amore,*a Mariantonia Bencardino,a Sergio Cinnirella,a Francesca Sprovieria and Nicola Pirroneb The overall goal of the on-going Global Mercury Observation System (GMOS) project is to develop a coordinated global monitoring network for mercury, including ground-based, high altitude and sea level stations. In order to ensure data reliability and comparability, a significant effort has been made to implement a centralized system, which is designed to quality assure and quality control atmospheric mercury datasets. This system, GMOS-Data Quality Management (G-DQM), uses a web-based approach with real-time adaptive monitoring procedures aimed at preventing the production of poor-quality data. G-DQM is plugged on a cyberinfrastructure and deployed as a service. Atmospheric mercury datasets,

Received 30th April 2015 Accepted 1st July 2015

produced during the first-three years of the GMOS project, are used as the input to demonstrate the application of the G-DQM and how it identifies a number of key issues concerning data quality. The

DOI: 10.1039/c5em00205b

major issues influencing data quality are presented and discussed for the GMOS stations under study.

rsc.li/process-impacts

Atmospheric mercury data collected at the Longobucco (Italy) station is used as a detailed case study.

Environmental impact Mercury is a persistent pollutant that exists naturally in the environment. However, levels have risen because of human activity and pollution. The UNEP Global Mercury Partnership has the goal of protecting human health and the global environment from the release of mercury and its compounds. In this framework, the Global Mercury Observation System (GMOS) was developed. In order to assure data quality of datasets produced within the GMOS project, a common QA/QC process has been implemented as a centralized system able to ensure, control and report on the quality of mercury data from the GMOS monitoring stations. The system, called GMOS-Data Quality Management (G-DQM), is based on a QA/QC methodology which automatically processes data retrieved from Tekran instruments.

1

Introduction

Mercury is a persistent pollutant that exists naturally in the environment. However, its levels have risen because of human activities and pollution. Currently, the UNEP Global Mercury Partnership has the goal of protecting human health and the global environment from the release of mercury and its compounds by minimizing and, where feasible, ultimately eliminating global and anthropogenic mercury releases to air, water and land. In this framework, among other large atmospheric mercury monitoring networks, the Global Mercury Observation System (GMOS) was implemented. It is a European funded network even if it has a global perspective including stations widespread in different countries. The network was developed by integrating previously established ground-based atmospheric mercury monitoring stations, such as EMEP and AMAP sites (W¨ angberg et al.;1 Tørseth et al.;2 AMAP3), with new

a

CNR-Institute of Atmospheric Pollution Research, Division of Rende, Italy

b

CNR-Institute of Atmospheric Pollution Research, Montelibretti, Rome, Italy

1482 | Environ. Sci.: Processes Impacts, 2015, 17, 1482–1491

stations, which are widespread in the northern and southern hemisphere, and located both at high altitude and sea level locations, as well as in climatically diverse regions (Sprovieri et al.4). Within the network, special attention was paid to the harmonization of measurements in order to ensure full comparability between data from all the monitoring sites. To achieve this, Standard Operating Procedures (SOPs) were developed during the planning and implementation stage of the GMOS network (Munthe et al.5). This was done in accordance with best practice on measurements adopted in well-established regional monitoring networks, and based on the most recent literature (Brown et al.;6 Steffen et al.;7 Gay et al.8). The GMOS network produces data coming in near real-time from a large number of sources. Strict QA/QC procedures are required to avoid the production of poor-quality data and to ensure the correct implementation of the SOPs. Furthermore, a common QA/QC approach for each raw dataset reduces the time between raw data and nal data production and publication (Campbell et al.9).

This journal is © The Royal Society of Chemistry 2015

View Article Online

Published on 02 July 2015. Downloaded by University of Cambridge on 01/11/2015 07:44:13.

Paper

Environmental Science: Processes & Impacts

To this end, a centralized system which is able to ensure, control and report on the quality of mercury data from the GMOS monitoring stations was designed. The system, called GMOS-Data Quality Management (G-DQM), is based on a QA/ QC methodology which automatically processes data retrieved from Tekran instruments (Landis et al.;10 Steffen et al.11), from different data providers. It forms a part of a dedicated cyberinfrastructure (GMOS-CI) that oversees data acquisition and data sharing among major stakeholders, policy-makers and the public, using an interoperable approach (Cinnirella et al.12). G-DQM uses a service approach to facilitate adaptive network monitoring, which supports both routine and alert notications to ensure proper instrument maintenance. It is deployed as a web based application and all QA/QC processes are made available using a common web browser. In order to test this system, an initial evaluation of data quality has been performed on three years of atmospheric mercury data produced within the GMOS network. This work provides a synthetic description of the features of the G-DQM system, and the analysis of the results obtained from its use. The detailed application of the system is demonstrated using the atmospheric mercury dataset collected at the Longobucco (Italy) station.

Table 1

2 Data quality Advances in cyberinfrastructure and sensor networks now provide enormous quantities of data, even in near real-time. Dedicated Information Technology (IT) frameworks make it possible to deliver ever larger datasets to the end user. In the coming years, improvements in sensor network technologies will provide researchers with more robust frameworks for data collection and management. Sensor network technologies enter many elds of modern life as they offer the opportunity to observe a wealth of environmental variables. This type of device can be used to create an Internet of Things (IoT), in which sensors, but also actuators, blend perfectly with the environment around us (Ashton13). Therefore, the problem is no longer how much data we have, but what kind of data we have, and above all its quality. Sensor networks are still subject to inevitable faults that may cause loss of data or poor quality and it is imperative to have a system in place to minimise data loss and alert operators to non-standard sensor performance. A rst approach to obtain good quality data from raw datasets may include a post-processing performed individually and oen manually by each station manager. This approach is unsuitable when data are coming in

Flagging criteria for general parameters and all readings

Flag code

Description

Flagging criteria

Data agged

IB0 WB1 WB2 WB3 IB5 IDL WM2 IMX WOL WV5 IV7 WTG

Baseline voltage too low Baseline voltage low or high Baseline voltage change Baseline deviation high Baseline deviation too high Below detection limit Multiple peaks detected Multiple peaks detected Overload Questionable sample volume Questionable sample volume Time gap

Baseline voltage < 0.01 V 0.01 V < baseline voltage < 0.05 V or baseline voltage > 0.25 V |Baseline voltagei  baseline voltagei1| > 0.02 V Baseline deviation > 0.10 V for 5 consecutive readings Baseline deviation > 0.20 V Hg concentration < 0.1 ng m3 Status ¼ M2 (multiple peaks) Status > M2 (multiple peaks) Status ¼ OL (overload) 5% < |(volumemeas  volumeexp)/volumeexp| # 7% ALL concentration GEM concentration

ALL ALL ALL ALL ALL ALL ALL ALL ALL ALL ALL ALL

Table 2

Flagging criteria for GEM/TGM readings. A and B refer to gold cartridges used in Tekran

Flag code

Description

Flagging criteria

Data agged

WEH WEL

Hg concentration high Hg concentration low

GEM GEM

WE5

Same cartridge difference > 50%

WK1

A/B cartridge difference within 5– 10% A/B cartridge difference > 10% No peak Non-representative GEM values aer calibration

GEM concentration > 4.0 ng m3 GEM concentration lower than a value varies according to site specic conditions (0.2–1 ng m3) |(GEMi  GEMi1)/GEMi| > 0.5 for the same cartridge 5% < |(A  B)/average (A, B)| # 10% |(A  B)/(average)(A, B)| > 10% Status ¼ NP (no peaks) Following calibration cycles the rst GEM value from each cartridge is not considered representative Following the desorption cycle the rst GEM value from each cartridge is not considered representative

GEM GEM GEM

WK2 INP IC0

ID0

Non-representative GEM values aer desorption

This journal is © The Royal Society of Chemistry 2015

GEM GEM

GEM

Environ. Sci.: Processes Impacts, 2015, 17, 1482–1491 | 1483

View Article Online

Environmental Science: Processes & Impacts

Published on 02 July 2015. Downloaded by University of Cambridge on 01/11/2015 07:44:13.

Table 3

Paper

Flagging criteria for the desorption cycle. Capital letters are defined in Table 5

Flag code

Description

Flagging criteria

Data agged

WP0 WG0 IP1 IG1 IP2 IG2 IL1 IID WS0 IS1 *B* *E*

No PBM No GOM PBM desorption arguable GOM desorption arguable PBM negative value GOM negative value Load cycle Incomplete desorption Speciation blanks (C) Speciation blanks (C) Beginning of desorption End of desorption

E+F+G¼0 H+I+J¼0 E < 0.70(E + F + G) or F > 0.20(E + F + G) or G > 0.10(E + F + G) H < 0.70(H + I + J) or I > 0.20(H + I + J) or J < 0.10(H + I + J) E + F + G < 3C H + I + J < 3C Load cycle < 1 or 2 or 3 h 0 GEM cycles < 12 or 24 or 36 before desorption Desorption cycle is incomplete < 12 step 1.67 pg m3 < cycle (C) # 10 pg m3 Cycle (C) > 10 pg m3 Beginning of each single desorption cycle End of each single desorption cycle

DES DES DES DES DES DES DES DES DES DES DES DES

near-real time from sensor networks. To deal with this scenario, QA/QC algorithms should run within an IT platform (i.e. a cyberinfrastructure) so that process optimization and data handling become more efficient. Individual data quality control does not ensure comparability: different stations spread around the world, within the same framework, require a homogeneous approach in order to dene a common data quality standard and data lineage. 2.1

Quality assurance and quality control (QA/QC)

The G-DQM system presented in this paper is related to both Quality Assurance (QA) and Quality Control (QC) on datasets produced within the GMOS network. QA and QC are oen presented together even if they are two quite different concepts: QA is related to the process regarding data collection, while QC is applied to the nal product of monitoring. QC is supervised by site operators, who are in charge of clarifying suspicious measurements as well as identifying anomalies and conrming data rejection within their own datasets (Campbell et al.9). In this regard, the system proposed is both process and product oriented. It enables site operators to monitor instrument performance and to promptly take corrective action when problems arise. G-DQM is able to verify if the monitoring process adheres to standard procedures in a way that minimizes losses and inaccuracies in data production.

Table 5

Scheme of the desorption cycle by the Tekran 1130/1135

Tekran event ag

Measurement type

Label

1 1 1 2 2 2 2 3 3 3 1 1

Zero air Zero air Zero air Pyrolysis air PBM PBM PBM GOM GOM GOM Zero air Zero air

A B C D E F G H I J K L

Even though it is necessary to have a level of human intervention and inspection in QA/QC, the use of automated common checks represents an improvement because it ensures consistency and reduces human bias thus avoiding misinterpretation and inappropriate data use. Through G-DQM we were able to automate the QA process making it available on the web via a user-friendly QC step that supports the expert supervision. Details and denitions of each component of the system are presented below.

Table 4 Flagging criteria for the calibration cycle. RFA and RFB refer to response factors over the two A and B gold cartridges used in Tekran

Flag code

Description

Flagging criteria

Data agged

WF1 IF2 WR1 IR2 WD1 WD2 WC1 IC2 WZ1 IZ2 IIC

Calibration interval Calibration interval Detector sensitivity Detector sensitivity Calibration change Calibration change Calibration trap bias Calibration trap bias Calibration blanks Calibration blanks Incomplete calibration

25 h < time between calibrations # 96 h Time between calibrations > 96 h 4  106 units # RespFact < 6  106 units or RespFact > 12  106 units RespFact < 4  106 units 5% < |(calibrationi  calibrationi1)/calibrationi| # 10% |(Calibrationi  calibrationi1)/calibrationi| > 10% 0.05 < |(RFA  RFB)/average (RFA, RFB)| # 10% |(RFA  RFB)/average (RFA, RFB)| > 10% Zero > 1500 peak area units Zero > 1% SPAN Calibration cycle incomplete

CAL CAL CAL CAL CAL CAL CAL CAL CAL CAL CAL

1484 | Environ. Sci.: Processes Impacts, 2015, 17, 1482–1491

This journal is © The Royal Society of Chemistry 2015

View Article Online

Paper

Published on 02 July 2015. Downloaded by University of Cambridge on 01/11/2015 07:44:13.

2.2

Flagging datasets

As described later in Section 4.1, data to be processed for quality are stored in tables managed in a database. G-DQM controls data by checking each observation (i.e. each row of the table) and returns specic information on data quality. For this purpose, a set of validation ags are used, which are derived either from instrument manufacturer recommendations or from the GMOS SOPs. Given an input dataset to the G-DQM, the output is the same dataset where each row is agged with a tag that identies the measurement as a valid, warning (suspicious) or invalid observation. Each ag refers to specic agging criteria. The evaluation process consists of comparing: (1) Warning limits, established to draw attention to data for possible corrective action; (2) Control limits, which invalidate data when exceeded. Each agging criterion triggers the corresponding ags using thresholds. G-DQM checks if rows, or a set of rows, within a dataset comply with these thresholds in order to tag the corresponding observations with ags that indicate valid/warning/ invalid data. Two existing suites of soware aimed to ensure data quality were taken as references in dening thresholds: the Research Data Management Quality (RDMQ) and the AMNet Quality Control (AMQC) programs, individually developed by Environment Canada and by the National Atmospheric Deposition Network (NADP), respectively (Steffen et al.7). Flagging criteria used in both tools were analysed and compared before their inclusion into the G-DQM system. Each control parameter has been set to meet the specic needs of the GMOS community because it is important to take into account different site specic conditions (i.e. polar sites and high-altitude locations). The ags and related agging criteria being used in G-DQM are summarized in Tables 1–4. The variables used in Table 3 are described in Table 5 where the desorption cycle is reported.

3 GMOS project The worldwide scope of the GMOS project provides valuable data for a deeper understanding of atmospheric mercury on a global scale. With respect to data collection and management, this global structure poses a challenge to mercury scientists because traditional approaches to QA/QC are not so easily applicable due to the size of datasets coming from different monitoring stations across the globe, and also because data arrive in near real-time. Moreover, comparability of atmospheric mercury measurements at the global level is imperative for the GMOS infrastructure in order to ensure data that are useful for both the scientic and policy communities. To specically meet these requirements, a centralized G-DQM system was developed and employed. 3.1

Measurement of mercury

Mercury in air is measured as three operationally dened forms: (1) Gaseous Elemental Mercury (GEM) or Total Gaseous Mercury (TGM);

This journal is © The Royal Society of Chemistry 2015

Environmental Science: Processes & Impacts

(2) Gaseous Oxidized Mercury (GOM); (3) Particle-bound mercury less than 2.5 mm (PBM). Gaseous Elemental Mercury (GEM) is the dominant form of atmospheric mercury (Lindberg and Stratton14). It can be oxidized in the atmosphere to form reactive and water-soluble Hg(II) compounds. This oxidised Hg is dened as all forms of mercury sampled using a KCl-coated denuder (GOM) (Landis et al.10), and/or Particle-Bound Mercury (PBM) (Lin and Pehkonen15), both are deposited to ecosystems through wet and dry processes (Amos et al.16). TGM, measured when speciation is not possible, is the sum of GEM and GOM (Lindqvist and Rodhe17). GMOS network sites measure the concentrations of atmospheric mercury fractions using an automated and continuous mercury speciation system: the Tekran Mercury Vapour Analyser Model 2537 coupled with the speciation models 1130 for GOM and 1135 for PBM. This equipment meets the GMOS requirements and is commonly available. Tekran utilizes two gold cartridges (A and B) in parallel to allow for continuous measurements with alternating operation modes (sampling versus desorbing/analysing stage) on a predened time base (e.g., 10 min) (Tekran18). Measurements are obtained through a multi-step procedure as described elsewhere (Lindberg et al.19) using an impactor inlet (2.5 mm cut-off aerodynamic diameter at 10 L min1), a KCl-coated quartz annular denuder in the 1130 unit, and a quartz regenerable particulate lter (RPF) in the 1135 unit. The operation and principles of the Tekran instrument are described in the study by Landis et al.10 The main operational phases are: (1) GEM or TGM measurements (GEM/TGM); (2) Desorption cycle (DES) (see Table 5); (3) Calibration cycle (CAL). During the DES phase it is possible to perform speciation measurements and determine both GOM and PBM concentrations. During the CAL cycle the Tekran 2537 CVAFS mercury analyzers are automatically calibrated using internal permeation sources that emit vapor mercury at a constant rate to ensure acceptable Response Factors (RF) over each cartridge (RFA, RFB) (Tekran18). Where it is not possible to perform speciation, only the 2537 module of Tekran is used in order to perform TGM measurements and the CAL cycle.

3.2

GMOS network

Within the GMOS network, stations are classied as master, if they provide mercury speciation measurements (GEM, GOM and PBM), and secondary, when they provide only TGM concentrations. The on-going GMOS network consists of 28 monitoring stations, whose institutions are internal GMOS partners, and 11 monitoring stations managed by external partners. Almost all internal GMOS stations provide near realtime raw data that are archived and managed by the GMOS-CI. In order to test its compliance with the adopted GMOS SOPs, the G-DQM system was tested on 16 different raw datasets: 11 from secondary stations and 5 from master sites. Names,

Environ. Sci.: Processes Impacts, 2015, 17, 1482–1491 | 1485

View Article Online

Published on 02 July 2015. Downloaded by University of Cambridge on 01/11/2015 07:44:13.

Environmental Science: Processes & Impacts

Paper

Fig. 1 Coverage and consistency, on a monthly basis, of TGM data collected at some of the on-going GMOS secondary stations, over the period 2011–2013.

Coverage and consistency, on a monthly basis, of GEM/GOM/PGM data collected at some of the on-going GMOS master stations, over the period 2011–2013.

Fig. 2

locations, reference institutes, as well as data coverage for 2011/ 2013 are shown in Fig. 1 and 2, respectively.

4 QA/QC and cyberinfrastructure From the user's point of view, G-DQM is a web-based application developed by using a Soware as a Service (SaaS) approach: the soware is developed as a product, but deployed as a service through a web browser. Data to be processed appear to users as managed within a computer cloud, and operators can access them aer a login phase. By using this web application, users are able to follow the whole QA/QC process. From an IT point of view, G-DQM is part of the GMOSCyberInfrastructure (GMOS-CI) cited above, which is a research environment that supports advanced data acquisition, storage, management, integration, mining and visualization, built on an IT infrastructure (DAmore et al.20).

1486 | Environ. Sci.: Processes Impacts, 2015, 17, 1482–1491

The core of the GMOS-CI is a Spatial Data Infrastructure (SDI) that integrates modules providing a set of services and features using open source components widely used by the geographic information community (de la Beaujardiere21). Services and processes of the GMOS-CI were designed to provide geographic services, necessary for the integration of datasets into federated systems such as GEOSS (GEO22). The GMOS-CI also ensures that data can be shared with major stakeholders, policy makers and the public. The G-DQM plugs into this cyberinfrastructure: it runs over datasets acquired by the GMOS-CI and makes use of some of the GMOS-CI's features, such as security and user management. The integration of the QA/QC component into the GMOS-CI allows us to deal with issues related to data and process integration, as well as the analysis of large datasets. To this end, managing large amounts of data, coming even in near real-time, is not straightforward if it is done by individual researchers using their personal computers: the G-DQM

This journal is © The Royal Society of Chemistry 2015

View Article Online

Published on 02 July 2015. Downloaded by University of Cambridge on 01/11/2015 07:44:13.

Paper

Environmental Science: Processes & Impacts

workow, described later in Section 4.2, is potentially able to work without an ICT platform but in this way it could not scale, if necessary. Let us consider that environmental data are increasing day by day coming from mobile applications or smart sensor networks. Deploying the soware as part of a cyberinfrastructure permits us to deal with this scenario in a more feasible way. Moreover, the QA/QC process can be scheduled by the cyberinfrastructure in order to automatically process new data coming from sensors, mainly for real-time acquisition. Furthermore, automatic outcomes, such as warnings or alarms, would notify operators about instrumental malfunctions thus preventing poor data quality and data loss (Campbell et al.9). All datasets collected by GMOS partners should respect the same SOPs and the same data lineage. It is possible to comply with both requirements by providing QA/QC processes as a common facility through the GMOS-CI. 4.1

Data acquisition

G-DQM is a service that starts working aer data are stored in the GMOS databases. The data integration process is held by a soware agent plugged on the GMOS-CI. This component acquires data coming from stations managed by the GMOS partners: it reads data shared by each partner using File Transfer Protocol (FTP), even though many other different protocols are supported. For stations where any type of automatic data connection is not available, it is possible to upload information manually on the GMOS web portal. In both cases, GMOS-CI stores data in tables managed by a Data Base Management System (DBMS). The data stored represent the standard output of Tekran, along with information regarding geographical locations, names of data sources and quality elds. When data are stored for the rst time, quality elds are empty. Periodically, G-DQM checks for new information available on GMOS databases. If new data are found, the quality process starts in order to tag all the new observations. At the end of this process, the quality elds contain ags about quality concerns. The ags used are those cited in Section 2.2. 4.2

G-DQM workow: main features and components

As described in Section 4.1, G-DQM essentially takes Tekran raw data as the input while the output is a agged dataset with information on data validation. Aer data acquisition, datasets are processed for quality assurance using the workow reported in Fig. 3. In step one (1), G-DQM runs an automated process that lters the raw data stored in GMOS databases. The system compares the dataset against 43 potential ags corresponding to 43 criteria that specically refer to the three operation phases cited above: GEM/TGM, DES, and CAL (see Tables 1–4). The ags are grouped in three sets: valid, warning and invalid. Thus, each ag refers to a specic condition, or criterion, that the system checks in order to screen the Tekran data. Each raw observation is agged depending on the result of each corresponding criterion and returns, as a temporary output, a agged dataset.

This journal is © The Royal Society of Chemistry 2015

Fig. 3

G-DQM workflow with the main five-step process on which it is

based.

The second step (2) consists of instrument reports compiled by site operators during their visits to stations. Field notes, anomalies, routine controls and part changes are reported in the station e-logbook, which is provided as a different service by means of a web application integrated into the GMOS-CI. GMOS SOPs are fully integrated into the e-logbook, which also serves as a reminder for routine maintenance. The third step (3) requires the site operator's approval of the intermediate agged dataset. Site operators are allowed to clarify data records prior to their full approval. At the end of the above processes the system outputs are fully QAed/QCed. A further process (4) computes GOM and PBM concentrations for those sites that are performing speciation. Aer step (4), measurements tagged as invalid are tossed and only the valid data will be considered available for dissemination purposes. Step (5) thus stores the nal valid datasets that will be accessible from the GMOS web portal (http://www.gmos.eu) for dissemination purposes. As an additional service, the G-DQM system is able to provide an alerting system (0) by which it is possible to visualize the near real-time Tekran output parameters. This helps site operators to identify any questionable events and take quick corrective actions in order to prevent the production of poor-quality data.

5 Case study: Longobucco (Italy) dataset In this section, a case study is shown using data from the Longobucco monitoring station. Longobucco is a GMOS master site whose atmospheric mercury speciation measurements covered the period Oct 2012–Oct 2013. Running the G-DQM we

Environ. Sci.: Processes Impacts, 2015, 17, 1482–1491 | 1487

View Article Online

Published on 02 July 2015. Downloaded by University of Cambridge on 01/11/2015 07:44:13.

Environmental Science: Processes & Impacts

Paper

Time-series highlighting the final (valid or invalid) Longobucco GEM dataset after all the G-DQM step processes.

Fig. 5

Time-series in which data points change colour according to the main flag assigned by the first automatic G-DQM step process. They refer to: (a) GEM concentrations; (b) PBM concentrations and (c) response factor values, recorded at the Longobucco station in 2013. Fig. 4

obtained the initial agged datasets for the main three operation phases (GEM, DES and CAL). At this stage, the resulting ags are only outcomes of the automated QA scripts and they can be used in time-series plots to highlight quality-related issues. In Fig. 4 the time-series refers to data recorded during 2013 at Longobucco. Colour coding is used to distinguish each specic quality-related issue: points change colour according to the main ag assigned by the automatic G-DQM process. In Fig. 4(a) (GEM data), dots in red and yellow refer to a problem with the sampled volume, indicated by the ags IV7 and WV5, respectively. The ag IV7 is used when the measured volume differs by over 7% from the expected value, while the WV5 ag indicates that the volume is between 5 and 7% of the expected value. Dots in blue in Fig. 4(a) refer to the warning ag WK2 that, as it can be seen, oen occurs in the analysed dataset. The WK2 ag indicates that the concentrations measured over A and B gold cartridges in the Tekran instrument is diverging more than 10% (see Table 2). In Fig. 4(b), PBM data from the desorption cycle are reported. In addition to the volume-related

1488 | Environ. Sci.: Processes Impacts, 2015, 17, 1482–1491

problem (IV7-dots in red) already encountered with the GEM data, for mercury speciation the automated quality screening highlights an issue associated with high zero values during the desorption cycle, tagged with the WS0 ag (dots in pink). Similarly, during the calibration cycle, shown in Fig. 4(c), there are issues notied by ags WZ0 and IZ1 (dots in brown and red, respectively). Both ags are related to a high calibration blank: WZ0 is a warning, while IZ1 invalidates the whole calibration cycle. It is important to consider that ags related to calibration cycles affect all related data. Fully automated checks have limitations: there is a risk that a real and potentially important phenomenon could be ignored, i.e. when a real but extreme value is censored for falling outside an expected range. To ensure that this does not happen, the G-DQM system includes a mandatory nal step, requiring that all data, and especially those agged as suspicious, be carefully reviewed by the responsible scientist/site operator of each station. In the case of the Longobucco dataset, the station experts reevaluated all GEM data tagged with the WK2 ag as valid. The reason being that the higher A/B cartridge divergence did not occur continuously over time, thus conrming that there was not an issue related to cartridge passivation. Moreover, they identied a strong increase in PBM concentrations, tagged as suspicious (WS0) by the G-DQM automatic screening, as the consequence of a concurrent Saharan dust storm occurrence. Control analysis was also carried out by site operators for data related to calibration cycles. The nal valid dataset for Longobucco will thus consist of data highlighted in green in Fig. 5, which is the result of the combination of the valid calibration cycles and the manual review performed by the site operators.

6 Data quality evaluation of the ongoing GMOS stations As with the Longobucco dataset, using G-DQM it was possible to screen each GMOS monitoring dataset reported in Fig. 1 and 2. The results for data quality, in terms of ag incidence, are

This journal is © The Royal Society of Chemistry 2015

View Article Online

Paper

Environmental Science: Processes & Impacts

presented below: the three Tekran operational phases are considered separately, even if as expected, at the end of the validation process the calibration results will affect both GEM/TGM and DES data. For the main Tekran parameters, box-and-whisker plots are later reported highlighting the compliance of the ongoing GMOS Tekran measurements with the adopted SOPs.

Published on 02 July 2015. Downloaded by University of Cambridge on 01/11/2015 07:44:13.

6.1

Issues affecting GEM/TGM measurements

For each of the 16 GMOS stations studied, a summary regarding the percentage of both warning and invalidating ags affecting GEM/TGM datasets is shown in Table 6. The results reveal that the larger part of the datasets are affected by various qualityrelated problems, labelled by different numbers of ags. For these datasets a lower percentage completely meets the criteria for GEM/TGM. It is possible to notice that much of the data is tagged with the WK1 and WK2 ags that both refer to the A/B cartridge divergence. The GMOS SOPs specically recommend to calculate the Absolute Percent Difference (APD) between the A and B cartridges and then check that it falls in the range 5–10% (WK1) or if it is higher than 10% (WK2, see Table 2). If data result to be continuously agged with WK2 over one day, values underestimated are tossed. The APD distribution recorded at each of the examined 16 GMOS stations was observed over the period 2011–2013 and is shown using a box and whisker plot (Fig. 6). From this gure, the level of compliance with the current version of the GMOS SOPs can be easily seen: there are 6 stations with more than 50% of their datasets over the higher cut-off value, and numerous stations that fall within the warning range. Only a very small percentage of each dataset is compliant with GMOS recommendations. 6.2

Issues affecting desorption cycles

Table 7 shows warning and invalid ags for the GMOS master sites under study. The most commonly observed ag was WS0

Table 6

Distribution of Absolute Percent Difference (APD) within each examined GMOS dataset. Each box includes the median (midlines), 25th and 75th percentile (box edges), and 5th and 95th percentile (whiskers).

Fig. 6

which refers to high values for the third step of the desorption cycle (label C in Table 5) also considered as a speciation blank measurement. In this regard, a specic warning range has been introduced: the range 1.67–10 pg m3 corresponds to the warning tagged WS0. Values of speciation blank (C step) higher than 10 pg m3 leads, instead, to an invalidation ag (IS1) that determines the invalidation of the whole DES cycle. The distribution of values observed at the GMOS master stations is shown as a box and whisker plot in Fig. 7, where the

Table 7 Incidence of flags for the DES cycle within each examined GMOS dataset. Master station codes are reported in the first column

M1 M2 M3 M4 M5

IS1

IP1/IG1

IP2/IG2

WS0

OK

5.5% 0.0% 0.5% 2.0% 6.1%

16.5% 0.2% 12.5% 0.5% 29.0%

11.8% 0.3% 9.5% 0.5% 1.3%

49.4% 9.8% 33.5% 13.7% 22.3%

16.9% 89.7% 44.0% 83.4% 41.2%

Incidence of flags for GEM/TGM data within each examined GMOS dataset. Secondary and master station codes are reported in the first

column

S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11 M1 M2 M3 M4 M5

INP

IDL

IV7

IB5

IB0

IMX

WV5

WB3

WB2

WB1

WK1

WK2

WE5

WM2

WTG

OK

0.1% 0.5% 3.3% 0.0% 0.0% 2.0% 0.0% 3.8% 0.0% 0.0% 2.7% 3.8% 0.1% 8.0% 9.4% 0.1%

0.0% 0.0% 0.0% 0.0% 0.0% 0.1% 0.0% 0.1% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0%

0.0% 0.0% 3.0% 0.0% 2.7% 0.0% 0.0% 0.1% 0.0% 0.0% 0.0% 0.0% 10.0% 18.6% 8.3% 21.2%

0.2% 2.8% 3.4% 0.0% 0.0% 0.7% 0.0% 0.1% 0.8% 0.5% 0.0% 0.0% 1.7% 0.1% 19.2% 1.2%

0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 3.3% 0.2% 2.7% 0.1% 0.0%

0.0% 0.2% 0.1% 0.0% 0.0% 0.2% 0.0% 0.0% 1.1% 1.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.1%

0.0% 0.0% 0.0% 0.0% 2.4% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 16.7% 0.2% 11.4% 0.6% 0.0%

0.1% 0.5% 4.6% 0.0% 0.0% 1.9% 0.0% 0.0% 0.0% 3.5% 0.1% 0.1% 0.0% 0.0% 0.0% 7.8%

0.1% 1.7% 0.0% 0.0% 0.0% 0.1% 0.0% 0.5% 9.7% 0.0% 0.0% 12.2% 0.0% 1.6% 39.6% 0.0%

20.0% 0.1% 24.4% 1.2% 10.2% 2.2% 0.2% 9.0% 61.0% 0.2% 21.4% 8.6% 25.0% 9.5% 11.1% 10.2%

22.2% 28.4% 21.5% 17.9% 20.2% 20.0% 7.8% 26.9% 16.4% 19.5% 15.2% 33.8% 19.1% 6.4% 8.1% 11.0%

8.4% 11.8% 8.2% 48.9% 23.1% 41.3% 1.5% 11.4% 10.7% 37.9% 5.0% 1.0% 0.6% 0.2% 0.7% 24.6%

0.1% 0.4% 0.0% 0.0% 0.0% 0.4% 0.0% 0.5% 0.0% 5.6% 0.0% 1.7% 0.1% 3.3% 0.0% 0.0%

0.0% 0.1% 0.3% 0.0% 0.0% 1.0% 0.0% 0.0% 0.0% 1.2% 0.1% 0.1% 0.0% 0.5% 0.1% 0.3%

0.1% 0.0% 0.4% 0.2% 0.0% 0.3% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.1% 0.1% 0.0% 0.0%

48.8% 53.4% 31.0% 31.7% 41.3% 29.9% 90.5% 47.4% 0.4% 30.5% 55.3% 18.7% 42.9% 37.4% 2.7% 23.5%

This journal is © The Royal Society of Chemistry 2015

Environ. Sci.: Processes Impacts, 2015, 17, 1482–1491 | 1489

View Article Online

Published on 02 July 2015. Downloaded by University of Cambridge on 01/11/2015 07:44:13.

Environmental Science: Processes & Impacts

Paper

Fig. 7 Distribution of speciation blank values (pg m3) within each examined GMOS dataset. Each box includes the median (midlines), 25th and 75th percentile (box edges), and 5th and 95th percentile (whiskers).

compliance (and non-compliance) with the control limits can clearly be seen. Only one site shows more than half of its dataset in the invalid range. The other four sites are mostly compliant with this specic QA criterion. 6.3

Issues affecting calibration cycles

For the calibration cycle the percentage of ags was also calculated (Table 8). The results reveal that this particular operational phase of the Tekran instrument was oen affected by an issue related to the Response Factor (RespFact). For this parameter, the G-DQM system produces a warning ag if the RespFact falls in the range of 4  106–6  106 units, or if it exceeds 12  106 (WR1). If RespFact is lower than 4  106 units, the system returns an invalid ag (IR2). In Fig. 8, the distribution of the RespFact values recorded during calibration cycles is shown as a box-and-whisker plot. It can be seen that for three stations nearly half the data proved to be invalid. Another nine stations had at least half of their RespFact values within the warning range.

Fig. 8 Distribution of response factor values (units) within each

examined GMOS dataset. Each box includes the median (midlines), 25th and 75th percentile (box edges), and 5th and 95th percentile (whiskers).

7

Conclusions and future directions

The monitoring network established within the Global Mercury Observation System (GMOS) project provides a valuable resource for a deeper understanding of atmospheric mercury concentration and distribution trends on a global scale. In the context of the UNEP Global Mercury Partnership, results of this on-going project are also expected to support the effective implementation of the Minamata Convention, which is aimed at reducing the harmful impacts of mercury on human and ecosystem health. Although current instruments measuring mercury levels in air may provide useful information to both the policy and scientic communities, they are susceptible to malfunctions that can result in lost or poor-quality data. Some level of instrument failure is inevitable; however, steps can be taken to minimize the risk of loss and to improve the overall quality of the data. The G-DQM system is a web-based tool aimed to control data quality that has been specically developed to ensure data comparability among atmospheric mercury

Table 8 Incidence of flags for the CAL cycle within each examined GMOS dataset. Secondary and master station codes are reported in the first

column

S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11 M1 M2 M3 M4 M5

IR2

IC2

IZ2

IF2

WR1

WC1

WD1

WD2

WZ1

WF1

OK

14.1% 4.3% 54.8% 0.0% 10.7% 22.3% 0.4% 9.8% 0.0% 2.8% 68.3% 20.9% 30.3% 3.3% 10.4% 3.5%

0.0% 5.4% 1.2% 0.0% 12.1% 18.0% 0.4% 2.6% 0.0% 10.3% 3.8% 20.6% 0.0% 5.7% 2.5% 5.0%

1.0% 2.3% 8.0% 1.5% 9.3% 0.9% 0.4% 0.8% 26.5% 0.0% 1.3% 0.4% 0.6% 0.5% 2.5% 6.3%

1.0% 0.5% 0.2% 0.0% 0.0% 0.9% 1.1% 0.9% 0.3% 0.8% 1.2% 0.8% 2.2% 0.6% 1.4% 0.4%

42.4% 28.8% 8.0% 25.3% 19.3% 18.6% 71.2% 44.7% 28.9% 23.8% 7.4% 22.2% 27.0% 39.9% 25.2% 40.1%

1.0% 2.9% 0.3% 3.0% 4.3% 8.8% 0.0% 0.0% 0.0% 12.7% 3.9% 5.4% 0.0% 4.3% 8.3% 5.2%

13.1% 4.8% 3.6% 15.6% 14.3% 9.1% 1.9% 7.2% 1.3% 10.7% 7.6% 3.8% 6.7% 4.2% 13.3% 7.4%

8.6% 14.2% 4.6% 7.4% 13.6% 16.5% 1.9% 7.4% 1.3% 6.7% 3.8% 2.3% 9.6% 3.6% 10.4% 5.8%

1.0% 2.3% 18.1% 7.4% 12.9% 0.9% 0.4% 14.3% 32.6% 1.6% 1.4% 3.3% 16.3% 2.3% 6.5% 11.0%

6.6% 0.5% 0.8% 5.2% 0.0% 0.9% 0.0% 2.5% 7.4% 2.0% 0.9% 0.1% 7.3% 0.2% 4.3% 6.1%

11.1% 34.0% 0.5% 34.6% 3.6% 3.0% 22.3% 9.8% 1.7% 28.6% 0.4% 20.2% 0.0% 35.4% 15.1% 9.1%

1490 | Environ. Sci.: Processes Impacts, 2015, 17, 1482–1491

This journal is © The Royal Society of Chemistry 2015

View Article Online

Published on 02 July 2015. Downloaded by University of Cambridge on 01/11/2015 07:44:13.

Paper

datasets collected within the GMOS network. Its application to three years of data allowed a very detailed analysis for each Tekran analyser used in the network. This centralized tool gave a fast and general overview of the analyser behaviour, and a rapid check of data quality. The ags adopted to tag values within datasets allowed us to understand issues occurring frequently and noticeably affecting data quality. The analysis performed here by means of the G-DQM on the GMOS network should be considered preliminary, since the site operator approval step is necessary to nalize the validation process through a human check. However, the results presented here provide an important rst assessment of the mercury data acquired with the on-going GMOS stations and give important feedback for future instrument management and maintenance guidelines that could be taken into account in further development of mercury-oriented monitoring networks. G-DQM has been specically designed to give rapid feedback on monitoring of atmospheric mercury based on the Tekran instrument, and is now being expanded to include the mercury analyser manufactured by Lumex, following ad-hoc SOPs. Further progress will also include an inter-comparison with existing systems aimed to quality assure and control mercury datasets. Apart from mercury, the amount of environmental data in general is expected to increase rapidly in the coming years, thus there is an increasing need for automated, platform-based methods to check and correct data to ensure that datasets provided to various end users are of highest quality.

Acknowledgements This work contributes to the EU-FP7 project Global Mercury Observation System (GMOS). We deeply thank the staff at the GMOS stations: for Bariloche (Argentina), M. Diguez and E. Garcia; for Calhau (Cape Verde), K. Read; for Cape Point (South Africa), M. Lynwill and E. G. Brunke; for Celestun and Sisal (Mexico), F. Sena; for Col Margherita (Italy), W. Cairns; for Ev-K2 (Nepal), I. Ammoscato; for Iskrba (Slovenia), J. Kotnik and M. Horvat; for Kodaikanal (India), R. Ramachandran; for La Seynesur Mer (France), J. Knoery; for Longobucco (Italy), F. Cofone and I. Ammoscato; for Manaus (Brazil), P. Artaxo and F. Morais; for Mt. Ailao, Mt. Changbai and Mt. Walinguan (China), X. Feng, X. Fu and H. Zhang; and for Station Nord (Greenland), C. Nordstroem and H. Skov.

References 1 I. W¨ angberg, J. Munthe, T. Berg, R. Ebinghaus, H. Kock, C. Temme, E. Bieber, T. Spain and A. Stolk, Atmos. Environ., 2007, 41, 2612–2619. 2 K. Tørseth, W. Aas, K. Breivik, A. Fjæraa, M. Fiebig, A. Hjellbrekke, C. L. Myhre, S. Solberg and K. Yttri, Atmos. Chem. Phys., 2012, 12, 5447–5481.

This journal is © The Royal Society of Chemistry 2015

Environmental Science: Processes & Impacts

3 AMAP, AMAP Assessment 2011: Mercury in the Arctic, Arctic Monitoring and Assessment Programme (AMAP), P.O. Box 8100 Dep, N-0032 Oslo, Norway, 2011. 4 F. Sprovieri, L. Gratz and N. Pirrone, E3S Web Conference, 2013. 5 J. Munthe, F. Sprovieri, M. Horvat and R. Ebinghaus, SOPs and QA/QC protocols regarding measurements of TGM, GEM, RGM, TPM and mercury in precipitation in cooperation with WP3, WP4 and WP5, GMOS deliverable 6.1, CNR-IIA, IVL, 2011. 6 R. Brown, N. Pirrone, C. van Hoek, M. Horvat, J. Kotnik, I. W¨ angberg, W. Corns, E. Bieber and F. Sprovieri, Accreditation and Quality Assurance: Journal for Quality, Comparability and Reliability in Chemical Measurement, 2010, vol. 15, pp. 359–366. 7 A. Steffen, T. Scherz, M. Olson, D. Gay and P. Blanchard, J. Environ. Monit., 2012, 14, 752–765. 8 D. Gay, D. Schmeltz, E. Prestbo, M. Olson, T. Sharac and R. Tordon, Atmos. Chem. Phys., 2013, 13, 10521–10546. 9 J. Campbell, L. Rustad, J. Porter, J. Taylor, E. Dereszynski, J. Shanley, C. Gries, D. Henshaw, M. Martin, W. Sheldon and E. Boose, BioScience, 2013, 63, 574–585. 10 M. Landis, R. Stevens, F. Schaedlich and E. Prestbo, Environ. Sci. Technol., 2002, 36, 3000–3009. 11 A. Steffen, T. Douglas, M. Amyot, P. Ariya, K. Aspmo, T. Berg, J. Bottenheim, S. Brooks, F. Cobbett, A. Dastoor, A. Dommergue, R. Ebinghaus, C. Ferrari, K. Gardfeldt, M. Goodsite, D. Lean, A. Poulain, C. Scherz, H. Skov, J. Sommar and C. Temme, Atmos. Chem. Phys., 2008, 8, 1445–1482. 12 S. Cinnirella, F. D'Amore, M. Bencardino, F. Sprovieri and N. Pirrone, Environ. Sci. Pollut. Res., 2014, 21, 4193–4208. 13 K. Ashton, RFID J., 2009, 22, 97–114. 14 S. Lindberg and W. Stratton, Environ. Sci. Technol., 1998, 32, 49–57. 15 C. Lin and S. Pehkonen, Atmos. Environ., 1999, 33, 2067– 2079. 16 H. Amos, D. Jacob, C. Holmes, J. Fisher, Q. Wang, E. C. RM Yantosca, E. Galarneau, A. Rutter, M. Gustin, A. Steffen, J. Schauer, J. Graydon, V. Louis, R. Talbot, E. Edgerton, Y. Zhang and E. Sunderland, Atmos. Chem. Phys., 2012, 12, 591–603. 17 O. Lindqvist and H. Rodhe, Tellus, 1985, 37, 136–159. 18 Tekran, Tekran, Model 2357A Principles of Operation, Tekran. 19 S. Lindberg, S. Brooks, C. Lin, K. Scott, M. Landis, R. Stevens and M. Goodsite, Environ. Sci. Technol., 2002, 36, 1245–1256. 20 F. DAmore, S. Cinnirella and N. Pirrone, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2012, 5, 1761–1771. 21 J. de la Beaujardiere, OpenGIS Web Map Service (WMS) Implementation Specication, OGC 06–042, Open Geospatial Consortium, 2010. 22 http://www.earthobservations.org/geoss.

Environ. Sci.: Processes Impacts, 2015, 17, 1482–1491 | 1491

QC system: implementation for atmospheric mercury data from the global mercury observation system.

The overall goal of the on-going Global Mercury Observation System (GMOS) project is to develop a coordinated global monitoring network for mercury, i...
1MB Sizes 0 Downloads 5 Views