Ann. Occup. Hyg., 2014, Vol. 58, No. 4, 450–468 doi:10.1093/annhyg/met081 Advance Access publication 21 January 2014
Reliability of the Advanced REACH Tool (ART) Jody Schinkel1*, Wouter Fransman1, Patricia E. McDonnell2, Rinke Klein Entink1, Erik Tielemans1 and Hans Kromhout3
A b st r a ct Objectives: The aim of this study was to assess the reliability of the Advanced REACH Tool (ART) by (i) studying interassessor agreement of the resulting exposure estimates generated by the ART mechanistic model, (ii) studying interassessor agreement per model parameters of the ART mechanistic model, (iii) investigating assessor characteristics resulting in reliable estimates, and (iv) estimating the effect of training on assessor agreement. Methods: Prior to the 1-day workshop, participants had to assess four scenarios with the ART. During two 1-day workshops, 54 participants received 3-h training in applying the mechanistic model and the technical aspects of the web tool. Afterward, the participants assessed another four scenarios. The assessments of the participants were compared with gold standard estimates compiled by the workshop instructors. Intraclass correlation coefficients (ICCs) were calculated and per model parameter and the percentage agreement and Cohen kappa statistics were estimated. Results: The ICCs showed good agreement before and almost perfect agreement after training. However, substantial variability was observed between individual assessors’ estimates for an individual scenario. After training, only 42% of the assessments lay within a factor of three of the gold standard estimate. The reliability appeared to be influenced by several factors: (i) information provided by text and video hampered the assessors gaining additional information required to make the assessments, (ii) for some parameters, the guidance documentation implemented in the tool may have been insufficient, and (iii) in some cases, the assessors were not able to implement the information explicitly provided. Conclusions: The ART is an expert tool and extensive training is recommended prior to use. Improvements of the guidance documentation, consensus procedures, and improving the training methods could improve the reliability of ART. Nevertheless, considerable variability can be expected between assessors using ART to estimate exposure levels for a given scenario. K e y w or d s : Advanced REACH Tool; expert judgement; exposure assessment; reliability
I n t ro d u ct i o n Both quantitative measures of personal exposure to contaminants through air monitoring and professional
judgment can be employed in occupational exposure assessment. However, actual measurements are often considered labor intensive and too expensive.
© The Author 2014. Published by Oxford University Press on behalf of the British Occupational Hygiene Society.
450 •
Downloaded from http://annhyg.oxfordjournals.org/ at Cornell University Library on July 22, 2015
1.TNO, PO Box 360, 3700 AJ Zeist, The Netherlands 2.Alkermes Pharma Ireland Limited, Monksland, Athlone, Ireland 3.Institute for Risk Assessment Sciences, Environmental Epidemiology Division, Utrecht University, 3508 TD Utrecht, The Netherlands *Author to whom correspondence should be addressed. Tel: +31-88-866-1532; fax: +31-88-866-8766; e-mail:
[email protected] Submitted 6 June 2013; revised 9 December 2013; revised version accepted 16 December 2013.
Reliability of the ART • 451
de Cock et al., 1996; Kunac et al., 2006; Logan et al., 2009; Vadali et al., 2009; Friesen et al., 2011), only very few studies have been conducted and published on the reliability of occupational exposure models. Two of the exceptions are firstly, a reliability study of the dermal exposure assessment method (DREAM) (van Wendel de Joode et al., 2005) and secondly, a study on EASE (Cherrie et al., 2003). Our reliability study investigated the reliability of the ART web tool (using the currently available version 1.0) when used by different users relying on workplace conditions described by text and videos. The users had to interpret the same information for the ART parameters. Two 1-day workshops were organized. Health and safety professionals from the Netherlands and Belgium assessed four (either liquid or dust) exposure scenarios with the ART before they attended the workshop. During the workshop, the health and safety professionals received 3-h training on the use of the mechanistic model of the ART and technical aspects of the web tool. During the afternoon, another four (either liquid or dust) exposure scenarios were assessed by the same participants. The aim of our study was to assess the reliability of ART by (i) studying the interassessor agreement of the resulting exposure estimates generated by the ART mechanistic model, (ii) studying interassessor agreement per parameter of the ART mechanistic model, (iii) investigating assessor characteristics explaining the deviations from the gold standard estimates, and (iv) estimating the effect of training on assessor agreement. M et h o d s
The ART framework The ART mechanistic model is based on a source– receptor model (Cherrie et al., 1996; Tielemans et al., 2008) which describes a stepwise transport of a contaminant from the source to the receptor and defines independent principal modifying factor (MFs). Each of these MFs has a set of underlying parameters and inputs of the parameters are used to calculate an exposure estimate. The mechanistic model is described extensively elsewhere (Fransman et al., 2011) and summarized below. The ART mechanistic model consists of one algorithm to estimate the contribution from near-field (NF) [equation (1)] and one for estimating the contribution from far-field (FF) sources [equation (2)].
Downloaded from http://annhyg.oxfordjournals.org/ at Cornell University Library on July 22, 2015
Furthermore, regulations require exposures from new chemicals and new operations to be assessed before workers are actually exposed, often making exposure assessment based on exposure measurements impossible. In such cases professional judgment could be used; however, one of the disadvantages of using professional judgment for estimating exposures is the underlying black box of unstructured opinions about the respective exposure situation (Cherrie et al., 1996). The professionals conducting the assessments may not be familiar with the jobs or industry and their background may influence how they assess exposure (Kromhout et al., 1987; Steinsvag et al., 2007). Exposure models may be more transparent alternatives to assist in the exposure assessment process. The Advanced REACH Tool (ART) was developed as a higher tier exposure assessment model for the purpose of the compilation of exposure scenarios under the new European legislation, Registration Evaluations Authorization and restriction of Chemicals (REACH). The ART estimates exposure levels for specific scenarios for groups of workers that share operational conditions and risk management measures across different workplaces in Europe. The ART is freely available online (www.advancedreachtool.com, accessed December 2013) (Tielemans et al., 2011). When developing exposure models both validity and reliability of the model are important issues to address (van Wendel de Joode et al., 2005). Validation compares the model estimates of exposure to measurement data (accuracy and bias) while the reliability is a measure of the consistency of assessments or of the ability of different assessors to reach the same conclusions about a specific case (Kunac et al., 2006). If the model is used within the applicability domain of the model, user variation in model estimates may occur if a user has a limited understanding of the exposure scenario (operational conditions and risk management measures), or when the exposure model and/or if the model is misused. The impact of user variation could have serious consequences for workers’ health, if an exposure scenario is incorrectly diagnosed as ‘safe’, or for the financial situation of the organization if an exposure scenario is incorrectly diagnosed as ‘unsafe’, which could lead to costly overengineering. While studies have investigated the reliability or accuracy of subjective judgments of exposure by experts and layman (Kromhout et al., 1987; Post et al., 1991;
452 • Reliability of the ART
Personal exposure from a near-field source (Cnf ) is a multiplicative function of substance emission potential (E), activity emission potential (H), (primary) localized control (LC1), secondary localized control (LC2), and dispersion (D). The algorithm for a farfield source (Cff ) also includes segregation (Seg) and personal enclosure/separation (Sep). Cnf = (Enf ⋅H nf ⋅LCnf1 ⋅LCnf2 ) ⋅Dnf (1)
Cff = (Eff ⋅H ff ⋅LCff1 ⋅LCff2 ⋅Seg ff ) ⋅Dff ⋅Sep (2)
The level of surface contamination (Su) for each activity depends on the location of the source, i.e. whether there is (i) a near-field source only [equation (3)], (ii) a far-field source only [equation (4)], or (iii) both near- and far-field sources (in which case the surface contamination in the near-field is assumed to dominate that of the far-field [see equation (3)]):
Su nf = Su factor ⋅( Enf ⋅H nf ⋅LCnf1 ⋅LCnf2 ⋅Dnf ) (3)
Su ff = Su factor ⋅(Eff ⋅H ff ⋅LCff1 ⋅LCff2 ⋅ (4) ⋅Seg ff ⋅Dnf ⋅Sep ff )
Subsequently, the overall exposure is estimated by algorithm [equation (5)]:
Ct =
1 t total
∑(texposure ⋅(Cnf +Cff + Su)) (5) tasks
The algorithm considers multiple activities [and exposure time (texposure)] within an 8-h work shift (ttotal) and also allows periods with assumingly zero exposure (tnon-exposure). Relative multipliers have been assigned to the underlying categories per determinant of each MF which are used as multipliers in the mechanistic model algorithm. The ART mechanistic exposure algorithm results in dimensionless relative exposure scores, which were calibrated using exposure measurements, enabling the mechanistic model to translate the relative scores into a geometric mean (GM) exposure level of an exposure scenario in mg·m−3 (Schinkel et al., 2011). This study focuses on the reliability of the mechanistic model of the ART for exposures to inhalable dust and exposure
Study design Health and safety professionals were approached to participate via a mailing to registered ART users and via a mailing of the membership of the Dutch Occupational Hygiene Association (NVvA). Participants were asked to (i) estimate the exposure of four different (either liquid or dust) exposure scenarios with the ART web tool version 1.0 (with similar mechanistic model as version 1.5) before the actual workshop took place, (ii) attend a 3-h training session on the scientific background of the ART mechanistic model and the functionalities of the web tool, and (iii) after the training to, assess another four (either liquid or dust) exposure scenarios with the ART). In total, 54 health and safety professionals attended the workshop. The 27 professionals that responded first were placed in the ‘dust’ workshop (assessing dust scenarios) and the remaining 27 professionals were placed in the ‘liquid’ workshop (assessing liquid scenarios). Prior to the workshop, two of the authors ( J.S. and W.F.) compiled eight exposure scenarios for both the dust and liquid workshop. Each exposure scenario was described by a video accompanied with a text document. The videos showed the product (only in case of situations describing exposure to dusts), the activity performed and the workplace, while information like product composition, vapor pressure, room volume, and ventilation rate were described by text. Two weeks before the workshop, the information was sent via e-mail to the participants. Table 1 summarizes the level of detail of information provided per MF through the videos or text. A questionnaire was used to gather information about work and career history for each participant. This information was used to categorize the assessors into four groups with comparable education levels, work experience, professions, age, and their years’ experience with other exposure assessment tools. Table 2 summarizes the characteristics of the assessors. Prior to the workshop, Group A assessed the dust scenarios 1–4 and Group B assessed scenarios 5–8. After the workshop, Group A assessed scenarios 5–8 and Group B assessed scenarios 1–4. A similar design was used for the Groups C and D for the liquids workshop. The online ART version 1.0 provides a summary report in a PDF, which lists the ART model
Downloaded from http://annhyg.oxfordjournals.org/ at Cornell University Library on July 22, 2015
to (non-)volatile liquids and does not address the Bayesian application of the tool.
Reliability of the ART • 453
Table 1. Outline of information on MFs and underlying parameters provided by the video or text Information provided
Rationale for providing information
Product characteristics
Participants had to decide on product descriptions
Dustiness
Product samples are shown in video
Dustiness test results are not normally available, so have to use subjective assessment of dustiness based on visual inspection of the product
Moisture content
Product samples are shown in video
In reality, this is quite subjective but the ART description text should guide participants
Percentage of substance in the product
Specify the percentage of substance in the product in text
This information is readily available in the workplace (e.g. material safety data sheets)
Vapor pressure
Vapor pressure for the substance of interest is provided in the text
This information is readily available in the workplace (e.g. material safety data sheets)
Viscosity
In the video the product is shown
In reality, this is quite subjective and the assessor should assess the viscosity of a product based on visual inspection of the product
Activity coefficient
Provided in the text
If all the information is available, the assessor could use other tools to derive the right activity coefficient which is outside the scope of this study. The activity coefficient used in this study was per default 1
Temperature
Not provided
The temperature was only provided in the single case that the temperature of the process was 120°C. All the other cases, room temperature was assumed
Activity (sub)class
Brief description of activity (e.g. using compressed air, tableting, degreasing, refueling) and a video of operator performing task
Participants had to decide on activity class and subclass based on the guidance provided by the tool
Activity class-specific determinants
The performed activity was shown in the video
Participants had to decide on activity-related determinants based on the guidance provided by the tool
Exposure form Exposure form Substance emission potential
Downloaded from http://annhyg.oxfordjournals.org/ at Cornell University Library on July 22, 2015
Activity emission potential
454 • Reliability of the ART
Table 1. Continued Information provided
Rationale for providing information
The workplace where the activity was performed was shown in the video
Based on the workplace shown in the video the participants had to decide whether or not localized controls were present. If so, the participant had to decide which localized control based on the guidance provided by the tool
The workplace where the activity was performed was shown in the video and the text incorporated a brief description about cleaning procedures
In reality, cleaning procedures are something that could be recovered at the workplace and surface contamination could be visually depicted at the workplace
Distance to source
The performed activity was shown in the video
Participants had to decide on distance to source based on the activities shown in the video
Location
The performed activity was shown in the video and characteristics of the location were provided in the text
Based on the video it was for indoor situations difficult to say something about the room volume or spray booth type. Information on room volume was provided in the text. Based on the guidance provided by the tool the participants had to choose the right categories
Ventilation
The performed activity was in the video and characteristics of the ventilation were provided in the text
Based on the video it was for indoor scenarios not possible to say something about ventilation rates. Information about ventilation was provided in the text. Based on the guidance provided by the tool the participants had to choose the right categories
The performed activity was shown in the video and characteristics of the activity were provided in the text
The activity was shown in the video. However, if the worker was not seen in the video because the activity was performed in another room this was mentioned in the text
The performed activity was shown in the video and characteristics of the activity were provided in the text
The activity was shown in the video. However, if the worker was not seen in the video because he was in a control room during the activity this was mentioned in the text
Localized controls Localized control
Surface contamination
Dispersion
Segregation Segregation
Personal enclosure Personal enclosure
Downloaded from http://annhyg.oxfordjournals.org/ at Cornell University Library on July 22, 2015
Question regarding enclosure of process, housekeeping, and maintenance
Reliability of the ART • 455
Table 2. Assessor characteristics regarding profession and experience per group Group A
Group B
Group C
Group D
Number of assessors
14
13
13
14
Workshop
Solids
Solids
Liquids
Liquids
Mean age
47
45
47
44
MSc or PhD
7
7
9
8
Not specific, MBO or HBO
7
6
4
6
Consultant
2
2
4
0
Occupational hygienist
8
7
7
13
Else (safety expert, toxicologist)
4
4
2
1
10 years
7
8
12
9
10 years
4
7
11
6
Never
4
4
3
6
1–10 times a year
7
7
7
6
>10 times a year
3
2
3
2
3
1
1
2
Occupational hygiene education
Experience in occupational hygiene
Experience in chemical exposure
Conducting exposure measurements
Experience with exposure models (training received) No experience with exposure models Experience with EASE
8 (4)
8 (4)
9 (5)
7 (3)
Experience with Ecetoc TRA
6 (2)
6 (1)
1
2 (1)
Experience with Stoffenmanager
8 (2)
12 (3)
Experience with welding fume assistant
11 (2)
10 (2)
6
3
7
6
2
3
0
4 (2)
Never
5
1
2
2
25 assessments
4
7
4
2
Experience with other models (COSHH, Consexpo, MEASE, DREAM, BEAT, RISKOFDERM) Exposure models used
Downloaded from http://annhyg.oxfordjournals.org/ at Cornell University Library on July 22, 2015
Profession
456 • Reliability of the ART
Table 2. Continued Group A Experience with the ART (training received)
7 (3)
Group B
Group C
Group D
8 (2)
4 (0)
5 (2)
ART used Never
8
5
9
12
25 assessments
1
4
1
0
Statistical analyses Scores were assigned for each of the ART parameters for each of the 16 exposure scenarios, by one member of the project team ( J.S.) and subsequently reviewed by the other project team member (W.F.). Both project members were involved by the development of
the ART mechanistic model and the writing of the guidance text of the model parameters being used for the workshop. After looking at the videos and process descriptions in the information provided, both project members assigned inputs for the model parameters and discussed until consensus was reached. This consensus result was used as the ‘gold standard’ assessment for each parameter used in the mechanistic model. The level of agreement between each of the 54 assessors and the gold standard was investigated by estimating the percentage of ratings per exposure parameter that was in agreement with the gold standard assignment. In order to be very strict in the level of agreement, agreement was only the case if the assessor inputted the closest level of detail for a specific determinant. For example, if the description of the exposure situation states that the concentration of the substance of interest was 35%, only the exact input for fraction (0.35) was deemed as the correct input, and the categorical input of 10–50% was counted as non-agreement. Similar requirements were followed for room volumes and ventilation rates. Although all of the exposure scenarios described in this study consisted of single activities, the ART allows assessors to assign multiple activities. If an assessor assigned multiple activities, the exposure estimate based on the multiple activities was used but the inputs of the main activity within a scenario were used in the analyses for investigating the agreement per parameter. To investigate the interassessor agreement for ART exposure estimates, intraclass correlation coefficients (ICCs) were calculated for the exposure estimate that were derived before and after training and stratified for solid and liquid situations. ICCs are a measure of interrater agreement for continuous data and may be
Downloaded from http://annhyg.oxfordjournals.org/ at Cornell University Library on July 22, 2015
inputs per parameter and resulting exposure estimate for each scenario assessment. Participants were asked to send their PDF reports of the four assessed scenarios to the first author of this manuscript the week before the workshop. The solid and liquid workshop took place in TNO in Zeist (The Netherlands) in May 2012. Both workshops were attended by 27 different assessors. Because of the study design, during the workshop, the assessors did not receive feedback, were not allowed to discuss the scenarios with each other, and could not ask questions about their assessments. The instructors ( J.S. and W.F.) provided exactly the same training for the ‘dust’ and ‘liquid’ workshops. The user inputs for the mechanistic model parameters are used to estimate a GM exposure level. The ART web tool also provides the option to assess different outputs (e.g. full-shift exposure estimates, long-term exposure estimates, different percentiles of the exposure distribution and confidence intervals (CI), with or without taking into account non-exposure periods). However, these different outputs are based upon derived variance components and not on the user inputs for the mechanistic model. For the model outputs to be comparable, all the exposure estimates derived by the participants were standardized to full-shift 50th percentile exposure estimates, without CI and without non-exposure periods.
Reliability of the ART • 457
R e s u lts The calculated ICCs were 0.77 and 0.76, respectively, for the solid and liquid scenarios before training and 0.86 and 0.82, respectively, after the training session. These results indicate a tendency toward improvement of the assessments after training, although the improvement was not statistically significant. Despite the good agreement (indicated by the large ICCs), there was substantial variation in exposure estimates
between assessors assessing the same scenario, as illustrated in Figs 1 and 2. Analyses of the relative bias [(log estimate − log gold standard estimate)/log gold standard estimate] showed the bias averaged over assessors and scenarios to be zero. Also, there was no statistically significant difference in average between the assessments before and after training. This result suggests that, although individual assessors may be incorrect, the average of the group was close to the gold standard estimate. However, there was a statistically significant difference in the within-scenario variability between the assessments before and after training. After training, the within-scenario variance was estimated to be 26% smaller than before training. This indicates that a higher degree of agreement was obtained between the assessors after training. The proportion of AR between the estimated exposures for each assessor and the gold standard estimate was used to illustrate the bias of assessors’ estimates with the gold standard. Where an assessor estimates an exposure level of 10 mg·m−3 while the gold standard estimate was 5 mg·m−3, the AR was 2 (10/5). Conversely, for this scenario if the assessors’ estimate was 1 mg·m−3, the AR was 5 [1/(1/5)]. Table 3 presents these AR for the solid assessments. Large differences were observed between scenarios. For example, after training 62% of the assessments for compressed air cleaning were within a factor of three from the gold standard estimate. In contrast, Table 3 also shows that for the sweeping scenario only 7% of the assessments were within a factor of three from the gold standard estimate. Training slightly improved the accuracy of the assessments. Table 4 illustrates similar results for the liquid assessments. Large differences between scenarios were observed. For example, 85% of the assessments for degreasing were within a factor of three, while only 7% of the galvanizing assessments were within a factor of three of the golden standard after training. Overall after training, 42% of the assessments were within a factor of three and 60% within a factor of nine. Consequently after training, the difference between the estimate and the gold standard for 40% of the assessments was more than a factor of nine. Logistic regression models were used to investigate whether the participants’ characteristics summarized in Table 2, increased the chance of the result of an assessment being within a factor of three of the gold
Downloaded from http://annhyg.oxfordjournals.org/ at Cornell University Library on July 22, 2015
interpreted similar to Cohen’s kappa statistic for categorical data (Fleiss, 1975). ICCs were calculated by dividing the variance in exposure estimates between exposure scenarios by the total variance in exposure estimates. Variance components were estimated by mixed linear models (PROC MIXED) of SAS (version 9.3; SAS Institute, Cary, NC, USA) using the maximum likelihood (REML) and a compound symmetric variance matrix (CS). Exposure scenario was included as random effect in the mixed model with the log-transformed estimated exposure as the dependent variable. The absolute ratio (AR) between the assessors’ exposure estimates and the gold standard estimates was calculated in order to illustrate the accuracy of the exposure estimates. Furthermore, the influence of training and the influence of participants’ characteristics summarized in Table 2, on the accuracy of the assessment was investigated using a logistic regression model with the chance of the result of an assessment being within a factor of three as the dependent variable. All the analyses were performed using SAS Statistical Software (version 9.3; SAS Institute). In order to estimate agreement between assessors per model parameter, Cohen kappa statistics (k) were estimated as the proportion of agreement beyond that expected by chance alone (Fleiss, 1981). The level of agreement expected by chance depends on the number of categories an assessor had to choose from. For continuous variables (e.g. vapor pressure and mole fraction), Cohen’s kappa statistic could not be calculated and only the percentage agreement was calculated. The strength of the interassessor agreement was qualified using terms defined by Landis and Koch (1977): kappa statistic ≤0 = no agreement (other than would be expected by chance), 0.01–0.20 = slight agreement, 0.21–0.40 = fair agreement, 0.41–0.60 = moderate agreement, 0.61–0.80 = substantial agreement, and 0.81–1.00 = almost perfect agreement.
458 • Reliability of the ART
standard assessment. Firstly, in order to investigate the effect of the characteristics presented in Table 2 on the factor change relative to gold standard estimate, these characteristics were included into an univariate regression model. Next, all the characteristics that had an significant effect were combined in multivariate logistic regression models while most characteristics were correlated. As a result, only training [training versus no training, odds ratio (OR) = 1.72 (95% CI: 1.12– 2.61)] and years of experiences in chemical exposure assessment (>10 years versus 80%. After training, 59% of the parameters had agreement >80%. The kappa statistics for 67% of the mechanistic model parameters showed substantial or almost perfect agreement (>0.60) with the gold standard before training. After training, this percentage increased to 70%. However even after training, some parameters showed less than moderate agreement with the gold standard. Table 6 presents the results per model parameter for the liquid scenarios. Prior to training, 38% of the model parameters showed overall agreement >80%. After training this proportion increased to 65%. The kappa statistics showed substantial or almost perfect agreement with the gold standard for 64% of the parameters prior to
Downloaded from http://annhyg.oxfordjournals.org/ at Cornell University Library on July 22, 2015
1 Exposure estimates for the scenario’s describing exposure to dust.
Reliability of the ART • 459
training and 79% after training. As is evident for the dust scenarios, even after training, the kappa’s for some parameters showed less than moderate agreement with the gold standard. Overall, the results for the solid and liquid assessments show similar patterns. For both solid and liquid scenarios, some activity emission potential parameters were found to be subjective and difficult to classify even after training was provided (e.g. contamination level, impaction level, and loading type). Also parameters such as viscosity, spray room type, and if the ‘outside situation was located in proximity of buildings’ were found to be difficult to assign based on the provided information. In contrast, exposure form and some activity emission potential parameters including drop height, spray technique, use rate, and surface area were assigned with almost perfect agreement. However, there were differences in level of agreement per parameter between the dust and liquid scenarios; e.g. for the parameters ‘level of containment’ and ‘exposure situation located in proximity of
buildings’, the level of agreement was almost perfect after training for the liquid scenarios while it was less than moderate for the dust scenarios. Di s c u s s i o n This study is focused on the reliability of the ART web tool. In total, 54 occupational health and safety professionals assessed four (either liquid or dust) exposure scenarios prior to 3-h training and four exposure scenarios after this training. The relatively high ICCs indicate almost perfect interassessor agreement for the overall exposure estimates. Although most of the variability in assessors’ exposure estimates was found to be between scenarios and not within scenarios (indicated by the large ICCs), substantial variability was observed between assessors assessing the same exposure scenario. After training, the assessors’ exposure estimate per scenario became more comparable. Also increased years’ experience in the field of chemical exposure assessment was shown to improve the ART
Downloaded from http://annhyg.oxfordjournals.org/ at Cornell University Library on July 22, 2015
2 Exposure estimates for the scenario’s describing exposure to liquids.
460 • Reliability of the ART
Table 3. Percentage of AR between estimates and gold standard estimates for the solid assessments stratified per scenario with and without training Workshop
AR ≤3
AR >3 and ≤9
AR >9 and ≤30
Solids
Prior
27
25
15
9
24
After
37
12
27
7
18
Prior
38
15
15
8
23
After
36
14
29
7
14
Prior
38
23
8
23
8
After
50
0
43
7
0
Prior
8
31
8
8
46
After
7
0
7
7
79
Prior
46
31
0
0
23
After
36
21
29
7
7
Prior
14
43
29
7
7
After
38
23
23
15
0
Prior
21
14
21
7
36
After
23
15
38
8
15
Prior
8
25
17
8
42
After
46
0
23
8
23
Prior
43
14
21
14
7
After
62
15
23
0
0
Crushing tablets Dumping Sweeping Floor construction Powder coating Tableting Stacking bags Compressed air cleaning
exposure estimates. However after training, only 42% of the assessments were within a factor of three of the gold standard estimate and in 40% of the scenarios the difference between the estimate and the gold standard were more than a factor of nine. Several studies have been published investigating the accuracy and reliability of (expert based) exposure assessments. These assessments were based on different levels of available information, exposure data, and experience of the assessors. As expected, such professional judgments are often subjective and result in exposure judgments with a wide range of accuracy, which is dependent on many factors (Logan et al., 2009). Our study did not investigate whether experts could provide accurate exposure estimates, but it investigated whether health and safety professionals could provide reliable inputs to the ART.
AR >30 and ≤90
AR >90
Although reliability is a necessary component of the validity of occupational exposure models, only a few published studies have investigated the reliability of worker exposure models. The most comparable reliability study is the study on the DREAM (van Wendel de Joode et al., 2005). As with the ART model, relative scores for model parameters are multiplied by an exposure algorithm which produced semiquantitative exposure estimates. However, the results of this study are not directly comparable to our study as in the DREAM study, occupational hygienists performed side by side observations of different workplace tasks. Unlike our study, they could actually see the activity rather than being limited to a video or text, this observational approach would likely involve less need to interpret workplace conditions. However, the ICCs found in the DREAM study (between 0.48 and 0.87),
Downloaded from http://annhyg.oxfordjournals.org/ at Cornell University Library on July 22, 2015
Scenario
Reliability of the ART • 461
Table 4. Percentage of AR between estimates and gold standard estimates for the liquid assessments stratified per scenario with and without training Workshop
AR ≤3
AR >3 and ≤9
AR >9 and ≤30
Liquids
Prior
35
26
11
7
22
After
46
25
13
4
12
Prior
7
21
7
14
50
After
23
38
15
8
15
Prior
86
14
0
0
0
After
85
15
0
0
0
Prior
36
14
14
14
21
After
31
23
38
8
0
Prior
14
50
29
0
7
After
31
31
31
8
0
Prior
31
15
0
8
46
After
7
36
7
0
50
Prior
15
31
0
15
38
After
50
21
7
0
21
Prior
33
25
34
0
8
After
79
7
7
0
7
Prior
54
38
8
0
0
After
64
29
0
7
0
Metal working fluids Degreasing Pesticide spraying Removing tape Galvanizing Cleaning Paint spraying Refueling
separated for potential and actual dermal exposure and for the different exposure routes, were comparable to ICCs of our study (0.76–0.86). The first explanation for the large deviations of the estimates exposures with the gold standard estimates is that information about circumstances was provided by text documents and video, so assessors were unable to look at or ask for more detailed information which they may have felt was required to make the assessments. The reliability might have been improved if the assessors were able to ask for additional information. For example, the video of the cleaning scenario showed a worker cleaning inside a low specification glove box. However, the information needed to specify the glove box was difficult to see on the video. As a result, prior to training 15% of the assessors chose the low specification glove box and 43% also chose this after training.
AR >30 and ≤90
AR >90
However, a glove box was assigned as localized control in 69% of the scenarios prior and 93% after training. Assigning a high specification glove box instead of on low specification glove box results in an overestimation of exposure by a factor 10 (Fransman et al., 2011). Classification of the type of glove box, e.g. would have been easier during a workplace visit. Secondly, for some parameters the guidance documentation implemented in the tool seems to be insufficient to come to reliable assessments. The kappa statistics for subjective parameters such as amount of product, contamination level, containment, exposure situation located close to buildings, viscosity, and general housekeeping were