Environ Monit Assess (2015) 187:422 DOI 10.1007/s10661-015-4570-y

Integration of electromagnetic induction sensor data in soil sampling scheme optimization using simulated annealing E. Barca & A. Castrignanò & G. Buttafuoco & D. De Benedetto & G. Passarella

Received: 7 January 2015 / Accepted: 26 April 2015 # Springer International Publishing Switzerland 2015

Abstract Soil survey is generally time-consuming, labor-intensive, and costly. Optimization of sampling scheme allows one to reduce the number of sampling points without decreasing or even increasing the accuracy of investigated attribute. Maps of bulk soil electrical conductivity (ECa) recorded with electromagnetic induction (EMI) sensors could be effectively used to direct soil sampling design for assessing spatial variability of soil moisture. A protocol, using a field-scale bulk ECa survey, has been applied in an agricultural field in Apulia region (southeastern Italy). Spatial simulated annealing was used as a method to optimize spatial soil sampling scheme taking into account sampling E. Barca (*) : G. Passarella Water Research Institute (IRSA)–National Research Council (CNR), Bari, Italy e-mail: [email protected] G. Passarella e-mail: giuseppe.passarella @ba.irsa.cnr.it A. Castrignanò : D. De Benedetto Research Unit for Cropping Systems in Dry Environments, Consiglio per la Ricerca in Agricoltura e l'Analisi dell'Economia Agraria, Bari, Italy A. Castrignanò e-mail: [email protected] D. De Benedetto e-mail: [email protected] G. Buttafuoco Institute for Agricultural and Forest Systems in the Mediterranean, National Research Council of Italy, Rende, CS, Italy e-mail: [email protected]

constraints, field boundaries, and preliminary observations. Three optimization criteria were used. the first criterion (minimization of mean of the shortest distances, MMSD) optimizes the spreading of the point observations over the entire field by minimizing the expectation of the distance between an arbitrarily chosen point and its nearest observation; the second criterion (minimization of weighted mean of the shortest distances, MWMSD) is a weighted version of the MMSD, which uses the digital gradient of the grid ECa data as weighting function; and the third criterion (mean of average ordinary kriging variance, MAOKV) minimizes mean kriging estimation variance of the target variable. The last criterion utilizes the variogram model of soil water content estimated in a previous trial. The procedures, or a combination of them, were tested and compared in a real case. Simulated annealing was implemented by the software MSANOS able to define or redesign any sampling scheme by increasing or decreasing the original sampling locations. The output consists of the computed sampling scheme, the convergence time, and the cooling law, which can be an invaluable support to the process of sampling design. The proposed approach has found the optimal solution in a reasonable computation time. The use of bulk ECa gradient as an exhaustive variable, known at any node of an interpolation grid, has allowed the optimization of the sampling scheme, distinguishing among areas with different priority levels. Keywords Sampling . EMI sensor . Bulk electrical conductivity . Spatial simulated annealing . Spatial variability . Soil water content

422

Page 2 of 12

Introduction Farmers around the world are becoming more and more aware that agricultural production varies across the landscape due to various naturally occurring, as well as maninduced, differences in key productivity factors (Adamchuk et al. 2011). Implementing efficient management practices requires identifying and interpreting these differences so to match agricultural inputs with locally defined crop needs (Pierce and Nowak 1999). The main target of precision agriculture (PA) is to improve management of crop production through spatially variable within-field practices, which require an accurate assessment of spatial variation at very fine scales of resolution. Differences in physical, chemical, and biological attributes have been traditionally detected through soil/crop sampling and laboratory analysis. However, data can be difficult or expensive to collect, and both sampling design and data quality may affect final results (Cochran 1977; Müller 1998). It is then critical to carefully design a strategy of collecting spatial data, which usually requires a proper formulation of at least two components: the objective of survey and what is already known in the area (Stein and Ettema 2003). The aim of sampling, even within PA, can be manifold: assessing the amount of yield, the spatial distribution of weeds, or the uniformity of irrigation. Kerry et al. (2010) showed that sampling requirements vary as a function of target variable. In the perspective of PA, the usual objective of sampling is making a spatial investigation to be displayed on a map. With respect to prior information, it may consist of previously sampled data, the presence and shape of boundaries, roads, water bodies, or any type of physical obstacle, and of sub-areas with different priority of sampling. Sampling of soil/crop properties at within-field scales may involve considerable costs (travel, labor time, extraction of soil cores, laboratory analyses, etc.); therefore, for making PA costeffective, these costs must be minimized. Optimizing sampling scheme means to provide information of the required precision at the minimum cost. The increasingly importance of optimizing sampling scheme is also supported by books published in recent years, as de Gruijter et al. (2006) and Webster and Lark (2013). At present, there are several techniques for optimizing sampling (Marchant and Lark 2010), and a general methodology, called survey sampling (Stein and Ettema 2003), was formulated by Müller (1998) and Van Groenigen and Stein (1998). It consists in defining

Environ Monit Assess (2015) 187:422

an objective function (optimization criterion) being minimized or maximized while respecting the constraints imposed. Whichever optimization criterion, the costs of sampling and laboratory analysis are such that it is quite difficult to collect enough samples to characterize spatial variability at the accuracy required by precision agriculture (PA). This economic constrain generally results in low sampling density, which is the major recognized factor limiting PA efficacy. Proximal soil sensing (PSS) techniques could allow overcoming the limitation due to low sampling density (Viscarra Rossel and Adamchuk 2013). Adopting sampling methods based on proximal soil sensing could allow to collect high spatial density data relevant to the soil attributes of interest and to reduce sampling costs. Differently than remote sensing, proximal soil sensing can be used to measure soil properties close to, or even in contact with, the soil being surveyed and may or may not be mounted on vehicles for on-the-go operation (Adamchuk and Viscarra Rossel 2010). In particular, measuring bulk electrical conductivity (ECa), using electromagnetic induction (EMI) instrumentation equipped with GPS, has become one of the more widely used proximal sensing technologies for soil mapping (Corwin and Lesch 2005; De Benedetto et al. 2013). Geo-spatial measurements of ECa may be used as a powerful tool in site-specific management to direct supplemental soil sampling in those areas where variability is higher (Castrignanò et al. 2008). Moreover, in a geostatistical context, optimizing spatial sampling involves estimation of a model for spatial dependence, usually expressed by a variogram, which can be used to optimize interpolation of an environmental geovariable (e.g., McBratney and Webster 1983). Van Groenigen et al. (1999) proposed an annealing-based algorithm to optimize sampling schemes on a continuous solution space for different quantitative optimization criteria, taking into account physical sampling barriers and earlier measurements. The main objective of the paper was to propose a novel method for optimizing the sampling scheme for the soil water content assessment in an agricultural field by using bulk electrical conductivity (ECa) as auxiliary information in the form of an ECa map. The proposed procedure is an extension of spatial simulated annealing (SSA) for optimization, presented by Van Groenigen et al. (1999), which allows the inclusion of objective weighting factors by using auxiliary information. Soil water content was used as environmental geovariable, whereas the simulated annealing

Environ Monit Assess (2015) 187:422

was implemented by the software MSANOS (Barca et al. 2011, 2015). Moreover, the proposed method was compared with two procedures which do not use auxiliary information: (1) minimization of mean of the shortest distances (MMSD) criterion (Van Groenigen et al. 1999), which optimizes the spreading of the point observations over the entire field by minimizing the expectation of the distance between an arbitrarily chosen point and its nearest observation, and (2) minimization of weighted mean of the shortest distances (MWMSD) (Castrignanò et al. 2008), which is a weighted version of the MMSD and uses the digital gradient of the grid ECa data as weighting function.

Materials and methods Optimization criteria Minimization of mean of the shortest distance The minimization of mean of the shortest distances (MMSD) criterion, which minimizes the expectation of the distance of an arbitrary point to its nearest observation point, has frequently been used in the past (Van Groenigen et al. 1999). This condition should be expressed mathematically, for a scheme S of sampling by the function E[d(x,S)], where d(x,S) is the Euclidean distance between the position x and the nearest neighbor xi ∈ S. When any previous information on the spatial variation over the area of interest is lacking, the most prudent design is an even sampling by applying MMSD criterion. However, this criterion cannot be the only best one for agricultural fields, which generally have uneven spatial distribution of soil properties. We therefore modified MMSD criterion to account for areas with different priority levels. Weighted mean of the shortest distance The minimization of weighted mean of the shortest distances (MWMSD) criterion is a weighted version of the previous MMSD criterion (Castrignanò et al. 2008). A location-dependent weighting function is introduced in the fitness function: Z ϕMWMSD ¼ wðxÞkx−V S ðxÞkdx ð1Þ A

Page 3 of 12 422

where x denotes a two-dimensional coordinate vector, w(x) is the weighting function, and V(x) the coordinate vector of the sampling point nearest to x. The symbol || is used as distance vector. This function is estimated by    ne  j  j X w xe xe −V S xej  b ð2Þ ϕMWMSD ðS Þ ¼ ne j¼1 where x denotes the generic node of a finely meshed grid superimposed on the area of interest and ne the number of samples being allocated. Employing a weighting function offers flexibility in the use of auxiliary high-resolution data. When EMI data maps of the area are available, w(x) can be defined as the gradient of ECa, so placing more observations in the area of expected maximum variation. In the absence of a priori information about the soil attribute of primary interest, we assumed that the spatial dependence structure of ECa was similar to the spatial dependence structure of the soil property. We then deemed that a multistage sampling design can lead to more accurate site characterization: a preliminary sampling scheme was obtained by firstly allocating 50 % of the total number of samples according to MMSD criterion; in this way, an even distribution of samples was guaranteed. In a second step, the remaining samples were allocated according to MWMSD criterion, so resulting in coverage of the area that reflected the distribution of areas with different priority of sampling on the basis of their variability. Mean of average ordinary kriging variance This criterion aims at minimization of the average kriging estimation variance (Webster and Oliver 2007) estimated at any location of the surveyed domain. It is assumed that prior information is available, as the variogram model of the interest variable. The criterion focuses on the areas of the highest uncertainty in terms of kriging estimation variance, which are the undersampled areas. A limitation of this criterion is that it depends only on the spatial configuration of data points and on the variogram model but not on the measured values (under stationary conditions). Therefore, this criterion can be mostly considered as a mere geometrical criterion. It tends to allocate the samples at the borders of the area of interest where kriging estimation variance may become very high due to the scarcity of data in these zones.

422

Environ Monit Assess (2015) 187:422

Page 4 of 12

Similarly to MWMSD, a two-stage sampling design was used by allocating the first 50 % of the total number of samples according to MMSD criterion and the remaining 50 % of the samples according to mean of average ordinary kriging variance (MAOKV) criterion. MSANOS software To realize and compare the different sampling designs according to the three previous procedures, the MSANOS software was used. The software MSANOS (Barca et al. 2011) is able to address a number of issues related to the sampling scheme definition and monitoring network design and/or redesign. In particular, starting from an existing scheme, the software can widen or reduce it in an optimal way according to some specified management objectives and constraints. These are suitably modeled in form of an objective function (OF) to be optimized, and the optimizing approach is spatial simulated annealing (SSA). Spatial simulated annealing The SSA working flowchart is reported in Fig. 1 and shows that SSA is basically structured as a preprocessing stage followed by couple of nested loops. In the pre-processing, some parameters, needed to trigger the actual optimization method, are estimated, namely, the initial configuration S0 and the initial temperature T0. The outer loop is governed by the temperature and stops when that variable approaches to zero. The inner loop is related to the problem size: i.e., if k is defined as Fig. 1 The SSA working flowchart (adapted from Lobato et al. 2012)

the number of locations to be added or removed, the inner loop consists of k iterations for a given temperature value. Within the inner loop, the candidate solutions are generated and subjected to the decision rule (see Eq. 3). Consequently, the total number of method’s iterations is k * NTemp at maximum, where NTemp is the number of the different values assumed by the temperature during the run. It is hardly needed to highlight that the total number of iterations corresponds to the total number of actually processed configurations. The SSA is technically classified as an interchange heuristics (Drosou and Pitoura 2009) since, at each step i, it considers a neighboring configuration Si of the current optimal configuration S* and probabilistically decides if moving to Si or staying in configuration S*. The probability of making the transition PT from the current configuration S* to the candidate new configuration Si is determined by an acceptance probability function PT (S* → Si), which depends on the values of the objective function ϕ(·) corresponding to the two configurations and to the temperature T: 8  *  > < PT S →S i ¼ 1 !    *  ϕ S * −ϕðS i Þ > P S →S ¼ −exp i : T T

  ϕ ðS i Þ ≤ ϕ S *   ϕ ðS i Þ > ϕ S *

ð3Þ where ϕ(·) S* Si T PT

objective function optimal transient solution candidate solution temperature transition probability from S* to Si.

Environ Monit Assess (2015) 187:422

Page 5 of 12 422

The decision rule (Eq. 3) allows the SSA to avoid getting stuck in local optima, since although it generally prefers better neighboring configuration (i.e., smaller ϕ(·), it can also accept worse configurations (Metropolis et al. 1953; Kirkpatrick et al. 1983). The neighborhood search usually results in minimal changes of the current optimal configuration. In the proposed implementation, the current optimal configuration S∗ is changed moving only one randomly chosen monitoring location S∗i within its neighborhood. The new location S i belongs to the simulation grid. Concerning the initial configuration S0, usually, it is randomly set. Apart from this classical approach, a smart initialization has been also implemented in MSANOS, consisting in applying, before starting SSA, a Greedy heuristics (Drosou and Pitoura 2009), which quickly provides a sub-optimal configuration that can be used as S0 by SSA. The loss, in terms of overall computational load, due to the Greedy heuristic application is abundantly regained thanks to the faster convergence of SSA. SSA continues running while a given stop criterion is not met. MSANOS provides three main, not mutually exclusive, customizable criteria: 1. Maximum number of iterations. 2. Persistence of the optimal transient solution, i.e., OF value of the optimal transient solution does not change significantly for a given number of iterations. 3. Temperature threshold of the cooling scheme. Actually, even if the first two criteria do not stop the simulation, the third would do it when T approaches zero (in the present implementation the aforementioned threshold is set to 10−24). As said above, the SSA optimization process is determined by the temperature T, which decreases according to a given cooling law. Actually, T influences the process duration and the method capability of avoiding local minima (Fig. 1 and Eq. 3). MSANOS uses, as a default, the geometric cooling scheme, which has been demonstrated at better balancing the convergence speed and the rate of convergence to the global optimum (Nourani and Andresen 1998): T jþ1 ¼ αcool T j

ð4Þ

where j is the current temperature index and αcool, called cooling parameter, usually belongs to the range 0.8–

0.995. Moreover, MSANOS allows the user to change the default cooling law and provides a set of different cooling laws. Finally, T0 is also a critical parameter influencing the cooling scheme and, in general, the method performances. MSANOS initializes T0 following the method introduced by Triki et al. (2005) and Van Laarhoven and Aarts (1987), based on the following equation derived by the Metropolis criterion (Eq. 5): T0 ¼ −

ΔϕðþÞ lnðχ0 Þ

ð5Þ

where ΔϕðþÞ is the average of all the rejected objective function values and χ0 is the acceptance probability of worse neighbors. In brief, before starting the real SSA process, a random walk on the hypersurface of the chosen objective function is carried out in order to estimate the parameters in the Eq. 5. As results from Eq. 5, T0 is related to the objective function and not on the shape of the original sampling scheme. Equation 5 is drawn from the Metropolis criterion that is at the core of the decision rule in the SSA algorithm and, coherently, led us to prefer it to other different algorithms, e.g., the one proposed by Cunha and Sousa (1999). Case study The experiment was conducted on a 20,500-m2 agricultural field located in the experimental farm of CRASCA, Rutigliano-Bari (40° 59′ 48.25″ N, 17° 02′ 02.06″ E), southeastern Italy. The aim of the soil survey was to predict top soil water content across the field for applying site-specific irrigation under the constraint of a fixed total number of samples equal to 100. Soil was classified as fine, mixed, superactive, thermic Typic Haploxeralfs according to the Soil Taxonomy (Soil Survey Staff 2010). Soil texture is mostly clayey with the clay content ranging from 30 to 60 % by weight and increasing in depth. The soil depth, as it was also revealed by the soil survey (De Benedetto et al. 2012), is rarely greater than 0.60–1 m, owing to the shallowness occurrence of the bedrock. The bedrock is constituted of a layered sequence of Cretaceous limestone with some dolomitic limestone level and occurs at variable depth due to its irregularly shaped boundary. The data were collected in October 2013 with an EMI sensor (EM38DD, Geonics Limited, Mississauga, Ontario, Canada.), which consists of two sets of coils

422

Page 6 of 12

positioned perpendicularly to each other: one orientated horizontally and the other one vertically. The instrumentation allows one to measure bulk electrical conductivity (ECa) simultaneously in the two (horizontal and vertical) polarization modes with a different depth response profile (McNeill, 1980). The sensor was connected with a Differential Global Positioning System (HiPer® 27 Pro, TOPCON) with planimetric and altimetric centimeter accuracies, and it was towed across the field by a tractor along 23 parallel N-S oriented transects, approximately 5 m apart (Fig. 2). The bulk electrical conductivity in both orientations was recorded every second with a spatial resolution of 1 m on average along the transect. The system facilitates the collection of fine-scale ECa data in a manageable time interval, thus greatly increasing the spatial resolution of soil information. It is worth pointing out that the absolute values of bulk electrical conductivity may not necessarily be diagnostic but only the variations (gradient) in conductivity can be used to identify soil anomalies (Benson et al. 1988). In the light of the above considerations, the gradient of the horizontal ECa estimates was used as auxiliary information to optimize the sampling design. One hundred georeferenced soil samples (Fig. 2) were collected using a hand auger up to 0.30-m depth,

Environ Monit Assess (2015) 187:422

with spacing of 15 m, and water content was measured with gravimetric method. Geostatistical analysis Ordinary kriging (Webster and Oliver 2007) was used to interpolate EMI data in horizontal mode and soil water content (SWC) data. Geostatistical analysis is most efficient when done on variables that have Gaussian distribution. Variogram modeling is sensitive to strong departures from normality, because a few exceptionally large values may contribute to very large squared differences. A procedure known as Gaussian anamorphosis (Wackernagel 2003) allowed transforming each variable { Zi(x), x∈R2} into a Gaussian-shaped variable { Yi(x), x∈R2} with zero mean and unit variance using an expansion into Hermite polynomials (Wackernagel 2003). After variogram modeling and cross-validation, ordinary kriging was used to estimate the EMI and SWC values at the nodes of a 5 m×5 m cell grid, and the estimates were then back transformed to the raw variables. The gradients of the horizontal ECa and SWC estimates were calculated according to the equation: sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðw1 ðix þ 1; iyÞ−w1 ðix; iyÞÞ2 ðw1 ðix; iy þ 1Þ−w1 ðix; iyÞÞ2 w2 ðix; iyÞ ¼ þ dx dy

ð6Þ where w2 is the real image which corresponds to the partial derivative of the initial real image w1 along the xaxis and y-axis. The gradient values were then smoothed through averaging within a window of 15-m radius. All statistical and geostatistical analyses were performed with ISATIS software, release 2014 (Geovariances 2014). Contingency matrix

Fig. 2 Study area: surveyed area with EMI sensors (green polygon) and transects of measurements (black lines) and soil sampling points (red points)

To compare the maps of the gradients of the horizontal EC a and SWC, they were classified in three isofrequency classes (low, medium and high) and compared in pairs by using a two-enter table (contingency matrix). Each cell of such a matrix gave the frequency of pixels in two corresponding classes of the maps (Campbell 2002). The overall consistency, which is a measure of the spatial agreement between the two types of maps, was computed as the proportion of the total number of observations along the main diagonal of the contingency matrix. As a measure of spatial agreement,

Environ Monit Assess (2015) 187:422

Page 7 of 12 422

Table 1 Exploratory statistics for raw soil electrical conductivity (ECa) and soil water content (SWC) Variable

Count

ECaHor (mS m−1)

4,924 100

SWC (%)

Minimum

Maximum

Mean

Stand. dev.

4.13

23.88

12.13

3.52

0.50

2.81

12.39

24.91

19.90

2.63

−0.25

2.48

kappa coefficient, introduced by Cohen (1960), was calculated. This coefficient measures the interclassification agreement that, in the case of independent ratings of the same subjects, equals 0 when the agreement is due to chance only and +1 when there is perfect agreement. When kappa is negative, it is assumed to have no agreement. Besides kappa coefficient, its confidence limits (Fleiss 1981) were also computed. The approach was implemented with the FREQ procedure of the SAS/STAT software package.

Results From the descriptive statistics of the very dense EMI data (Table 1), it was evident that ECa data were characterized by large variability and the data distribution looked positively skewed with some departure from normal distribution (the hypothesis of normality was refused on the basis of χ2 test at the 5 % level of probability). Therefore, the data were transformed through Gaussian anamorphosis (Wackernagel 2003) and standardized to mean zero and variance one. The experimental variogram (Fig. 3) looked unbounded, crossing the sample variance (dashed line) at approximately 75-m distance. However, due to the small

Fig. 3 Isotropic variogram model for the Gaussian ECa in horizontal polarization. The points are the experimental semivariances, while the solid line is the fitted model

Skewness

Kurtosis

size of the surveyed field, the variable was assumed intrinsically stationary but within a distance longer than the maximum axis of the field. The variogram model of ECa in horizontal polarization (Fig. 3) was isotropic and included the following basic structures: nugget effect, spherical model (range=33.5 m), and cubic model (range=242 m). The kriged map of horizontal ECa (Fig. 4a) shows two areas: the northwestern area of the field characterized by lower values and a transversal central area by higher values. The higher values of ECa might be attributed to intrinsic properties of soil, such as textural characteristics, or due to moisture conditions of the topsoil. The ECa gradient map (Fig. 4b) splits the field approximately into two halves along the NW-SE direction: the right-hand half, with much higher values, and the left one. The comparison of the two maps shows that the most conductive areas were generally the ones characterized by the largest variability. The basic statistics (Table 1) for soil water content showed a wide variability range (range of 12.51 %), and the data distribution did not show sensible departure from normal distribution (the hypothesis of normality was accepted on the basis of χ2 test). Therefore, the raw variable was used for estimation. In order to perform MAOKV, an isotropic variogram model of gravimetric soil water content (SWC) was used, consisting of three basic structures: nugget effect, an exponential model with practical range of 40 m, and a spherical model with range of 100 m (Fig. 5). In order to show the strength of the relationship between the primary variable, SWC, and the auxiliary variable, ECa, the map of SWC was obtained using the ordinary kriging with the variogram model previously described. The SWC map (Fig. 6a) showed a wide stretched area crossing the field in 60° N direction characterized by higher SWC values, quite similar to that observed in the ECa map (Fig. 4a). Moreover, the gradient of SWC was calculated, and the map (Fig. 6b) showed the central area with lower values than the northern and southern areas: the wettest part of the field then looked more uniform.

422

Page 8 of 12

Environ Monit Assess (2015) 187:422

Fig. 4 Spatial estimate of ECa in horizontal polarization (mS m−1) (a) and ECa gradient map (b). Color scale uses iso-frequency classes

The overall spatial accordance between the gradient maps of the two variables (Figs. 4b and 6b) was 42 %, indicating a moderate degree of spatial association; k statistic was 0.11, with its 95 % lower and upper confidence limits equal to 0.07 and 0.16, respectively, which reveals that the similarities observed between the two maps were not due to chance but actually they showed also significant differences. Ten simulated annealing procedure runs for each optimization criterion were carried out with the aim of

Fig. 5 Isotropic variogram model for the soil water content. The points are the experimental semivariances, while the solid line is the fitted model

taking into account and controlling the effects of the inherently stochastic nature of SSA approach. The best result among the ten runs with respect to the value of the corresponding criterion objective function was chosen as the true optimal sampling configuration. By collecting the objective function values associated to the ten found optimal solutions, it is possible to assess the fluctuation of the objective function through the statistical range index, defined as the difference between the maximum and minimum values. If the latter is some order of magnitude smaller than the mean value of objective function, it can be assumed as an indication about the method stability. It shows that the found suboptimal solutions in terms of the value of the objective function are not substantially different to the global optimum. In this case, the fluctuations associated to each criterion are from 2 to 3 orders of magnitude smaller than the corresponding mean values, and that proves the robustness of the implementation. The initial temperature was set, after the first step, equal to 0.1287, 0.0161, and 0.0204 for MMSD, MAOKV, and MWMSD, respectively. The cooling down factor was set equal to 0.95 and the number of steps before decreasing temperature to 50. Table 2 reports the average time to reach the convergence averaged over the ten simulations to compare the efficiency of the three procedures in terms of computer

Environ Monit Assess (2015) 187:422

Page 9 of 12 422

Fig. 6 Maps of gravimetric SWC (a) and gradient of SWC (b)

time. It is evident that the introduction of a weight function has speeded up the process of convergence, but the choice of such a function is quite critical, mostly depending on the spatial relationship between target variable and auxiliary information. The current wide use of EMI sensor, or other similar proximal sensors in PA, should then be encouraged to direct efficient sampling of those attributes, which have significant relation to dielectric characteristics. To explore and illustrate the different sampling procedures, estimation variance of SWC was calculated for the three different sampling designs at the nodes of the same interpolation grid of ECa using ordinary kriging with the same variogram model described above, under the assumption that variogram model is independent of Table 2 Convergence time for the three sampling optimization procedures Objective function MMSD

Average time for convergence (s) 2,649.71

MWMSD

1,295.65

MAOKV

48,263.62

MMSD minimization of mean of the shortest distances, MWMSD minimization of weighted mean of the shortest distances, MAOKV mean of average ordinary kriging variance

sampling configuration. Figure 7 shows the three resulting sampling schemes overlapped on the kriging standard deviation estimates, and it shows the impact of the optimization criterion on the arrangement of the observation points: the MMSD scheme (Fig. 7a) is quite regular; a spatial pattern emerges, that is close to the one of ECa, in the MWMSD scheme (Fig. 7b); and a migration of the observation points towards the field edges is evident in the MAOKV scheme (Fig. 7c). It is worth pointing out that each map reflects the optimization criterion used which, in turn, underlines a particular aspect of uncertainty. It is impossible to judge the best scheme, because each one is useful in a different practical situation. Depending on the particular objective of the survey and the prior information, MMSD will be preferred when no previous knowledge of the field is available and the purpose is to have the most regular sampling scheme; MWMSD if auxiliary information, spatially related to the primary variable of interest, is available, and the objective is to uniform estimation variance (uncertainty) in conditions of evident heteroscedasticity; and MAOKV when a variogram model was previously estimated and the objective is the minimization of estimation variance. Another way to compare the three procedures is to analyze the distribution of kriging standard deviation of SWC estimates (Table 3). From the inspection of the

422

Environ Monit Assess (2015) 187:422

Page 10 of 12

Fig. 7 Arrangement of the sample points overlapped on kriging standard deviation for MMSD (a), MWMSD (b), and MAOKV (c)

overall statistics, relevant differences do not result among the procedures, even if MAOKV outperforms in terms of mean and spread of the errors, as it was expected. MAOKV is also to be preferred when the target is reducing the occurrence of large errors. MWMSD and MMSD have produced quite symmetric distributions of uncertainty, although the former is to be preferred in observed conditions of heteroscedasticity, because its error distribution is more symmetric (skewness, Table 3).

Discussion and conclusions The paper has described three approaches to optimize soil sampling design and showed how they can be adapted to different practical situations. Three optimization criteria have been defined depending on the main purpose of the survey: regular arrangement of the

sample points (MMSD), uniformity of uncertainty (MWMSD), and reduction of uncertainty (MAOKV). These criteria cover a large range of practical situations occurring in the reality, but, of course, they are not exhaustive. The spatial simulated annealing procedure has shown great flexibility in adapting to the different practical issues, reaching the optimal solution in a reasonable calculation time. The availability of the ECa gradient map has allowed sampling to be optimized, distinguishing between areas with different priority levels. The proposed MWMSD criterion can effectively use a suited spatial weight function to direct sampling in areas having high variability and reduce sampling in areas with expected low heterogeneity. One of the crucial issues in this approach is the definition of the weighting function. We have preferred to avoid qualitative priority values, based on subjective expert knowledge, and use ancillary variables in a more quantitative way.

Table 3 Basic statistics of kriging standard deviation of soil water content for the three sampling optimization procedures Variable

Count

Minimum (%)

Maximum (%)

Mean (%)

Median (%)

Stand. dev. (%)

Skewness (−)

MAOKV

835

1.71

2.95

2.49

2.54

0.31

−0.70

MWMSD

835

1.71

3.96

2.51

2.52

0.35

−0.51

MMSD

835

1.71

3.93

2.50

2.51

0.33

−1.55

MMSD minimization of mean of the shortest distances, MWMSD minimization of weighted mean of the shortest distances, MAOKV mean of average ordinary kriging variance

Environ Monit Assess (2015) 187:422

The advantage to use a method instead of another cannot be evaluated only on the basis of a global statistic. MWMSD is preferred when the main objective is to produce a map with approximately the same precision in any part, whatever variation is. This requires an intensification of sampling in those areas where the variability is greater. Therefore, the homogenization of local error cannot be realized using a uniform sampling, but some assessment of spatial variation of the target variable is needed. Since it is quite uncommon that this piece of information is available before sampling, a good surrogate may be to use a variable, which is significantly correlated with the primary variable and is relatively easily and cheaply surveyed. Wide literature has proved the capability of ECa to reproduce the main structures of spatial dependence for most soil variables. A geophysical survey, using EMI sensor, is relatively cheap compared with a direct sampling, and at present, in many parts of the world, there are private agencies that can provide this type of service at a reasonable cost. However, most advantages of MWMSD depend on the strength of the relationship between primary and auxiliary variables. We then think that more research should be dedicated to the best setting of the weighting factors, because attaching priorities to the different within-field zones may prove to be a very effective tool in decision-making. Moreover, further improvement of the software used for sampling optimization should be realized by adding more procedures in the future, aimed at minimization of cokriging variance in multi-purpose sampling or a loss function in socio-economic analyses. Loss function should also evaluate the costs associated with additional sampling to reduce uncertainty, which is a quite critical issue for the rational management of sampling in precision agriculture.

References Adamchuk, V. I., & Viscarra Rossel, R. A. (2010). Development of on-the-go proximal soil sensor systems. In R. A. Viscarra Rossel, A. B. McBratney, & B. Minasny (Eds.), Proximal soil sensing (pp. 15–28). New York: Springer. Adamchuk, V. I., Viscarra Rossel, R. A., Sudduth, K. A., & Schulze Lammers, P. (2011). Sensor fusion for precision agriculture. In C. Thomas (Ed.), Sensor fusion - foundation and applications (pp. 27–40). Croatia: In-Tech. Barca, E., Passarella, G., Vurro, M., & Morea, A. (2011). A software for optimal information based downsizing/upsizing of existing monitoring networks. In Spatial2 Conference:

Page 11 of 12 422 Spatial Data Methods for Environmental and Ecological Processes, Foggia (IT), 1-2 September 2011. Barca, E., Passarella, G., Vurro, M., & Morea, A. (2015). MSANOS: Data-driven, multi-approach software for optimal redesign of environmental monitoring networks. Water Resources Management, 29(2), 619–644. Benson, R. R., Glaccum, A., & Noel, M. R. (1988). Geophysical techniques for sensing buried wastes and waste migration. Dublin: National Water Well Association. Campbell, J. (2002). Introduction to remote sensing (3rd ed.). New York: The Guilford Press. Castrignanò A., Morari F., Fiorentino C., Pagliarin C., Brenna S. (2008) Constrained optimization of spatial sampling in skeletal soils using EMI data and continuous simulated annealing. In Proceeding of 1st Global Workshop on High Resolution Digital Soil Sensing & Mapping. 5–8 February 2008, Sydney – Australia (CD-ROM). Cochran, W. G. (1977). Sampling techniques (3rd ed.). New York: Wiley. Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37–46. Corwin, D. L., & Lesch, S. M. (2005). Characterizing soil spatial variability with apparent soil electrical conductivity. I. Survey protocols. Computers and Electronics in Agriculture, 46, 103–133. Cunha, M. C., & Sousa, J. (1999). Water distribution network design optimization: Simulated annealing approach. Journal of Water Resources Planning and Management, 125, 215–221. De Benedetto, D., Castrignanò, A., Sollitto, D., Modugno, F., Buttafuoco, G., & Lo Papa, G. (2012). Integrating geophysical and geostatistical techniques to map the spatial variation of clay. Geoderma, 171–172, 53–63. De Benedetto, D., Castrignanò, A., Rinaldi, M., Ruggieri, S., Santoro, F., Figorito, B., Gualano, S., Diacono, M., & Tamborrino, R. (2013). An approach for delineating homogeneous zones by using multi-sensor data. Geoderma, 199, 117–127. de Gruijter, J., Brus, D., Bierkens, F. P., & Knotters, M. (2006). Sampling for natural resource monitoring. Berlin: Springer. Drosou M., Pitoura E. (2009). Comparing diversity heuristics. Technical Report 2009–05. Computer Science Department, University of Ioannina. Fleiss, J. L. (1981). Statistical methods for rates and proportions. New York: Wiley. Geovariances (2014). Isatis Technical Ref., ver. 2014, Geovariances & Ecole Des Mines De Paris. Avon Cedex, France, www.geovariances.fr. Kerry, R., Oliver, M. A., & Frogbrook, Z. L. (2010). Sampling in precision agriculture. In M. A. Oliver (Ed.), Geostatistical applications for precision agriculture (pp 35–64). Springer. Kirkpatrick, S., Gelatt, C. D., & Vecchi, M. P. (1983). Optimization by simulated annealing. Science, 220, 671–680. Lobato, F. S., Assis, E. G., Steffen, Jr V., & da Silva Neto, A. J. (2012). Design and identification problems of rotor bearing systems using the simulated annealing algorithm. In M. Sales Guerra Tsuzuki (Ed.), Simulated annealing - single and multiple objective problems (pp 197–216). Croatia: In-Tech. Marchant, B. P., & Lark, R. M. (2010) Sampling in precision agriculture, optimal designs from uncertain models. In M.

422

Page 12 of 12

A. Oliver (Ed.), Geostatistical applications for precision agriculture (pp 65–87). Springer. McBratney, A. B., & Webster, R. (1983). Optimal interpolation and isarithmic mapping of soil properties. Co-regionalization and multiple sampling strategies. Journal of Soil Science, 34, 137–162. McNeill, J. D. (1980). Electromagnetic terrain conductivity measurement at low induction numbers. Geonics Limited, Technical Note TN 6, Geonics Ltd. Mississauga, Ontario, Canada. Metropolis, N., Rosenbluth, A., Rosenbluth, M., Teller, A., & Teller, E. (1953). Equation of state calculations by fast computing machines. The Journal of Chemical Physics, 21, 1087–1092. Müller, W. G. (1998). Collecting spatial data-optimum design of experiments for random fields. Heidelberg: Physical-Verlag. Nourani, Y., & Andresen, B. (1998). A comparison of simulated annealing cooling strategies. Journal of Physics A: Mathematical and General, 31, 8373–8385. Pierce, F. J., & Nowak, P. (1999). Aspects of precision agriculture. Advances in Agronomy, 67, 1–85. Soil Survey Staff. (2010). Keys to soil taxonomy (11th ed.). Washington: USDA-Natural Resources Conservation Service. Stein, A., & Ettema, C. (2003). An overview of spatial sampling procedures and experimental design of spatial studies for

Environ Monit Assess (2015) 187:422 ecosystem comparisons. Agriculture, Ecosystems & Environment, 94, 31–47. Triki, E., Collette, Y., & Siarry, P. (2005). A theoretical study on the behavior of simulated annealing leading to a new cooling schedule. European Journal of Operational Research, 166, 77–92. Van Groenigen, J. W., & Stein, A. (1998). Spatial simulated annealing for constrained optimization of spatial sampling schemes. Journal of Environmental Quality, 27, 1078–1086. Van Groenigen, J. W., Siderius, W., & Stein, A. (1999). Constrained optimisation of soil sampling for minimisation of the kriging variance. Geoderma, 87(3), 239–259. Van Laarhoven, P. J., & Aarts, E. H. (1987). Simulated annealing. Netherlands: Springer. Viscarra Rossel, R. A., & Adamchuk, V. I. (2013). Proximal soil sensing. In M. A. Oliver, T. F. A. Bishop, & B. P. Marchant (Eds.), Precision agriculture for sustainability and environmental protection (pp. 99–118). Abingdon: Routledge. Wackernagel, H. (2003). Multivariate geostatistics: an introduction with applications. Berlin: Springer. Webster, R., & Oliver, M. A. (2007). Geostatistics for environmental scientists. John Wiley & Sons, New York. Webster, R., & Lark, R. M. (2013). Field sampling for environmental science and management. Abingdon: Routledge.

Integration of electromagnetic induction sensor data in soil sampling scheme optimization using simulated annealing.

Soil survey is generally time-consuming, labor-intensive, and costly. Optimization of sampling scheme allows one to reduce the number of sampling poin...
3MB Sizes 0 Downloads 10 Views