Assessment of cluster yield components by image analysis.

Research Article Received: 11 April 2014

Revised: 30 June 2014

Accepted article published: 14 July 2014

Published online in Wiley Online Library: 12 August 2014

(wileyonlinelibrary.com) DOI 10.1002/jsfa.6819

Assessment of cluster yield components by image analysis Maria P Diago,a Javier Tardaguila,a Nuria Aleixos,b Borja Millan,a Jose M Prats-Montalban,c Sergio Cuberod and Jose Blascod* Abstract BACKGROUND: Berry weight, berry number and cluster weight are key parameters for yield estimation for wine and tablegrape industry. Current yield prediction methods are destructive, labour-demanding and time-consuming. In this work, a new methodology, based on image analysis was developed to determine cluster yield components in a fast and inexpensive way. RESULTS: Clusters of seven different red varieties of grapevine (Vitis vinifera L.) were photographed under laboratory conditions and their cluster yield components manually determined after image acquisition. Two algorithms based on the Canny and the logarithmic image processing approaches were tested to find the contours of the berries in the images prior to berry detection performed by means of the Hough Transform. Results were obtained in two ways: by analysing either a single image of the cluster or using four images per cluster from different orientations. The best results (R2 between 69% and 95% in berry detection and between 65% and 97% in cluster weight estimation) were achieved using four images and the Canny algorithm. The model’s capability based on image analysis to predict berry weight was 84%. CONCLUSION: The new and low-cost methodology presented here enabled the assessment of cluster yield components, saving time and providing inexpensive information in comparison with current manual methods. © 2014 Society of Chemical Industry Keywords: Vitis vinifera L; cluster weight; berry number per cluster; berry weight; LIP–Canny; Hough Transform

INTRODUCTION

1274

Berry weight, number of berries per cluster and cluster weight define cluster morphology. These variables, called cluster yield components, are key parameters for vineyard yield estimation and have an impact not only on the final yield, but also on the cluster architecture and compactness.1,2 Moreover, cluster yield components are considered indicators of grape and wine quality.3 – 5 Yield forecasting has been identified in recent years as one of the most profitable research topics in the wine and tablegrape industry,6 being an important process for making decisions in vineyard management to optimise the grapevine balance between vegetative and reproductive growth, and to prepare growers and wineries for the harvest operation.7 Typically, yield predictions are conducted using knowledge of historical yields and weather patterns, along with measurements taken manually in the field. These field measurements generally consist in cluster and berry counts, mass monitoring from sentinel vines across a given vineyard early in the season8 or models developed to predict daily carbon balance in the different grapevine organs.9 All these current methods are destructive, labour-demanding and time-consuming, and often a representative and sufficient number of measurements cannot be made in order to obtain an accurate estimation of the final yield. Image analysis is widely used to inspect fruit production. This technology allows the creation of systems capable of estimating or predicting some features of the inspected objects without the J Sci Food Agric 2015; 95: 1274–1282

need of contact in a fast, repeatable and accurate way. Recently, Herzog et al.10 have shown initial results on the use of image analysis for high-throughput phenotyping in vineyards. Computer vision was used in viticulture to assess key canopy features as yield.11,12 A classifier based on the Mahalanobis distance was created to identify and quantify the pixels corresponding to grape clusters in an RGB image of a grapevine canopy, which were then correlated to the actual grape yield of the plant.13 Recently, Diago et al.14 developed a new algorithm to assess the number of flowers per inflorescence using image analysis under uncontrolled outdoors conditions. Several studies have been conducted for cluster morphological determinations using image analysis. Wycislo et al.15 used

∗

Correspondence to: Jose Blasco, Centro de Agroingeniería, Instituto Valenciano de Investigaciones Agrarias (IVIA), Cra. Moncada-Náquera km 5, 46113 Moncada, Valencia, Spain. E-mail: [email protected]

a Instituto de Ciencias de la Vid y del Vino, University of La Rioja, CSIC, Gobierno de La Rioja, 26006 Logroño, Spain b Instituto Interuniversitario de Investigación en Bioingeniería y Tecnología Orientada al Ser Humano (I3BH), Universitat Politècnica de València, Camino de Vera s/n, 46022 Valencia, Spain c Departamento de Estadística e Investigación Operativa, Universitat Politècnica de València, Camino de Vera s/n, 46022 Valencia, Spain d Centro de Agroingeniería, Instituto Valenciano de Investigaciones Agrarias (IVIA), Cra. Moncada-Náquera km 5, 46113 Moncada, Valencia, Spain

www.soci.org

© 2014 Society of Chemical Industry

Cluster yield components by image analysis

www.soci.org

different ratios to estimate the shape of tablegrapes such as the major/minor axis ratio, shape factor, and compactness shape value. Recently, a new method for cluster peduncle detection was developed using image analysis.16 To perform the detection of the berries, two main steps can be considered: the extraction of contours, and the detection of circles in the image, since this is the supposed shape of a berry. Several methods of contour extraction have been developed but the most widely used are those based on Sobel and Canny operators.17 Both rely on the gradient of the intensity in the images, Canny being more advanced as it includes the Sobel operator as an intermediate step. The detection of circles in images is another key problem that researchers have attempted to solve from a number of different approaches,18,19 the Hough Transform being probably the most widely extended,20 but with a very high computational cost.21 With the aim of conducting geometric measurements of grape fruits dynamically, Miao et al.22 used a snake-based model after image segmentation to discriminate each berry in the cluster and obtain some descriptors of the individual berries. Grossetête et al.23 used the reflection of a digital camera flash light for counting number of green berries per cluster. This approach cannot be applied after veraison because pruine affects the reflection, either avoiding it or by generation of multiple reflection points per berry. No available data exist on berry weight assessment in the cluster of winegrapes or tablegrapes using image analysis. The main goal of the present work was to explore the potential of image analysis methodology to accurately estimate the cluster yield components, such as berry weight, berry number and cluster weight in a fast, inexpensive and potentially automated way. This information could be later used to provide accurate yield estimations, being an alternative to current manual methods in wine and tablegrape industry.

MATERIAL AND METHODS Plant material and manual cluster yield components assessment At harvest time (4 October 2012), cluster samples of seven different red varieties of Vitis vinifera L.: Carignan, Grenache, Monastrell, Bobal, Cabernet Sauvignon, Tempranillo and Merlot, were collected from a commercial vineyard in the wine-producing area of Utiel-Requena (Valencia, Spain). For each variety, 10 clusters showing the general features of the variety were randomly selected and their weight recorded. After image acquisition, each cluster was manually destemmed and their number of berries and average berry weight manually assessed.

J Sci Food Agric 2015; 95: 1274–1282

The lighting system was composed of four lamps placed on the sides of the inspection chamber each consisting of two fluorescent tubes (Biolux L18W/965, 6500 K; Osram AG, Munich, Germany) powered by high-frequency electronic ballasts to avoid the flicker effect. This arrangement achieves a spatially uniform light. In addition, to reduce undesired bright spots in the image, cross-polarisation was used by placing polarising filters between the samples and the camera, and also between the samples and the lamps. In order to facilitate image segmentation, the contrast between the berries and the background was enhanced by using a uniform orange background. During image acquisition, clusters were hanging from a clamp so as not to distort their shape. Four views of each cluster were acquired, the cluster being rotated 90∘ from one image to another. A total of 280 images were thus acquired, that is, four images per cluster and ten clusters for each variety. Image processing algorithms for berry detection in grapevine clusters Image processing was aimed at detecting each berry in the cluster automatically. In a first step, the cluster had to be discriminated


wileyonlinelibrary.com/jsfa

1275

Image acquisition The images were acquired under laboratory conditions (at 23 ∘ C) using a digital still camera Canon EOS 550D and a Canon EFS 18-55 lens (Canon Inc., Tokyo, Japan) with the focal distance set at 55 mm. The settings of the camera and the captures were performed using the EOS Utility software provided by the camera manufacturer. The parameters used to capture the images were: shutter speed 200 ms, ISO 800, manual focus and white balance set to ‘Shadow’. The resolution of the images was 0.38 mm pixel−1 . The camera was placed in an inspection chamber with the inside covered by a diffuser tissue; the lamps were oriented 45∘ to the samples (Fig. 1) and separated 30 cm from the cluster to be photographed.

Figure 1. Scheme of the experimental set-up configuration for image acquisition.

www.soci.org from the background, and then the contours of each berry had to be extracted in order to apply the berry detector, which was developed and implemented using the Hough Transform.24 Once the berries had been detected, some features were calculated for each one. The information about each of the four images of the cluster was processed individually, and also averaged in order to estimate the number and weight of all the berries in the cluster. Figure 2 shows the flowchart of the algorithm implemented in the automatic system. Contour extraction In a first step, the cluster had to be discriminated from the background which was done in this case using thresholding due to favourable background. The next step was to extract the contours in the image for further application of a circle detection procedure. Berry contour extraction was carried out using two methods based on the Canny operator:16 (1) the original algorithm, and (2) the LIP–Canny algorithm,25 which is based on the logarithmic image processing (LIP) paradigm.26,27 Detection of borders using the Canny algorithm The Canny algorithm uses an image gradient anchored by two thresholds (lower threshold, LT, and higher threshold, HT) to highlight regions with high spatial derivatives and suppresses any pixel that is not within the gradient range. In terms of threshold selection, a threshold set too high can miss important information, whilst a threshold that is set too low will falsely identify irrelevant information (such as noise) as important. For this reason the choice of the proper threshold values for all the images in our study was a critical step. In this work, an algorithm based on the Canny edge detection function implemented in MATLAB (MathWorks Inc., Natick, MA, USA) was used, modified to speed up processing. The Canny transform was applied in the R (red) and G (green) bands in parallel and the images of the edges were combined at a later stage. This procedure was followed as it proved to yield more robust and reliable results in preliminary trials.28 Detection of borders using the LIP–Canny method In order to reduce the influence of the lighting system during berry detection due to the irregular shape of the clusters (affecting the magnitude of the gradient in more shaded or lighted regions), intensity normalisation of the values of the pixels29 was carried out. Such transformations were applied to the image in a pre-processing step, performing edge detection operations on the transformed image. The LIP–Canny operator used in this work performed all transformations in a single image processing step. In order to obtain the optimum performance of the system, HT and LT values had to be studied for each of the varieties. Therefore, for both the Canny and the LIP–Canny algorithms, each image was analysed for all combinations of values between 0.15 and 0.35 with a resolution of 0.02 for HT and from 0.01 to 0.1 with an interval of 0.01 for LT.

1276

Circle detection The contour extraction process yielded a binary image, which was used as input for a MATLAB implementation of the Hough Transform, aimed at circle detection.30 A range that included all the sizes of the berries of all the varieties under study, in this case from 6 mm to 30 mm, was selected. The value of the minimum acceptable perimeter was set between 40% and 80% of


MP Diago et al.

the theoretical perimeter of the circle. The search for circles was performed using two nested loops and varying the radius and tolerance values. To avoid false positives, which typically occurred for smaller berries (requiring a reduced number of pixels), but to allow for the detection of the maximum possible number of them, the tolerance of the minimum circumference segment permitted was linearly adjusted based on the radius value, so that circles with a radius of 3 mm (minimum size of the berries under study) were taken into account only if they presented a perimeter of at least 80% of the circumference. Circles with a radius greater than 15 mm (maximum size of the berries under study) were only considered if they presented segments of 40% of the circumference. These values were obtained empirically since the contour of objects other than berries (noise) could cause misclassification of berries with a radius smaller than approximately 5 mm. Centre filtering and berry detection After circle detection a final filtering step was needed to ensure a unique detection per berry. Taking into account the size of the smaller berries in the experiments, the circles with centres closer than 10 mm were considered as actually belonging to the same berry, and so the one with largest radius was selected. The information about each of the four images of the cluster was processed individually, and also averaged in order to estimate the number and weight of all the berries in the cluster. Figure 2 shows the flowchart of the algorithm involving all these steps in the imaging system. Statistical analysis The robustness of the algorithms employed and their capability to automate the estimation of the yield components was tested. Therefore, to establish the optimum values for HT and LT, a three-factor complete factorial design was adopted,31 HT and LT being the factors, and the type of algorithm applied – LIP–Canny versus Canny – also included as another factor. In order to obtain the optimum HT and LT values for each variety, each image was analysed for all combinations of values between 0.15 and 0.35 with a resolution of 0.02 for HT and from 0.01 to 0.1 with an interval of 0.01 for LT. Due to the limited number of clusters available per grapevine variety, it was necessary to perform a cross-validation procedure, which was conducted using seven clusters per variety for building the model and the remaining three clusters per variety were used for model validation. Thus, a data set of 120 possible combinations was obtained. By evaluating the difference between the variables predicted, i.e. the number of berries in one cluster predicted by one algorithm (with its corresponding HT and LT settings), and their real value, i.e. real number of berries in the cluster, the coefficient of determination R2 value was obtained. This R2 value represents the accuracy of the algorithm in predicting the number of berries in the cluster or their weight. From the 120 R2 values computed, the mean R2 value was obtained for each one of the possible HT, LT and type of algorithm settings (treatments, in statistical terminology), for each of the varieties analysed (number of berries computed from the analysis of one or four images per cluster, and cluster weight). From this data set, ANOVAs (Statgraphics Centurion XVI v16.1.15 64 bits; StatPoint Technologies, Inc., Warrenton, VA, USA) were performed on each of the varieties analysed. Since R2 is a percentage, in order to better accomplish the normal distribution assumptions required for ANOVA, a proper transformation as the arc sine of the


J Sci Food Agric 2015; 95: 1274–1282


www.soci.org

J Sci Food Agric 2015; 95: 1274–1282



1277

Figure 2. Flowchart of the algorithm implemented to estimate the number and weight of berries per cluster from RGB images.

www.soci.org

MP Diago et al.

Table 1. Average (Avg), minimum (Min) and maximum (Max) values corresponding to cluster weight, berry number per cluster and berry weight of each variety by manual assessment (n = 10) Bunch weight (g)

Berry number per bunch

Variety

Min

Avg

Max

Bobal Cabernet Sauvignon Carignan Grenache Merlot Monastrell Tempranillo

2095 701 1630 1427 1028 1815 1802

316.9 110.3 2456 222.7 127.3 236.8 240.4

399.2 162.5 3490 345.2 187.0 325.3 367.9

Min 106 88 69 123 116 124 132

square root of R2 was conducted. From ANOVA, it was possible to determine the statistical significance of the simple effect of the factors involved: Type of algorithm used, HT and LT, as well as their interactions at P < 0.05.

RESULTS AND DISCUSSION Variability of the cluster morphology The data corresponding to cluster weight, berry number per cluster and berry weight for each variety, determined from manual measurement in the laboratory are shown in Table 1. These values illustrate the intrinsic variability of the cluster yield components that exist not only among varieties, but also within a single cultivar, as the maximum values are often twice the value of the minimum.

Avg 131.9 127.6 894 162.5 124.7 150.2 154.0

Max 165 159 121 204 164 177 181

Berry weight (g) Min 185 67 176 116 88 148 136

Avg 2.28 0.82 2.67 1.33 0.97 1.63 1.54

Max 246 102 303 168 117 184 201

This variability in cluster morphology is a positive feature for the development and testing of image vision algorithms aimed at the estimation of yield components, as the universality and robustness of these algorithms are reinforced. Berry number per cluster Figure 3A shows the original image of a cluster and the process followed to detect the individual berries. Figure 3B shows the first derivative image, and the one resulting from applying the LIP–Canny algorithm is depicted in Fig. 3C. Figure 3D shows the potential centres of the berries found after applying the Hough Transform. Finally, Fig. 3E shows the result of circle detection after the filtering process and Fig. 3F depicts an image of a cluster with the berries detected by the system.

1278

Figure 3. (A) Original image of a cluster; (B) first derivative of the image; (C) result of the contour extraction using the LIP–Canny algorithm; (D) potential centres of the berries found after applying the Hough Transform; and (E) circles found in the image supposedly corresponding to the berries; (F) result of the circle detection operation in a grapevine cluster image after applying the Hough Transform.



J Sci Food Agric 2015; 95: 1274–1282


www.soci.org

Table 2. Analysis of variance P-values for the grapevine varieties studied (using a single image and four images) in the estimation of total berry number per cluster Effect

Bobal

Cabernet Sauvignon

Analysis of variance P-value (single image) HT

Dermal type I collagen assessment by digital image analysis.

Microglia Morphological Categorization in a Rat Model of Neuroinflammation by Hierarchical Cluster and Principal Components Analysis.

An analysis of association of components of yield and oil in safflower (Carthamus tinctorius L.).

Reliability of proliferation assessment by Ki-67 expression in neuroendocrine neoplasms: eyeballing or image analysis?

Assessment of flower number per inflorescence in grapevine by image analysis under field conditions.

In vivo assessment of subcutaneous fat in dogs by real-time ultrasonography and image analysis.

Comparative assessment of proliferation and DNA content in breast carcinoma by image analysis and flow cytometry.

Genome-wide association mapping of yield and yield components of spring wheat under contrasting moisture regimes.

Combining Ability, Maternal Effects, and Heritability of Drought Tolerance, Yield and Yield Components in Sweetpotato.

Combining ability of highland tropic adapted potato for tuber yield and yield components under drought.

Selection for yield and yield components in the early generations of a potato breeding programme.

Fractal analysis for reduced reference image quality assessment.

Blind image quality assessment via probabilistic latent semantic analysis.

Efficient Execution of Microscopy Image Analysis on CPU, GPU, and MIC Equipped Cluster Systems.

Obstructive sleep apnea subtypes by cluster analysis.

Direct response in yield and correlated response in components accompanying selection for milk yield in Jerseys.

Quantitative assessment of liver fibrosis by digital image analysis: Relationship to Ishak staging and elasticity by shear-wave elastography.

Dual reward prediction components yield Pavlovian sign- and goal-tracking.

Assessment of bioethanol yield by S. cerevisiae grown on oil palm residues: Monte Carlo simulation and sensitivity analysis.

Two worlds collide: image analysis methods for quantifying structural variation in cluster molecular dynamics.

Subtyping of psychiatric patients by cluster analysis of QEEG.

Image analysis.

Diallel analysis to predict heterosis and combining ability for grain yield, yield components and bread-making quality in bread wheat (T. aestivum).

The ability of video image analysis to predict lean meat yield and EUROP score of lamb carcasses.