Equivalence of data fusion and simultaneous retrieval Simone Ceccherini,* Bruno Carli and Piera Raspollini Istituto di Fisica Applicata “Nello Carrara” del Consiglio Nazionale delle Ricerche, Via Madonna del Piano 10, 50019 Sesto Fiorentino, Italy * [email protected]

Abstract: A new method for the data fusion of atmospheric vertical profiles, referred to as complete fusion, is presented. Using the measurements of the MIPAS instrument, the performance of the method is compared with those of weighted and arithmetic means. The complete fusion perfectly reproduces the results of the simultaneous retrieval with equal error estimates and number of degrees of freedom, while arithmetic and weighted means have relatively low vertical resolution and differ from the simultaneous retrieval by more than their errors. In addition the problem posed in this context by systematic errors is analyzed and alleviating procedures are considered. ©2015 Optical Society of America OCIS codes: (010.1280) Atmospheric composition; (280.4991) Passive remote sensing; (100.3190) Inverse problems; (000.3860) Mathematical methods in physics; (150.1135) Algorithms; (150.4232) Multisensor methods.

References and links 1.

2.

3.

4. 5. 6. 7. 8. 9.

H. Fischer, M. Birk, C. Blom, B. Carli, M. Carlotti, T. Clarmann, L. Delbouille, A. Dudhia, D. Ehhalt, M. Endemann, J. M. Flaud, R. Gessner, A. Kleinert, R. Koopman, J. Langen, M. Lopez-Puertas, P. Mosner, H. Nett, H. Oelhaf, G. Perron, J. Remedios, M. Ridolfi, G. Stiller, and R. Zander, “MIPAS: an instrument for atmospheric and climate research,” Atmos. Chem. Phys. 8(8), 2151–2188 (2008). J. W. Waters, L. Froidevaux, R. S. Harwood, R. F. Jarnot, H. M. Pickett, W. G. Read, P. H. Siegel, R. E. Cofield, M. J. Filipiak, D. A. Flower, J. R. Holden, G. K. Lau, N. J. Livesey, G. L. Manney, H. C. Pumphrey, M. L. Santee, D. L. Wu, D. T. Cuddy, R. R. Lay, M. S. Loo, V. S. Perun, M. J. Schwartz, P. C. Stek, R. P. Thurstans, M. A. Boyles, K. M. Chandra, M. C. Chavez, G. S. Chen, B. V. Chudasama, R. Dodge, R. A. Fuller, M. A. Girard, J. H. Jiang, Y. Jiang, B. W. Knosp, R. C. LaBelle, J. C. Lam, K. A. Lee, D. Miller, J. E. Oswald, N. C. Patel, D. M. Pukala, O. Quintero, D. M. Scaff, W. V. Snyder, M. C. Tope, P. A. Wagner, and M. J. Walch, “The Earth Observing System Microwave Limb Sounder (EOS MLS) on the Aura satellite,” IEEE Trans. Geosci. Rem. Sens. 44(5), 1075–1092 (2006). P. F. Bernath, C. T. McElroy, M. C. Abrams, C. D. Boone, M. Butler, C. Camy-Peyret, M. Carleer, C. Clerbaux, P.-F. Coheur, R. Colin, P. DeCola, M. De Mazière, J. R. Drummond, D. Dufour, W. F. J. Evans, H. Fast, D. Fussen, K. Gilbert, D. E. Jennings, E. J. Llewellyn, R. P. Lowe, E. Mahieu, J. C. Mc-Connell, M. McHugh, S. D. McLeod, R. Michaud, C. Midwinter, R. Nassar, F. Nichitiu, C. Nowlan, C. P. Rinsland, Y. J. Rochon, N. Rowlands, K. Semeniuk, P. Simon, R. Skelton, J. J. Sloan, M.-A. Soucy, K. Strong, P. Tremblay, D. Turnbull, K. A. Walker, I. Walkty, D. A. Wardle, V. Wehrle, R. Zander, and J. Zou, “Atmospheric Chemistry Experiment (ACE): Mission overview,” Geophys. Res. Lett. 32(15), L15S01 (2005). W. L. Smith, Sr., H. Revercomb, G. Bingham, A. Larar, H. Huang, D. Zhou, J. Li, X. Liu, and S. Kireev, “Technical note: evolution, current capabilities, and future advance in satellite nadir viewing ultra-spectral IR sounding of the lower atmosphere,” Atmos. Chem. Phys. 9(15), 5563–5574 (2009). European Commission, “Copernicus: The European Earth Observations programme,” http://www.copernicus.eu/. F. Aires, O. Aznay, C. Prigent, M. Paul, and F. Bernardo, “Synergistic multi-wavelength remote sensing versus a posteriori combination of retrieved products: Application for the retrieval of atmospheric profiles using MetOpA,” J. Geophys. Res. 117(D18), D18304 (2012). S. Ceccherini, P. Raspollini, and B. Carli, “Optimal use of the information provided by indirect measurements of atmospheric vertical profiles,” Opt. Express 17(7), 4944–4958 (2009). S. Ceccherini, B. Carli, U. Cortesi, S. Del Bianco, and P. Raspollini, “Retrieval of the vertical column of an atmospheric constituent from data fusion of remote sensing measurements,” J. Quant. Spectrosc. Radiat. 111(3), 507–514 (2010). S. Ceccherini, U. Cortesi, S. Del Bianco, P. Raspollini, and B. Carli, “IASI-METOP and MIPAS-ENVISAT data fusion,” Atmos. Chem. Phys. 10(10), 4689–4698 (2010).

#232855 - $15.00 USD © 2015 OSA

Received 20 Jan 2015; revised 11 Mar 2015; accepted 11 Mar 2015; published 25 Mar 2015 6 Apr 2015 | Vol. 23, No. 7 | DOI:10.1364/OE.23.008476 | OPTICS EXPRESS 8476

10. J. X. Warner, R. Yang, Z. Wei, F. Carminati, A. Tangborn, Z. Sun, W. Lahoz, J.-L. Attié, L. El Amraoui, and B. Duncan, “Global carbon monoxide products from combined AIRS, TES and MLS measurements on A-train satellites,” Atmos. Chem. Phys. 14(1), 103–114 (2014). 11. C. D. Rodgers, Inverse Methods for Atmospheric Sounding: Theory and Practice, Vol. 2 of Series on Atmospheric, Oceanic and Planetary Physics (World Scientific, 2000). 12. S. Ceccherini and M. Ridolfi, “Technical Note: Variance-covariance matrix and averaging kernels for the Levenberg-Marquardt solution of the retrieval of atmospheric vertical profiles,” Atmos. Chem. Phys. 10(6), 3131–3139 (2010). 13. S. Ceccherini, “A generalization of optimal estimation for the retrieval of atmospheric vertical profiles,” J. Quant. Spectrosc. Radiat. 113(12), 1437–1440 (2012). 14. S. Ceccherini, B. Carli, and P. Raspollini, “The average of atmospheric vertical profiles,” Opt. Express 22(20), 24808–24816 (2014). 15. S. Ceccherini, B. Carli, and P. Raspollini, “Quality quantifier of indirect measurements,” Opt. Express 20(5), 5151–5167 (2012). 16. S. Ceccherini, B. Carli, and P. Raspollini, “Quality of MIPAS operational products,” J. Quant. Spectrosc. Radiat. 121, 45–55 (2013). 17. R. A. Fisher, “The logic of inductive inference,” J.R. Stat. Soc. 98(1), 39–54 (1935). 18. M. Ridolfi, B. Carli, M. Carlotti, T. von Clarmann, B. M. Dinelli, A. Dudhia, J. M. Flaud, M. Höpfner, P. E. Morris, P. Raspollini, G. Stiller, and R. J. Wells, “Optimized forward model and retrieval scheme for MIPAS near-real-time data processing,” Appl. Opt. 39(8), 1323–1340 (2000). 19. P. Raspollini, C. Belotti, A. Burgess, B. Carli, M. Carlotti, S. Ceccherini, B. M. Dinelli, A. Dudhia, J. M. Flaud, B. Funke, M. Hopfner, M. Lopez-Puertas, V. Payne, C. Piccolo, J. J. Remedios, M. Ridolfi, and R. Spang, “MIPAS level 2 operational analysis,” Atmos. Chem. Phys. 6(12), 5605–5630 (2006). 20. P. Raspollini, B. Carli, M. Carlotti, S. Ceccherini, A. Dehn, B. M. Dinelli, A. Dudhia, J.-M. Flaud, M. LópezPuertas, F. Niro, J. J. Remedios, M. Ridolfi, H. Sembhi, L. Sgheri, and T. von Clarmann, “Ten years of MIPAS measurements with ESA Level 2 processor V6 – Part 1: Retrieval algorithm and diagnostics of the products,” Atmos. Meas. Tech. 6(9), 2419–2439 (2013). 21. A. Doicu, T. Trautmann, and F. Schreier, Numerical Regularization for Atmospheric Inverse Problems (Springer, 2010). 22. S. Ceccherini, “Analytical determination of the regularization parameter in the retrieval of atmospheric vertical profiles,” Opt. Lett. 30(19), 2554–2556 (2005). 23. S. Ceccherini, C. Belotti, B. Carli, P. Raspollini, and M. Ridolfi, “Technical Note: Regularization performances with the error consistency method in the case of retrieved atmospheric profiles,” Atmos. Chem. Phys. 7(5), 1435–1440 (2007). 24. M. Ridolfi and L. Sgheri, “Iterative approach to self-adapting and altitude-dependent regularization for atmospheric profile retrievals,” Opt. Express 19(27), 26696–26709 (2011). 25. M. Carlotti, “Global-fit approach to the analysis of limb-scanning atmospheric measurements,” Appl. Opt. 27(15), 3250–3254 (1988). 26. A. Dudhia, V. L. Jay, and C. D. Rodgers, “Microwindow selection for high-spectral-resolution sounders,” Appl. Opt. 41(18), 3665–3673 (2002). 27. M. Ridolfi and L. Sgheri, “On the choice of retrieval variables in the inversion of remotely sensed atmospheric measurements,” Opt. Express 21(9), 11465–11474 (2013). 28. J. J. Remedios, R. J. Leigh, A. M. Waterfall, D. P. Moore, H. Sembhi, I. Parkes, J. Greenhough, M. P. Chipperfield, and D. Hauglustaine, “MIPAS reference atmospheres and comparisons to V4.61/V4.62 MIPAS level 2 geophysical data sets,” Atmos. Chem. Phys. Discuss. 7(4), 9973–10017 (2007).

1. Introduction Remote sensing observations of chemical and physical processes occurring in the atmosphere are presently made from space by several instruments and more instruments will be available in the future. Among the instruments that in recent years performed space borne observations of the atmosphere, we recall the limb viewing instruments of MIPAS [1], MLS [2] and ACE [3], while a review of some nadir viewing instruments is made in [4]. Examples of satellite instruments that will sound the atmosphere in the future can be found in the description of Copernicus [5], a joint European Commission/European Space Agency (ESA) programme. When two or more instruments sound the same portion of atmosphere and observe the same species either in different spectral regions or with different geometries, two strategies are possible for the retrieval of the best vertical profile estimate that exploits all the available information. First, we can use all the observations acquired by the different instruments as inputs of a single retrieval algorithm that produces a single profile. We refer to this approach as the simultaneous retrieval. Second, we can use the observations of the different instruments to retrieve from each one an independent vertical profile and then use a posteriori an algorithm that combines the different profiles and determines a single estimate. We refer to this approach as the data fusion.

#232855 - $15.00 USD © 2015 OSA

Received 20 Jan 2015; revised 11 Mar 2015; accepted 11 Mar 2015; published 25 Mar 2015 6 Apr 2015 | Vol. 23, No. 7 | DOI:10.1364/OE.23.008476 | OPTICS EXPRESS 8477

The simultaneous retrieval provides the best estimate because it takes into account all the possible interactions between the various information inputs [6], exploiting the complementary information of the different measurements while rigorously combining the redundant information. However, the simultaneous retrieval is of difficult implementation because it requires a forward model that can simulate all the observations of the different instruments and the retrieval algorithm has to deal with a large amount of data. The data fusion approach overcomes these implementation problems, but usually it is not expected to have the same performance as the simultaneous retrieval, because the intermediate step of the retrievals of the single measurements can cause a loss of information. The two approaches were compared in [6] in the case of the MetOp-A observations from IASI, AMSU-A and MHS instruments, using the weighted mean for the data fusion. This study showed that the simultaneous retrieval provides better results than the data fusion, because it can better exploit the synergy of the different measurements. A data fusion method, that can provide better results than the simple weighted mean, was proposed in [7] and subsequently applied [8, 9] to ozone measurement combining co-located measurements of IASI and MIPAS instruments. These studies showed that the performance of this method is good, but its implementation requires the use of the measurement space solution [7], which is not a standard product and is not commonly provided to the users. Another data fusion method, that uses a formulation identical to the Kalman filter method, commonly used in data assimilation, was used in [10] for the determination of carbon monoxide concentration from the measurements of AIRS, TES and MLS instruments on Atrain satellites. In practice the Kalman filter too combines measurements using the weighted mean. We present herewith a new data fusion method for atmospheric vertical profiles retrieved from remote sensing measurements. It uses an algorithm that is more sophisticated than the simple weighted mean because in addition to the retrieval errors of the single profiles, described by the covariance matrices (CM), it also takes into account the sensitivity of the retrieved profiles to the true profile, described by the averaging kernel matrices (AKM) [11, 12]. Since both the CM and AKM are commonly provided together with the retrieved profile, the implementation of this method is simple and feasible with all measurements. We also compare the results of the new data fusion method with those of the simultaneous retrieval using the measurements of the MIPAS (Michelson Interferometer for Passive Atmospheric Sounding) instrument [1] onboard the ENVISAT satellite. The observations of a MIPAS limb sounding sequence are divided into two complementary sets and two profiles are independently retrieved from the two sets of observations. The two profiles are fused with the data fusion algorithm and the result is compared with the profile retrieved using simultaneously all the observations of the sequence. The performance of the new data fusion method is also compared with those of weighted and arithmetic means. In Section 2 we present the algorithm of the new data fusion method, in Section 3 we apply the method to real measurements and in Section 4 we draw conclusions. 2. The new data fusion method We suppose to have N independent simultaneous measurements of the vertical profile of an atmospheric species referred to a specific geolocation. Performing the retrieval of the N measurements we obtain N vectors xˆ i (i = 1, 2, …, N) that provide independent estimates of the profile on a common vertical grid. The vectors xˆ i are characterized by the CMs Si and the AKMs Ai. The CMs Si are each defined as the mean value of the product σiσiT, where the vector σi contains the errors on the vertical profile obtained propagating the errors of the observations through the retrieval process and superscript T indicates the transpose of the vector. The AKMs Ai are each defined as the matrix that contains the derivatives of the components of retrieved profile xˆ i with respect to the components of true profile xtrue (that is not known).

#232855 - $15.00 USD © 2015 OSA

Received 20 Jan 2015; revised 11 Mar 2015; accepted 11 Mar 2015; published 25 Mar 2015 6 Apr 2015 | Vol. 23, No. 7 | DOI:10.1364/OE.23.008476 | OPTICS EXPRESS 8478

If we expand at the first order the relationship that exists between retrieved profile and true profile, exploiting the definition of the AKM, we obtain the following equation [11, 13]: xˆ i = x ai + A i ( xtrue − x ai ) + σ i ,

(1)

where we have indicated with xai the a priori profile used in the i-th retrieval. Rearranging Eq. (1) we obtain: xˆ i − ( I − A i ) x ai = A i xtrue + σ i ,

(2)

where I represents the identity matrix. In Eq. (2) we notice that the vector α i = xˆ i − ( I − A i ) x ai ,

(3)

which is obtained from known quantities, is an estimate of the vector Aixtrue, made of the components of xtrue along the rows of the AKM (averaging kernels) and, as such, corresponds to a new indirect measurement of the true profile made in the vector space generated by the averaging kernels. From Eqs. (2) and (3) we see that the new measurement provided by the vector αi has the same errors σi as the retrieved profile xˆ i (consequently its CM is also given by Si), but does not depend on the a priori profile xai (which instead contributes as a bias to xˆ i ). The above procedure for the calculation of the vectors αi was already presented in [14], but, given its importance for the following considerations, is here repeated. Since the vectors αi are indirect measurements in the form of Aixtrue, we can perform a simultaneous fit of these measurements minimizing the following cost function: N

c ( x ) =  ( α i − A i x ) Si −1 ( α i − A i x ) + ( x − x a ) S a −1 ( x − x a ) , T

T

(4)

i =1

where xa and Sa are an a priori profile and its CM that we may want to use as a constraint of the solution. This constraint depends on the ill conditioning of the simultaneous fit and is in general different from the constraints used in the individual retrievals. The minimum of c(x) is obtained for: −1

 N   N  x f =   A iT Si −1 A i + S a −1    A iT Si −1α i + S a −1x a  . (5)  i =1   i =1  This relationship provides a new estimate of the profile determined with the data fusion of N different profiles. This fused profile has a CM, obtained propagating the errors of αi into xf, equal to:  N  S f =   A i T Si −1 A i + S a −1   i =1 

−1 N

 N  A i S i A i   A i T S i −1 A i + S a −1   i =1  i =1  T

−1

−1

(6)

and an AKM, obtained performing the derivative of xf with respect to the true profile, equal to: −1

 N  N A f =   A iT Si −1 A i + S a −1   A i T Si −1 A i . (7)  i =1  i =1 In these formulas we can see the importance of the quantity A i T Si −1 A i [15, 16] that gives the Fisher information matrix [17] using the retrieval products. This quantity, which summarizes the information that observations contain about retrieved parameters, is a retrieval invariant because, independently of the used constraint, it is the same for all retrievals made with a given set of observations.

#232855 - $15.00 USD © 2015 OSA

Received 20 Jan 2015; revised 11 Mar 2015; accepted 11 Mar 2015; published 25 Mar 2015 6 Apr 2015 | Vol. 23, No. 7 | DOI:10.1364/OE.23.008476 | OPTICS EXPRESS 8479

If the AKMs Ai are equal to the identity matrices from Eq. (3) it follows that αi = xˆ i and Eq. (5) coincides with the weighted mean of the N measurements (which in turn reduces itself to the arithmetic mean when all the measurements are considered to have the same Si). Therefore, Eq. (5) is a generalization of weighted mean in the case of AKMs different from identity matrices and, for its capability of considering all the features of the measurements that are being combined, we shall refer to it as complete fusion formula. In the Appendix we show that in a linear approximation the solution obtained with complete fusion coincides with the solution obtained with simultaneous retrieval. 3. Application to real data We evaluate the performance of the new data fusion approach applying it to an ozone vertical profile retrieved from MIPAS observations [1]. MIPAS is a limb-viewing Fourier transform spectrometer that sounds the emission of the Earth atmosphere in the spectral range from 685 to 2410 cm−1. It operated successfully onboard the sun-synchronous polar orbiting ENVISAT satellite that was launched on the 1st of March 2002 and ended its operations on the 8th April 2012. A modified version of ORM (Optimized Retrieval Model) [18–20], which is the scientific prototype of the ESA operational level 2 processor for MIPAS, was used for the retrievals performed in this paper. The ORM code minimizes the chi-square function using the regularizing LevenbergMarquardt approach [21] and then the residual non-physical oscillations are eliminated by an a posteriori Tikhonov regularization with a self-adapting strength [22–24]. This approach provides a good performance in terms of retrieval speed and minimum constraint, but makes it difficult to determine the effective a priori. Since the proposed data fusion method requires the knowledge of the a priori profile, as described in the previous section, the ORM was modified in order to perform the retrieval with the optimal estimation method [11], for which the a priori profile is well defined. In the following analysis we present the MIPAS measurement acquired on the 1st October 2007 at the geolocation of 41.38° N and 95.24° E. However, similar results and same conclusions are obtained when the new data fusion method is applied to the other MIPAS measurements. The considered measurement consists of 27 atmospheric spectra, with a spectral resolution of 0.0625 cm−1, acquired at limb with tangent altitudes ranging from 7 to 71 km with steps increasing with altitude of 1.5 km in troposphere up to 4.5 km in mesosphere. The retrieval of this measurement consists in the global fit [25] of a few selected spectral intervals (microwindows [26]) of the 27 spectra and the retrieved state vector includes volume mixing ratio (VMR) profile of ozone, profiles of the (frequency independent) atmospheric transparency due to continuum absorption [27] and (tangent altitude independent) radiometric offset for each microwindow. A climatological profile [28] is used as the a priori profile and its CM is built using an error equal to the value of the a priori profile for diagonal elements and a correlation length of 10 km for the calculation of offdiagonal elements. The a priori value of the transparency profiles is 1 with uncorrelated errors equal to 1%. The a priori value of the radiometric offsets is 0 with uncorrelated errors equal to 31.6 nW/(cm−2 sr−1 cm−1). MIPAS measurements cover a large vertical range and contain (as we shall see) a large number of degrees of freedom (NDOF). Therefore, it is possible to divide one of these measurements into two data sets, each with a NDOF sufficient for an independent analysis, that are suitable for testing the data fusion method. We divided the measurement in two different ways. Firstly, we obtained two data sets taking spectra at alternate tangent altitudes. Indexing the spectra starting from the highest tangent altitude (for which the index value is equal to 1) and increasing the index progressively as the tangent altitude decreases, even spectra (corresponding to index values 2, 4, …) are included in one data set and odd spectra (corresponding to index values 1, 3, …) are included in the other one. In this way, we obtain two data sets that cover approximately the same vertical range but have each a vertical resolution worse than that of the original measurement. In the second place, we obtained two

#232855 - $15.00 USD © 2015 OSA

Received 20 Jan 2015; revised 11 Mar 2015; accepted 11 Mar 2015; published 25 Mar 2015 6 Apr 2015 | Vol. 23, No. 7 | DOI:10.1364/OE.23.008476 | OPTICS EXPRESS 8480

data sets including spectra indexed from 1 to 13 in the first one (in the following indicated as high spectra) and those indexed from 14 to 27 in the second one (in the following indicated as low spectra). In this way, we obtain data sets that cover two complementary altitude ranges: the first one approximately from 29 km to 71 km and the second one from 7 to 29 km. The vertical resolution of these data sets is equal to that of the original measurement. Since the reduced data sets contain a number of observations smaller than the original measurement, but are used for retrievals made in the same retrieval grid, a constraint stronger than that of the original measurement is used for their analyses. After some tests we obtained good profiles without oscillations induced by ill-conditioning using for the a priori ozone profile an error equal to 30% of the a priori value and a correlation length for the off-diagonal elements of the CM equal to 30 km. The a priori CM elements of the transparency profiles and of the radiometric offsets are taken equal to those used in the retrieval of the original measurement. The results obtained in the tests, made with the data sets containing even and odd spectra and with the data sets containing high and low spectra, are discussed in the next two subsections. 3.1 Fusion of even and odd spectra In the left panel of Fig. 1 we report the profiles retrieved using the simultaneous retrieval of all the spectra, even spectra only and odd spectra only. The a priori profile is also reported. The right panel of Fig. 1 shows the differences of the profiles obtained using even and odd spectra with respect to the profile obtained using the simultaneous retrieval. The large number of observations makes the three retrieved profiles comparable, but the differences highlight the merits of the full analysis.

Fig. 1. Left panel: a priori profile (green line) and profiles retrieved using all spectra (black line), even spectra (red line) and odd spectra (blue line). Right panel: differences between the profiles obtained with reduced data sets and simultaneous retrieval in the case of even spectra (red line) and in the case of odd spectra (blue line).

In Fig. 2 we compare the profile obtained using the simultaneous retrieval with the profiles obtained using the different data fusion techniques (complete fusion, weighted mean and arithmetic mean) applied to the two profiles retrieved from even and odd spectra. The complete fusion reproduces very well the results of the simultaneous retrieval both from the point of view of the values, which differ by quantities much smaller than the errors, and from the point of view of the error estimates: the black dashed line is exactly under the blue dashed line. The arithmetic and weighted means differ from the simultaneous retrieval by much larger quantities and are characterized by errors that are much smaller than the observed differences. This apparent contradiction is explained by the analysis of the NDOF.

#232855 - $15.00 USD © 2015 OSA

Received 20 Jan 2015; revised 11 Mar 2015; accepted 11 Mar 2015; published 25 Mar 2015 6 Apr 2015 | Vol. 23, No. 7 | DOI:10.1364/OE.23.008476 | OPTICS EXPRESS 8481

Fig. 2. Left panel: profile obtained using the simultaneous retrieval of all spectra (black line) and profiles obtained with complete fusion (blue line), weighted mean (red line) and arithmetic mean (green line) of the two profiles retrieved from even and odd spectra. The errors of these profiles are shown in the right panel with the same colors in dashed lines. The right panel also shows the differences, with respect to the profile obtained using the simultaneous retrieval, of profiles obtained with complete fusion (blue line), weighted mean (red line) and arithmetic mean (green line).

In Table 1 we report the NDOF, calculated as the trace of the AKM, of the profiles reported in Figs. 1 and 2. The profiles retrieved from even and odd spectra have, as expected, a NDOF that is about half the NDOF of the profile obtained from the simultaneous retrieval. The profile obtained with the complete fusion has the same NDOF as simultaneous retrieval while weighted and arithmetic means have a NDOF of the same order as profiles retrieved from even and odd spectra. Table 1. NDOF of the profiles reported in Figs. 1 and 2.

Simultaneous retrieval

Even spectra

Odd spectra

Complete Fusion

Weighted mean

Arithmetic mean

23.6

11.6

11.7

23.6

9.7

11.7

The weighted mean and the arithmetic mean are the result of an averaging process that reduces the errors, but cannot provide a better NDOF. Accordingly, the small NDOF prevents an adequate representation of the shape of the profile so that the differences with respect to the more realistic representation provided by the simultaneous retrieval are significant and, in particular, larger than the reduced errors. The simultaneous retrieval and the complete fusion best exploit the available NDOF at the cost of a slightly larger retrieval error. These results confirm that data fusion performed using either weighted or arithmetic mean provides worse results than those obtained with the simultaneous retrieval. However, when the more rigorous method of the complete fusion is used, the data fusion reproduces very well the results of the simultaneous retrieval (from the point of view of values, of error estimates and of NDOF) and, therefore, can be considered equivalent to the simultaneous retrieval. 3.2 Fusion of high and low spectra The tests performed in the case of even and odd spectra and shown in Figs. 1 and 2 were repeated in the case of high and low spectra and are shown in Figs. 3 and 4. The comparison of the different retrievals, made in Fig. 3, shows the errors made when a reduced data set is used, while the comparison of the different data fusion methods, made in Fig. 4, shows the equivalence that exists between the complete fusion and the simultaneous retrieval and the

#232855 - $15.00 USD © 2015 OSA

Received 20 Jan 2015; revised 11 Mar 2015; accepted 11 Mar 2015; published 25 Mar 2015 6 Apr 2015 | Vol. 23, No. 7 | DOI:10.1364/OE.23.008476 | OPTICS EXPRESS 8482

approximations provided by arithmetic and weighted means. The estimated errors of weighted and arithmetic means are also in this case much smaller than the observed differences because of their poor vertical resolution, as confirmed by the NDOF.

Fig. 3. Left panel: a priori profile (green line) and profiles retrieved using all spectra (black line), low spectra (red line) and high spectra (blue line). Right panel: differences between the profiles obtained with reduced data sets and the simultaneous retrieval, in the case of low spectra (red line) and in the case of high spectra (blue line).

Fig. 4. Left panel: profile obtained using the simultaneous retrieval of all spectra (black line) and profiles obtained with complete fusion (blue line), weighted mean (red line) and arithmetic mean (green line) of the two profiles retrieved from low and high spectra. The errors of these profiles are shown in the right panel with the same colors in dashed lines. The right panel also shows the differences, with respect to the profile obtained by the simultaneous retrieval, of profiles obtained with complete fusion (blue line), weighted mean (red line) and arithmetic mean (green line).

Table 2 shows the NDOF of the profiles reported in Figs. 3 and 4. In this case too, the profiles retrieved from the reduced data sets have a NDOF that is about half the NDOF of the profile retrieved from all the spectra. The profile obtained with the complete fusion has the same NDOF as the simultaneous retrieval while the weighted and arithmetic means have a NDOF of the same order as the single profiles that are being fused (fusing profiles).

#232855 - $15.00 USD © 2015 OSA

Received 20 Jan 2015; revised 11 Mar 2015; accepted 11 Mar 2015; published 25 Mar 2015 6 Apr 2015 | Vol. 23, No. 7 | DOI:10.1364/OE.23.008476 | OPTICS EXPRESS 8483

Table 2. NDOF of the profiles reported in Figs. 3 and 4.

Simultaneous retrieval

Low Spectra

High Spectra

Complete Fusion

Weighted mean

Arithmetic mean

23.6

9.0

10.6

23.6

11.4

9.8

Comparing the results of the two tests, we see that they are very similar independently of the type of complementarity of the measurements: interlaced observations in the first test and complementary altitude ranges in the second test. Assuming that the simultaneous retrieval provides the best estimate of the profile, we notice that, relative to this reference, the shortcomings of reduced data sets and of approximate data fusions are mainly errors in the shape of the profile rather than systematic amplitude effects. However, this result may depend on the adopted a priori constraints. 3.3 Effect of systematic errors in data fusion In the previous tests we considered the data fusion of profiles that are only affected by random errors. Systematic errors, if present in the MIPAS measurements, are the same in the two data sets obtained from the same measurement and do not have consequences in the fusion procedure. However, when the fusing profiles are provided by different instruments it is very likely that they have different systematic errors and it is important to be aware of the problems that these independent systematic errors may cause. We studied this problem in the case of data fusion of profiles retrieved from even and odd spectra simulating a systematic error. We increased the profile retrieved from even spectra by 2% and decreased the profile retrieved by odd spectra by 2%, obtaining a total bias of 4% between the two profiles. Then we performed the data fusion of the two biased profiles using the complete fusion approach as well as weighted and arithmetic means. The comparison of these three fused profiles with the reference profile, provided by the simultaneous retrieval without systematic errors, is shown in Fig. 5.

Fig. 5. Same profiles as reported in Fig. 2 when a systematic error is present in the profiles retrieved from even and odd spectra (respectively increased and decreased by 2%).

From the comparison of Fig. 5 with Fig. 2 we see that the presence of a systematic error determines strong oscillations in the complete fusion, while has little or no effect in the case of weighted and arithmetic means. The different consequences that systematic errors have in the three fusion methods are linked to the different NDOF of the fused profiles. Weighted and arithmetic means have a small NDOF and do not exploit the information that the interlaced even and odd spectra provide at contiguous altitudes. In practice these fusion techniques do

#232855 - $15.00 USD © 2015 OSA

Received 20 Jan 2015; revised 11 Mar 2015; accepted 11 Mar 2015; published 25 Mar 2015 6 Apr 2015 | Vol. 23, No. 7 | DOI:10.1364/OE.23.008476 | OPTICS EXPRESS 8484

not increase the NDOF and calculate the averages of the inconsistent values provided at contiguous altitudes by the fusing profiles, cancelling in this way the bias artificially introduced. On the other hand, the complete fusion profile has a large NDOF and, therefore, with its good vertical resolution can discriminate between the contiguous altitudes of even and odd spectra. In this case, the fused profile tries to fit the alternating values obtained with the combination of even and odd spectra and generates an oscillation with amplitude greater than the bias between the two measurements. A bias of 4% is causing an oscillation of up to 30% peak to peak. When a bias is introduced in the case of high and low spectra similar results (not shown here) are obtained even if the error amplification is not as large and differences mainly occur around the boundary altitude. Among the systematic errors, not only instrumental errors, but also space and time variability of the observed species is important, suggesting caution in the definition of the colocation criteria of the measurements that are being fused. It appears that in presence of systematic errors simple averaging techniques provide better results than complete fusion. However, this is not a shortcoming of complete fusion, but a problem that also concerns simultaneous retrieval and that is due to the rigorous exploitation of inconsistent information. Of course, the instability can be reduced using a Sa that provides a stronger constraint, so that the NDOF is reduced and complete fusion is more alike the other methods (weighted/arithmetic means). However, a better way to alleviate the problem is to take into account systematic errors in the fusion process adding to the CMs of the fusing profiles the CMs of systematic errors. If we add to the CMs of the profiles retrieved from even and odd spectra a diagonal matrix that accounts for the systematic errors (2% of the retrieved profiles) and perform the fusion between the two biased profiles we obtain the profile reported in Fig. 6 (blue line).

Fig. 6. Left panel: reference profile provided by the simultaneous retrieval (black line) and profile obtained from the complete fusion (blue line) of the profiles retrieved from even and odd spectra respectively increased and decreased of 2% when CMs that take into account systematic errors are used. Right panel: differences between the profiles reported in the left panel compared with their errors.

We see in Fig. 6 that taking into account systematic errors in the characterization of the fusing profiles reduces by a factor three the oscillations (see Fig. 5) and, furthermore, these are now consistent with estimated errors. The NDOF of the fused profile reported in Fig. 6 is 23.1 showing that the inclusion of the CM of systematic errors in the fusion process does not reduce the NDOF and makes possible a further reduction of the oscillations with a subsequent regularization.

#232855 - $15.00 USD © 2015 OSA

Received 20 Jan 2015; revised 11 Mar 2015; accepted 11 Mar 2015; published 25 Mar 2015 6 Apr 2015 | Vol. 23, No. 7 | DOI:10.1364/OE.23.008476 | OPTICS EXPRESS 8485

4. Conclusion We presented a new data fusion method that can be used for the combination of two or more atmospheric vertical profiles measured by different instruments in the same location. The algorithm used for the fusion takes into account both the CM and the AKM of the fusing profiles and can be considered to be a generalization of weighted and arithmetic means. In turn, these means can be considered approximations of the new data fusion method, which for its comprehensive approach is referred to as complete fusion. The complete fusion method uses standard retrieval products and has very simple implementation requirements. We compared the performance of complete fusion with those of weighted and arithmetic means using the measurements of the MIPAS instrument. A MIPAS limb sounding sequence was divided in two complementary data sets and two profiles were independently retrieved from the two sets of observations. The two profiles were fused using complete fusion as well as weighted and arithmetic means. The results of these fusions were compared with the profile retrieved using simultaneously all the observations of the sequence. Experimentally, we observed that complete fusion reproduces very well the results of the simultaneous retrieval from the point of view of values, of error estimates and of NDOF. Weighted and arithmetic means differ from the simultaneous retrieval and are characterized by errors that are smaller than observed differences. This apparent contradiction is explained, as shown by the analysis of NDOF, by the poor vertical resolution of the weighted and arithmetic means that prevents an adequate representation of the shape of the profile so that the differences with respect to the more realistic representation provided by the simultaneous retrieval are significant. Furthermore, in the Appendix we analytically demonstrated that in the linear approximation complete fusion provides the same solution as the simultaneous retrieval. The use of a linear approximation does not mean that we are limiting ourselves to the case of observations with a linear forward model, but that the derivatives of the forward models calculated at different profiles (obtained with either single or simultaneous retrieval) can be assumed to have the same values. Despite the promising results and its easy implementation, the complete fusion must be cautiously used. Problems are observed when systematic errors introduce a bias between the fusing profiles and in atmospheric measurements systematic errors are often caused by either instrumental differences or space and time variability of the observed species. Tests have shown that a bias determines strong oscillations in the case of complete fusion (a bias of 4% was observed to cause an oscillation of 30%), while has little or no effect in the case of weighted and arithmetic means. In practice, weighted and arithmetic means with their low vertical resolution are performing an average of the different biases that in this way do not cause oscillations in the fused profile. On the other hand, the complete fusion, like the simultaneous retrieval, exploits for an improved vertical resolution all the altitudes present in the different data sets and generates oscillations when the contiguous altitudes have different systematic errors. The instability observed in the complete fusion is not a shortcoming of the method but a problem associated with the high vertical resolution of the obtained profile and, given the equivalence of the two methods, also concerns the simultaneous retrieval. We have shown that, taking into account the systematic errors in the CMs of fusing profiles, the instability can be significantly reduced without a reduction of NDOF. For a successful application of complete fusion to operational data, it is important to be aware of the possible effects of systematic errors and make adequate choices in the procedure settings. Appendix We want to show that with some assumptions the solution obtained with the data fusion performed with Eq. (5) coincides with the solution obtained with the simultaneous retrieval. We assume that a linear approximation can be applied to the forward model of each

#232855 - $15.00 USD © 2015 OSA

Received 20 Jan 2015; revised 11 Mar 2015; accepted 11 Mar 2015; published 25 Mar 2015 6 Apr 2015 | Vol. 23, No. 7 | DOI:10.1364/OE.23.008476 | OPTICS EXPRESS 8486

measurement in the range of variability between the solution of the single retrieval and that of the simultaneous retrieval. If we represent the observations (radiances) of a measurement with a vector y and the forward model with a function F(x), the relationship between the true profile xtrue and the observations is given by: y = F ( xtrue ) + ε,

(8)

where ε is the vector containing the experimental errors of the observations, characterized by a CM Sy given by the mean value of the product of ε times its transpose. Let’s consider the chi-square function of this measurement

χ 2 ( x ) = ( y − F ( x ) ) S y −1 ( y − F ( x ) ) T

(9)

and substitute y given by Eq. (8):

χ 2 ( x ) = ( F ( xtrue ) − F ( x ) + ε ) S y −1 ( F ( xtrue ) − F ( x ) + ε ) . T

(10)

In the case of the assumed linear approximation, we can write the forward model as: F ( xtrue ) − F ( x ) = K ( xtrue − x ) ,

(11)

where K is the Jacobian of the forward model calculated at a linearization point that has to be as close as possible to the true profile. Since our best estimate of the true profile is the retrieved profile, we calculate the Jacobian at the retrieved profile. Substituting Eq. (11) in Eq. (10) after some calculations we obtain:

χ 2 ( x ) = ( xtrue − x ) K T S y −1K ( xtrue − x ) + 2εT S y −1K ( xtrue − x ) + εT S y −1ε. T

(12) Now we want to express the chi-square function, which in Eq. (12) depends on the quantities that characterize the observations, with the quantities that characterize the retrieved profile, which are the AKM A and the CM S. We recall from [11] some relationships that characterize the retrieved solution. The errors σ of the retrieved profile obtained propagating the errors ε of the observations through the retrieval process are equal to: σ = ( K T S y −1K + S a −1 ) K T S y −1ε, −1

(13)

where Sa is the a priori CM used in the retrieval, and consequently the CM of σ is given by: S = ( K T S y −1K + S a −1 ) K T S y −1K ( K T S y −1K + S a −1 ) . −1

−1

(14)

Furthermore, the AKM of the retrieved profile is equal to: A = ( K T S y −1K + S a −1 ) K T S y −1K. −1

(15)

Using Eqs. (13-15) we now obtain: εT S y −1K = σT S −1 A

(16)

and (as also demonstrated in [15]): K T S y −1K = AT S −1 A. Substituting Eqs. (16) and (17) into Eq. (12) we obtain:

χ 2 ( x ) = ( xtrue − x ) AT S −1 A ( xtrue − x ) + 2σT S −1 A ( xtrue − x ) + εT S y −1ε. T

#232855 - $15.00 USD © 2015 OSA

(17) (18)

Received 20 Jan 2015; revised 11 Mar 2015; accepted 11 Mar 2015; published 25 Mar 2015 6 Apr 2015 | Vol. 23, No. 7 | DOI:10.1364/OE.23.008476 | OPTICS EXPRESS 8487

Rearranging Eq. (18) and exploiting the relationship between the α vector and the true profile, introduced in Section 2: α = Axtrue + σ,

(19)

we obtain:

χ 2 ( x ) = ( α − Ax ) S −1 ( α − Ax ) − σT S −1σ + εT S y −1ε. T

(20) The equivalence of Eq. (9) and Eq. (20) for the calculation of the chi-square function is now used to demonstrate the equivalence between the simultaneous retrieval and the data fusion approach presented in Section 2. Given N independent observations yi (i=1, 2, …, N), characterized by the CMs Syi and by the forward models Fi(x), the solution of the simultaneous retrieval is obtained minimizing the following cost function: N

g ( x ) =  ( y i − Fi ( x ) ) S yi −1 ( y i − Fi ( x ) ) + ( x − x a ) S a −1 ( x − x a ) , T

T

(21)

i =1

where xa and Sa are an a priori profile and its CM that may be used for the conditioning of the solution. Substituting the terms in the summation using Eqs. (9) and (20), we obtain: N

g ( x ) =  ( α i − A i x ) Si −1 ( α i − A i x ) + ( x − x a ) S a −1 ( x − x a ) T

T

i =1

N

N

−  σ i S i σ i +  εi S yi ε i . i =1

T

−1

T

(22)

−1

i =1

We can see that the first two terms of g(x) coincide with the cost function c(x) given in Eq. (4). Since the last two terms of g(x) do not depend on x, they disappear in the differentiation and the x at which g(x) is minimum, that is the simultaneous retrieval solution, coincides with the x at which c(x) is minimum, that is the data fusion solution. This demonstrates that in the linear approximation the two procedures provide the same solution.

#232855 - $15.00 USD © 2015 OSA

Received 20 Jan 2015; revised 11 Mar 2015; accepted 11 Mar 2015; published 25 Mar 2015 6 Apr 2015 | Vol. 23, No. 7 | DOI:10.1364/OE.23.008476 | OPTICS EXPRESS 8488

Equivalence of data fusion and simultaneous retrieval.

A new method for the data fusion of atmospheric vertical profiles, referred to as complete fusion, is presented. Using the measurements of the MIPAS i...
1MB Sizes 0 Downloads 5 Views