Interactive comment on “ Establishment of a regional precipitable water vapor model based on the combination of GNSS and ECMWF data

Abstract. Water vapor is the engine of the weather. Owing to its large latent energy, the phase changes of water vapor significantly affect the vertical stability, structure and energy balance of the atmosphere. Many techniques are used for measuring the water vapor in the atmosphere such as radiosondes, Global Navigation Satellite System (GNSS) and water vapor radiometer (WVR). In addition, the method that uses European Centre for Medium-range Weather Forecasts (ECMWF) data is an important method for studying the variations in precipitable water vapor (PWV). This paper used both GNSS PWV and ECMWF PWV to establish a city-level local PWV fusion model using a Gaussian Processes method. The results indicate that by integrating the precipitable water vapor obtained from GNSS and ECMWF data, the accuracy of fusion PWV is improved by 1.89 mm in active tropospheric conditions and 2.61 mm in quiescent tropospheric conditions compared with ECMWF-PWV, reaching 3.87 mm and 3.97 mm, respectively. Furthermore, the proposed fusion model is used to study the spatial and temporal distribution of PWV in Hong Kong. It is found that the accumulation of PWV corresponds to monsoon and rainfall events.



Introduction
Water vapor is a highly variable component in the atmosphere and plays a key role in many atmospheric processes.Accurate measurement of water vapor is vital for improving the predictability of regional precipitation, weather and visibility, especially for a highly moist metropolis such as Hong Kong (Chen and Liu, 2014).Many techniques are used for water vapor measurement in the atmosphere, such as radiosondes, ground-or space-based water vapor radiometers, Global Navigation Satellite System (GNSS) and other meteorological methods.
Radiosonde can accurately measure the water vapor, but its high operating cost restricts its applications in short-term weather forecasting.Its temporal and spatial resolution is quite poor (Guerova, 2003), usually with a 12-h observation interval.Since GNSS meteorology was first proposed by Bevis et al. (1992) as an approach for sounding the atmospheric water vapor by using ground-based receivers, extensive investigations based on batch processing have been conducted in the past two decades (Rocken et al., 1997;Ohtani and Naito, 2000;Hagemann et al., 2003;Braun., 2004;Gendt et al., 2004).GNSS has several significant advantages, including a low operating cost, all-weather availability, and high spatiotemporal resolution (Lu et al., 2015).Various studies have proven that GNSS can provide accurate water vapor estimates comparable to the measurements obtained from meteorological sensors in both postprocessing and near-real-time modes (Gendt et al., 2004;Haan et al., 2004;Gutman et al., 2004;Elgered et al., 2005;Nilsson and Elgered, 2008).However, the uneven distribution of ground GNSS stations has resulted in limited PWV coverage in marine regions and other remote areas.The ECMWF produces the highest level of short-term numerical weather forecast in the world and can provide global water vapor data 4 times a day (Annamalai et al., 1999;Huang et al., 2006;Renfrew et al.2002;Bromwich et al., 2004).Because of the consistency and homogenous spatial coverage of ECMWF data, they play an increasingly important role in regional weather forecasting and are being increasingly studied by scholars (Flentje et al., 2007;Ye et al., 2007;Zhang et al., 2009;Bock et al., 2010).The high-precision ECMWF reanalysis product, ERA-Interim, does not assimilate ground-based GNSS observations and extends back to 1979 (Dee et al., 2011), thereby maintaining good continuity.
Additional applications are concerned with validating PWV reanalysis products with GNSS observations (Vey et al., 2010).Although each water vapor measurement method has its advantages and disadvantages, the data are usually used alone.Only a few efforts have been devoted to investigating the modeling of multi-source precipitable water vapor data.In this paper, using both ECMWF and GNSS data, we aim to establish a local PWV fusion model.It is expected to obtain PWV field with higher accuracy and higher horizontal resolution, which is more suitable for weather analysis.The fusion is conducted with Gaussian Processes, and the results of using multisource data are validated with the radiosonde data and ground-based GNSS data.
In addition, the spatial and temporal distribution of PWV in Hong Kong is analyzed based on the proposed fusion model, which is a preliminary exploration of model application.

GNSS PWV
GNSS methods dedicated to estimating the PWV and are now well developed and commonly applied (Rocken et al., 1997;Ohtani and Naito, 2000;Hagemann et al., 2003;Braun, 2004).This technique is based on estimating the tropospheric delay by using a Global Navigation Satellite System (GNSS) with a combination of surface pressure and temperature.
The study is based on ground-based GNSS measurements of PWV from the Hong Kong Satellite Positioning Reference Station Network (SatRef).
Figure 1 shows the location map of the SatRef network continuously operating reference stations (CORS), and the radiosonde station is marked with a five-pointed star.
The Hong Kong SatRef network consists of 15 continuously operating reference stations equipped with Leica GNSS receivers and antennas (Figure 1).Each station (except T430) is equipped with an automatic meteorological device to record the temperature, pressure and relative humidity.With these data, the hydrostatic components of the tropospheric delay can be accurately estimated.The mean horizontal Once the zenith troposphere delay is obtained through the PPP, the precipitable water vapor can be calculated.A brief description of the computation procedure of the estimation of PWV is given below: The temperature, pressure and relative humidity recorded by meteorological devices at 14 tracking stations are used to calculate Zenith Hydrostatic Delay (ZHD).
Therefore, ZHD can be calculated from the surface pressure Ps (mbar), latitude φ (radians) and ellipsoidal height Hs (km) using the equation given by Saastamoinen et al (1972):

2.2768
/ (1 0.00266cos 2 0.00028 ) ZWD is obtained by subtracting the ZHD from the ZTD.Subsequently, the precipitable water vapor can be calculated from the Zenith wet delay (ZWD) and dimensionless proportional constant  .ZWD is converted into PWV using the following expression (Wang et al., 2005):    pressure, temperature and relative humidity at various altitudes using the balloon-borne platform.In this paper, the data from the Hong Kong radiosonde station at 0:00 and 12:00 UTC during two weeks in July and August of 2015 were selected.By using these meteorological parameters, the PWV at the radiosonde station is calculated according to formula (4).Since the radiosonde can measure PWV with an accuracy of a few millimeters, PWV measurements derived from radiosonde data are often used as an accuracy standard to evaluate the water vapor data from other independent sensors (Niell et al., 2001;Adeyemi et al., 2012).Because of its expensive operating cost, radiosonde data have a low temporal data rate, which limits their applications in shortterm weather forecasting.
A commonly used quantity in meteorology is the precipitable water vapor calculated by an integration method, which is defined as Here, Rw=R/Mw, and ρ1 (the liquid water density) is chosen to be 1000 kg/m3.

Local PWV model using multisource data
Fitting the PWV values obtained by different methods of observation using an appropriate model can produce a local PWV model.The method used in this paper is polynomial fitting through a Gaussian Processes (GP) model (Xia.et Where ( , ) is the mean function, which is used to describe the expected pwv value at i v .The term  accounts for the measurement error, and is assumed to follow a normal distribution with 0 mean and 2


 variance, i.e. ( , ) is the covariance of the pwv value at locations To reduce the fitting coefficients, the PWV involved in modelling have been height(m)-reduced to the earth surface using coefficients obtained from Ref (Means, 2011).The reduction equation is as follows: /2697 0 () In this work, we used a quadratic polynomial to represent the mean function of the GP model.The polynomial function model is expressed as follows.
Where n1 denotes the number of reference GNSS stations and (n2-n1) denotes ECMWF grid points respectively, and the subscript i denotes the index of the reference stations.PWVi is the surface precipitable water vapor at the ith station.(Bi, Li) are the latitude and longitude of the station.(a0,a1,……a5) are six fitting coefficients.
In addition, we used the squared exponential function to represent the covariance of the GP model: where is the Euclidean distance between locations vi and vj in the plane, 2 pwv  is the constant variance of the GP model and l is the characteristic length-scale.In practice, according to Eq. ( 8), pwv values that lie closer together on the plane (regardless of where they are located) are likely to be more similar.The squared exponential is one of the most popular choices for GP models because it yields positive definite correlation matrices, enables the proper convergence of the statistical estimation algorithms and can model smooth and infinitely differentiable functions (Rasmussen et al., 2010).
The parameters

Results
In this section, verification of PWV fusion model is conducted.The precision of the ECMWF reanalysis products is first evaluated.Two case studies concerning active and quiescent troposphere conditions are analyzed to assess the performance of the PWV fusion model, which is also used to study the spatial and temporal variation of PWV over Hong Kong.
To verify the contribution of GNSS material, the PWV data derived from the radiosonde station and some of the CORS stations are utilized to evaluate the precision of the calculated PWV values.The PWV accuracy for each site is expressed as the bias and RMS error of the difference between the calculated PWV and the reference PWV.
The optimum criterion is defined as follows: The integrated precipitable water vapor at the radiosonde station and several GNSS-derived PWV are used to assess the accuracy of the ECMWF PWV estimates and the fusion PWV values.Due to the inconsistent locations of the grid points of ECMWF products and the radiosonde station, the ECMWF-PWV in each grid are interpolated to the radiosonde station before comparison.However, because of the complex topography with large undulations in Hong Kong, the elevation differences between the radiosonde station and the ECMWF grid points are significant, and the extracted PWV values of the ECMWF products could not be suitable for any reliable comparison with the radiosonde PWV values.Therefore, to overcome the bias between the datasets due to elevation differences, the PWV from the ECMWF are reduced to the height of the radiosonde station using the exponential function illustrated in Eq. ( 6).

Case study 1: active troposphere condition
The local water vapor fusion modeling is performed at 0:00 and 12:00 UTC on 7 consecutive days (DOY201-207) in July.Firstly, with 25 ECMWF grids and 7-CORSstation network configurations as data sources, the PWV fitting coefficients are determined through Gaussian Processes for machine learning; using these coefficients, the PWV values at the radiosonde station are obtained after height reduction.As an independent external reference, the ZTD derived from radiosonde and GNSS data processing at CORS stations that are not involved in the modeling are used to assess the precipitable water vapor fusion model.
Intercomparisons have been conducted between the techniques for PWV time series measurements.The deviations of the PWV residuals between the radiosonde and calculated PWV (ECMWF and fusion model) are presented in Figure 2, and the mean bias and RMS information are shown in Table 1.These improvements indicate that adding GNSS water vapor data to ECMWF PWV reanalysis products helps improve the accuracy and reliability of PWV.

Case study 2: quiescent troposphere condition
The second case study describes a stable troposphere period during 7 consecutive days (DOY213-219) in August, when the daily sunshine duration reaches 5. shown in Table 3.During DOY213~219, except at DOY 217 UTC 12:00, the overall bias still appears to be positive.Compared with radiosonde-PWV, the bias of ECMWF-PWV fluctuates from -3.66 mm to 11.22 mm.While mean bias of the calculated PWV is 5.37 mm, and the mean RMS reaches 6.58 mm, which is quite unreliable in the fair-weather period.
In addition, the bias shows an upward trend during DOY 218~219, especially on 219/12, when the bias of ECMWF-PWV exceeds 10 mm, which may because that the precision of ECMWF data decreased these days.
With the fusion PWV model, the bias fluctuates from -3.16~8.10mm, with a mean bias of 2.73 mm and a mean RMS of 3.97 mm.Compared to the ECMWF estimations, the results for the calculated PWV are considerably more reliable, showing a 2.61 mm RMS improvement.Therefore, introducing GNSS observations into the meteorological reanalysis product has a stabilizing effect on the PWV fusion model, with a precision improvement of approximately 3 mm relative to the previous ECMWF products.
Similar to the processing and strategies in the first case study, the PWV derived from GNSS data processing on CORS stations that are not involved in the modeling are used to assess the PWV fusion model.

Discussion
In order to study the meteorological application of PWV fusion model, the spatialtemporal PWV variability as a function of topography and climatic differences will be discussed in this section.The spatial resolution of PWV distribution presented in this part is much higher than that of ECMWF data or GNSS data.For example, the PWV    The weather condition during DOY201~207 is relatively active compared with those on the preceding and following days.Several severe rainfall events occur on these days, indicating an accumulating phase of the troposphere ZWD.
Variations in precipitable water vapor correspond to the meteorological phenomena during wet and dry weather.The rainfall in Hong Kong on DOY201 is 46.2 mm, increases to 191.3 mm on DOY203 and decreases to 5.7 mm in the following days.
Accordingly, a first upward and then downward trend is identified in PWV variation during that week in July.An evident cause of the high PWV values on those days is the longer period of rainfall.
In contrast, the PWV during the following sunny days of DOY213~219, however, is approximately 30 mm less than the value of the previous rainy days.On days when the sunshine duration reaches 11 h, such as DOY202, the PWV is less than 50 mm for the entire area.
These figures show that the PWV time series are affected by the variations of the rainfall on a broad scale.In more detail, the PWV time series in HKPC CORS station on DOY 203 is analyzed, accompanied by rainfall information recorded by nearby PEN meteorological station, as shown in Figure 10.It can be seen that the PWV maintained an upward trend from 0:00 to 6:00.In the meanwhile, the continuous rainfall occurred from 4:00 on.An accumulating phase of PWV can also be identified during 17:00~19:00.However, there is no rainfall event in this period.Therefore, we note that intense rainfalls are always associated with an increase in PWV, while a PWV accumulation is not necessarily accompanied by instant significant precipitation.
According to the high complexity of the processes that are conducive to rainfall (Hally vapor content and that the accumulation of precipitable water vapor in the atmosphere does not mean that there will always be instant rainfall.

Conclusions
The ECMWF meteorological reanalysis product can provide precipitable water vapor at a global scale, but ECMWF reanalysis has significant errors in the PWV field.
To improve the accuracy and reliability of the PWV estimations, this paper proposes a local PWV fusion method that assimilates multiple sources of water vapor data through Gaussian processes.By integrating GNSS PWV with ECMWF PWV, a precipitable water vapor fusion model with high spatiotemporal resolution and higher precision is established.
The proposed method has been evaluated by the Hong Kong radiosonde station under active and quiescent troposphere conditions for DOY 2015 201~207 and 213~219.As an external reference and partial data source for modeling, 14 days of GNSS observation data from the SatRef Network are processed by the precise point positioning (PPP) module in the Bernese 5.0 software to establish the PWV fusion model.With respect to radiosonde-derived PWVs, the fusion-modeled PWVs present an accuracy of 3.87 mm in active troposphere conditions and 3.97 mm in stable troposphere conditions, which are significantly better than the conventional ECMWF models (i.e., 5.76 mm in active period or 6.58 mm in quiescent period).The accuracy and spatial resolution of the PWV model have been improved to some extent by introducing the GNSS data.
In addition, the proposed PWV fusion model is used to study the spatial-temporal variation of precipitable water vapor over Hong Kong.Affected by the monsoon, PWV tends to be higher in the north and lower in the south during the testing period.In Atmos.Meas.Tech.Discuss., https://doi.org/10.5194/amt-2018-227Manuscript under review for journal Atmos.Meas.Tech.Discussion started: 23 July 2018 c Author(s) 2018.CC BY 4.0 License.
Atmos.Meas.Tech.Discuss., https://doi.org/10.5194/amt-2018-227Manuscript under review for journal Atmos.Meas.Tech.Discussion started: 23 July 2018 c Author(s) 2018.CC BY 4.0 License.distance between stations is approximately 10 km, and the ellipsoidal heights of the 15 stations are within 350 m.GNSS observation data from the SatRef Network are processed by the precise point positioning (PPP) module in the Bernese 5.0 software (Astronomical Institute of the University of Bern, Bern, Switzerland) (Dach et al., 2007).
the weighted mean temperature of the atmosphere, ρ is the density of water, and Rv is the gas constant for the water vapor.Tm (Kelvin) is given by the GPT2w model(Böhm et al., 2014).

Figure 1 .
Figure 1.The SatRef network in Hong Kong
the geometry model described by Eqs.(5)-(8) are all unknown and must be estimated from the actual measurement data PWV (vi).The fitting of GP models was implemented in this paper based on the code developed byRasmussen et al. (2010).Once the parameter estimation is complete, the knowledge of the mean and covariance functions make it possible to estimate the value of the function pwv(v) at any new location v in the plane.To investigate the contribution of GNSS observations, this paper uses the PWV derived from 7 CORS tracking station (HKOH, HKPC, HKST, HKSS, HKSL, HKTK, and HKWS) with uniform distribution and uninterrupted observations.Adding the water vapor data to ECMWF PWV reanalysis products provide a certain help to improve their accuracy and reliability.In this paper, we consider two different situations with active and quiescent troposphere conditions on days of year (DOYs) 201~207 and 213~219 in 2015, respectively.The weather condition on DOY201~207 is relatively active compared with those on the preceding and following days.Several severe rainfall events occurred on these days, with the largest daily rainfall (~190 mm) in 2015 on 22 July, indicating an accumulating phase of the troposphere ZWD.The following days of DOY213~219, however, all happened to be sunny days.For each case, we first determine the PWV fitting coefficients using Gaussian Processes by inputting GNSS-PWV on several CORS stations and ECMWF-PWV at grid points, and we then assess the performance of the PWV fusion model.Atmos.Meas.Tech.Discuss., https://doi.org/10.5194/amt-2018-227Manuscript under review for journal Atmos.Meas.Tech.Discussion started: 23 July 2018 c Author(s) 2018.CC BY 4.0 License.
Atmos.Meas.Tech.Discuss., https://doi.org/10.5194/amt-2018-227Manuscript under review for journal Atmos.Meas.Tech.Discussion started: 23 July 2018 c Author(s) 2018.CC BY 4.0 License.between ECMWF PWV and GNSS PWV at CORS station locations at UTC12:00 on DOY 203 is presented in Figure 3 (left), with the difference of the fusion model shown in Figure 3 (right).This example in Figure 3 shows that the PWVs derived from the fusion model are more consistent with the PWV solved by GNSS.

Figure 3 .
Figure 3. Differences (mm) between PWVs by ECMWF (left) and fusion model (right) versus GNSS processing on the day 203 UTC 12:00 7 h ~ 11.4 Atmos.Meas.Tech.Discuss., https://doi.org/10.5194/amt-2018-227Manuscript under review for journal Atmos.Meas.Tech.Discussion started: 23 July 2018 c Author(s) 2018.CC BY 4.0 License.h.Similarly, the local precipitable water vapor fusion modeling is performed at 0:00 and 12:00 UTC each day.The deviations of the PWV residuals between radiosonde and PWV calculated by ECMWF and fusion model with 7-CORS-station network configurations are presented in Figure 4, and the mean bias and RMS information are

Figure 4 .
Figure 4. Bias versus radiosonde of ECMWF and Fusion model in August Figure 5 (right).The example in Fig 5 shows that the PWV data derived from the fusion model are more consistent with the PWV solved by GNSS.

Figure 5 .
Figure 5. Differences (mm) between PWVs by ECMWF (left) and fusion model (right) versus GNSS processing on the day 214 UTC 12:00 distribution of ECMWF, CORS, and fusion model at DOY 201 UTC 00:00 are displayed in the Figure7.The fusion model can reflect the detailed PWV distribution in the areas marked with red ellipse.However, when only looking at ECMWF data (or only the GNSS data), the spatial feature is relatively coarse.In order to apply PWV to meteorological analysis the PWV is calculated with the proposed PWV fusion model at a spatial resolution of 1"×1" in longitude and latitude at 0:00 and 12:00 UTC during (DOY) 201~207 and 213~219, respectively.To study the spatial distribution considering the PWV variation with the terrain, all PWV values are reduced to the earth surface.Hong Kong's topography is shown in Figure6.

Figure 6 .
Figure 6.Topography in Hong Kong

Figure 7 .Figure 8 Figure 8 .
Figure 7. PWV distribution at DOY 201 UTC 00:00Figure8shows the precipitable water vapor values for July, and Figure9shows the PWV distribution for August.

Figure 10 .
Figure 10.Precipitable water vapor time series and rainfall at HKPC Atmos.Meas.Tech.Discuss., https://doi.org/10.5194/amt-2018-227Manuscript under review for journal Atmos.Meas.Tech.Discussion started: 23 July 2018 c Author(s) 2018.CC BY 4.0 License.addition, the PWV values are significantly higher during the long period of rainfall than they are in fine weather.The accuracy and reliability of the local PWV model are improved compared with those of ECMWF products.This paper has only considered the GNSS PWV data.If more sufficient data can be obtained, further efforts will also be considered to assimilate water vapor data observed via multiple techniques (e.g., GNSS, WVR) into the ECMWF reanalysis product and thereby further enhance the ECMWF's weather forecasting performance in the future.
2.1.2.ECMWF PWV To compare PWV measurements from the latest reanalysis product with radiosondes, surface PWV data of Hong Kong from ERA-Interim (Dee et al., 2011), covering the time period July-August 2015, are considered.In order to incorporate

Table 1
External precision of PWV fusion models versus radiosonde in July . Meas.Tech.Discuss., https://doi.org/10.5194/amt-2018-227Manuscriptunderreviewforjournal Atmos.Meas.Tech.Discussion started: 23 July 2018 c Author(s) 2018.CC BY 4.0 License.Figure 2. Bias of ECMWF and Fusion model in July by 1.89 mm from the perspective of RMS relative to the previous ECMWF products.Furthermore, the fusion model maintains high external precision and stability in active tropospheric conditions.The PWV derived by the fusion model and GNSS observations at CORS stations that are not involved in modeling are also compared, and the average statistical results of the CORS network are presented in Table2.In addition, a typical PWV difference

Table 2
External precision of PWV fusion models versus CORS stations in July

Table 3
External precision of PWV fusion models versus radiosonde in August

Table 4
presents the mean precision over the inspection station network for each epoch.Similarly, a typical PWV difference between ECMWF PWV and GNSS PWV at the CORS station locations at UTC12:00 on DOY

Table 4
External precision of PWV fusion models versus CORS in August Atmos.Meas.Tech.Discuss., https://doi.org/10.5194/amt-2018-227Manuscript under review for journal Atmos.Meas.Tech.Discussion started: 23 July 2018 c Author(s) 2018.CC BY 4.0 License.derived water vapor data have higher accuracy than ECMWF water vapor data implies that the assimilation of water vapor data observed from multiple techniques (e.g., GNSS, WVR, and radiosonde) into the ECMWF can further enhance the ECMWF's weather forecasting performance.As such, these measurements will be an important component of the fusion model and will enhance the precipitable water vapor precision in meteorological research.