Journal cover Journal topic
Atmospheric Measurement Techniques An interactive open-access journal of the European Geosciences Union

Journal metrics

  • IF value: 3.089 IF 3.089
  • IF 5-year<br/> value: 3.700 IF 5-year
    3.700
  • CiteScore<br/> value: 3.59 CiteScore
    3.59
  • SNIP value: 1.273 SNIP 1.273
  • SJR value: 2.026 SJR 2.026
  • IPP value: 3.082 IPP 3.082
  • h5-index value: 45 h5-index 45
https://doi.org/10.5194/amt-2017-300
© Author(s) 2017. This work is distributed under
the Creative Commons Attribution 4.0 License.
Research article
27 Sep 2017
Review status
This discussion paper is a preprint. It is a manuscript under review for the journal Atmospheric Measurement Techniques (AMT).
Evaluation of linear regression techniques for atmospheric applications: The importance of appropriate weighting
Cheng Wu1,2 and Jian Zhen Yu3,4,5 1Institute of Mass Spectrometer and Atmospheric Environment, Jinan University, Guangzhou 510632, China
2Guangdong Provincial Engineering Research Center for on-line source apportionment system of air pollution, Guangzhou 510632, China
3Division of Environment, Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong, China
4Atmospheric Research Centre, Fok Ying Tung Graduate School, Hong Kong University of Science and Technology, Nansha, China
5Department of Chemistry, Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong, China
Abstract. Linear regression techniques are widely used in atmospheric science, but are often improperly applied due to lack of consideration or inappropriate handling of measurement uncertainty. In this work, numerical experiments are performed to evaluate the performance of five linear regression techniques, significantly extending previous works by Chu and Saylor. The regression techniques tested are Ordinary Least Square (OLS), Deming Regression (DR), Orthogonal Distance Regression (ODR), Weighted ODR (WODR), and York regression (YR). We first introduce a new data generation scheme that employs the Mersenne Twister (MT) pseudorandom number generator. The numerical simulations are also improved by: (a) refining the parameterization of non-linear measurement uncertainties, (b) inclusion of a linear measurement uncertainty, (c) inclusion of WODR for comparison. Results show that DR, WODR and YR produce an accurate slope, but the intercept by WODR and YR is overestimated and the degree of bias is more pronounced with a low R2 XY dataset. The importance of a properly weighting parameter λ in DR is investigated by sensitivity tests, and it is found an improper λ in DR can leads to a bias in both the slope and intercept estimation. Because the λ calculation depends on the actual form of the measurement error, it is essential to determine the exact form of measurement error in the XY data during the measurement stage. With the knowledge of an appropriate weighting, DR, WODR and YR are recommended for atmospheric studies when both x and y data have measurement errors.

Citation: Wu, C. and Yu, J. Z.: Evaluation of linear regression techniques for atmospheric applications: The importance of appropriate weighting, Atmos. Meas. Tech. Discuss., https://doi.org/10.5194/amt-2017-300, in review, 2017.
Cheng Wu and Jian Zhen Yu
Cheng Wu and Jian Zhen Yu

Model code and software

Scatter Plot
C. Wu
https://doi.org/10.5281/zenodo.832417
Aethalometer data processor
C. Wu
https://doi.org/10.5281/zenodo.832403
Histbox
C. Wu
https://doi.org/10.5281/zenodo.832405
Cheng Wu and Jian Zhen Yu

Viewed

Total article views: 374 (including HTML, PDF, and XML)

HTML PDF XML Total Supplement BibTeX EndNote
231 138 5 374 18 2 11

Views and downloads (calculated since 27 Sep 2017)

Cumulative views and downloads (calculated since 27 Sep 2017)

Viewed (geographical distribution)

Total article views: 374 (including HTML, PDF, and XML)

Thereof 373 with geography defined and 1 with unknown origin.

Country # Views %
  • 1

Saved

Discussed

Latest update: 20 Oct 2017
Publications Copernicus
Download
Short summary
A new data generation scheme that employs the Mersenne Twister (MT) pseudorandom number generator is proposed to conduct benchmark tests on a variety of linear regression techniques. With an appropriate weighting, Deming Regression (DR), Weighted ODR (WODR), and York regression (YR) are recommended for atmospheric studies when both x and y data have measurement errors. An Igor based program is developed to facilitate the regression implementation.
A new data generation scheme that employs the Mersenne Twister (MT) pseudorandom number...
Share