At present, in the simulation of various atmospheric pollutants, the simulation of independent trace gases is constrained by the insufficient resolution of key remote sensing products, resulting in insufficient simulation reliability. This study combines spatial sampling and parameter convolution to optimize LightGBM using ground observations, remote sensing products, meteorological data, aid data, and random IDs. Through the above techniques and simulation of atmospheric pollutant sequences, we obtained seamless products with a daily resolution of 1 kilometer for PM2.5 in most parts of China from 2015 to 2018. Through random sampling, random site sampling, specific area validation, comparison of different models, and horizontal comparison of different studies, we have verified that our simulation of the spatial distribution of various atmospheric pollutants is reliable and effective.
| collect time | 2018/03/19 - 2020/12/31 |
|---|---|
| collect place | China |
| data size | 40.4 GiB |
| data format | gz and GeoTIFF |
| Coordinate system | WGS84 |
The data used in this study includes daily ground monitoring data of PM2.5 in China. In addition, remote sensing data, meteorological data, and auxiliary data were also used.
A multi pollutant universal machine learning model based on random IDs, spatial adoption, parameter convolution, and other methods can better consider multiple factors when predicting changes in atmospheric pollutant concentrations and optimize the estimation of pollutant spatial distribution. We use CV and visual qualitative analysis to evaluate the model results. Compare LightGBM, LSTM, and RF Ps with our model to evaluate their performance. Finally, we use SHAP to attempt to explain the output results of the model.
The CV value of the random sample is: the R2of PM2.5 is 0.88, and the root mean square error is 9.91 µ g/m3. By combining the SHapley Additive exPlans (SHAP) method, the roles of different parameters in the simulation process were clarified, and the positive effect of parameter convolution was confirmed.
This work is licensed under a
Creative
Commons Attribution 4.0 International License.
| # | title | file size |
|---|---|---|
| 1 | _ncdc_meta_.json | 4.4 KiB |
| 2 | 2018 | |
| 3 | 2019 | |
| 4 | 2020 |
| # | category | title | author | year |
|---|---|---|---|---|
| 1 | paper | Sequential spatiotemporal distribution of PM$_{2.5 | Y,Chi,Y,Zhan,K,Wang,H,Ye | 2023 |
Air pollutants machine learning model optimization spatial distribution products of air pollutants SHAP
©Copyright 2005-. Northwest Institute of Eco-Environment and Resources, CAS.
Donggang West Road 320, Lanzhou, Gansu, China (730000)

