基于机器学习的北京市PM2.5浓度预测模型及模拟分析

Machine-learning-based model and simulation analysis of PM2.5 concentration prediction in Beijing

  • 摘要: 对北京市周边8个点多个压力高度的温度、湿度和风速数据, 以及北京市PM2.5污染数据进行了分析和归一化处理, 建立了反向传播神经网络(back propagation, BP)、卷积神经网络(convolutional neural network, CNN) 和长短期记忆模型(long short-term memory, LSTM) 对上述气象数据和污染数据进行训练, 训练结果表明: 反向传播神经网络模型和卷积神经网络模型对未来1 h的PM2.5污染等级的预测准确率较低, 而长短期记忆模型的准确率较高.使用长短期记忆模型预测未来1 h的PM2.5污染值与实际值十分接近, 表明北京市的PM2.5污染与其周边地区的气象条件关系密切.通过利用长短期记忆模型对不同压力高度的气象数据进行训练和对比, 得出在利用气象数据预测污染时, 仅使用近地面气象数据比使用多个高度上的气象数据更加准确.

     

    Abstract: In recent years, the air quality in China has become a matter of serious concern. Among the available indicators for evaluating air quality, PM2.5 is one of the most important. It comprises a complex mixture of extremely small particles and liquid droplets emitted into the air, whose diameters are no more than 2.5 μm. Environments with a high PM2.5 index are extremely harmful to human health. Once inhaled, these particles can affect the heart and lungs and cause serious health problems. Air pollution is closely related to meteorological conditions such as wind speed, wind direction, atmospheric stability, temperature, and air humidity. With the development of various machine learning methods, deep learning models based on neural networks are increasingly applied in air pollution research. In this study, the temperature, humidity, wind velocity data at different pressure altitudes from 8 locations around Beijing and average of PM2.5 data in Beijing were analyzed and normalized. Multi-dimensional data was ideal for research applications using machine learning methods. and three neural network models were built, including the back propagation (BP), convolutional neural network (CNN), and long short-term memory (LSTM) models, and trained them using the meteorological and PM2.5 data.The results indicate that the accuracies of the back propagation and convolutional neural network models in predicting the PM2.5 pollution level in the next hour is much lower than that of the long short-term memory model. The PM2.5 pollution index predicted for the next hour by the long short-term memory model is very close to the actual value. This result reveals the strong relationship between the PM2.5 pollution index of Beijing and the local meteorological conditions. The long short-term memory model is trained using meteorological data from different pressure altitudes, and found it to be more accurate in predicting pollution levels when using near-surface meteorological data than that obtained from multiple altitudes.

     

/

返回文章
返回