ETRCN：用于轻量化光伏功率预测的ETALinear两级残差修正网络

刘书言; 李其骏; 和学豪; 苏适; 黄伟; 李鹏

doi:10.13374/j.issn2095-9389.2025.10.24.001

ETRCN：用于轻量化光伏功率预测的ETALinear两级残差修正网络

ETRCN: ETALinear Two-level Residual Correction Network for Lightweight Photovoltaic Power Forecasting

摘要

摘要: 针对现有光伏功率预测模型存在的精度不足、计算时间过长等问题，本文提出了ETRCN（ETALinear two-level residual correction network），一种用于轻量化光伏功率预测的ETALinear两级残差修正网络. 该方法首先利用一个参数冻结的一维卷积神经网络对原始数据进行趋势–残差分解，随后通过改进的轻量化时间注意力机制对特征权重进行计算，进而对原始特征进行缩放. 将经过注意力机制处理后的趋势项与残差项加性融合得到初步预测值. 接着，利用卷积门控单元对长序列残差进行一级修正，并使用多层感知机进行二级修正. 最终，初步预测值与两步残差修正后的结果相加，得到最终的预测值. 实验表明，ETRCN的均方根误差（Root mean squared error, RMSE）为82.54 kW，相比CNN–BiLSTM–AM降低约23.1%，相比LSTM（Long short-term memory, LSTM）降低约17.1%，相比GWO–GRU降低约22.6%，相比TCN–BiGRU降低约42.8%. 在拟合度方面，ETRCN的R²达到0.922，比CNN–BiLSTM提高约1.1%，LSTM提高约6.9%，比GWO-GRU提高约10.3%，比TCN–BiGRU提高约8.3%. 此外，ETRCN参数量仅为17.68 K，较CNN–BiLSTM–AM、LSTM、TCN–BiGRU和GWO–GRU分别减少约98.0%、99.6%、99.7%和48.6%，单步预测时间也缩短至0.235 s，相较最慢的LSTM加速约81.6%，较目前主流的光伏功率预测模型而言，在计算精度与计算效率方面具有显著优越性，具有一定的工程应用价值.

Abstract: To address two persistent limitations in photovoltaic (PV) power forecasting, namely, insufficient predictive accuracy and excessive computational cost, this study presents a lightweight network—ETRCN—which couples ETALinear with a two-stage residual correction strategy. The design features a one-dimensional convolutional front end with frozen parameters. This component performs a trend residual decomposition of the raw multivariate time series, isolating slowly varying diurnal patterns from high-frequency fluctuations driven by transient weather. In addition to this decomposition, an improved lightweight temporal attention mechanism computes dynamic feature weights that rescale the inputs across time, ensuring that informative lags and meteorological channels receive a proportionally greater influence while noisy or less relevant inputs are attenuated. The attention-refined trend and residual streams are additively fused to produce an initial estimate. To further reduce the bias without increasing model size, the framework adopts hierarchical error mitigation: a convolutional gated unit first corrects temporally structured long-horizon residuals, and a compact multilayer perceptron subsequently calibrates the remaining system-level errors. The final prediction is the sum of the initial estimate and corrected residual terms. This modular pipeline, consisting of decomposition, attention-based feature scaling, and staged error correction, prioritizes interpretability, stable optimization, and efficiency. Methodologically, each design choice targets a known bottleneck in PV forecasting. Freezing the parameters of the decomposition convolutional neural network (CNN) eliminates backpropagation overhead and stabilizes training in the initial stage. The temporal attention block is intentionally lightweight to limit latency on commodity hardware, and the two corrective modules are narrow and task-specific, improving accuracy with minimal increase in parameter count. Because PV generation is shaped by periodic solar geometry, intermittent cloud cover, and abrupt ramp events, a decomposition-first approach separates the regular from the irregular, thereby simplifying the learning problem for the subsequent modules. The attention mechanism aligns the model capacity with the most informative time interval. Finally, the residual correctors capture the local autocorrelation and global nonlinear biases that remain after additive fusion in a computationally efficient manner. Extensive experiments confirm concurrent gains in accuracy and efficiency. Using the root mean square error (RMSE) as the primary metric, the proposed model attains 82.54 kW in relative comparisons with convolutional neural network – bidirectional long short-term memory – attention mechanism(CNN–BiLSTM–AM), long short-term memory (LSTM), grey wolf optimizer – gated recurrent unit (GWO–GRU), and temporal convolutional network bidirectional gated recurrent unit (TCN–BiGRU), indicating substantial error reductions of approximately 23.1%, 17.1%, 22.6%, and 42.8%, respectively. The goodness of fit is similarly strong, with an R² of 0.922. The associated increments in R² are approximately 1.1% for CNN–BiLSTM, 6.9% for LSTM, 10.3% for GWO–GRU, and 8.3% for TCN–BiGRU. Furthermore, the proposed model outperforms other methods by a significant margin in terms of error reduction and model fitting under complicated weather conditions. The RMSE of 99.348 indicates a reduction of 31.8% relative to CNN–BiLSTM, 25.0% relative to TCN–BiGRU, 21.7% relative to LSTM, and 16.4% relative to GWO–GRU. Similarly, an R² value of 0.876 represents improvements of 5.7%, 3.5%, 2.7%, and 2.3% over CNN–BiLSTM, TCN–BiGRU, LSTM, and GWO–GRU, respectively. These accuracy gains are achieved with a parameter budget of only 17.68 K, representing reductions of approximately 98.0%, 99.6%, 99.7%, and 48.6% compared to CNN–BiLSTM–AM, LSTM, TCN–BiGRU, and GWO–GRU, respectively. Inference is also efficient: a single forward pass is completed in 0.235 s, which corresponds to an acceleration of approximately 81.6% relative to the slowest baseline (LSTM). Taken together, the results show that the proposed architecture improves both predictive fidelity and computational efficiency, a combination that is seldom achieved simultaneously. Ablation and component-wise analyses confirm the contribution of each module. The frozen CNN decomposition yields a cleaner separation between low-frequency seasonality and high-frequency disturbances, which translates into more stable gradients and faster downstream convergence. The lightweight temporal attention layer consistently improves calibration by increasing the contribution of time steps that align with known PV dynamics while reducing the influence of erratic inputs. The first residual corrector, based on convolutional gating, efficiently models short-range temporal dependencies in the residual sequence, whereas the second corrector, a small multilayer perceptron, removes the residual systematic bias without incurring significant parameter overhead. Notably, these improvements do not rely on deep stacks or large hidden dimensions; rather, they emerge from architectural parsimony and careful division of labor among specialized blocks. From an engineering perspective, this framework is readily deployable. The small parameter footprint reduces memory usage and energy consumption, enabling on-premise inference at PV plants or substation controllers. The short prediction latency supports quasi-real-time tasks, such as dispatch scheduling, reserve allocation, and curtailment decisions. Moreover, the modular structure facilitates operational flexibility, in that, operators can disable the second-stage corrector to prioritize ultra-low latency or replace the attention block with a static weighting scheme for hardware-constrained environments without retraining the entire model. Overall, by combining decomposition-first modeling, lightweight temporal attention, and hierarchical error correction in a rigorously efficient implementation, the proposed method delivers a practically meaningful balance of accuracy, speed, and interpretability for modern PV power forecasting.

HTML全文

参考文献(32)

施引文献

资源附件(0)