Abstract:
This study addresses two persistent limitations in photovoltaic (PV) power forecasting—insufficient predictive accuracy and excessive computational cost—by presenting a lightweight network: ETRCN, which couples ETALinear with a two-stage residual correction strategy. The design begins with a one-dimensional convolutional front end whose parameters are frozen. This component performs a trend–residual decomposition of the raw multivariate time series, isolating slowly varying diurnal patterns from high-frequency fluctuations driven by transient weather. On top of this decomposition, an improved, lightweight temporal attention mechanism computes dynamic feature weights that rescale inputs across time, ensuring that informative lags and meteorological channels receive proportionally greater influence while noisy or less relevant inputs are attenuated. The attention-refined trend and residual streams are additively fused to produce an initial estimate. To further reduce bias without inflating model size, the framework adopts hierarchical error mitigation: a convolutional gated unit first corrects temporally structured long-horizon residuals, and a compact multilayer perceptron subsequently calibrates remaining system-level errors. The final prediction is the sum of the initial estimate and both corrected residual terms. This modular pipeline—decomposition, attention-based feature scaling, and staged error correction—prioritizes interpretability, stable optimization, and efficiency. Methodologically, each design choice targets a known bottleneck in PV forecasting. Freezing the parameters of the decomposition CNN eliminates backpropagation overhead and stabilizes training in the earliest stage; the temporal attention block is intentionally lightweight to limit latency on commodity hardware; and the two corrective modules are narrow and task-specific, improving accuracy with minimal growth in parameter count. Because PV generation is shaped by periodic solar geometry, intermittent cloud cover, and abrupt ramp events, a decomposition-first approach separates the regular from the irregular and simplifies the learning problem for the subsequent modules. The attention mechanism then aligns model capacity with the most informative time intervals. Finally, the residual correctors capture local autocorrelation and global nonlinear biases that remain after additive fusion, but do so in a computationally frugal manner.
Extensive experiments confirm concurrent gains in accuracy and efficiency. Using root mean square error (RMSE) as the primary metric, the proposed model attains 82.54kW. Relative comparisons indicate substantial error reductions: approximately 23.1% when contrasted with CNN-BiLSTM-AM, 17.1% compared with LSTM, 22.6% compared with GWO-GRU, and 42.8% compared with TCN-BiGRU. Goodness of fit is similarly strong, with an R2 of 0.922. The associated increments in R2 are about 1.1% over CNN-BiLSTM, 6.9% over LSTM, 10.3% over GWO-GRU, and 8.3% over TCN-BiGRU. Furthermore, the proposed model outperforms other methods by a significant margin in terms of error reduction and model fitting: the RMSE of 99.348 shows reductions of 31.8% relative to CNN-BiLSTM, 25.0% compared to TCN-BiGRU, 21.7% compared to LSTM, and 16.4% compared to GWO-GRU. Similarly, the R2 value of 0.876 represents improvements of 5.7%, 3.5%, 2.7%, and 2.3% over CNN-BiLSTM, TCN-BiGRU, LSTM, and GWO-GRU, respectively. These accuracy gains are achieved with a parameter budget of only 17.68 K, representing reductions of roughly 98.0%, 99.6%, 99.7%, and 48.6% when compared to CNN-BiLSTM-AM, LSTM, TCN-BiGRU, and GWO-GRU, respectively. Inference is also efficient: a single forward pass completes in 0.235 s, which corresponds to an acceleration of about 81.6% relative to the slowest baseline (LSTM). Taken together, the results show that the proposed architecture improves both predictive fidelity and computational frugality, a combination that is seldom achieved simultaneously.
Ablation and component-wise analyses reinforce the contribution of each module. The frozen CNN decomposition yields a cleaner separation between low-frequency seasonality and high-frequency disturbances, which translates into more stable gradients and faster convergence downstream. The lightweight temporal attention layer consistently sharpens calibration by elevating the contribution of time steps that align with known PV dynamics while reducing the influence of erratic inputs. The first residual corrector based on convolutional gating efficiently models short-range temporal dependencies in the residual sequence, and the second corrector, a small multilayer perceptron, removes residual systematic bias without incurring significant parameter overhead. Notably, these improvements do not rely on deep stacks or large hidden dimensions; rather, they emerge from architectural parsimony and a careful division of labor among specialized blocks. From an engineering perspective, the framework is readily deployable. The small parameter footprint reduces memory usage and energy consumption, enabling on-premise inference at PV plants or substation controllers. The short prediction latency supports quasi-real-time tasks such as dispatch scheduling, reserve allocation, and curtailment decisions. Moreover, the modular structure facilitates operational flexibility: operators can disable the second-stage corrector to prioritize ultra-low latency, or replace the attention block with a static weighting scheme for hardware-constrained environments, without retraining the entire model. Overall, by uniting decomposition-first modeling, lightweight temporal attention, and hierarchical error correction within a rigorously efficient implementation, the proposed method delivers a practically meaningful balance of accuracy, speed, and interpretability for modern PV power forecasting.