|
1.INTRODUCTIONIntegrated energy system (IES) can diversify energy supply, effectively improve the comprehensive utilization efficiency of all kinds of energy, and play a positive role in environmental protection. Load forecasting is the premise to ensure the reliable and economic operation of energy system. There is a certain coupling relationship between multi-energy loads in IES, and the user level loads of IES change frequently. Consequently higher requirements are put forward for IES multienergy load forecasting (MELF). With the acceleration of energy transformation process, the research on MELF is gradually increasing. At present, the research of MELF mainly belongs to deterministic forecasting, which can be divided into two categories according to the number of forecasting models used. The first category is single model prediction. This kind of method usually uses Copula theory1, Pearson coefficient2, grey correlation analysis3 and other methods for correlation analysis, extracts the influencing factors with great correlation with MELF. These factors construct the original feature set of forecasting models. On this basis, convolutional neural network (CNN)2, encoder-decoder model based on long short-term memory (LSTM) network4, gated recurrent unit (GRU)5 are utilized to further extract features. Traditional machine learning algorithms widely used in power load forecasting are also used in MELF, such as generalized regression neural network2, support vector regression (SVR)2, gradient boosting decision tree (GBDT)4, extreme learning machine (ELM)5, least square support vector machine (LSSVM)6. In recent years, deep learning has been more and more applied in MELF with its good learning performance and generalization ability, including deep belief network (DBN)7, LSTM network8. The second category is the combined forecasting method. These methods mostly use decomposition algorithms, such as wavelet packet9 and quadratic mode decomposition10, to decompose multi-energy loads of IES into different components. Then different methods, such as recurrent neural network (RNN)9, deep bidirectional LSTM (DBiLSTM) network and multiple linear regression (MLR)10, are adopted to predict components respectively. To sum up, considerable achievements have been made in the research of MELF, in which the prediction accuracy of combined forecasting model is generally higher than that of single forecasting model. Light gradient boosting machine (LightGBM) has good performance in power load forecasting11. Therefore, this paper constructs a weighted combination forecasting method with LSTM network, CNN model and LightGBM model optimized by harmony search (HS)12. Different features composed of load data, meteorological information and calendar information are input into these models. The weight coefficients are determined by inverse root mean square error (IRMSE), and the weighted combination of the predicted loads of the three models is the final MELF values. The experimental results show that the presented model is better than the single models and surpasses the simple average combination model. 2.MELF BASED ON LSTM NETWORK2.1.LSTM networkLSTM network is an improved RNN13, which can learn the long-term dependence information of sequences and overcome the gradient disappearance and explosion problems of traditional RNN. The structure of LSTM cell is shown in Figure 1. t is current time. ft, it, ot, gt and ct represent forgetting gate, input gate, output gate, the candidate value of cell state and cell state at time t respectively. xt is current input. ht-1 represents the output at time 0074-1. σ is Sigmoid function. ® represents the product of the corresponding elements of two vectors. 2.2.Construction of forecasting model based on LSTMFor LSTM forecasting model, its input and output data should be set first. Assuming that the IES load (electric, cooling and heat load) at time t on day d is predicted, the input data are composed of the 16-dimensional features of time t on the previous 7 days (d-7, …, d-1). The features include IES load, meteorological information and calendar information, as shown in Table 1. DNI, DHI and GHI are abbreviation for direct normal irradiance, diffuse horizontal irradiance and global horizontal irradiance. Therefore, the input sequence length of LSTM network is 7 and the feature dimension of each time is 16. For the working day type in Table 1, the working day is 1, otherwise it is 0. For the holiday type, the holiday is taken as 1, otherwise it is taken as 0. Table 1.Feature set of LSTM.
The feature values of different dimensions vary dramatically. Therefore, the normalization operation is carried out as follows: where xi and represent the values before and after normalization. and are the maximum and minimum values of the ith feature respectively. LSTM network structure can be described as follows: the normalized input sequence data are sent to the input layer, then cascade multiple hidden layers, and finally the electric, cooling and heat load are predicted through the fully connected layer. 3.MELF BASED ON CNN MODEL3.1.CNN modelCNN is a special feedforward neural network14. Its structure is similar to that of multi-layer perceptron (MLP). It has the advantages of equivalent representation, sparse interaction and parameter sharing. CNN usually consists of three parts: convolution layer, pooling layer and full connection layer. Among them, the convolution layer generally convolves with input data with multiple kernels, and then adopts nonlinear activation function to extract features. The dimension of the pooling layer is reduced by down sampling. Then the features are flattened into one-dimensional vector and sent to the fully connected layer for prediction. 3.2.Construction of forecasting model based on CNNThe input structure of CNN model is slightly different from that of LSTM network. The 16-dimensional features of LSTM network input data are arranged into 4×4 matrix, taking the input sequence length as the number of channels. The output is the same as that of LSTM network. 4.MELF BASED ON LIGHTGBM4.1.LightGBM modelLightGBM is an efficient GBDT model based on histogram11. It adopts gradient based single-sided sampling and leaf growth strategy with depth limit, makes it have better training speed, less memory consumption and higher accuracy. LightGBM constructs one regression tree at a time by fitting the residual of the previous regression tree. LightGBM combines weak learners into a single strong learner in the iterative process by minimizing the loss function, as shown in equation (2). where x is the input feature vector, f (x) denotes the final predicted value and fi (x) accounts for the output of the ith regression tree. 4.2.Construction of forecasting model based on LightGBM4.2.1.Determination of input and output data.The input and output of LightGBM model are different from LSTM network. The IES load (electric, cooling or heat load) at time t on day d is predicted, and the input data is 25-dimensional features. The input features include the electric, cooling and heat load at time t and t-1 on day d-1 and d-7, weather forecast information and calendar information at time t on day d as shown in Table 1. The output is the load at time t on day d, and the dimension is 1. That is, the prediction model based on LightGBM needs to be established for electric, cooling and heat loads respectively. 4.2.2.Hyper Parameter Optimization Based on Harmony Search.When building LightGBM model, it is necessary to determine its hyper parameters, among which the most important parameters are the number of regression trees Ntree and the maximum number of leaves per tree Nleaf. Therefore, harmony search (HS)12 algorithm is used to optimize these two hyper parameters. HS algorithm is inspired by the improvisation process of musicians. In the process of music creation, musicians will repeatedly adjust the tones of various instruments to find the best harmony (hyper parameters). 5.COMBINED FORECASTING MODELAfter the predicted values of electric, cooling and heat loads are obtained by LSTM network, CNN model and LightGBM model respectively. IRMSE method15 is adopted to determine the weights of the three models. The calculation process is as follows: where fcb is the final predicted value of electric, cooling or heat load. fi is the predicted value of the ith model. represents the RMSE of the ith model, Wi represents the weight of the ith model and Nmod is the number of models, which is 3 in this paper. yk and are the actual and predicted electric, cooling or heat load of the kth sample point. Nv is the number of samples in the validation set. It can be seen from equation (4) that the model with smaller error has larger weight coefficient, which can reduce the error of the combined model and improve the prediction accuracy. The framework of the proposed method is shown in Figure 2. 6.CASE STUDY6.1.Experimental data and platformThe user level IES multi-energy load data is from Tempe campus of Arizona State University16. Meteorological data are from the National Renewable Energy Laboratory of the United States17. The data from May 25, 2017 to August 31, 2019 are selected as the experimental data, and the sampling interval is 1 hour. The data from August 25 to August 31, 2019 constitute the test set, and the other data are randomly divided into training set and validation set according to 4:1. The experimental hardware is configured with Intel Core i5-4200 CPU and 8G memory. Python language is used for programming, and each model is implemented by calling PyTorch and scikit-learn libraries respectively. 6.2.Evaluation indicesIn order to comprehensively evaluate the performance of MELF methods, mean absolute percentage error (MAPE) and RMSE indicators are selected to evaluate the effect of single load and integrated load forecasting. MAPE is defined as: where m is the number of samples in the test set. denote MAPE for class j load. WMAPE is the integrated MAPE of IES loads, which represents the overall performance of MELF. αj is the importance ratio of class j load (electric, cooling or heat load). 6.3.Model parameter settingThe parameter setting has a great impact on MELF performance of each model. In this paper, the grid search method is used to obtain the optimal hyper parameters of LSTM network and CNN model. The LSTM network is provided with a hidden layer with 16 neurons. Two convolution layers are set in CNN model, the number of convolution kernel is 8 and 16 respectively, and the size of convolution kernel is 2×2. There is no pooling layer, two fully connected layers are set, the number of neurons is 10 and 3 respectively, and the activation function is ReLU. Both LSTM network and CNN model are optimized by Adam algorithm, the learning rate is 0.01, the training batch size is set to 24, and epochs are 100. In order to evaluate the performance of the combined method (denoted as IRMSE-LCL) in MELF, it is compared with SVR and MLP. The hyper parameters of LightGBM, SVR and MLP are optimized by HS algorithm. The relevant parameter settings of HS are shown in Table 2. Table 2.Parameter setting of HS.
The inputs of SVR and MLP are the same as that of LightGBM model. Similar to the LightGBM model, the output of the SVR model is one-dimensional, so it is necessary to train the SVR model for electric, cooling and heat loads respectively. The output of MLP model is 3-dimensional, so only one MLP model needs to be trained to predict the electric, cooling and heat loads at the same time. The SVR model adopts Gaussian kernel function, and the hyper parameters optimized by HS include insensitive loss parameter, penalty coefficient and kernel function parameter. For the MLP model, the hyper parameters optimized by HS include the number of hidden layer neurons and L2 norm penalty parameter. The activation function also selects ReLU. The optimization algorithm is Adam, with adaptive learning rate and 500 iterations. 6.4.Result analysisIRMSE-LCL model is compared with single LSTM network, CNN, LightGBM, SVR and MLP, and with the simple average combination model of LSTM, CNN and LightGBM (represented by Avg-LCL). The MAPE, RMSE and integrated MAPE of day ahead MELF on the test set are shown in Tables 3 and 4. When calculating the integrated MAPE, the importance ratios of electric, cooling and heat load are set to 0.4, 0.4 and 0.2 respectively3. Table 3.MAPE comparison of different models.
Table 4.RMSE comparison of different models.
From Tables 3 and 4, we can conclude that:
The proposed method combines LSTM network, CNN and LightGBM, which can effectively combine the respective advantages of each model and learn from each other. It can not only enhance the model’s perception of timing information, but also fully mine the characteristic information of discontinuous data. Giving more weight to the model with higher prediction accuracy can effectively improve the overall prediction performance. 7.CONCLUSIONIn this paper, a combined MELF method based on LSTM network, CNN and model is proposed. The inverse root mean square error method is used to weighted combine the prediction results of the three models to obtain the final prediction values. Compared with the single models and the simple average combination model, the proposed model has better prediction accuracy. The next step is to study the load decomposition algorithm to further improve the prediction performance of IES loads. REFERENCESMa, J., Gong, W. and Zhang Z., Adv. Technol. Electr. Eng. Energy, 39 24
–31
(2020). Google Scholar
Luo, F, Zhang, X., Yang, X., Yao, L., Zhu, L. and Qian, M, High Volt. Eng, 47 23
–32
(2021). Google Scholar
Tian, H., Zhang, Z. and Yu, D.,
in Proc. CSU EPSA,
130
–7
(2021). Google Scholar
Wang, S., Wang, S., Chen, H. and Gu, Q., Energy, 195 116964
(2020). https://doi.org/10.1016/j.energy.2020.116964 Google Scholar
Sun, X., Li, J., Zeng, B., Gong, D. and Lian, Z., Control Theory A, 38 63
–72
(2021). Google Scholar
Tan, Z., De, G., Li, M., Lin, H., Yang, S., Huang, L. and Tan, Q., J. Clean. Prod, 248 119252
(2020). https://doi.org/10.1016/j.jclepro.2019.119252 Google Scholar
Shi, J., Tan, T., Guo, J., Liu, Y. and Zhang, J., Power Syst. Technol, 42 698
–707
(2018). Google Scholar
Sun, Q., Wang, X., Zhang, Y., Zhang, F., Zhang, P. and Gao, W.,
“Autom. Electr. Power,”
Syst, 45 63
–70
(2021). Google Scholar
Zhu, L., Wang, X., Ma, J., Chen, Q. and Qi, X., Electr. Power Constr, 41 131
–8
(2020). Google Scholar
Chen, J., Hu, Z., Chen, W., Gao, M., Du, Y. and Lin, M.,
“Autom. Electr. Power,”
Syst, 45 85
–94
(2021). Google Scholar
Chen, W., Hu, Z., Yue, J., Du, Y. and Qi, Q.,
“Autom. Electr. Power,”
Syst, 45 91
–7
(2021). Google Scholar
Geem, Z. W., Kim, J. H. and Loganathan, G. V., Simulation, 2 60
–8
(2001). https://doi.org/10.1177/003754970107600201 Google Scholar
Hochreiter, S. and Schmidhuber, J., Neural Comput, 9 1735
–80
(1997). https://doi.org/10.1162/neco.1997.9.8.1735 Google Scholar
Khan, A., Sohail, A., Zahoora, U. and Qureshi, A. S., Artif. Intell. Rev, 53 5455
–516
(2020). https://doi.org/10.1007/s10462-020-09825-6 Google Scholar
Moreira, J., Soares, C., Jorge, A. M. and Sousa, J., ACM Comput. Surv,, 45 1001
–40
(2012). Google Scholar
,Campus metabolism,
(2021) http://cm.asu.edu/Online Google Scholar
,NSRDB data viewer,
(2021) https://maps.nrel.gov/nsrdb-viewer/Online Google Scholar
|