Hostname: page-component-745bb68f8f-grxwn Total loading time: 0 Render date: 2025-01-26T19:05:40.557Z Has data issue: false hasContentIssue false

Estimating high-resolution profiles of wind speeds from a global reanalysis dataset using TabNet

Published online by Cambridge University Press:  03 January 2025

Harish Baki*
Affiliation:
Faculty of Civil Engineering and Geosciences, TU Delft, Delft, the Netherlands
Sukanta Basu
Affiliation:
Atmospheric Sciences Research Center, University at Albany, Albany, NY, USA Department of Environmental and Sustainable Engineering, University at Albany, Albany, NY, USA
*
Corresponding author: Harish Baki; Email: h.baki@tudelft.nl

Abstract

The growing demand for global wind power production, driven by the critical need for sustainable energy sources, requires reliable estimation of wind speed vertical profiles for accurate wind power prediction and comprehensive wind turbine performance assessment. Traditional methods relying on empirical equations or similarity theory face challenges due to their restricted applicability beyond the surface layer. Although recent studies have utilized various machine learning techniques to vertically extrapolate wind speeds, they often focus on single levels and lack a holistic approach to predicting entire wind profiles. As an alternative, this study introduces a proof-of-concept methodology utilizing TabNet, an attention-based sequential deep learning model, to estimate wind speed vertical profiles from coarse-resolution meteorological features extracted from a reanalysis dataset. To ensure that the methodology is applicable across diverse datasets, Chebyshev polynomial approximation is employed to model the wind profiles. Trained on the meteorological features as inputs and the Chebyshev coefficients as targets, the TabNet more-or-less accurately predicts unseen wind profiles for different wind conditions, such as high shear, low shear/well-mixed, low-level jet, and high wind. Additionally, this methodology quantifies the correlation of wind profiles with prevailing atmospheric conditions through a systematic feature importance assessment.

Type
Application Paper
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2024. Published by Cambridge University Press

Impact Statement

We applied deep learning in conjunction with Chebyshev polynomials to predict unseen wind profiles from coarse-resolution meteorological features and correlate them with the prevailing atmospheric conditions. The methodology can be extended to different geographical locations and diverse wind profile datasets.

1. Introduction

The demand for global wind power production has experienced a significant increase, driven by the growing recognition of renewable energy sources as an essential solution to combat climate change and the pressing need for a sustainable, low-carbon future (Nagababu et al., Reference Nagababu, Srinivas, Kachhwaha, Puppala and Kumar2023). Although offshore wind power technology is still in its initial stages, it is predicted to grow rapidly, which is primarily attributed to offshore wind speeds being higher and more uniform as the distance from the coast increases (Guo et al., Reference Guo, Wang and Lian2022). Moreover, technological advancements have facilitated the deployment of the largest wind turbines to date, such as the MySE 16–260 with a 16 MW capacity (New Atlas, 2023), having a rotor diameter of 260 m and a hub height of 152 m, making it the largest wind turbine to reach a towering height of 280 m.

For such massive wind turbines, the traditional use of hub height wind speed in estimating power output (IEC, 2005) is insufficient due to varying wind speeds across the rotor plane. To address this, the rotor equivalent wind speed approach considers a wind profile within the rotor swept area, enhancing the reliability and accuracy of power output estimates for such large wind turbines (Wagner et al., Reference Wagner, Courtney, Gottschall and Lindelöw-Marsden2011; Van Sark et al., Reference Van Sark, Van der Velde, Coelingh and Bierbooms2019). Furthermore, specific meteorological conditions lead to the formation of distinct wind profiles, such as low-level jets (LLJs) during strong stratification, well-mixed profiles during very unstable conditions, and Ekman profiles during neutral conditions (Durán et al., Reference Durán, Basu, Meißner and Adaramola2020). The atmospheric variables like wind shear and turbulence intensity associated with these diverse wind profiles significantly influence power production (Elliott and Cadogan, Reference Elliott and Cadogan1990; Wharton and Lundquist, Reference Wharton and Lundquist2012). Beyond power output, wind profiles under different stability conditions also significantly impact turbine loads (Dimitrov et al., Reference Dimitrov, Natarajan and Kelly2015; Gutierrez et al., Reference Gutierrez, Ruiz-Columbie, Tutkun and Castillo2017; Park et al., Reference Park, Manuel and Basu2015). These findings underscore the importance of analyzing wind profiles in both wind resource assessment and turbine design analysis.

However, characterizing wind profiles across the rotor-swept area has been hindered by the scarcity of observations at high altitudes, as the deployment of wind measurement instruments such as wind masts and lidars is generally cost-prohibitive. Numerous similarity theory-based and empirical equations exist, such as the logarithmic law of the wall, Monin–Obukhov similarity theory, and power law, among others. These have been utilized in wind resource assessment applications to extrapolate near-surface wind speeds from ground meteorological stations (Bañnuelos-Ruedas et al., Reference Bañuelos-Ruedas, Angeles-Camacho and Rios-Marcuello2010) or satellites (Optis et al., Reference Optis, Bodini, Debnath and Doubrawa2021) to various vertical levels. However, these equations often require additional information that is not typically measured by ground stations. Furthermore, some of these equations are only valid within the surface layer (Basu, Reference Basu2023); yet, the swept areas of contemporary turbines extend well beyond this layer, complicating their application. At present, the most reliable approach for estimating wind profiles is to use mesoscale models. There have been several such activities going on, such as, the Copernicus Regional Reanalysis for Europe (CERRA) (Schimanke et al., Reference Schimanke, Ridal, Le Moigne, Berggren, Undén, Randriamampianina, Andrea, Bazile, Bertelsen, Brousseau, Dahlgren, Edvinsson, El Said, Glinton, Hopsch, Isaksson, Mladek, Olsson, Verrelle and Wang2021), New European Wind Atlas (NEWA) (Hahmann et al., Reference Hahmann, Sīle, Witha, Davis, Dörenkämper, Ezber, García-Bustamante, González-Rouco, Navarro, Olsen and Söderberg2020; Dörenkämper et al., Reference Dörenkämper, Olsen, Witha, Hahmann, Davis, Barcons, Ezber, García-Bustamante, González-Rouco, Navarro, Sastre-Marugán, Sīle, Trei, Žagar, Badger, Gottschall, Sanz Rodrigo and Mann2020), Dutch Offshore Wind Atlas (DOWA) (Wijnant et al., Reference Wijnant, van Ulft, van Stratum, Barkmeijer, Onvlee, de Valk, Knoop, Kok, Marseille and Baltink2019), Winds of the North Sea in 2050 (WINS50) (Dirksen et al., Reference Dirksen, Wijnant, Siebesma, Baas and Theeuwes2022), Solar and Wind at Gray-Zone resolution (Baki et al., Reference Baki, Basu and Lavidas2024a), to name a few. However, mesoscale model simulations require significant computational resources. In addition, the simulated wind speed profiles are susceptible to grid size and physical parameterizations (Baki et al., Reference Baki, Basu and Lavidas2024b).

In recent years, numerous machine learning (ML) studies have explored extrapolating near-surface wind speeds to rotor-swept heights. Mohandes and Rehman (Reference Mohandes and Rehman2018) employed deep neural networks (DNNs) to extrapolate wind speeds from lidar measurements at lower heights to 120 m height, showing superior performance over the empirical local wind shear exponent method. Optis et al. (Reference Optis, Bodini, Debnath and Doubrawa2021) investigated methods to extrapolate near-surface wind speeds from satellite-based wind atlases to hub heights, with ML models, particularly Random Forest (RF), outperforming traditional empirical methods. They highlighted that ML models trained on a limited number of lidars could accurately extrapolate winds at various surrounding locations. Building on this, Liu et al. (Reference Liu, Ma, Guo, Li, Jin, Ma and Gong2023) used three ML (RF) models to estimate wind speeds at 120 m, 160 m, and 200 m levels, incorporating large-scale weather features from ERA5 reanalysis and wind speed/direction from a remote sensing device. They concluded that, including meteorological features significantly improved ML model accuracy compared to the empirical power law method. Yu and Vautard (Reference Yu and Vautard2022) extended this approach, constructing RF and extreme gradient boosting (XGBoost) models to estimate a gridded dataset of 100 m wind speed using meteorological variables from the ERA5 (5th generation European Centre for Medium-Range Weather Forecasts (ECMWF)) reanalysis (Hersbach et al., Reference Hahmann, Sīle, Witha, Davis, Dörenkämper, Ezber, García-Bustamante, González-Rouco, Navarro, Olsen and Söderberg2020). However, these ML studies focused on specific extrapolation levels and lack generalization for entire vertical profiles.

2. Problem statement

In this article, our objective is to explore a deep learning (DL) approach for wind profile estimation, leveraging coarse-resolution meteorological features from public-domain reanalysis data. Traditionally, wind resource assessments involve collecting observations for 1 year using met-masts, sodars, and lidars and extrapolating winds for other years through the measure–correlate–predict (MCP) approach. To emulate this process, our focus is on training a DL model for 1 year and predicting for a different year. As a proof-of-concept, we utilized simulated high-resolution wind profiles from the CERRA reanalysis and coarse-resolution meteorological features from the ERA5 at a specific location. The challenge lies in generalizing the methodology for easy adoption with any datasets, which could be observational data from various instruments.

To achieve this, we initially approximate the CERRA wind profiles using Chebyshev polynomials, representing them with five coefficients. Using these coefficients as targets and the ERA5 meteorological features as inputs, a DL model is trained. While the aforementioned ML models like RF and XGBoost excel in regression problems, they are designed to predict single targets. Given our objective of predicting all coefficients simultaneously to obtain collective wind profile information, we opt for state-of-the-art TabNet (Arik and Pfister, Reference Arik and Pfister2021), an attention-based sequential DL model. Additional details about the data used are provided in Section 3.1, the Chebyshev coefficient estimation is explained in Section 3.2, and the training procedure is outlined in Section 3.3. The results are presented in Section 4, with concluding remarks in Section 5.

3. Data and Methodology

3.1. Data

3.1.1. Wind speed at different height levels

The CERRA is a state-of-the-art reanalysis developed through the collaborative efforts of the Swedish Meteorological and Hydrological Institute (SMHI), Norwegian Meteorological Institute (MET Norway), and Meteo-France. CERRA provides wind speeds at 12 vertical levels: 10, 15, 30, 50, 75, 100, 150, 200, 250, 300, 400, and 500 m above sea or ground level. The dataset is available as analysis every third hour and as forecast at lead hours of 1, 2, and 3. In this study, we utilized the three-hourly analyses and corresponding forecasts at lead hours of 1 and 2, creating an hourly dataset. We analyzed data spanning 2 years, from 0000UTC on January 1, 2000, to 2300UTC on December 31, 2001. We extracted a time-series of wind profiles at the FINO1 site (54.0143N, 6.58385E), a location noted for extensive wind power meteorology research (Durán et al., Reference Durán, Basu, Meißner and Adaramola2020).

3.1.2. Meteorological variables

This study utilizes 34 meteorological variables from the publicly available and globally acclaimed ERA5 reanalysis data as drivers for wind profiles, which are presented in Table 1. Of these, 25 variables are selected based on the studies of Kartal et al. (Reference Kartal, Basu and Watson2023), adhering to the same naming convention and descriptions as mentioned in Table 1 of their work.

Table 1. Description of the meteorological variables adopted from the ERA5 reanalysis

3.2. Estimating Chebyshev coefficients

Chebyshev polynomials allow one to approximate a function with smallest error as follows (Mason and Handscomb, Reference Mason and Handscomb2002):

(1) $$ U(z)=\sum \limits_{n=0}^{\infty }{C}_n{T}_n(z) $$

Here, we want to approximate the wind speed profile $ U(z) $ with the combination of Chebyshev polynomials $ {T}_n(z) $ multiplied by the corresponding coefficients $ {C}_n $ . The polynomials of the first kind can be estimated through recurrence relations as follows:

(2) $$ {T}_0(z)=1 $$
(3) $$ {T}_1(z)=z $$
(4) $$ {T}_{n+1}(z)=2{zT}_n(z)-{T}_{n-1}(z) $$

In this study, we employed fourth-order Chebyshev polynomials (Figure 1, 1st column). Once computed, these polynomials transform the problem into a system of linear equations, facilitating the estimation of coefficients through methods such as solving linear equations or matrix inversion. The variable $ z $ is normalized between −1 and 1 in real data before estimating coefficients. For a wind profile with 12 vertical levels and Chebyshev polynomials of order 4 ( $ {T}_0 $ to $ {T}_4 $ ), five Chebyshev coefficients ( $ {C}_0 $ to $ {C}_4 $ ) are estimated, reducing the wind profile’s complexity to five coefficients. We would like to point out that the approximation strategy allows us to effectively handle wind profile data from various sources. Whether the data is observed or simulated up to 500 m, and even if it does not align with the specific vertical levels of the CERRA data, the Chebyshev coefficient estimation process remains unaffected. This adaptability ensures that the proposed methodology is versatile across different wind datasets.

Figure 1. Column 1: an illustration of fourth order Chebyshev polynomials plotted against the normalized height $ z=\left[-1,1\right] $ . The remaining figures display the vertical profiles of wind speed from CERRA alongside those approximated by Chebyshev polynomials, for four well-known categories of wind regimes: high shear, low shear/well-mixed, low-level jets (LLJ), and high wind.

$ {T}_0 $ represents a constant line, with $ {C}_0 $ approximating mean wind speed. Similarly, $ {T}_1 $ corresponds to a diagonal line, and $ {C}_1 $ approximates wind shear. The parabolic profile of $ {T}_2 $ is captured by $ {C}_2 $ , representing curvature in the wind profile. These three coefficients are expected to capture a significant portion of the wind profile, while higher-order coefficients account for small-scale variations.

To illustrate the capability of Chebyshev coefficients, we compared four wind profiles from the CERRA dataset with their Chebyshev approximations (Figure 1). These types of profiles, namely high shear, low shear/well-mixed, LLJ, and high wind, selected from Durán et al. (2020), are crucial for wind energy applications. The figures demonstrate the effective approximation of CERRA wind profiles by Chebyshev coefficients.

3.3. Experimental setup

Figure 2(a) illustrates a complete flowchart of the training procedure adopted in this study. First, the ERA5-based predictor meteorological variables as inputs and the estimated Chebyshev coefficients as targets are stacked side-by-side as a tabular dataset. Next, the entire data of year 2001 is kept aside for testing purpose. Now, among the data of year 2000, randomly selected six consecutive days of each month are used for validation purposes. The remaining data is adopted for training the models. This ensures that the training and validation data covers seasonality given 1 year sample size. The data-splitting strategy is illustrated in Figure 2(b). After splitting, there are 7056 samples in training, 1728 samples in validation, and 8760 samples in testing. After this, a min-max normalization function is constructed on the targets of training data, using which the targets of training and validation data are normalized.

Figure 2. (a) Flowchart of the experimental setup used in this study to train the TabNet. (b) Our strategy of splitting the entire dataset into train, validation, and test. (c) Loss curves of one of the trained model, in which the train and validation RMSE values are plotted against the training epochs.

The TabNet consists of several hyperparameters, in which we chose to tune $ {n}_d $ (width of decision prediction layer), $ {n}_{steps} $ (number of steps in the architecture), $ {N}_{independent} $ (number of independent Gated Linear Units), $ {n}_{shared} $ (number of shared Gated Linear Units), and $ gamma $ (coefficient for feature reusage). The readers are encouraged to peruse the paper by Arik and Pfister (Reference Arik and Pfister2021) for more information about the architecture and the hyperparameters. Next, a random search is employed for tuning the model hyperparameters from parameter spaces of $ {n}_d $ :[4,8,16], $ {n}_{steps} $ :[3,4,5], $ {n}_{independent} $ :[1,2,3,4,5], $ {n}_{shared} $ :[1,2,3,4,5], and $ gamma $ :[1.1,1.2,1.3,1.4]. Using these parameters, the TabNet model is trained with mean squared error (MSE) as the loss function and evaluated on the validation dataset. After training, the validation loss is calculated, and the trained model is saved as an external file. Following hyperparameter tuning, the training process is replicated across 10 ensembles, each initiated with a random train-validation split. Each ensemble saves its best-performing model. The inner loop of hyperparameter tuning enhances the model’s robustness by optimizing performance across different hyperparameters on the same dataset. The outer loop, on the other hand, is crucial for generating reliable predictions through ensemble modeling. A sample learning curve from one of the saved models is depicted in Figure 2(c), demonstrating that the models are training effectively.

4. Results

Model predictions are generated for the test data, and performance is evaluated using key metrics such as mean absolute error (MAE), coefficient of determination ( $ {R}^2 $ ), and root mean square error (RMSE). Figure 3(first row) illustrates a comparison of predicted coefficients from one of the ensemble models with respect to the test data using bivariate histograms. Among the coefficients, $ {C}_0 $ and $ {C}_1 $ predominantly align along the diagonal line, displaying narrower spreads. Notably, the model exhibits its highest predictability for $ {C}_0 $ with an $ {R}^2 $ of 0.93 and moderate predictability for $ {C}_1 $ with an $ {R}^2 $ of 0.65. Conversely, the remaining coefficients frequently register values close to zero, with less probable values demonstrating a wider spread, indicating lower predictability for these coefficients.

Figure 3. First row: a comparison of Chebyshev coefficients ( $ {C}_0,{C}_1,{C}_2,{C}_3 $ and $ {C}_4 $ ) between the test data and the model predictions using bivariate histograms. The probability of occurrence is represented on a log scale with the color increasing from dark (low probability) to light (high probability). The evaluation scores, namely MAE, $ {R}^2 $ , and RMSE for each coefficient are provided in the text boxes. Second row: the combined feature importance of input variables based on the test data.

Additionally, the impact of input meteorological variables on the predicted Chebyshev coefficients is estimated by measuring (permutation) feature importance. This is done by quantifying the reduction in the score when a specific feature is absent. From the feature importance, as shown in 3 (second row), it is evident that the meteorological variables directly related to wind speed ( $ {\mathbf{W}}_{10} $ to $ {\mathbf{W}}_{p10}^i $ ) are showing significant influence on the coefficients, which is expected. A major finding from the feature importance is that identification of atmospheric stability-related variables, such as instantaneous surface sensible heat flux ( $ {H}_S $ ), difference in air and skin temperatures ( $ \Delta {T}_1 $ ), and the difference in 975 hPa and 2 m air temperatures ( $ \Delta {T}_4 $ ), are also exerting significant influence on wind profiles. It is evident that the boundary layer height ( $ \mathbf{H} $ ) is also an important input feature for wind profile estimation. By identifying these influential variables, we can accurately characterize various wind profiles. This improved characterization aids in the better estimation of wind power and turbine loads.

Since the main objective of this study is to predict the wind profiles given coarse-resolution meteorological features from the ERA5 dataset, we reconstruct the wind profiles using the predicted Chebyshev coefficients from the test data. A sample of four profiles from the 10 ensemble model predictions is presented in Figure 4. From this figure, it is evident that the predicted profiles are in agreement with the CERRA-generated wind profiles, and the uncertainty bounds are quite narrow. The evaluation metrics RMSE and mean absolute percentage error (MAPE), which are computed between the CERRA profile and the median profile, further corroborate the strength of the TabNet model. However, not every prediction turned out to be accurate, as shown in Figure 5. For these selected profiles, the error metrics (RMSE and MAPE) are quite high. Careful analysis of Figure 5 reveals a substantial discrepancy between the ERA5 and CERRA wind speeds, which may account for the poor predictions observed in these instances. The TabNet-generated profiles lie in between the ERA5 and CERRA wind speeds. This systematic error could be reduced by increasing the training sample size and including samples from diverse field sites.

Figure 4. A comparison of vertical profiles of wind speed from CERRA and the 10 ML model predictions, on four instances of test data for the selected wind regimes. Blue line represents the 50th percentile of the ensemble, darker shade represents the ensemble between 25th and 75th percentiles, and the ligher shade represents the ensemble between 10th and 90th percentile. The wind speed from ERA5 at 10 m ( $ {\mathbf{W}}_{10} $ ) and 100 m ( $ {\mathbf{W}}_{100} $ ) are illustrated using green diamonds. The evaluation scores, RMSE and MAPE are computed between the CERRA and the median profile for each wind regime, are provided in the text boxes.

Figure 5. Same as Figure 4, but a different set of time instances of the test data for the selected wind regimes.

5. Conclusion

In this work, we introduced a proof-of-concept methodology for estimating wind speed profiles using large-scale meteorological features from the ERA5 reanalysis using TabNet, an attention-based DL model. Chebyshev polynomials were utilized to approximate wind profiles with only five coefficients. Instead of using wind speeds at multiple heights as targets, we used these Chebyshev coefficients as targets. This approximation strategy ensures that our TabNet-based methodology can be used with different types of wind datasets, which may contain wind speeds at varying heights. Results indicated that TabNet effectively captured nonlinear dependencies between meteorological features and wind profiles across different wind regimes. Feature importance analysis highlighted the significant influence of wind speed, atmospheric stability-related variables, and boundary layer height on the Chebyshev coefficients.

Nonetheless, there is significant room for improving the accuracy of the proposed approach. Specifically, the performance of the TabNet in predicting higher-order Chebyshev coefficients is less than satisfactory. In our future work, we will incorporate additional meteorological variables from reanalysis datasets and will also explore alternative DL models. Additionally, we plan to apply this methodology to diverse wind profile datasets, collected at various geographical locations, and predict over several years in a round-robin manner.

Open peer review

To view the open peer review materials for this article, please visit http://doi.org/10.1017/eds.2024.41.

Author contribution

Conceptualization: H.B., S.B.; Methodology: H.B., S.B.; Data curation: H.B.; Data visualization: H.B.; Writing original draft: H.B.; Writing review and editing: H.B., S.B. All authors approved the final submitted draft.

Competing interest

None.

Data availability statement

The ERA5 and CERRA reanalysis are downloaded from ECMWF CDO, available at https://cds.climate.copernicus.eu/cdsapp#!/search?type=dataset. The entire workflow and the scripts used in training can be found at https://github.com/HarishBaki/CI2024.git.

Funding statement

This study was partially supported by the EU-SCORES and the Winds of the North Sea in 2050 (WINS50) projects. The EU-SCORES project is a part of the European Union’s Horizon 2020 research and innovation programme under grant agreement No 101036457. The WINS50 project was sponsored by the Rijksdienst voor Ondernemend Nederland via the Top Sector Energy programme.

Provenance statement

This article is part of the Climate Informatics 2024 proceedings and was accepted in Environmental Data Science on the basis of the Climate Informatics peer review process.

Ethical standard

The research meets all ethical guidelines, including adherence to the legal requirements of the study country.

Footnotes

This research article was awarded Open Data and Open Materials badges for transparent practices. See the Data Availability Statement for details.

References

Arik, and Pfister, T (2021) TabNet: Attentive interpretable tabular learning. Proceedings of the AAAI Conference on Artificial Intelligence 35, 66796687.CrossRefGoogle Scholar
Baki, H., Basu, S., & Lavidas, G. (2024a). Estimating the offshore wind power potential of Portugal by utilizing gray-zone atmospheric modeling. Journal of Renewable and Sustainable Energy, 16(6). https://doi.org/10.1063/5.0222974CrossRefGoogle Scholar
Baki, H., Basu, S., & Lavidas, G. (2024b). Modelling Frontal Low-Level Jets and Associated Extreme Wind Power Ramps over the North Sea. Wind Energy Science Discussion [preprint], https://doi.org/10.5194/wes-2024-99.CrossRefGoogle Scholar
Bañuelos-Ruedas, F, Angeles-Camacho, C and Rios-Marcuello, S (2010) Analysis and validation of the methodology used in the extrapolation of wind speed data at different heights. Renewable and Sustainable Energy Reviews 14, 23832391.CrossRefGoogle Scholar
Basu, S (2023) Vertical wind speed profiles in atmospheric boundary layer flows. Wind Energy Engineering: A Handbook for Onshore and Offshore Wind Turbines (2 ed., pp. 7585). Elsevier. https://doi.org/10.1016/B978-0-323-99353-1.00031-1CrossRefGoogle Scholar
Dimitrov, N, Natarajan, A and Kelly, M (2015) Model of wind shear conditional on turbulence and its impact on wind turbine loads. Wind Energy 18, 19171931.CrossRefGoogle Scholar
Dirksen, M, Wijnant, I, Siebesma, A, Baas, P and Theeuwes, N (2022) Validation of Wind Farm Parameterisation in Weather Forecast Model HARMONIE-AROME: Analysis of 2019. Netherlands: Delft University of Technology.Google Scholar
Dörenkämper, M, Olsen, BT, Witha, B, Hahmann, AN, Davis, NN, Barcons, J, Ezber, Y, García-Bustamante, E, González-Rouco, JF, Navarro, J, Sastre-Marugán, M, Sīle, T, Trei, W, Žagar, M, Badger, J, Gottschall, J, Sanz Rodrigo, J and Mann, J (2020) The making of the new European wind atlas – Part 2: Production and evaluation. Geoscientific Model Development 13, 50795102.CrossRefGoogle Scholar
Durán, P, Basu, S, Meißner, C and Adaramola, MS (2020) Automated classification of simulated wind field patterns from multiphysics ensemble forecasts. Wind Energy 23, 898914.CrossRefGoogle Scholar
Elliott, DL and Cadogan, JB 1990, Effects of Wind Shear and Turbulence on Wind Turbine Power Curves, Technical Report, Richland, WA (USA): Pacific Northwest Lab.Google Scholar
Guo, Y, Wang, H and Lian, J (2022) Review of integrated installation technologies for offshore wind turbines: Current progress and future development trends. Energy Conversion and Management 255, 115319.CrossRefGoogle Scholar
Gutierrez, W, Ruiz-Columbie, A, Tutkun, M and Castillo, L (2017) Impacts of the low-level jet’s negative wind shear on the wind turbine. Wind Energy Science 2, 533545.CrossRefGoogle Scholar
Hahmann, AN, Sīle, T, Witha, B, Davis, NN, Dörenkämper, M, Ezber, Y, García-Bustamante, E, González-Rouco, JF, Navarro, J, Olsen, BT and Söderberg, S (2020) The making of the new European wind atlas – Part 1: Model sensitivity. Geoscientific Model Development 13, 50535078.CrossRefGoogle Scholar
Hersbach, H, Bell, B, Berrisford, P, Hirahara, S, Horányi, A, Muñoz-Sabater, J, Nicolas, J, Peubey, C, Radu, R, Schepers, D, et al. (2020) The ERA5 global reanalysis. Quarterly Journal of the Royal Meteorological Society 146, 19992049.CrossRefGoogle Scholar
IEC (2005) 12–1: Power Performance Measurements of Electricity Producing Wind Turbines, British Standard, IEC 61400-12.Google Scholar
Kartal, S, Basu, S and Watson, SJ (2023) A decision tree-based measure-correlate-predict approach for peak wind gust estimation from a global reanalysis dataset. Wind Energy Science, 8, 15331551, https://doi.org/10.5194/wes-8-1533-2023.CrossRefGoogle Scholar
Liu, B, Ma, X, Guo, J, Li, H, Jin, S, Ma, Y and Gong, W (2023) Estimating hub-height wind speed based on a machine learning algorithm: Implications for wind energy assessment. Atmospheric Chemistry and Physics 23, 31813193.CrossRefGoogle Scholar
Mason, JC and Handscomb, DC (2002) Chebyshev Polynomials. CRC press.CrossRefGoogle Scholar
Mohandes, MA and Rehman, S (2018) Wind speed extrapolation using machine learning methods and LiDAR measurements. IEEE Access 6, 7763477642.CrossRefGoogle Scholar
Nagababu, G, Srinivas, BA, Kachhwaha, SS, Puppala, H and Kumar, SVA (2023) Can offshore wind energy help to attain carbon neutrality amid climate change? A GIS-MCDM based analysis to unravel the facts using CORDEX-SA. Renewable Energy 219, 119400.CrossRefGoogle Scholar
New Atlas (July 2023) World’s largest wind turbine to feature 16 MW capacity. New Atlas URL https://newatlas.com/energy/worlds-largest-wind-turbine-myse-16-260/.Google Scholar
Optis, M, Bodini, N, Debnath, M and Doubrawa, P (2021) New methods to improve the vertical extrapolation of near-surface offshore wind speeds. Wind Energy Science 6, 935948.CrossRefGoogle Scholar
Park, J, Manuel, L and Basu, S (2015) Toward isolation of salient features in stable boundary layer wind fields that influence loads on wind turbines. Energies 8, 29773012.CrossRefGoogle Scholar
Schimanke, S, Ridal, M, Le Moigne, P., Berggren, L, Undén, P, Randriamampianina, R, Andrea, U, Bazile, E, Bertelsen, T, Brousseau, P, Dahlgren, P, Edvinsson, L, El Said, A, Glinton, M, Hopsch, S, Isaksson, L, Mladek, R, Olsson, E, Verrelle, A, Wang, Z (2021) CERRA sub-daily regional reanalysis data for Europe on pressure levels from 1984 to present, Technical Report, Copernicus Climate Change Service (C3S) Climate Data Store (CDS). https://doi.org/10.24381/cds.a39ff99f (accessed 10 10 2023).CrossRefGoogle Scholar
Van Sark, WG, Van der Velde, HC, Coelingh, JP and Bierbooms, WA (2019) Do we really need rotor equivalent wind speed? Wind Energy 22, 745763.CrossRefGoogle Scholar
Wagner, R, Courtney, M, Gottschall, J and Lindelöw-Marsden, P (2011) Accounting for the speed shear in wind turbine power performance measurement. Wind Energy 14, 9931004.CrossRefGoogle Scholar
Wharton, S and Lundquist, JK (2012) Assessing atmospheric stability and its impacts on rotor-disk wind characteristics at an onshore wind farm. Wind Energy 15, 525546.CrossRefGoogle Scholar
Wijnant, I, van Ulft, B, van Stratum, B, Barkmeijer, J, Onvlee, J, de Valk, C, Knoop, S, Kok, S, Marseille, G, Baltink, HK, et al. (2019) The Dutch Offshore Wind Atlas (DOWA): Description of the Dataset. De Bilt: Royal Netherlands Meteorological Institute, Ministry of Infrastructure and Water Management.Google Scholar
Yu, S and Vautard, R (2022) A transfer method to estimate hub-height wind speed from 10 meters wind speed based on machine learning. Renewable and Sustainable Energy Reviews 169, 112897.CrossRefGoogle Scholar
Figure 0

Table 1. Description of the meteorological variables adopted from the ERA5 reanalysis

Figure 1

Figure 1. Column 1: an illustration of fourth order Chebyshev polynomials plotted against the normalized height $ z=\left[-1,1\right] $. The remaining figures display the vertical profiles of wind speed from CERRA alongside those approximated by Chebyshev polynomials, for four well-known categories of wind regimes: high shear, low shear/well-mixed, low-level jets (LLJ), and high wind.

Figure 2

Figure 2. (a) Flowchart of the experimental setup used in this study to train the TabNet. (b) Our strategy of splitting the entire dataset into train, validation, and test. (c) Loss curves of one of the trained model, in which the train and validation RMSE values are plotted against the training epochs.

Figure 3

Figure 3. First row: a comparison of Chebyshev coefficients ($ {C}_0,{C}_1,{C}_2,{C}_3 $ and $ {C}_4 $) between the test data and the model predictions using bivariate histograms. The probability of occurrence is represented on a log scale with the color increasing from dark (low probability) to light (high probability). The evaluation scores, namely MAE, $ {R}^2 $, and RMSE for each coefficient are provided in the text boxes. Second row: the combined feature importance of input variables based on the test data.

Figure 4

Figure 4. A comparison of vertical profiles of wind speed from CERRA and the 10 ML model predictions, on four instances of test data for the selected wind regimes. Blue line represents the 50th percentile of the ensemble, darker shade represents the ensemble between 25th and 75th percentiles, and the ligher shade represents the ensemble between 10th and 90th percentile. The wind speed from ERA5 at 10 m ($ {\mathbf{W}}_{10} $) and 100 m ($ {\mathbf{W}}_{100} $) are illustrated using green diamonds. The evaluation scores, RMSE and MAPE are computed between the CERRA and the median profile for each wind regime, are provided in the text boxes.

Figure 5

Figure 5. Same as Figure 4, but a different set of time instances of the test data for the selected wind regimes.

Author comment: Estimating high-resolution profiles of wind speeds from a global reanalysis dataset using TabNet — R0/PR1

Comments

To:

The Editor-in-Chief,

Environmental Data Science,

Cambridge University Press.

RE: Submission of a research manuscript

Dear Prof. Monteleoni,

We would like to submit a research manuscript entitled “Estimating high-resolution profiles of wind speeds from a global reanalysis dataset using TabNet” to be considered for publication in “Environmental Data Science”, as part of the Climate Informatics 2024 Proceedings. We confirm that this work is original and has not been published elsewhere, nor is it currently under consideration for publication anywhere else.

We believe that the findings of our manuscript would have substantial standing in the field of sustainable wind resource modeling, specifically for wind power prediction and comprehensive wind turbine performance assessment. In this study, we introduced a proof-of-concept methodology utilizing TabNet, an attention-based sequential deep learning model, to estimate vertical wind profiles from coarse-resolution meteorological features extracted from a reanalysis dataset. The methodology has been designed to be applicable across diverse datasets through the utilization of Chebyshev polynomial approximation. To mimic the measure-correlate-predict (MCP) approach, the TabNet model is trained for one year and predictions are obtained for a different year. The model more-or-less accurately predicts unseen wind profiles for different wind conditions, such as high shear, low shear/well mixed, low-level jet, and high wind. Our overall methodology will also be helpful for studies focusing on quantifying the correlation of wind profiles with prevailing atmospheric conditions through a systematic feature importance assessment. We are sure that the communication shall be interesting to the broad readership of “Environmental Data Science.”

Best Regards,

Harish Baki.

On behalf of the manuscript authors.

Review: Estimating high-resolution profiles of wind speeds from a global reanalysis dataset using TabNet — R0/PR2

Conflict of interest statement

Reviewer declares none.

Comments

>Summary: In this section please explain in your own words what problem the paper addresses and what it contributes to solving it.

The paper presented a proof-of-concept method to estimate the wind profile using deep learning, TabNet, which has the potential for wind energy for climate mitigation. The paper shows the utility of deep learning for wind profile estimation with relatively good generalizability which is important for applications.

>Relevance and Impact: Is this paper a significant contribution to interdisciplinary climate informatics?

The paper demonstrates the value of deep learning for climate mitigation applications and addresses the gap in existing methods for estimating wind speed for the renewable energy industry.

>Detailed Comments

The paper is overall well written with clear description of methods and well designed experiments. It is among the top application papers/submissions that I have reviewed in this year’s conference. The results are carefully analyzed and presented to demonstrate the strength and limitations of the proof-of-concept model.

To help improve the paper, I have a few minor comments:

1. For the input variables, I suggest the authors to provide the explanation of variables used as the input for the model if space allow.

2. In equation 3, please define variable “”x“”.

3. Although the target of the model is to estimate the coefficients that can use for the approximation for the profile. Thus, Figure 3 show the comparison between the estimated and reference coefficients. However, it will also be valuable to show the comparison between the performance of the estimated wind profile and the reference wind profile with quantitative metrics in a more comprehensive way than selected examples in Figure 4 and 5.

4. Currently, authors used ~30 variables from ERA-5 to estimate the coefficients. Are all those variables necessary? From the variable importance, the answer seems to be no. Maybe the authors should consider variable selection before feed them as input features which may improve the model explainability and reduce the sensitivity of the model to errors in the input features (eg. Figure 5).

Recommendation: Estimating high-resolution profiles of wind speeds from a global reanalysis dataset using TabNet — R0/PR3

Comments

This article was accepted into Climate Informatics 2024 Conference after the authors addressed the comments in the reviews provided. It has been accepted for publication in Environmental Data Science on the strength of the Climate Informatics Review Process.

Decision: Estimating high-resolution profiles of wind speeds from a global reanalysis dataset using TabNet — R0/PR4

Comments

No accompanying comment.