Summary and Conclusions
Overall this has been an interesting project with which to learn about machine learning algorithms such as artifical neural networks. I think there are limitations though in how accurately we can predict the power values from the wind speed values given the relatively small size of the dataset and the single input feature. In addition there is a lack of detail around the source of the dataset and what exactly the measurements represent. The first time I looked at the data I wondered if the data values represented measurements over a particular period of time and if the lower values of power might be related to the time taken for the turbines to get up and running before they could generate any power. However looking at the raw data in the csv file, the rows are ordered by ascending values of speed column with the corresponding power values in the other column.
However given this it is still very impressive to see how in a few lines of code a machine learning model can learn to perform such as task by analysing some training examples. All the models developed managed to resemble the general shape of the curve of the data with the artificial neural networks performing better than the linear regression models.
Creating a better artificial neural network does seem to involve quite a bit of trial and error in terms of selecting the number of neurons, the number of layers, the activation functions and setting the weights and bias parameters. I did research some of these parameters for this project that will be useful as a starting point for any future projects in this area but there is certainly a lot more to learn.
The various machine learning models developed in this project do seem to under predict the power values for the lower values of speed. This is as I expected earlier given that when wind speeds are too low to generate power, the stored power is often used later. The models also seems to extrapolate a value for wind speeds beyond the cut-off rate. I excluded some observations from the dataset where the power values were zero for valid reasons such as the turbines being powered off for safety reasons. I assumed that some of the zero values were related to maintenance reasons and excluded these values as outlined earlier. All other observations with zero power values were included as they provide valid information. I think it make sense not to be trying to predict values for a wind turbine that is turned off or not working. Clever and all as the neural networks are they do not know this and therefore it is important for the user to have an understanding of the dataset to avoid making nonsense predictions!
I did experiment with leaving all the observations in, however for each model the power values at the higher wind speed end of the curve were underestimated. The neural networks performed better in each case than the polynomial regression models. The 4th degree polynomial did perform better than the 3rd degree polynomial at the lower wind speed end but otherwise the models performed almost the same. The neural network models perform better overall than the polynomial regression models, particularly at the lower and upper ends of wind speed values. The second neural network had an extra layer added in and more neurons in each layer. This did increase the processing time though. The improvement in accuracy gained was relatively small. I also experimented with reducing the number of neurons and layers and this model had a slightly higher cost than the other two variations. I guess you could spend a long time trying out all the options and this notebook could be even longer!. Having too many neurons and too many layers might result in longer processing times. However once a model is perfected it can be saved for later use. I did find that running predictions for individual values in this notebook did generate some tensorflow warning messages.
The learning curves for the neural networks showed that the loss fell quite dramatically for the first 100 or so iterations but then it levels off and does not come down below a particular value despite playing around with the parameters. Overall two of the neural network models ended up being the same and cannot really be distinguished when plotted together.
In the end the dataset provided here is relatively small so there is a limit to the accuracy that can be achieved. The research shown earlier did suggest that there is known to be a lot of uncertainty in predicting power generated from a wind turbine using local wind speed data. The variance in the power output is greatest near the rated wind speeds which can be seen at the upper ends of the plots. More data points and additional features would help in increasing the accuracy level.