TOWARDS PREDICTING RICE LOSS DUE TO TYPHOONS IN THE PHILIPPINES
Reliable predictions of the impact of natural hazards turning into a disaster is important for better targeting humanitarian response as well as for triggering early action. Open data and machine learning can be used to predict loss and damage to the houses and livelihoods of affected people. This research focuses on agricultural loss, more specifically rice loss in the Philippines due to typhoons. Regression and binary classification algorithms are trained using feature selection methods to find the most important explanatory features. Both geographical data from every province, and typhoon specific features of 11 historical typhoons are used as input. The percentage of lost rice area is considered as the output, with an average value of 7.1%. As for the regression task, the support vector regressor performed best with a Mean Absolute Error of 6.83 percentage points. For the classification model, thresholds of 20%, 30% and 40% are tested in order to find the best performing model. These thresholds represent different levels of lost rice fields for triggering anticipatory action towards farmers. The binary classifiers are trained to increase its ability to rightly predict the positive samples. In all three cases, the support vector classifier performed the best with a recall score of 88%, 75% and 81.82%, respectively. However, the precision score for each of these models was low: 17.05%, 14.46% and 10.84%, respectively. For both the support vector regressor and classifier, of all 14 available input features, only wind speed was selected as explanatory feature. Yet, for the other algorithms that were trained in this study, other sets of features were selected depending also on the hyperparameter settings. This variation in selected feature sets as well as the imprecise predictions were consequences of the small dataset that was used for this study. It is therefore important that data for more typhoons as well as data on other explanatory variables are gathered in order to make more robust and accurate predictions. Also, if loss data becomes available on municipality-level, rather than province-level, the models will become more accurate and valuable for operationalization.