Extended Abstract
Introduction
Gully erosion is one of the most important factors affecting sediment production and land degradation, and predicting its occurrence is one of the practical solutions to prevent gully erosion. Since the occurrence of gully erosion is directly related to environmental factors and human activities, it is possible to identify areas prone to gully erosion using models based on artificial intelligence and data mining. Modeling helps to save time and cost of measuring gutters. Also, because artificial intelligence and data mining models have a high ability to analyze environmental information; they are able to identify nonlinear and complex relationships between variables, and for this reason, they have been widely accepted by researchers in various sciences worldwide. For this purpose, this study aimed to predict gully erosion susceptibility using random forest models and boosted regression trees in the Talwar watershed located in the southeast of Kurdistan province.
Materials and Methods
Initially, 99 gullies were identified during field visits, the location of the gully head-cut was recorded, and a map of the spatial distribution of the gullies was prepared. The recorded gullies were randomly divided into two groups: training and validation in a ratio of 70:30, such that 70% of the gullies were in the training group and the rest in the validation group. In addition, maps of factors affecting gully erosion including elevation, slope gradient, slope aspect, lithology, distance from the stream, topographic wetness index, land use, plan curvature, profile curvature, average annual rainfall, relative slope position, stream power index, distance from the road, soil order, and soil texture were prepared in geographic information system. Subsequently, in the modeling process, environmental factors were considered as independent variables and the creation of gullies as a dependent variable. In order to model gully erosion, the training group gullies were used in this stage to calibrate the models. In this study, Random Forest (RF) and Boosted Regression Trees (BRT) machine learning models were used to predict gully erosion. In these models, raster layers related to environmental factors affecting gully erosion were introduced as independent variables to the model. Also, the layer of gully front points, which were previously named after the training group, was introduced as a dependent variable to the model. The process of running the models was carried out in the R software environment. The prediction accuracy of the models was also evaluated using the area under the receiver operating characteristic curve (AUC) method.
Results and Discussion
The spatial pattern of gully erosion by these two models showed that in this basin, generally the middle, eastern and northern parts, which are adjacent to waterways and rivers, have a higher tendency to cause gully erosion. Since the prediction interval of gully erosion in artificial intelligence models varies between zero and one, it can be considered as the probability of gully erosion. The lowest and highest values of the probability of gully erosion by the random forest model were obtained as 0.006 and 0.996, respectively. The median value of the prediction of gully erosion in the prediction of the random forest model was also calculated as 0.322. Therefore, fifty percent of the pixels in this basin have a tendency to cause gully erosion greater than 0.322 and the other half of its tendency is less than 0.322. The spatial pattern of gully erosion prediction from the boosted regression tree model also varied from 0.011 to 0.799. This model generally introduced the adjacent sections of the drainage network in the middle, eastern, and northern parts as the most favorable lands for the creation and formation of gully erosion. The median value in the prediction made using this model was 0.387. The prediction accuracy of the models was also obtained based on the area under the receiver operating characteristic curve in the random forest and boosted regression tree models, 0.952 and 0.891, respectively.
Conclusion
The findings showed that the random forest model has more accuracy in the spatial prediction of gully erosion in the Talwar watershed. Also, based on the AUC criterion, the random forest model was placed in the excellent group (AUC>0.9) and the boosted regression tree model was placed in the very good group (AUC<0.8). According to the findings of this study, executive agencies can use artificial intelligence and data mining models, such as the random forest model, to prepare a gully erosion map and plan and prioritize areas for implementing soil conservation measures. Certainly, focusing soil conservation executive measures and management programs in areas prone to gully erosion in the country's watersheds will improve the performance and optimize the financial resources of natural resources and watershed management departments. |