This dataset contains 193 sets of sample data from several embankment sections, including Huangdaodi embankment of the Yangtze River and Anqing Yangtze River dry embankment, covering eight key influence factors, including water level height difference, cover thickness, permeability coefficient, effective cohesion, effective angle of internal friction, dry density, pore ratio and compression coefficient.
This dataset has significant advantages in terms of influence factors in data diversity, providing higher resolution and reliability. This dataset is widely used in embankment safety assessment, flood control model construction and the training and validation of machine learning algorithms, which helps the scientific and refined management of flood control and disaster prevention, and promotes the development of research and application in the field of water conservancy engineering.
| collect place | the lower reaches of the yangtze river |
|---|---|
| data size | 20.5 KiB |
| data format | *.xlsx |
| Coordinate system |
This dataset is derived from the master's thesis, “Predictive Analysis of Binary Dike Pipe Surge Risks in Yangtze River Based on Gray Correlation and GA-DBN”. In that study, data related to dike pipe-surge hazards were collected and analyzed for several dike sections, including Huangdaodong dike of the Yangtze River and Anqing Yangtze River dry dike, by combining grey correlation analysis and genetic algorithm optimization with a deep belief network (GA-DBN) method.
(1) Gray correlation analysis was used to screen five key factors from eight influencing factors of embankment leakage. Gray correlation analysis ensures that the selected factors contribute significantly to the prediction model by assessing the degree of association between each influencing factor and embankment leakage.
(2) Deep Belief Network (DBN) model and Genetic Algorithm optimized DBN (GA-DBN) model were used to predict the levee leakage, and the DBN model effectively captured the complex features of the data through the multilayer nonlinear structure. The DBN model effectively captures the complex features of the data, while the GA-DBN model further combines the genetic algorithm to optimize the network structure and parameters, which improves the prediction accuracy and generalization ability of the model.
Data Quality EvaluationThe training results based on the GA-DBN model show that the present dataset is of extremely high quality and reliability. On the training set, the model achieved 98.04% accuracy, 98.10% recall, 97.88% precision, and 97.98% F1 score, which are excellent metrics reflecting the consistency and efficiency of the data on various impact factors.
In addition, on the validation set, the model even achieves 100% accuracy, recall, precision, and F1 score, further demonstrating the high quality of the dataset and the superior predictive power of the model. These excellent performance indicators show that the dataset ensures data completeness and accuracy during the selection of influencing factors, data collection and processing, and can effectively support the accurate prediction and analysis of embankment seepage hazards.
| # | number | name | type |
| 1 | 2021YFC3000100 | Lower Yangtze River Flood Disaster Integration and Control and Emergency De-risking Technology and Equipment | National key R & D plan |
This work is licensed under a
Creative
Commons Attribution 4.0 International License.
| # | title | file size |
|---|---|---|
| 1 | _ncdc_meta_.json | 5.3 KiB |
| 2 | 长江黄广大堤、安庆长江干堤等某几个堤段8个影响因子数据集.xlsx | 20.5 KiB |
©Copyright 2005-. Northwest Institute of Eco-Environment and Resources, CAS.
Donggang West Road 320, Lanzhou, Gansu, China (730000)

