Alberto Pistocci 1 lucia Luzi 2 Paula Napolitano 3
Zhu Rulie 4 Translation Proofreading
(165438 Italian cesena Viale g kaducci environment and land studio,1547023; 2 RRS-CNR, Milan, Italy; 3ACTA Studio Association of Naples, Italy; Institute of Hydrogeological Engineering Geological Technology and Methods, China Geological Survey, Baoding, Hebei, 07 105 1)
This case study stems from the application of different probability prediction models [Bayesian probability, fuzzy logic, and, or, and, output, grey (nonlinear) operation and inevitable factors] in the compilation of landslide hazard map in the hilly area of Apennines in northern Italy. Seven data layers are used to test very fragile areas: lithology, distance from geological structure line, annual rainfall, land cover type, topographic slope and slope direction, and distance from hydrological network section. By comparing the different results of the prediction rate index, this paper makes a careful discussion to evaluate the possibility of using this easy-to-use, applicable and effective database in land planning.
Keyword support function; Integrated simulation; Landslide disaster; Spatial database
1 Introduction and general arguments
In recent years, great progress has been made in establishing spatial databases in various parts of Europe and planning departments. However, many databases still seem to be ineffective for decision support, and the effective data they use are usually purely local. Especially the final data users and decision makers, know almost nothing about the simulation ability of Geographic Information System (GIS). Few local government agencies use forecasting models as effective support for their daily decision-making.
Geographic Information System (GIS) has brought great capabilities for detailed spatial feature simulation. Many local governments now have GIS technology, which provides convenient conditions for its use. Is it possible for this important information to become a more powerful method when people make daily habitual observation of natural phenomena?
Because of the need to participate in planning and enjoy goals, geologists have noticed how important the evaluation of identified resources is in planning and decision support. Some people emphasize the role of geoscience maps in the process of making policies and land use planning. According to their point of view, the main function of danger map is to provide decision makers with a correct view on the definition of land development laws and regulations.
The forecasting model based on causality between natural phenomena has been widely used by hydrologists, geoscientists, environmental analysts and engineers in natural risk assessment, natural resource management, pollution prevention and soil improvement, environmental impact assessment and other fields. However, as far as natural disasters such as landslides are concerned, it seems quite difficult to establish a reliable and applicable model in a regional scale. Some people explore the reasons for this difficulty and think that it is mainly limited by models and data. Different from other risk management perspectives, few managers explore the application of quantitative models.
The traditional landslide hazard mapping method relies on the experience observation and appraisal of geologists and geomorphologists (through direct observation of field characteristics and remote detection reports) to explain the characteristics of landslides. Although it is possible to identify past events, it can hardly support any prediction without the subjective and qualitative judgment of experts.
In recent years, a geotectonic model based on zonality has been proposed. However, the calculation method based on geotectonic model, or actually based on exponential superposition, is limited by lack of data or low quality of data. Although its natural foundation is quite solid, it is often unreliable.
On the other hand, analysts are interested in possible and limited random choices by predicting "objective" replicable models. Especially in the following cases:
When the assumption that planning is quite important involves social conflicts;
When the phenomenon is not easy to detect;
When it is too expensive to make a detailed map of phenomena covering the whole area of interest, it is necessary to simulate the "filtering range" of those areas that need further understanding.
Generally speaking, the simulation process is similar to decision-making, and its work is coordinated and basic. One reason why the disaster map is reasonable is that it can be reproduced and copied by expert appraisal and simulation methodology, which is helpful to understand the social composition, that is, to share the correct decision-making criteria among managers, the community public and scientists.
This reason has led to the investigation and research on the possibility of using probability to predict. In these explorations, we make full use of the prior knowledge about landslide events, reasonably determine the parameters, and use fuzzy or random map superposition method to make probability prediction.
In recent years, people have made many explorations on this method. All these methods are widely used in sensitivity analysis or performance research of different methods on the same case. At present, the main difficulty of these applications is the comparison of different maps.
In order to improve the drawing function, some people put forward the idea of solving the problem. In the work of these authors, probability and fuzzy categories are shown, which can be proved by the most supported detection function in the area where this phenomenon occurs (such as landslides or mineral deposits). These technologies play a role through speculation and verification; For other methods, such as neural network and Bayesian network, their * * * pass characteristics are similar to general mathematical simulation. This method can easily find a unique standard, called prediction ratio, which is mainly used to compare different prediction charts and can be called an effective measure to make the model have good performance. The explanation is as follows: The purpose of the support function is to produce a map that at least contains the scientists' judgment based on the regulations, that is, it can be obtained from the field experience and the prediction is the most correct. Of course, due to the gradual deepening of experts' knowledge and understanding of the phenomenon, it is inevitable to choose a variety of simulations during the evaluation. At the same time, due to improper coordination and the lack and unreliability of data, the hypothetical probability of variables may also lead to wrong results. However, the calibration and validation of the model can support the transparency and rationality of the prediction by using quantitative methods. Support function simulation method has recently been applied to some case studies specially arranged for this purpose.
The purpose of this paper is to explore the applicability of support function simulation in compiling disaster maps for landslide events determined by the existing database standards, and to check how to improve this method in the information application of the existing database in order to compare with other technologies (for example, landslide frequency mapping of each lithologic unit or pure landslide list mapping).
Supportive function simulation can also be used to construct the conceptual setting of database: data collection strictly depends on the accurate understanding of the best available information within the theoretical framework.
2 theoretical background
Many authors point out that the use of numerical techniques is related to those noteworthy events because of the local value of related phenomena. Attribute is regarded as the evidence factor of an event, and the possibility, possibility or degree of possibility of finding an event is very consistent with the existence of each related attribute in a certain sense. Assume that A is the domain that has been analyzed and F is the event phenomenon that has been checked. If the r data layer is valid data, then for mk in each attribute category, let k= 1, ... r, and you can define an allocation function for each data layer:
Essays on Geological Disaster Investigation and Monitoring Techniques and Methods
It assigns each pixel of a to one layer in the sequence of k layers; Another function can be determined for each layer:
Essays on Geological Disaster Investigation and Monitoring Techniques and Methods
In this case, the map has an interval [a, b] in which the value decreases in each layer. Here, A and B depend on further assumptions made by analysts (which will be pointed out later). This value represents popularity, that is, popularity when encountering a special attribute phenomenon.
After defining the function components v and r as each data layer, the support function can be expressed as:
Essays on Geological Disaster Investigation and Monitoring Techniques and Methods
The interval extremes A and B must be determined by the analyst according to his "reliability" explanation. If the reliability is the same as the "possibility", then A = 0 and B = 1. If the reliability range is equal to the deterministic coefficient, a =- 1 and b = 1. If you choose a different method, you may need another value.
The different uses of the auxiliary functions in this project will be explained in this report.
If the support hypothesis is related to a specific phenomenon F, set the attribute type consistent with the possibility of the event as E 1, …, e, and then press E 1, …, e according to Bess theorem. Independent conditional hypothesis can be written as:
Essays on Geological Disaster Investigation and Monitoring Techniques and Methods
In ppsI, I= 1, …, n, which is the priority probability of the inevitable attribute type, can be estimated by the percentage of the total area where the attribute type exists. Pps 1 to n is considered as a priori possibility of attribute type; This can be expressed as a percentage of the total area of all kinds of * * *. PpaI, I= 1, …, n is the observation possibility of F for attribute type Ei events; This can be calculated according to the formula. PPAI =1-(1-(areaI)-1) Nb(I), where Areai is the area of I- series, and Nb (i) is the area of I- series that also meets the F- condition. PsF is the priority probability that all F covers the whole area, which can be calculated by the percentage of all F qualified areas.
According to this rule, the map can be compiled by calculating every combination of attribute types that appear. This can be done in the grid of GIS through the conventional crossover operation process.
If deterministic coefficients are used, the algorithm changes accordingly as follows:
The deterministic coefficient of (1) attribute category can be defined as:
Essays on Geological Disaster Investigation and Monitoring Techniques and Methods
Where: I= 1, …, n; N is the number of subject data categories as the cause factor.
(2) For two data types, the deterministic coefficient is calculated according to the following rules:
When CF 1 and CF2 are both positive signs, then cf1+2 = cf1+cf2-(cf1× cf2);
If the signs of CF 1 and CF2 are opposite, then cf1+2 = cf1+cf2/{1-min (| cf1|, | cf2 |);
If CF 1 and CF2 are both negative signs, then cf1+2 = cf1+cf2+(cf1× cf2).
(3) The program first calculates CF 1+2=CF 12, then CF 13=CF 12+3, and so on. More maps can be obtained according to this.
As the last method, fuzzy set theory is used to calculate fuzzy sum, fuzzy output, fuzzy sum, fuzzy or sum fuzzy nonlinear functions. All these functions are run under the assumption that the predicted value of the possibility of F existence is equal to the given class EI (ppaI). They are:
Fuzzy sum = min (PPAI), I = 1, … n;
Fuzzy OR = Max (PPAI), I = 1, … n;
Fuzzy result = Ⅱ (PPAI), i = 1, … n;
"Fuzzy sum" =1-Ⅱ (1-PPAI), I= 1, … n;
"Fuzzy nonlinear calculation" = (fuzzy sum) (fuzzy result) 1-γ, where γ is a parameter in the range of 0: 1.
According to this method, the rules for compiling coverage map can be determined, so that analysts can evaluate the differences of different event attribute data covering the whole research area and help to further identify more phenomena sites. These calculation results indicate the number of indicators related to phenomena that are considered favorable. It must be pointed out that in addition to what has been described, different techniques can be used according to the corresponding data evidence components, confidence function, linear regression coverage probability and many other conditions.
It must be pointed out that the priority probability psF needs to be calculated by estimating and measuring the deterministic coefficient, but it is meaningless to use it for absolute boundary conditions, because it is almost impossible to predict the possibility of future landslide events. The reason of prediction can only be confirmed after summarizing all conditions and obtaining supporting indicators, and cannot be based on the numerical calculation results of disasters.
3 application
The area used in this case study is the Savio River Basin in northern Italy (Figure 1). Regional geological survey is basically a sedimentary basin composed of marl and sandstone. In more detail, it can be divided into the following three main geological layers:
Figure 1 Location map of the study area
(1) Tortonian N 1 is a gray sandy and muddy turbidite sedimentary rock, which is the main geological stratum and exposed on both sides of this mainstream.
(2) It is composed of microcrystalline gypsum, argillaceous clay and sand, and the basement is sulfur-bearing limestone.
(3) It is composed of argillaceous rocks, sandstone and conglomerate, all of which contain limestone.
In addition, argillaceous marl layer, late Eocene sandstone layer, Pliocene clay and mixed accumulation clay layer are exposed on the surface.
This area is covered by a large number of landslides, and most of them appear in the form of sliding or debris flow in different geological units. Moreover, there are rock caving and block translation movements in some areas, but they are not analyzed. The data used in the study were provided by the Geological Survey of Romagne District, Emilia.
The database used in this case study consists of several subject layers, including:
Linear structure (fault, syncline, anticline), scale1:50000;
Lithologic unit, scale1:50000;
According to CORINE's European engineering guide agreement, the scale of land coverage obtained from TM Landsat images is1:50000;
Digital Terrain Model (DTM), based on the contour lines obtained from the map database of local authorities in emilia-romagna District, interpolated by curve contour lines, with a vertical spacing of 50m;
Rainfall measurement data of 7 rainfall measurement stations in the whole region;
Digital hydrological network with the scale of 1: 10000.
It must be emphasized that the resolution of the database is very poor, and the data scale is also very uneven. Some people think that the terrain information is obviously inaccurate, especially compared with the average landslide area, so it becomes a kinematic representative of unrealistic sliding. The purpose of this study is to evaluate the predictive power of real-world databases (as mentioned above). It must be recognized that it is not important to make a reliable disaster map, because although the best information has been used up, it is impossible to obtain any further investigation and data. As emphasized in the following contents, compared with the land planning forecast, the evaluation results will provide more input for the improvement of the database.
The slope and aspect map are formed by DTM digital terrain model, and the slope is graded at fixed numerical intervals.
The purpose of calculating the distance of linear structure is to evaluate the possible influence of structural interference on slope stability. The grating diagram and rasterization are taken as the calculation results.
Analyze the rainfall data to find out the relationship between altitude and annual rainfall. From a regression equation of these two variables, it is found that y = 0.7086x+708.19 (R2 = 0.66), x is the altitude (m), and y is the total average value (mm/a) of long-term series rainfall over 30 years. Later, this equation was used to make a continuous rainfall map. The results clearly show that DTM is not only an indicator of rainfall characteristics, but also a display of potential energy release.
It should be noted that the correlation between excessive altitude and rainfall is quite weak, and further analysis needs to better describe the actual rainfall distribution in this area. However, according to the existing data, it can only show that the general trend of rainfall distribution has been properly identified.
Although we can sort out the conceptual differences between landslide phenomenon and required factors, when other required characteristics exist, it is precisely the "factors" that trigger landslides. It can be considered that all these data layers have priority.
As for the data of landslide possibility, it can only be obtained through the land instability report of local authorities. It should be emphasized that the database is suitable for constructing long-time series analysis of GIS. Moreover, the density and distribution of its data, statistically speaking, belong to the realistic landslide distribution with typical significance. It can be proved that, in fact, when the data set cultivated for Bayesian programs is not large enough (and the arrangement is not random enough), which is related to the acquisition of block random variables, it is meaningless to measure probabilistic comprehensive simulation from the perspective of quantitative evaluation. In this case, the probability ppa 1 and psf of priority and favorable conditions (special grade) of landslide need expert judgment. In addition, it is important to choose the type and age of landslides to cultivate data series, which can take care of similar landslides. Some people have analyzed "debris flow" and "landslide debris flow" landslides, and think that they generally occur in local areas. In this research project, it is only used effectively when compiling some maps.
The report records of regional land instability also consider rock caving, block sliding and potential unstable areas, but these are not included in the analysis process. Figure 2 shows the data layer used for analysis.
In principle, the data of all the topics considered are likely to be related. Because redundant information may lead to invalid results, some tentative calculations are made. In order to analyze, seven subject conditions (namely, rainfall map, petrology, land cover, slope, slope direction, distance from hydrological network and distance from linear structure) were jointly tested. The seven thematic conditions are classified into independent legends, which are used as the identification criteria for the conditions that promote landslide activity on the map.
For each graph pair, the joint calculation of four indicators is carried out:
X squared (x2) index;
Kramers index;
Accident index;
* * * Same information uncertainty score.
Here, the first index is determined as:
Fig. 2 Topic Map of Predicted Cause Factors
Essays on Geological Disaster Investigation and Monitoring Techniques and Methods
formula
Essays on Geological Disaster Investigation and Monitoring Techniques and Methods
And t = the total number of pixels, Ti = the number of class I pixels in map 1, and TJ = the number of class J pixels in map 2. Indexes n and m are the number of categories in map 1 and map 2, respectively.
Kramer index (V) and accident index (C) are determined as follows:
Essays on Geological Disaster Investigation and Monitoring Techniques and Methods
The meaning of the same symbol is the same as before, m takes the minimum value of (m- 1, n- 1), and n and m are the number of data types of each of the two pictures respectively.
The same information uncertainty score of the mapping pair of * * * and b depends on:
Essays on Geological Disaster Investigation and Monitoring Techniques and Methods
In ...
Essays on Geological Disaster Investigation and Monitoring Techniques and Methods
N and m are the number of classes in Map A and Map B, respectively, and Pij is the ratio of the number of pixels in Class I and J to the total number of pixels on the intersection of Map A and Map B, respectively. Pj is the total number of class J pixels in Figure A, and Pi represents the total number of class I pixels in Figure B. ..
The above indicators can judge the degree of coordination between a map pair. The x-squared exponent gives an absolute measure of harmony (unbounded), but it is useless to itself; V and C represent the standard scale of prevention in this area [0, 1], and the closer to 1, the stronger the connection between the two graphs. The combination of these three indicators can provide a comprehensive connectivity scale standard and allow us to compare the connectivity of image pairs from different angles outside a set of maps. Generally speaking, it can be noticed that these three indicators show very similar reactions to expectations. Uncertainty * * * and information records can also be used to determine the connectivity model determined by the previous indicators, assuming that it changes between 0 (completely independent graph) and 1 (completely connected graph). The table 1 shows the index calculated by the map as described above.
Table 1 Connectivity index between data layers
Although under strict conditions, the calculated indicators are not used to determine the independence of Bayesian conditions (stronger than irrelevant attributes), these related indicators inferred from all data layers may be independent.
As pointed out in the analysis, it must be noted that the landslide shows a certain connection with petrology (the landslide is only once, and it has nothing to do with the topic of petrology because of the uncertainty of the same information), and there is a trend of spatial connection with altitude/rainfall and surface coverage.
It should be pointed out that from the perspective of factors other than causality, petrology is related to altitude/rainfall and land cover, but it is weakly related to slope and has little or no connection with other topics. Inappropriate DTM provided for research projects seems to be the main reason for this phenomenon. Except the weak ties between slope and rainfall/altitude, other relationships can be ignored.
According to the analysis of local geological survey, it seems that the same conclusion has been reached. Lithologic factors are only used to compile landslide hazard map, and the landslide frequency of each lithologic unit is used as landslide hazard index.
In each operation, only half of the known landslides (selected by random sampling) are used to generate the prediction map, but the rest should be regarded as equally effective data sets. As an attempt to predict the landslide disaster, the potential causative factors are used at first, but in the second experiment, only three most relevant factors are used, which will be explained in the following chapters.
4 results discussion
As described below, the calculation of support function is carried out under different simulation assumptions. The prediction ability of each good graph generated by calculation is tested by the prediction ratio of the curve. This curve is based on the calibration and classification of the cumulative percentage of the study area, with the reduction of supporting evaluation value (following the above law) as the abscissa and the cumulative percentage of landslide area as the ordinate. It is said that when the predicted landslide percentage is consistent with 20% of the regional maximum, it is a good evaluation of the prediction ability of the model. The broader concept is that the more regularly the curve approaches the vertical axis, the more consistent the prediction will be. On the contrary, if more curves are close to 45 straight lines, it shows that the comprehensive factors lead to the prediction approaching the random distribution range of support values, and this prediction is of little useful value. Among these causal factors, people realize that hydrological network plays a less role because the details drawn for it are much more accurate than other factors. At first glance, river cutting is "everywhere", so it is not convenient to associate the landslide distribution with its distance from the hydrological network. Therefore, the river system is not included in the causal factors.
In Figure 3, the prediction ratios of the six causal factors considered are shown one by one. In this project, the conditional frequencies ppaI, I= 1, …n (conditional probability of landslide events, given category I) estimated by forecasters are applicable to each category in each topic.
Fig. 3 Causal prediction ratio-using the whole landslide closed broken line and conditional frequency.
The first step is to calculate the data used as evidence, which come from all closed polylines showing landslide activity. Landslides are decomposed into two random sampling groups, one for calibration and the other for verification. The three most relevant topics (lithology, land cover and altitude/rainfall) are used in the calculation, and the indicators described above are followed. The predicted ratio curve is shown in Figure 4.
In addition, all six indicators are used for calculation, and the forecast ratio is shown in Figure 5.
We noticed that because the whole landslide is drawn, it may contain some precision deviation. Due to the collection of inducing factors, the trigger point of landslide is different from that of landslide front. Therefore, only the highest point is used for prediction in each landslide closed broken line; Considering the kinematics principle of material motion, the trigger point should be at the highest position. Under this assumption, the forecast ratio calculated by six causal indicators is shown in Figure 6.
Figure 4 Ratio of 7 Predictors-Using 3 Causality Indicators
(petrology, rainfall and land cover) and the whole landslide closed broken line
Fig. 5 ratio predicted by 7 forecasters-using all 6 related cause indicators and the whole landslide closed broken line.
Figures 7 and 8 show the prediction ratios of 7 predictors using 3 and 6 causal indicators, respectively.
As far as the correlation of input data is concerned, it shows that the use of slope, aspect and rainfall distribution (that is, more accurate DTM and rainfall-data obtained by more regional rain gauges) is more representative and will improve the results. Once new data is obtained, analysts can re-evaluate its potential impact on the forecast.
From the comparison of forecast ratios, it can be determined that:
There seems to be no obvious improvement when six causal indicators are used instead of three indicators (petrology, land cover and rainfall) which are more related to landslides. The predictions in the two cases are very similar, just like using pruning detergent for species groups, but more indicators are applied.
Without considering the whole landslide, the further removal effect can be proved only by using the trigger point. This will not bring about the deterioration of the overall prediction ability of the map; But at the same time, it must be considered that excessive removal may lead to the reduction or even disappearance of drawing reliability.
Fig. 6 only uses the trigger point causal indicator to predict the ratio.
Figure 7 Seven forecasters only use three causal indicators (petrology, land cover and rainfall) and the ratio of trigger point prediction.
Fig. 8 uses the ratio of 6 related causal indicators and 7 predictors of landslide trigger points.
Petrology, in any case, obviously has higher prediction ability in the prediction ratio map of genetic indicators (so it is understandable why the local geological survey chose this special layer to make disaster mapping alone), and of course it also includes land cover and rainfall. However, not all other topics are related to forecasting.
In this case study, except for the case of Bayesian possibility, the predictions used by seven forecasters are very similar. However, the effective distribution of data is very sensitive. When the whole landslide is taken as evidence, the prediction is almost random in some fuzzy or sum cases. Generally speaking, the deterministic coefficient seems to be the most useful means for forecasters to study this specific case, although in each case, some forecasters make the same prediction with the forecast ratio curve and forecast chart.
Fig. 9 shows that in this case, it further shows seven kinds of predictions when three factors are used as subject evidence together with the trigger point * * *. This is one of the situations discussed in this case study, which has a good prediction ratio and may make the best basic thinking on the division of landslide disasters, showing the current cognitive state.
Fig. 9 Forecast chart based on 7 kinds of forecasts.
5 conclusion
The method discussed in this paper is to use digital model (less subjective judgment of experts) to classify land grades according to landslide disasters. This seems to indicate that when the objective prediction can be extracted from the spatial database, it can be explained that its theme has some "systematic" added value, that is, it is better to use all the data together than to use only some topics.
It must be emphasized that this method starts with the development of the existing database and keeps the understanding of each topic open and perfect. Among the various tests (deterministic coefficient, Bayesian possibility, fuzzy operation and other possible techniques) of the best predictor, they can only make a choice according to the prediction ability of various test techniques, and finally they carefully use the prediction ratio curve to predict.
These analyses make people realize that the existing database is not perfect, and of course, it is not appropriate to generate forecast simulation only by referring to terrain data. It is hoped that in the future, we will invest in further investigation and capture data to determine a better digital terrain model. As long as an improved cause factor diagram is generated, or a new cause factor is confirmed to be related to this phenomenon, it can be recalculated, so that a new prediction diagram can be generated. The validity of prediction ratio can be checked according to the actual effective improvement, and can also be used to point out the direction for further efforts in data collection and geotechnical engineering monitoring. For example, in this case study, lithology, land cover and rainfall (described by elevation as mentioned above) are obviously the most related factors to landslides, so up to now, the analysis is mainly devoted to the investigation and mapping of these factors. In addition, it is necessary to prepare and use DTM with appropriate interpretation ability in order to check the influence of topographic data in more detail. The analysis also attaches great importance to other main conditions, such as the height of water body, which may become quite important when used in hazard mapping.