(1. Hunan Wanyuan Appraisal Consulting Co., Ltd., Changsha, 410011; 2. School of Resources and Environmental Sciences, Wuhan University, Wuhan, 430079)
This paper introduces and analyzes two non-classical mathematical methods: cloud theory and rough set theory. By comparing and combining these two methods, a land suitability evaluation model based on the combination of cloud theory and rough set theory is established, and a case study and application are carried out on this basis.
Keywords: rough set theory; Cloud theory; Data mining; Land suitability evaluation
Land suitability evaluation is the evaluation of the suitability and suitability of specific land use types. It is an important content of rational use of land. Through the study of all land resources in the region, it provides scientific basis for the balance between people and land, land layout and land structure adjustment, land development and utilization in the overall land use planning. Therefore, it is one of the primary tasks of planning and decision-making to correctly evaluate the land suitability and reasonably divide the suitability grade, and the evaluation method is particularly important as a way to obtain the correct evaluation results.
Traditional evaluation methods, such as limit condition method, regression analysis method, empirical index method and analytic hierarchy process, are too simple to objectively and comprehensively reflect the actual situation to some extent. With the continuous development and perfection of intelligent technology, the evaluation method has also developed from the traditional simple numerical method to the intelligent method. Because of the uncertainty of land suitability, it is more advantageous to use mining technology to deal with a large number of uncertain data.
1 Characteristics of Rough Set Theory and Cloud Theory
Rough set theory is a mathematical tool to describe incompleteness and uncertainty, which can effectively analyze and deal with imprecise, inconsistent and incomplete information, discover hidden knowledge and reveal potential laws. It can effectively discover association rules from existing data, and can support multi-step knowledge acquisition, such as data preprocessing, data reduction, rule generation, data dependence acquisition and so on. Cloud theory is a qualitative and quantitative transformation model based on traditional fuzzy set theory and probability statistics. Qualitative concept, represented by expected value ex, entropy en and super-entropy he, is a system based on the study of uncertainty transformation between qualitative and quantitative. As a new theory to deal with uncertainty, it can help the discretization of data and the reasoning of rules, make this method closer to the field of human thinking, and lay the foundation for the better development of artificial intelligence.
Both cloud theory and rough set theory are generalizations of classical set theory in dealing with uncertainty and inaccuracy, and both can be used to describe the inaccuracy and incompleteness of knowledge, but their starting points and emphases are different. Cloud theory combines fuzziness and randomness, while rough set describes uncertainty through upper approximation set and lower approximation set. Rough set does not need any additional data information, and has its unique advantages in deriving association rules. The method of cloud theory to deal with uncertain information needs some additional information or prior knowledge of data, but it provides a qualitative and quantitative transformation method. Although cloud theory and rough set theory have different characteristics, they are closely related and highly complementary in studying uncertain data. Introducing cloud theory into rough set method and improving the structured model of rough set can not only improve the efficiency of discovery algorithm, but also improve the robustness of system model. Land suitability is a qualitative concept. Using rough set theory and cloud theory to establish a land suitability evaluation model can complement each other's advantages and make up for each other's shortcomings, which makes it possible for the objectivity of land suitability evaluation.
2. Establishment of evaluation model based on cloud theory and rough set theory.
The combination of cloud theory and rough set method takes the quantitative-qualitative conversion method based on cloud theory as the pretreatment means of rough set method, which converts quantitative data into qualitative data, or converts qualitative data into new qualitative data at different conceptual levels, and then applies rough set method to discover classification decision-making knowledge, and finally applies this knowledge by using the uncertainty reasoning method of cloud theory, that is, infers quantitative or qualitative results according to new quantitative or qualitative conditional data, thus expressing and transmitting the uncertainty of knowledge and reasoning. As far as concrete modeling is concerned, first, an initial decision table is made according to the original data, and for each conditional attribute, it is checked whether it is a discrete attribute. If so, discretize it until the whole decision table is completely converted into discrete data, and then make the final decision table. On the basis of this decision table, the association rules are found by rough set method and the importance of attributes is calculated to get the association rules. Finally, the qualitative reasoning results are obtained through the reasoning method based on cloud theory. The whole model is shown in figure 1.
Figure 1 evaluation model diagram
The detailed process of rule reasoning based on cloud theory is shown in Figure 2.
2. 1 Establishment of decision table
Collect data that affect land suitability, such as slope, texture, organic matter content, thickness, etc. Sample and sort out the original data, and make an information decision table according to the purpose of land suitability evaluation (such as suitable forest and grazing). ).
2.2 Data preprocessing
In many cases, the information table to be processed is not a complete information table, and some attribute values in the table are omitted. In this case, it can be handled by giving the property value of vacancy a special value to distinguish it from other property values.
Figure 2 Cloud theoretical reasoning
2.3 Data discretization
Using cloud model to simulate human thinking and divide attribute space. Each attribute is regarded as a language variable (or a combination of multiple language variables). For each language variable, several language values are defined, and adjacent language values are allowed to overlap. Clouds representing language values can be given interactively by users. Let cloud A 1 (Ex 1, En 1, He 1), A2 (Ex2, En2, He2), ..., give a numerical attribute (Exn, Enn, Hen), and any attribute value x is regarded as a language item. μ2, ..., μn, that is, the attribute values μ and A 1, A2, ..., an, and retrieve the maximum membership μi, and then assign X to Ai. If the two membership degrees μi and μj are equal to the maximum, then X is randomly assigned to Ai or Aj.
2.4 Decision table attribute reduction
Based on the knowledge acquisition of rough set theory, the attribute reduction algorithm of decision table discernibility matrix and discernibility function is used to reduce the original decision table, including attribute reduction and attribute value reduction.
Let s = < u, r, v, f > be a decision table system, r = p ∪ d be an attribute set, and subsets p = {ai | i = 1, ..., m} and d = {d} be a conditional attribute set and a decision attribute set, respectively, with u = {x/kloc-0. CD (i, j) represents the element in the I-th row and the J-th column in the discernibility matrix, then the discernibility matrix CD is defined as: {AK | AK ∈ P ∧ AK (xi) ≠ AK (xj)}, d (xi) ≠ d (xj);
Innovation of Land Information Technology and Development of Land Science and Technology: Proceedings of the 2006 Annual Conference of china land science Institution.
Where I, j = 1, …, n.
According to the definition of discernibility matrix, when the decision attributes of two samples (examples) take the same value, the corresponding discernibility matrix takes the value of 0; When the decision-making attributes of two samples are different, they can be distinguished by different values of some conditional attributes, and the value of the corresponding identifiable matrix element is the set of conditional attributes with different attribute values of the two samples, that is, the set of conditional attributes of the two samples can be distinguished; When two samples conflict, that is, all the conditional attributes have the same value, but the decision attributes have different values, the values of the elements in their corresponding difference matrix are empty.
2.5 Calculating Attribute Weights
For the classification attribute subset B' at the derivation of attribute set c? The importance of B can be measured by the difference in correlation between them, namely:
Rubidium (carbon)-rubidium-boron' (carbon)
This shows how the positive domain of the classification U/C is affected when the object classification of a certain attribute subset B' is deleted from the set B. ..
Where Rb (c) = card (POSP (q))/card (u).
Is a measure of knowledge dependency, where card represents the cardinality of the set:
Innovation of Land Information Technology and Development of Land Science and Technology: Proceedings of the 2006 Annual Conference of china land science Institution.
The positive definite domain of P is called Q. For the classification of U/P, the positive definite domain of U/Q is the set of objects that all the knowledge expressed by classifying U/P can be clearly classified into the domain of U/Q..
2.6 Extraction of Decision Rules Minimization Based on Value Reduction
Decision rules extraction based on value reduction is based on value reduction of decision table. Assume that the decision table has three conditional attributes A, B, C and one decision attribute D. By reducing the attribute values of [x] a, [x] b, [x] c and [x] d, the minimum decision rule is calculated under the principle of rule minimization.
2.7 Rule Reasoning Based on Cloud Theory
Uncertainty reasoning based on cloud theory can be divided into single rule and multi-rule reasoning according to the number of rules, and each rule can be divided into single condition rule and multi-condition rule according to the number of rule antecedents. Land suitability evaluation only needs qualitative reasoning results, so the model is solved by calculating the importance of attributes. Firstly, several rules of an instance are activated, and the cloud drops of membership degree of each rule are obtained. The expected value of virtual cloud is the result. Finally, the qualitative results are selected according to the maximum membership degree.
The land suitability evaluation system designed according to the above theory is shown in Figure 3. The menu is about the basic methods of commonly used theories, and a series of steps on the right are about the realization methods of establishing mathematical models. The coordinate interface in the middle is used to display the graphical results.
Figure 3 Evaluation System Interface
Three application examples
Qionghai city is located in the east of Hainan Province. It borders the South China Sea in the east, Wenchang in the north, Tunchang in the west and Wanning County in the south. Qionghai city has superior agricultural natural conditions and abundant tourism resources, but there are some restrictive factors such as weak industrial base, poor mineral resources, energy shortage, low level of science and technology, and insufficient construction funds. The main task of land suitability evaluation is to evaluate the suitability of all the land within the evaluation scope on the basis of collecting data such as soil, topography, water conservancy and climate, find out the land that is not suitable for current use, and give the land grade suitable for designated use.
3. 1 Collecting and collating data
Collect all the data about land suitability evaluation in qionghai city, including 5 conditional attributes and 1 decision attributes, and divide 93 1 1 examples according to the original unit. The table 1 is part of an example decision table.
Table 1 Example of Decision Representation
Among them, Yjz stands for soil organic matter content, Hd stands for soil thickness, Zd stands for soil texture condition attribute, Sl stands for water conservancy condition attribute, and S_c stands for land type decision attribute suitable for aquaculture.
3.2 Data preprocessing
Because the initial data obtained in this example is not missing, there is no need to preprocess the initial decision table, so this step can be omitted, so the final decision table obtained is the same as the table 1.
3.3 Data discretization
For each attribute in the decision table, perform the following steps in turn to obtain discrete results.
3.3. 1 Calculate the data distribution function of the attribute
Calculating the data distribution function gi (x) of attribute I by taking every possible value in the domain of attribute I; Fig. 4 is a graph of data distribution function of attribute thickness (Hd).
Fig. 4 Attribute data distribution map
3.3.2 Calculate the data distribution function of a single cloud model.
Find the peak position of the data distribution function gi (x), define its attribute as the center of gravity of the cloud, and then calculate the cloud model fitting gi (x). The calculation of the cloud model function fi (x) is shown in Figure 5.
Figure 5 Cloud model distribution
This graph is a cloud-based data distribution function (solid red line) fitted when the position of the second peak is found. The parameters of the cloud model are:
Innovation of Land Information Technology and Development of Land Science and Technology: Proceedings of the 2006 Annual Conference of china land science Institution.
3.4 discretization
After understanding the concept cloud through the induction obtained in the previous step, for each attribute value that needs to be discretized, the membership degree of each concept cloud is calculated one by one, and the maximum value is taken as the discretization result. Table 2 is a part of the discretization results.
Table 2 Attribute Discrete Results
3.5 Attribute reduction
The expression of Boolean function is obtained, and the reduction result is calculated by Boolean function minimization algorithm. Convert Boolean function into binary discernibility matrix, and simplify the binary discernibility matrix to get the reduction result of decision table, as shown in Table 3.
Table 3 Attribute Simplification Results
3.6 Calculating Attribute Weights
According to the influence of conditional attributes on the classification of decision attributes, the importance and coefficient of each conditional attribute to the decision result are calculated, as shown in Table 4. This measure is based on examples in the universe and does not depend on human prior knowledge. )
Table 4 Attribute Weight Results
3.7 Decision reasoning
According to the multi-condition and multi-rule reasoning method of cloud theory, the original data is inferred with reference to the minimum rule, and the final grade division result is obtained, as shown in Figure 6.
Figure 6 Grading results
4 conclusion
Using the above model, we should first collect as many factors as possible that have an impact on land suitability. After discretization of continuous data by cloud theory, evaluation factors can be screened according to the method of determining the importance of attributes. On this basis, the evaluation rules are obtained by rough set method. In addition, we should pay attention to the land suitability. In the evaluation of land suitability, the grade of each land use should be determined separately, which is different from the comprehensive decision of combining several different decision attributes into one decision attribute set in the process of rough set general information processing.
The application results show that the cloud model absorbs the advantages of natural language, breaks through the limitations of existing methods, and can organically combine fuzziness and randomness to form a mapping between qualitative and quantitative in spatial data mining, and the discovered knowledge is reliable. Rough set theory is good at dealing with fuzzy and incomplete knowledge, but its ability to deal with original fuzzy data is weak. The qualitative and quantitative transformation method based on cloud model is more suitable as the preprocessing of rough set. The combination of the two methods in land suitability evaluation can combine the advantages of the two theories and is more conducive to solving the practical problems of qualitative evaluation.
refer to
Zhang Wenxiu, Wu, et al. Rough set theory and method [M]. Beijing: Science Press, 200 1.
Zeng Huang Lin. Rough set theory and its application [M]. Chongqing: Chongqing University Press, 1998.
Ma Liang Zhang Li. Fuzzy pattern recognition based on attribute reduction of rough set [J]. Journal of University of Shanghai for Science and Technology, 2003,25 (1): 50 ~ 53
Yang, man. Two-dimensional cloud model and its application in prediction [J]. chinese journal of computers, China,1998,21(1): 961~ 969.
Di Chang Kai. Spatial data mining and knowledge discovery [M]. Wuhan: Wuhan University Press, 200 1. 12.