(1. Guangzhou Marine Geological Survey Guangzhou 510760; 2. Key Laboratory of Seabed Mineral Resources, Ministry of Land and Resources, Guangzhou 5 10760)
Introduction to the first author: Liang Guang (1972-), male, engineer, mainly engaged in network management and data management, E-mail:one@hydz.cn.
In recent years, resource exploration has covered most land areas, and more and more countries have turned their attention to the ocean. As a huge treasure house of energy and resources, the importance of the ocean in the national economy and military strategy is increasingly apparent. Countries are competing to formulate plans and strategic plans for the development of marine science and technology, giving priority to the development of new marine technologies [1]. How to effectively obtain useful information from massive marine geological survey data is an important research content in marine new technology research. According to the application requirements of marine geological survey data research technology, the regression analysis model is introduced into the marine geological survey database, and the technical method of regression analysis and its application advantages in the marine geological survey database research are introduced in detail, which provides technical support for marine scientific research.
Marine geology; Regression analysis; database
1 preface
With the consumption of land resources and the increasing demand for energy, the ocean, as a treasure house of energy and resources to be developed on a large scale, has attracted more and more attention from all countries. As the largest developing country in the world, China's demand for energy is also increasing greatly. In recent years, China's oil imports have increased dramatically. It is estimated that by 2020, China's dependence on oil imports will reach 60%. Party and state leaders have repeatedly pointed out that "resources, energy, especially oil and gas resources have become an important factor in China's economic and social development, and solving the energy reserve problem is a major event to ensure national economic security." With the development of land resources survey and marine geological survey in China, a large number of marine geological data have been collected and accumulated, and several information systems and data sources have been established to meet their respective business needs [2]. How to effectively obtain useful information from massive marine geological survey data is an important research content in marine new technology research. Aiming at the demand of the application means of marine geological survey data research technology, the regression analysis technology is introduced into the marine geological survey database, and the technical method of regression analysis and its application advantages in the marine geological survey database research are introduced in detail, so as to provide technical support for marine scientific research.
2 Overview of regression analysis
2. 1 overview
Regression analysis is a statistical analysis method to determine the quantitative relationship between two or more variables. Regression analysis can be divided into univariate regression analysis and multivariate regression analysis according to the number of independent variables involved; According to the type of relationship between independent variables and dependent variables, it can be divided into linear regression analysis and nonlinear regression analysis. If the regression analysis contains only one independent variable and one dependent variable, and the relationship between them can be approximately expressed by a straight line, this regression analysis is called unary linear regression analysis. If regression analysis contains two or more independent variables, and there is a linear relationship between dependent variables and independent variables, it is called multivariate linear regression analysis [3]. Regression analysis prediction method can calculate the future state and quantitative performance of the predicted object by analyzing the changing trend of the phenomena related to the predicted object, and establish regression models X 1, X2, ..., Xk is related to the predicted object (Y) through multiple factors. Whether the obtained regression model is reasonable, whether it conforms to the objective regularity between variables, whether it is effective to introduce relevant factors, whether there is linear correlation between variables, and whether the model can be put into application depends on inspection. This paper gives two tests: on the one hand, the practical significance test. That is, whether the theoretical expectation is consistent with the actual result. On the other hand, it is statistical test: goodness of fit test (R-squared test), equation significance test (F-test) and variable significance test (T-test) [4]. This paper mainly introduces the application of linear regression analysis in marine geological survey database.
2.2 Linear regression analysis model
Linear regression analysis can describe the regression relationship between two elements. The formula of linear regression analysis is: yi = a+bxi+εi, where a and b are parameters and εi is error. We define Q(a, b)a as the total error. Then:
Geological Research of South China Sea (20 14)
Derive a and b on both sides of the formula:
Geological Research of South China Sea (20 14)
Geological Research of South China Sea (20 14)
X represents the average value of x, and y represents the average value of y.
The evaluation method of correlation coefficient R2 is [5]:
Geological Research of South China Sea (20 14)
2.3 multivariate linear regression analysis model
The research object Y is influenced by many factors, such as x 1, x2, x3, …xn. Assuming that the relationship between each influencing factor and y is linear, a multiple linear regression model can be established:
y =β0+β 1x 1+β2 x2+…+βkxk+ε
Where: x 1, x2, ..., xk represents the impact factor; ε is a random error; Y represents the research object, that is, the prediction target [3].
2.4 Statistical test
Statistical test is to test the reliability of the estimated values of equations and model parameters by mathematical statistics. This mainly includes goodness of fit test, equation significance test and variable significance test, and commonly used are R2 test, F test and T test.
2.4. 1 goodness of fit test (test):
The goodness of fit test is to test the fitting degree of regression equation to the observed values of samples. Also known as the complex correlation coefficient test method, it is obtained by decomposing the total variation (total deviation).
Geological Research of South China Sea (20 14)
In ...
Geological Research of South China Sea (20 14)
The sum of squares of the total variation S is always the sum of squares of the difference between the observed value and the sample mean, which reflects the difference between all the data. Residual sum of squares s residual is the part of total variance sum of squares that is not explained by regression equation, and it is caused by the influence of all factors that are not included in explanatory variables x 1, x2……, xk on the explained variable y; The S-circle of the sum of squares of regression is the part of the sum of squares of total variation explained by regression equation. For a good regression model, it should fit the observed values of samples well, and the smaller the S residual in the total S, the better. So you can use:
Geological Research of South China Sea (20 14)
[4] found.
2.4.2 equation significance test (f test):
For the multivariate linear regression equation, the significance test of the equation is to infer whether the linear relationship of the population is significant, that is, it is significant to test the explained variable Y and all explanatory variables X 1, X2, ..., Xk.
Geological Research of South China Sea (20 14)
That is, the F statistic obeys the F distribution with (k, n-k- 1) as the degree of freedom. Firstly, the statistic F is calculated according to the observed values and regression values of the samples, so if F > FA (k, n-k- 1) at a given significance level, H0 is rejected, and it is judged that the explained variable Y is significant with all the explained variables x 1, x2,.., xk, that is, there is indeed a linear relationship; On the contrary, it is not significant [4].
2.4.3 variable significance test (t test):
For the multiple regression model, the significance of the equation does not mean that the influence of each explanatory variable on the explained variable Y is important. If an explanatory variable is not important, it should be removed from the equation and a simpler equation should be established. Therefore, it is necessary to test the significance of each explanatory variable.
At a given significance level A, if | ti | > ta/2 (n-k- 1), H0 is rejected, which shows that the explanatory variable xi has a significant influence on the explained variable Y, that is, xi is the main factor affecting Y; On the other hand, if H0 is accepted, it means that the explanatory variable xi has no significant influence on the explained variable Y, so this factor should be deleted [4].
Three application examples
In this paper, the linear regression analysis model is used to analyze the temperature of marine sediments in the South China Sea. The scatter plot is shown in figure 1, and the regression analysis results are shown in table 1.
Figure 1 Scatter map of water depth and sediment temperature
Figure 1 Water depth and sediment temperature
Table 1 Results of regression analysis of sediment temperature in water depth 1 Results of regression analysis of water depth and dimensional temperature
Read the regression results as follows:
Intercept: a =17.56; Slope: b =-0.0014; Correlation coefficient: r = 0.276 Determinant coefficient: R2 = 0.076;; ; F value: F=89.54.
Establish a regression model and test the results.
The model number is:
The formula and result of f value are as follows:
Geological Research of South China Sea (20 14)
Where p < 0.000 1. The regression results show that the sediment temperature is closely related to the seawater depth, but the scatter plot shows that the deeper the temperature, the lower the sediment temperature. Affected by other factors such as submarine heat flow and ocean circulation.
4 conclusion
This paper introduces the application of regression analysis in marine geological investigation and research, and provides the technical principle and realization method of regression analysis. Through the application analysis of the relationship model between sediments and seawater depth in the South China Sea, the regression results show that there is a close but uncertain relationship between them. The experimental results have been effectively applied.
refer to
[1] Shan, Mao Yongqiang. 2005. Definition and transformation of coordinate system in GIS [J]. Heilongjiang Land and Resources, 1 1, 38-39
Sue, wait. Key problems and solutions in marine geological data integration [J]. Frontier of marine geology, 1 1 (27): 5 1
[3] Baidu Encyclopedia. Regression analysis.215/regression.pdf
Marine Geological Survey Based on Regression Analysis
Liang Guang 1, 2, Shao changgao 1, 2.
(1. Guangzhou Marine Geological Survey, Guangzhou, 510760; 2. State Key Laboratory of Marine Mineral Resources, Guangzhou, 5 10760)
Abstract: A new round of resource survey has covered most mainland areas. Therefore, people pay more and more attention to marine resources, because it is a huge resource and energy pool, which has far-reaching significance to the national economy and military strategy. Energy competition has prompted many countries to develop new technology projects and take new marine technology as the main research field. However, how to extract useful information from marine geological survey data is one of the most important research technologies. This paper focuses on the shortage of marine database technology, and introduces the regression analysis model and its application advantages. This paper aims to provide technical support for marine research. Key words: marine geology; Regression analysis model; Database database