When principal component analysis is used to solve the factors, at most, as many factors as the number of measured items can be obtained. If all the factors are retained, the purpose of dimensionality reduction will not be achieved. But we know the size of the factors and we can discard them. Where are so many small factors to give up? In general behavior research, we often use two judgment methods: the characteristic root method greater than 1 and the gravel slope method.
Because the information in the factor can be represented by the characteristic root, we have the rule that the characteristic root is greater than 1. If the characteristic root of a factor is greater than 1, keep it, otherwise discard it. Although this rule is simple and easy to use, it is only an empirical rule and there is no clear statistical test. Unfortunately, in practice, the statistical test method is not more effective than this rule of thumb (Gorsuch, 1983). So this rule of thumb is still the most commonly used rule. According to experience, it is not always correct. It will overestimate or underestimate the actual number of factors. Its application range is 20-40 measures, each theoretical factor corresponds to 3-5 measures, and the sample size is large (3 100).
Gravel slope method is a drawing method. If the order of factors is X axis and the size of characteristic root is Y axis, the variation of characteristic root with factors can be drawn on a coordinate, and the characteristic root of factors shows a downward trend. The head of this trend line drops rapidly, while the tail flattens out. Draw a regression line from the tail. The points farther above the tropic of cancer represent the main factors, and the points on both sides of the tropic of cancer represent the secondary factors. However, the gravel slope method often overestimates the number of factors. This method is more unreliable than the first method, so it is generally not used in practical research.
After discarding small factors and retaining large factors, the dimension can be reduced. When analyzing social survey data, researchers often need to test the relationship between factors and measurement items in order to ensure that each principal factor (principal component) corresponds to a group of meaningful measurement items, in addition to synthesizing related problems into factors and retaining large factors. In order to show the relationship between factors and measurement items more clearly, researchers need to rotate the factors. The common rotation method is VARIMAX rotation. After rotation, if a metric item is highly correlated with the corresponding factor (>; 0.5) is acceptable. If the correlation between a metric item and a non-corresponding factor is too high (>: 0.4), it is unacceptable, and such a metric may need to be modified or eliminated.
The process of obtaining factors through principal component analysis and measuring the relationship between items and factors through factor rotation analysis is usually called exploratory factor analysis.
After exploratory factor analysis is accepted, researchers can further test the relationship between these factors, such as structural equation analysis for hypothesis testing. Propositional principal component analysis (PCA) of 1 problem is a method to reduce dimensions, which is convenient for analyzing problems and has been widely used in many fields. However, some textbooks and papers have some mistakes and shortcomings when using principal component analysis, which can't solve practical problems. For example, in some teaching materials of multivariate statistical analysis, the principal component analysis of covariance matrix has the following errors and shortcomings: ① It is not clear whether the dimensionality reduction condition of data is established. ② The sum of squares of principal component coefficients is not 1. ③ Whether the data used is suitable for independent principal component analysis is not clearly judged. ④ The selected principal component does not represent the original variable. The following is to solve the above problems in turn from the relevant theories and results, and give corresponding suggestions. In the study of behavior and psychology, it is often necessary to analyze the behavior characteristics of people with certain identities, such as the daily behavior characteristics of primary and secondary school students in this case, so as to guide primary school students to develop more positive behavior attitudes according to these characteristics. See table 1 for the data of the document [1] used here, and the data comes from the investigation results of a research group. The research group investigated the daily behavior of 480 students in grades 5-6 in a primary school in northern China. * * * Investigated 1 1 indicators as follows: S 10~ ~ response to teachers' questions, S2 ~ concern for class affairs, S3 ~ performance in self-study class, S4 ~ attitude towards homework, S5 ~ concern for classmates, S6 ~ attitude towards labor, S7 ~ right.
Similarities and differences between principal component analysis and analytic hierarchy process
1. Index screening principle based on correlation analysis
The correlation coefficient between the two indicators reflects the correlation between the two indicators [1]. The greater the correlation coefficient, the higher the information correlation reflected by the two indicators [1]. In order to make the evaluation index system concise and effective, it is necessary to avoid duplication of information reflected by the indicators [1]. By calculating the correlation coefficient between evaluation indexes in the same criterion layer, the indexes with larger correlation coefficient are deleted, which avoids the information duplication reflected by evaluation indexes [2]. Through correlation analysis, the index system is simplified and concise and effective [2].
2. The principle of index screening based on principal component analysis.
(1) factor loading principle
Through the principal component analysis of other indicators, the factor load of each indicator is obtained. The absolute value of factor load is less than or equal to 1, and the more the absolute value tends to 1, the more important the index is to the evaluation result [3].
(2) The principle of index selection based on principal component analysis.
Factor load reflects the degree of influence of indicators on evaluation results. The greater the absolute value of factor load, the more important the index is to the evaluation result, and the more it should be retained. On the contrary, the more it should be deleted. 1 Through the principal component analysis of the indicators screened by correlation analysis, the factor load of each indicator is obtained, so as to delete the indicators with small factor load and ensure the screening of important indicators [2].
3. Similarity between correlation analysis and principal component analysis
First of all, the index screening based on correlation analysis and principal component analysis are both carried out within the criterion layer, not between the criterion layers. The reason for this is that by artificially dividing different criteria layers, we can reflect the situation of evaluating things at different levels and avoid the important indicators with different information being deleted by mistake [2].
Second, the idea of index screening based on correlation analysis and principal component analysis is to screen a small number of representative indicators [2].
4. The difference between correlation analysis and principal component analysis
1. The purpose of the two kinds of screening is different: the purpose of index screening based on correlation analysis is to delete the evaluation indicators that reflect information redundancy. The purpose of index screening based on principal component analysis is to delete the evaluation indexes that have little influence on the evaluation results [2].
Second, the function of the two screening is different: the function of index screening based on correlation analysis is to ensure that the evaluation index system of hoof selection is concise and clear. The purpose of simple selection of indicators based on principal component analysis is to screen out important indicators [2].
[1] Cao Tingting, Zhang Kun, Chi Guotai. Construction of evaluation index system of people's all-round development based on related principal component analysis [J]. Systems Engineering
Cheng theory and practice, 2013,32 (1):112-119.
[2] Li Hongxi. Research on Port Logistics Evaluation Based on Correlation Principal Component Analysis [D]. Dalian, Liaoning: Dalian University of Technology, 20 13.
[3] Sun Hui, Li Yuanyuan and Zhang Nana. Empirical study on the competitiveness of coal industry based on principal component analysis [J]. Resources and industries, 20 12,14 (1):145-149.