Pearson linear correlation analysis is often used to quantitatively describe the direction and tightness of linear correlation between two quantitative variables. Pearson linear correlation analysis can only be used to analyze two quantitative variables, and both variables are required to be normally distributed, and they are random variables, not artificially controlled variables (for example, different doses are given to different mice, and different doses are artificial non-random variables). See the end of this article for other precautions. Taking the test scores of a group of students as an example, we analyze whether there is a wired relationship between students' historical scores and comprehensive scores, and how close they are.
Tool materials:
Additional power supply device (abbreviation of Supplementary Power Supply Set)
operational approach
0 1
Before Pearson linear correlation analysis, we need to plot the historical score and comprehensive score into a scatter plot to see if our data can be used for Pearson linear correlation analysis. Click Graph-Chart Builder, and then click OK in the pop-up dialog box. (If the dialog box doesn't pop up in the figure, ignore it and go directly to the next step. )
02
In the icon generator, select the scatter plot, and then select the simple scatter plot; Then drag the "History" and "Geography" on the left onto the X axis and Y axis (the order can be reversed), and then click OK.
03
We can get the result as shown in the following figure. We can see that the distribution of scattered points in the figure is elliptical and the scattered points are linear, which shows that we can carry out linear correlation analysis. This is just a simple preliminary judgment.
04
Return to the data view and click Analysis-Correlation-Bivariate;
05
In the pop-up dialog box, select "History" and "Comprehensive" to enter the variable box on the right, select "Pearson" as the correlation coefficient below, and click "OK" to output the result.
06
In the results, we can see that the correlation coefficient between "history" and "synthesis" is 0.84 1, that is | r | = 0.841; There are two asterisks in the upper right corner and "* *" in the lower left corner, indicating that the correlation is significant at 0.0 1, indicating that the correlation between "history" and "comprehensive" is significant; We generally think that the correlation coefficient |r| is strongly correlated between 0.8- 1.0; There is a strong correlation between 0.6 and 0.8; There is a moderate correlation between 0.4 and 0.6; There is a weak correlation between 0.2 and 0.4; 0.0-0.2 is extremely weak correlation or no correlation. Results The expression of this paper is shown in the figure.
07
Note 1: Drawing a scatter plot is just a simple judgment. If your scatter chart is not elliptical, then your final result may be low correlation or P > 0.05, indicating that the correlation between them is too weak or there is no linear correlation.
08
Meaning 2: Hierarchical data can't be merged casually. For example, in the following figure (a), the original related data are merged, resulting in irrelevant illusion; Figure (b) combines two unrelated samples to create the illusion of positive correlation.
09
Note 3: When outliers appear, we should carefully use correlation analysis, such as this obvious outlier in Figure (c). Including or not including in the calculation will have a great influence on the conclusion, and even draw the opposite conclusion. For such obvious outliers, we should carefully check the data collection and input process, or repeat the experiment.
Special tips
Correlation is not necessarily causal, but may also be accidental.