Check for missing data types. There are two situations, one is the missing data that can be ignored in the design, and the other is the missing data that cannot be ignored in the design.
Clear two kinds of missing data that can not be ignored in design, known situation: missing caused by process factors. Limited data disclosure; Did not complete the questionnaire; The subject made a mistake (illness, etc.) in the selection. ). Unknown situation: directly caused by the subject. For example, the subjects refused to answer some questions.
Check the statistical proportion of data missing degree and the missing proportion of each case in all variables; The proportion of cases with missing data for each variable; There is no missing case rate in all variables.
Criteria for determining the degree of data missing. The missing rate is less than 10%, and there is little difference in using any missing data processing method, but non-random missing is not included. Simple treatment of missing items or variables-deletion; Variables with missing rate exceeding 15% can be considered for deletion, but variables with higher missing rate (20%~30%) will usually be remedied.