What is the phenomenon to be explained in the decision?
What is the hypothesis or theory to be tested in making a decision?
What is the trend to be predicted?
What is the policy to be evaluated?
2. Construct an empirical econometric model;
In addition to learning relevant economic theories, we should also compare the empirical econometric models in three to five documents with empirical analysis:
Confirm causality); The relationship between explanatory variables and dependent variables in econometric model;
Clarify the similarities and differences, advantages and disadvantages of each model, and think about the possibility of improving the existing models in the literature;
Finally, the prototype of the empirical measurement model is determined.
Make a preliminary investigation and see if there is any relevant information. If not, no matter how well the empirical model is designed, it is useless.
3. Collect relevant information;
The accuracy of data must be strictly checked, and errors and false data should be carefully corrected;
Use spreadsheet software to draw data list, verify the logical rationality of data, and deal with unreasonable values;
Whether using cross-sectional data or time series, the more data, the better, especially PanelData.
Sort out the data values and list all kinds of basic statistical data (sample mean, variance, sample correlation coefficient between variables, etc.). ), list the pairwise interactions between variables, and do some preliminary graphic analysis.
Execution of measurement methods:
1. The measurement method should not be too simple (for example, only the simplest OLS), but it should not be too complicated. Appropriate measurement methods should be adopted to solve the problem. If more complicated measurement methods are adopted, it is necessary to explain why simple methods are not suitable. The quality of the measurement method lies not in its complexity, but in whether it can help us get the correct estimate and understand the real information contained in the data.
2. In addition to the estimated value and the corresponding t-test, some F-tests can be done to test the hypothesis of multiple coefficients.
3. The setting of regression model, especially the choice of explanatory variables, can be constantly revised in the process of estimation. Corresponding variables and explanatory variables can try different transformations such as logarithm, exponent and power function. The decision of these transformation methods is the most important in economic theory, and we can't blindly do some unreasonable variable transformation just to improve the adaptability of the model.
4. When choosing explanatory variables, the following factors should be considered:
The causal relationship between explanatory variables and dependent variables must be correct, that is to say, explanatory variables are the cause first, dependent variables are the result later, and there is a certain order. It is particularly important to note that the values of some variables are likely to be determined at the same time as the dependent variables, or the causal relationship is unclear (that is, these variables are endogenous to the dependent variables), so we should be very cautious when choosing these variables as explanatory variables. The endogenous problem of explanatory variables is often the main reason why research is criticized;
We should pay attention to explain the homomorphism of variables, and we should not put a large number of variables with high correlation (including different transformations of the same variable or various cross-product items between several variables) into the regression formula at random, resulting in serious linear coincidence problems;
The variables involved in economic theory are often unobservable, so empirical research must use agents, and researchers should explain the rationality of the selected agents in detail. Because there are always some missing data, people often use many incredible substitution variables in despair;
The definition of virtual variable should be clear and reasonable, and it should be used carefully.
It is necessary to discuss the measurement problems that may be caused by the lack of data such as insufficient explanatory variables and poor observation values.