Current location - Education and Training Encyclopedia - Graduation thesis - Paper Writing
Paper Writing
This issue will briefly talk about my immature understanding of excluding interference factors, mechanism testing and heterogeneity analysis in the writing of empirical papers on economics.

The empirical part of the paper generally answers two questions, one is whether the core explanatory variable X affects the explained variable Y, and the other is the specific influence mechanism, that is, how X affects Y.

In order to answer the first question, after the benchmark regression, the paper will generally design a series of identification condition tests and robustness tests, including testing the specific assumptions of the model (such as the parallel trend test of DID), discussing, testing and alleviating the possible causal inference problems (endogenous) of the model, excluding other interfering factors that may affect the research conclusion, and discussing the expected effect and lag effect of X on Y. In some cases, the identification condition tests and robustness tests of the benchmark regression results even need to occupy more than half.

On the one hand, answering the question whether X affects Y is the basis of further analysis of the influence mechanism, so ensuring the stability and credibility of X's influence on Y is the basic requirement of empirical design; On the other hand, before the reader or reviewer puts forward "soul torture", we should consider all the possible problems in the paper, think about the reader's thoughts and the reviewer's thoughts. Although no matter how hard we rack our brains to find and fill these loopholes, reviewers can always ask some "strange" questions, but doing enough work in the early stage can minimize the possibility of these questions being asked.

Answering the second question can increase the scientificity, story and fullness of the paper. X has a certain influence on Y, and this influence is stable. On this basis, we also want to know through what channels the influence of X on Y is realized, that is, to explore the "process" behind "existence". The discussion of the objective influence mechanism is essentially a summary and refinement of the actual economic operation law, which embodies the "scientific nature" of social science research.

If it is suspected that the variable M is the mechanism behind the action of X on Y, then the theoretical analysis part of this paper will clearly analyze and explain the basic logic of this mechanism, and then test this mechanism in the empirical part. There is no fixed paradigm for mechanism testing, which generally needs to be designed in combination with the research content, theory, model and even the data used in the paper. But the mechanism test of economic research should try to avoid using the intermediary effect model, because (quote me in Zhihu's answer):

Compared with management, psychology and other disciplines, economics emphasizes the inference of causality between variables, and it is precisely because the mediating effect model does not consider the possible endogeneity of mediating variables that the model may conform to the research paradigm of management, but it does not conform to the research paradigm of economics.

The possible endogeneity of the intermediary effect model lies in:

Referring to teacher Lian's answer, there is a real problem in the mediation effect model: most of us are struggling with an endogenous variable, and the mediation effect requires us to overcome not only endogenous problems, but also endogenous problems, which is really harsh.

Eliminating interference factors is one of the basic steps in the robustness test of empirical papers.

For example, Tran Dang Khoa (2020) suspected that other policies implemented at the same time after China's accession to the WTO, especially the policies of restructuring state-owned enterprises and encouraging foreign investment, and environmental supervision policies (such as the two control zones and the pollution control policy in the 11th Five-Year Plan) might potentially interfere with the results. In order to eliminate these interference factors, the author further adds the proportion of state-owned economy and foreign trade economy to the regression equation to control the influence of the first two policies; With regard to environmental regulation policies, the author thinks that most of China's environmental regulation policies are implemented according to administrative divisions, so the region-year fixed effect is further added to the regression model to control the potential impact of environmental regulation policies on the results.

For example, when Jiang et al. (20 18) studied the influence of rural growth experience on family stock market participation, they thought that the factors affecting people's behavior were very complicated, so they suspected that the influence of rural growth experience on stock market participation might be interfered by social interaction, trust level, financial knowledge, family socio-economic status and risk attitude. In order to eliminate these factors, the author has done a series of robustness tests, including (taking social interaction as an example):

For another example, Lu Jing et al. (20021) suspected that the international financial crisis in 2008 and the environmental supervision policies in the same period (such as cleaner production standards, pollution control policies in the 11th Five-Year Plan, regional approval restrictions, etc.) might interfere with the research conclusions. For the former, the author controls two additional proxy variables representing the investment and financing needs of enterprises on the basis of the benchmark model; For the latter, the author introduces some dummy variables and excludes relevant samples (see the original text for details).

Drawing on the above three documents, let's briefly sort out the logical and empirical ideas of eliminating interference factors.

Because there are many factors influencing the explained variable Y, the control variables in the benchmark model are only introduced according to common sense and theory (that is, the practice of existing literature). If it is suspected that an unusual unexpected factor Z has a certain degree of influence on Y, it is intuitively believed that Z is an important factor that cannot be ignored, and the role of X on Y may change when considering Z. Then, in order to test the robustness of the conclusion, z can be introduced into the benchmark model as a control variable. If the coefficient of the core explanatory variable X is basically consistent with the benchmark regression result, it means that the interference of Z on the research conclusion is eliminated.

In addition, you can also use the reduction to absurdity, that is, assuming that Z does interfere with the research conclusion, there is the following inference: with the change of Z value, the influence of X on Y is heterogeneous, and our test logic is to prove that this inference is not valid. In order to prove that this inference is not valid, there are two empirical ideas: one is to group the samples according to the value of z and carry out group test. If the coefficient of x is basically unchanged in different groups, it is basically consistent with the results of benchmark regression, which shows that this inference is not valid; The second is to use the regulation effect model, in which the regulation term is the interaction term of X and Z, and the two independent terms cannot be ignored. If the interaction term in the moderating effect model is not significant (it doesn't matter whether the individual term X is consistent with the benchmark regression result, or even whether X is significant, because the coefficient of X has different meanings in the moderating model), it means that the inference is not established.

There is no unified paradigm for mechanism testing, and it generally relies on the research content to support the story told in the paper. The more commonly used empirical designs include (to quote my answer in Zhihu):

When Dai et al. (202 1) studied the mechanism of "Shanghai Stock Connect" on the total factor productivity of enterprises, they thought that improving the information content and information transmission efficiency of stock prices, correcting stock mispricing and improving the quality of information disclosure were the main mechanisms for Shanghai Stock Connect to improve the total factor productivity of enterprises. In the part of mechanism testing, firstly, it empirically tests the influence of the opening of Shanghai Stock Connect on these mechanism variables, and then theoretically discusses the influence of mechanism variables on the total factor productivity of enterprises by using the existing authoritative literature.

Tran Dang Khoa's (2020) thought of mechanism experiment is similar to it, but it is logically closely related to his own research content. Trade liberalization significantly reduces the emission intensity of enterprises, which is equal to the emission divided by the total industrial output value. In order to discuss whether the reduction of enterprise emission intensity is due to the reduction of emission or the increase of total industrial output value, the author uses trade liberalization to return to enterprise emission intensity and total industrial output value respectively. The results show that trade liberalization mainly reduces the emission intensity by reducing the emissions of enterprises rather than increasing output. After that, a new question is: Is the decrease in enterprise emissions a decrease in production or an increase in processing capacity during terminal processing? In order to answer this question, the author uses trade liberalization to regress the quantity produced and the quantity removed respectively. The results show that trade liberalization reduces emissions by reducing production rather than increasing removal.

Such a mechanism test idea is logically related, so the story of the paper is very strong. In addition, this paper also examines two specific channels of coal utilization and technological progress respectively.

Heterogeneity analysis can generally be divided into two types:

The main differences between these two methods are:

In fact, heterogeneity analysis can be used as an auxiliary means of mechanism testing to further enhance the story of the paper.

For example, Wan Panbing et al. (202 1), when studying the green transformation of enterprises with cleaner production industry standards, after verifying the specific mechanism of technological transformation, believed that technological transformation was subject to the technological transformation needs and financing capacity of enterprises, and the basic logic was as follows:

According to the above logic, the author adds the multiplication term of technical transformation demand and financing ability and double difference (that is, constructing three-difference model DDD) to capture possible heterogeneous effects.