1. Pre-Berman era
In the 1940s, many scholars in the fields of economics and statistics found that backward induction could solve some risky multi-stage decision-making problems without qualitative analysis. In their game theory research, von Neumann and Morgenstein found the perfect sub-game equilibrium of the extended form game through backward induction.
Abraham Wald, the founder of statistical decision theory, also applied his theory to multi-stage decision-making problems. After that, Arrow, Blackwell and Girshick studied the general form of statistical decision-making problem, and clarified and solved this problem with the method of modern dynamic programming.
In addition, Arrow, Harris and Marschak also studied the optimal inventory strategy by backward induction.
2. Berman era
Almost at the same time, that is, from the late 1940s, richard bellman gradually discovered the structure behind the multi-stage decision-making problem, and pointed out how backward induction solved a large class of multi-stage decision-making problems. From 1949, Berman began his research on dynamic programming in RAND Corporation. When there was no such term in fashion, it was later named by Berman.
Berman's core work is to give the optimality principle of dynamic programming. There are important documents here, namely 1953, 1954, and several important papers published in Bull. Amir. mathmatics Socialists and operational research, including dynamic programming theory and some applications of dynamic programming theory-a summary. Its milestone is Dynamic Planning published by Princeton University Press in 1957, which marks the formal establishment of dynamic planning theory.
Since then, Berman's research has been active until the 1960s, during which he has done a series of research work on the expansion of the application scope of dynamic programming, and there are also many important papers, which are not listed here.
3. Post-Berman era
After the establishment of the basic theoretical system, dynamic planning has entered a period of all-round development, and the research work has many directions. To sum up, one is the generalization of the theoretical system, and the other is the modification of some conditions of the original theory, including the principle of optimality, to adapt to some special new problems.
At present, the research work is quite complicated, so it is difficult to list them one by one here, and several branches are simply given. One is the study of so-called dimension disaster, the other is the application in large-scale separable nonlinear integer programming, the third is fuzzy dynamic programming, and the fourth is the cross-study with other mathematical fields such as partial differential equations.
This division is purely due to historical considerations and may be unreasonable in terms of subject content. But I think it is helpful for junior researchers to sort out the basic context.