Recently, DeepMind, a subsidiary of Google, published a paper in Nature magazine, announcing that it had predicted 98.5% of human protein with its artificial intelligence program AlphaFold 2, and decided to make the source code of AlphaFold 2 public and open relevant data sets for researchers around the world.
So, is AlphaFold a basic research?
In this regard, Li Guojie, an academician of China Academy of Engineering, classified AlphaFold as engineering science and technology-"Engineering science and technology is not only a tool, but also an important part that can play a huge role in basic research".
The author especially comments on Li Guojie's statement, and has the following views on basic research, hoping to attract more attention.
Scientific research has its own laws and regulations. If you don't follow the rules, you will get twice the result with half the effort.
So, what are the rules of basic research? In fact, different definitions of basic research reflect different perspectives, and the corresponding specific implementation methods are also different.
Generally speaking, in the past few decades, basic research has two main definitions:
First of all, Vannevar Bush defines basic research and applied research under the linear model, and regards basic research as the collection of knowledge and the source of technological progress.
Under this definition, the role of basic research is to generate knowledge, regardless of the relationship with specific technologies. Therefore, at the implementation level, "casting a net" may be the most effective way to generate diversified knowledge.
Secondly, Donald e stokes defines different types of research through four quadrants. Stokes divided basic research into pure basic research (Bohr quadrant) and "application-driven" basic research (Pasteur quadrant).
On the implementation level, the basic research under Bohr quadrant and linear model is basically the same.
In the pasteur quadrant, we should use the cutting-edge basic scientific research to solve the urgent, strong and huge practical needs; In practice, by solving practical problems, researchers are forced to clarify the basic principles of some application problems.
I prefer Stokes' four-quadrant model.
In my opinion, "understanding the fundamental principle of the problem" is basic research.
In fact, Bohr quadrant and Pasteur quadrant are actually the same in concrete scientific research practice, that is, "clarifying the underlying principle of the problem", but the source of the problem is different.
The source of Bohr Quadrant's problems mainly comes from the discipline itself, such as why there is quantum entanglement; The problem of Pasteur Quadrant mainly comes from practical application, such as how to keep milk fresh.
From the perspective of "understanding the underlying principles of problems", as long as some unresolved problems can be raised, there is potential to do a good job in basic research.
We may all have an experience. The "first time" when tackling scientific and technological problems is often particularly difficult, such as the first plane, the first atomic bomb, the first artificial satellite, the first CPU, the first landing on Mars and so on. Even if other countries have done it, it is still very difficult for another country to do it for the first time.
Why? This is mainly because these "first time" outputs not only a prototype system, but also a set of technical processes for developing the prototype system and the corresponding platforms, materials, reagents, equipment and instruments. , that is, scientific research infrastructure.
The function of these scientific research infrastructures is to "clarify the basic principles of the problem". For example, in order to develop the wind tunnel built by aircraft, high-precision simulators and simulators are needed to develop CPU.
Even the basic research in the fields of physics, chemistry, astronomy, etc. is now inseparable from various cutting-edge equipment and instruments, such as the EAST Tokamak device for studying nuclear fusion and the FAST telescope for studying astronomy.
In the field of CPU chip design, many people regard it as pure engineering technology and think that there is no basic research.
But in my opinion, clarifying the underlying principles of some problems in CPU design space is the basic research.
For example, Apple's recently launched M 1 processor even surpassed Intel's desktop processor. This is because Ml uses about 600 ROBs, which completely subverts the concept of traditional CPU architecture designers, because in the past, CPU ROBs generally did not exceed 200.
Perhaps with reverse engineering thinking, we can quickly make a 600-item CPU architecture design.
But who knows why Apple dares to design this way? Why 600, not 400, or 800? Reverse engineering is only engineering technology, but if we can thoroughly understand the underlying principles of these problems, it is the basic research in the field of CPU architecture design.
It is not easy to understand the underlying principle, which requires the support of a set of CPU architecture design infrastructure-from program feature analysis technology, design space exploration technology, high-precision simulator, system simulation technology, verification technology and so on. We also need to analyze a lot of program features, collect a lot of original data, need a lot of detailed quantitative analysis, and need a lot of simulation ... all of these are to clarify the basic principles.
To some extent, the scientific research infrastructure such as platform/materials/reagents/equipment/instruments is more important than the prototype system.
With these, we can continuously explore the underlying principles of various phenomena, support the subsequent iterative optimization, and become the base for cultivating talents.
Basic research and engineering technology are not simple binary opposites.
On the contrary, basic research and engineering development are integrated in many fields.
This convergence occurs because many research infrastructures, such as new platforms, new equipment and new processes, need engineering investment.
Even basic research such as detecting gravitational waves and Higgs particles requires engineering investment to develop instruments and equipment such as LIGO and LHC.
Once you have this research infrastructure, it will be much easier for others to do research on it.
One reason why basic research in the United States is strong is that many scholars have built these scientific research infrastructures in universities and enterprise research institutes.
For example, in the field of CPU chip design, there are a series of infrastructures such as GEM5 simulator, CACTI model and FireSim simulation platform, which can make it more convenient for scholars in other universities to carry out research.
Therefore, some scholars believe that basic research does not need engineering, mainly because someone helped them build the underlying scientific research infrastructure, making it easier for them to optimize and publish papers.
Many scientific and technological enterprises in the United States will also build a set of scientific research infrastructure (open source and self-research) that is generally open to academic circles.
By importing business requirements and internal data into the enterprise's scientific research infrastructure, it is easy to digest new ideas generated by academia and integrate them into the enterprise's products.
Therefore, the open infrastructure and the flow of talents are the important reasons for the closed loop of "innovative ideas-applications-feedback collection-innovative ideas-new applications" in American academia and industry.
However, such an efficient closed loop between academia and industry in China has not yet been formed, and most enterprises have not established scientific research infrastructure with academia.
Therefore, for Chinese academic circles, it is more necessary to participate in the construction of scientific research infrastructure, especially to make up the courses of scientific research infrastructure with enterprises.
Although a lot of basic research is purely theoretical exploration, it can be carried out by a small team of several people or even one person.
However, there are also many basic research needs big teams, big management and big organizations, such as detecting Higgs particles and developing LIGO to observe gravitational waves.
The Advanced Research Projects Agency (DARPA) of the US Department of Defense has funded many disruptive innovation projects.
When we observe the process of DARPA project establishment and implementation, we can see that there are some characteristics: first, we will imagine the future and set radical goals; Scientifically decompose radical goals into a series of subtasks; Make the implementation plan of specific subtasks, including objectives, time nodes, etc. The subtasks are finally integrated into a prototype system.
The "project supervisor" will be responsible for the above four tasks, have absolute project decision-making power, and be responsible for the project at the same time, which is equivalent to mastering the overall situation. A large number of practices have proved that this scientific research organization and management model is efficient.
This model is also effective for basic research.
Take Tsinghua University brain like computing Research Center as an example. The center was established on 20 14, and its members come from different departments in Tsinghua University.
Their research model is similar to the DARPA project. The whole team carried out full-stack research around the "Tianji" brain chip, and integrated it into the self-driving bicycle system, forming a good display of scientific research results, publishing a number of Nature and Science papers, being selected into the top ten scientific and technological progress in China, and establishing the brain like computing discipline in Tsinghua.
Back to the question at the beginning of this article: Does AlphaFold count as basic research?
According to the discussion in this paper, we can draw the following conclusions: First, there are many unknown problems in the R&D process of AlphaFold, and basic research is needed to clarify the underlying principles of these problems; Secondly, α-folding is the scientific research infrastructure in the field of mid-structure prediction in protein, and it is a part of the basic research in the field of mid-structure prediction in protein.
(The author is deputy director and researcher of Institute of Computing Technology, Chinese Academy of Sciences)