(Qingdao Institute of Marine Geology)
Cloud computing inherits and integrates many key technologies, such as virtualization technology, mass data storage, distributed parallel computing framework, intelligence and automation management, and forms a new computing model with high performance, scalability, low cost and service orientation. At present, the research and discussion on cloud computing in academia and industry are showing a rapid growth trend. A large number of papers were published in computer and library and information journals, and the research contents focused on the basic theory of cloud computing, key technologies of cloud computing, application fields of cloud services, cloud computing and information resource management. Based on the research literature of cloud computing published in domestic core journals from October to February, 2000, this paper analyzes the research hotspots and evolution direction of cloud computing, and probes into the application strategies of cloud computing combined with the development of geological data clustering industry services in China.
Cloud computing mode, geological data information sharing and service
1 preface
The word "cloud computing" appeared in 2006 and was first formally put forward by Eric Schmidt, president of Google, at SES San Jose 2006. It not only unveiled the mystery of the key technology behind Google search, but also quickly surpassed "grid computing" and became a new trend in just a few years (Figure 1).
Figure 1 Trend chart of search volume change in grid computing and cloud computing
After 2006, driven by Google, Amazon, IBM and other enterprises, "cloud computing" has been widely used as a new computing model. As the delivery and use mode of infrastructure and services, cloud computing is profoundly affecting the development of the Internet. In recent years, there has been a research upsurge of cloud computing at home and abroad, and a large number of research documents and application cases have emerged. Cloud computing has become a hot spot in academia and industry. This paper first introduces the basic concepts and key technologies of cloud computing, and through the comprehensive analysis of the existing research literature on cloud computing, combined with the development of industrial services of geological data clustering in China, puts forward some problems that need attention in the application of cloud computing.
2 Cloud computing and its key technologies
2. 1 Basic concepts of cloud computing
The concept of cloud computing still has different definitions. It is generally believed that cloud computing is a computing method based on the Internet, through which the software and hardware resources and information enjoyed by * * * can be provided to computers and other devices on demand [1]. The National Institute of Standards and Technology (NIST) of the United States also gives the definition of cloud computing, and thinks that cloud computing is a convenient and on-demand way to obtain computing resources through the network and significantly improve the usability. These computing resources come from a * * * shared and configurable resource pool, which can be automatically acquired and released [2].
The Cloud Computing Expert Committee of the Chinese Institute of Electronics believes that cloud computing is a computing model based on the Internet and public participation, and its computing resources (computing power, storage power and interaction power) are dynamic, extensible, virtualized and provided as services. This new mode of organization, distribution and use of computing resources is conducive to rational allocation of computing resources, improving their utilization rate, thus promoting energy conservation and emission reduction and realizing green computing [3].
Although there are different definitions of cloud computing, there have been many in-depth discussions on its characteristics. The following five basic characteristics can be used to judge whether a computing service is cloud computing.
(1) Provide services on demand. Cloud computing is a way to provide information technology as a service. Since this service is from the user's point of view, self-service on demand is one of its most important features. Users can acquire computing power by themselves, including using servers and network storage, and the whole process is usually automatic.
(2) Convenient network access. Cloud computing supports a wide range of convenient network access, and users can use a variety of devices such as mobile phones, mobile computers or workstations to obtain cloud services.
(3) Resources * * * shared pool. One of the benefits of cloud computing is that it can improve the utilization of resources. By concentrating resources in a common resource sharing pool, we can provide * * * sharing services for large-scale user groups. Because the resource pool can dynamically allocate all physical and virtual resources, and achieve the purpose of improving resource utilization through * * * enjoyment.
(4) High scalability and flexible service. Cloud computing has the ability to provide services quickly and flexibly. According to the change of demand, the services provided by cloud computing can automatically expand or contract quickly.
(5) Service is measurable. By automatically monitoring the use of resources, the cloud system can provide quantitative operation reports, thus ensuring that cloud services are at an appropriate level.
2.2 Cloud computing architecture
The development of computer technology has experienced the transformation from traditional mainframe computing mode to personal pervasive computing mode and distributed network computing mode [4]. As a new computing mode, cloud computing is not only the result of the rapid development of distributed computing, parallel computing and grid computing, but also the inevitable choice of information demand in the information society. Socialized, intensive and specialized information services are embodied through various cloud computing, including various Internet applications, software or computing resource services provided to users through the network, as well as software and hardware platforms that support the reliable and efficient operation of these services.
The technical report of the National Institute of Standards and Technology gives a complete model of cloud computing architecture (Figure 2), and the top model defines the roles, activities and functions in the cloud computing model [5]. The core roles of cloud computing include cloud consumers, cloud providers, cloud auditors, cloud brokers and cloud operators (table 1). In this mode, cloud users can obtain business intelligence including ERP, CRM and HR, and services such as information, communication, collaboration, storage, backup and software and hardware hosting. Cloud service providers provide online software services (SaaS), platform services (PaaS) and infrastructure services (IaaS) through the construction, operation and management of cloud computing centers, and cloud operators guarantee the provision and supply of cloud computing by providing network access and communication systems.
Figure 2 Reference model of cloud computing architecture (cited from NIST)
Table 1 Main Roles and Definitions in Cloud Computing Mode
2.3 Key technologies of cloud computing
Cloud computing is the product of the development of computer technology, among which virtualization technology, mass data storage, distributed parallel computing framework, intelligence and automatic management are considered as the key technologies to realize cloud computing [6].
2.3. 1 virtualization technology
Virtualization technology is the key to fully integrate and efficiently utilize various computing and storage resources. Virtualization technology includes two aspects: physical resource pool and resource pool management. Physical resource pool is to change a physical device from large to small, and virtualize a physical device into a plurality of smallest resource units, with configurable performance; Resource pool management is the smallest resource unit after virtualization in the management cluster, which flexibly allocates and schedules resources according to the usage of resources and realizes the on-demand allocation of resources. Virtualization technology is mainly used in server virtualization, storage virtualization and network virtualization.
Mass data storage
Mass data storage is the main task of cloud computing. In order to ensure availability, reliability and economy, cloud computing uses distributed storage to store data. Because of the distributed redundant storage, the data reliability is high, and it can provide services for large-scale users in parallel. The data storage technologies of cloud computing mainly include Google's GFS (Google File System) and Hadoop's HDFS (Hadoop Distributed File System).
2.3.3 Distributed Parallel Computing Framework
Parallel computing is the core of cloud computing. Cloud computing adopts Map-Reduce programming mode to realize distributed parallel computing. Map-Reduce simplifies parallel computing through two processes, namely "map" and "Reduce". All applications only need to provide map function and Reduce function to process large-scale distributed data on the cluster. Map-Reduce is not only a programming model, but also an efficient task scheduling model. The use of this model makes highly parallel and distributed computing tasks a reality.
2.3.4 Intelligent and automatic management technology
Cloud computing has a high degree of autonomy, and intelligent and automated management is an important technical support for the cloud computing model. Through comprehensive monitoring, automatic feedback and intelligent deployment of all nodes in the cluster system, the dynamic management and automatic migration of equipment, virtual resources, communication and services are realized. Cloud computing based on the fourth generation large-scale data center can not only be flexibly expanded and deployed, but also meet the requirements of service computing and multi-granularity computing.
3 China Cloud Computing Research Hotspots Analysis
3. 1 Comparison of the change trend of cloud computing search volume at home and abroad
The size of search volume usually reflects the level of attention, and some long-term trends and changes can be analyzed by using Google Trends tool. Here, "cloud computing" and "cloud computing" are selected as index keywords in the field of cloud computing in the world and China respectively. From the analysis results, we can see the following characteristics (Figure 3): ① The world began to pay attention to cloud computing in 2007, while China began to pay attention to this field in 2008. Therefore, China still belongs to the research mode of learning to follow. ② Since 2007, the global search volume of "cloud computing" has shown a rapid growth trend. At present, it has surpassed "grid computing" to become a new information technology hotspot, but China's attention to it is relatively mild and lagging behind. ③ If the attention represented by the search volume is regarded as an "iceberg on the sea", there is a greater gap between China and foreign countries in those "underwater parts", including basic theories, key technologies and application practices.
Fig. 3 Comparison of the change trend of cloud computing search volume at home and abroad
3.2 Quantitative analysis of domestic cloud computing research literature
In this paper, using CNKI academic journal database of China HowNet, 852 core journal papers on cloud computing research published from March 2000 1 to March 20 12 were retrieved (Table 2). The research on cloud computing in China began in 2007, and there were few related researches before. From 2008 to 20 1 1, the research on cloud computing began to attract widespread attention, and the number of papers began to rise sharply. At the same time, the number of journals publishing papers on cloud computing has also increased rapidly, which shows the universality of cloud computing research. Because only part of the data of April of 20 12 is counted, there are not many results of 20 12 retrieved from the surface, which has not actually changed the trend of rapid growth of the number of papers.
Table 2 Distribution Table of Publication Time of Cloud Computing Papers
The keywords of 852 retrieved papers were quantitatively analyzed, including the keyword 1376, with a cumulative frequency of 3020 times. In descending order of frequency, the top ten keywords are: cloud computing (645), virtualization (1 15), library and information science (1 15), cloud service (94), security (65), storage (42) and so on. From the keyword analysis, we can see that the research of cloud computing involves basic theory, key technologies, application fields, information resource management and many other aspects, and there are also many discussions on key technologies such as virtualization, storage and MapReduce. But on the whole, most of them are comprehensive and forward-looking papers. As far as the application field is concerned, the trend of library and information science research and learning from cloud computing is obvious [7], but the attention and application research of cloud computing in the field of geological data are still less.
4 Cloud computing and geological data services
4. 1 Geological data and service status
Geological data is an important national basic data. Since the founding of New China, a large number of geological data have been accumulated through the implementation of the unified collection system of geological data. There are more than 50 kinds of national basic geology and strategic mineral geology data resources 12, and the data volume exceeds 10TB, involving regional geology, mineral geology, hydrology-engineering-environmental geology, agricultural geology, marine geology, basic geology, geochemistry, geophysics, geoscience research, geological data and remote sensing [8].
At present, China implements a geological data management framework of two-level supervision and three-level preservation. Due to fragmentation and other reasons, there is still a big gap between the enjoyment and service of geological data, which is characterized by low degree of digitization, serious information island phenomenon, and geological data can not meet the needs of national construction and society in time and effectively.
In 2002, the State Council promulgated the Regulations on Geological Data Management, and in 2003, the Ministry of Land and Resources issued the Measures for the Implementation of the Regulations on Geological Data Management. The management and service of geological data have received unprecedented attention. The Ministry of Land and Resources has successively promoted the collection and delivery of geological data, entrusted custody of geological data, clustering of geological data and industrialized services. The management and service of geological data began to show a new situation. Because the transformation of management and service mode is a long-term process, the importance of geological data work has not been fully revealed, and the social attention to geology and minerals is still far behind "land", "ocean" and "meteorology", only slightly higher than "surveying and mapping" (Figure 4).
4.2 Cloud computing is an opportunity to change the service mode of geological data.
From the perspective of the emergence and development of cloud computing, cloud computing is a high-performance, scalable, low-cost, service-oriented new computing model based on the inheritance and integration of virtualization technology, mass data storage, distributed parallel computing framework, intelligence and automatic management. Cloud computing is pushing the information industry to achieve great changes in socialization, intensification and specialization.
Socialization: Internet computing is becoming a social infrastructure, and it is the current development trend to establish centralized and various cloud computing centers to realize large-scale social services.
Fig. 4 Comparison of Change Trend of Geological Equivalent Search Quantity
Intensive: integrate scattered and extensive software development and application, modularize software modules, improve the utilization rate of the platform, make the virtual organization and configuration of computing resources, flexibly expand and contract, and optimize the service process through software reuse and flexible reorganization.
Specialization: for multi-tenants, the service is more refined and standardized, the service is transparent and rented on demand [9].
Geological data service and information sharing are typical data-intensive computing services, which coincide with the basic characteristics of cloud computing model. Therefore, the introduction of cloud computing is a natural opportunity to promote the clustering industrialization of geological data information services. From a technical point of view, the construction of national geological data center is very important. It is suggested to plan a geological data professional cloud to provide complete SPI (Software as a Service SaaS, Platform as a Service PaaS, Infrastructure as a Service IaaS) services, covering secondary supervision, tertiary preservation and social services. This centralized deployment mode not only reduces the technical difficulty, but also helps to improve the investment and use efficiency. Secondly, the National Geological Data Center can also be planned as a three-tier data center system with "logical unity and physical distribution". This community cloud deployment mode is in line with the current situation of China's geological data industry, and the organization and implementation are relatively simple. It should be noted that no matter which way, unified architecture, mature technology, consistent standards and security are all important issues to be considered.
5 conclusion
Contrary to grid computing, cloud computing has gone through the process from practice to theory. Since researchers paid attention to cloud computing, there have been a lot of examples of cloud computing. China's basic research in the field of cloud computing is still backward, but the tracking and application of cloud computing in the library and information industry is very prominent, and some knowledge-based services have reached the level of specialization and industrialization. It is believed that the introduction of cloud computing mode will greatly promote the transformation of geological data service to cluster industrialization, so as to better realize the sharing of geological data and achievements in the whole society.
Take the exam and contribute.
[1] Wikipedia. Cloud computing. Http:/http://zh.wikipedia.org/wiki//cloud computing, 20 12.
[2] peter mayle, Timothy Glance. NIST definition of cloud computing. NIST Special Issue 800 ~145,2011.
[3] Li Deyi, Lin Runhua, Zheng Weimin, etc. Cloud computing technology development report [M]. Beijing: Science Press, 20 1 1.
Yang Chunxia, Wang Shengjie and Wang Chunmin. On the evolution of calculation model and its influence on marine geological data processing [J]. Marine Geodynamics, 2004,20 (2): 32 ~ 36.
[5] Fang Liu, Jin Tong, Jian Mao, etc. NIST Cloud Computing Reference Architecture. NIST Special Issue 500 ~ 292,201/.
[6] Michael Kurt Armbruster, Armando Fox, Ryan Griffiths and others, Above the Cloud: Berkeley's View of Cloud Computing, http://www.eecs.berkeley.edu/pubs/techrpts/2009/eecs-2009-28.pdf, 2009.
[7] Zhang. Summary of Cloud Computing Research in Library and Information Science in China [J]. Journal of National Library, 20 10, (3): 73 ~ 76.
Department of Mineral Resources and Reserves, Ministry of Land and Resources. Promote the cluster industrialization of geological information services [M]. Beijing: Geological Publishing House, 20 1 1.
[9] Li Deyi. Cloud computing supports the socialization, intensification and specialization of information services [J]. Journal of Chongqing University of Posts and Telecommunications, 20 10/0,22 (6): 698 ~ 702.
1. What are the evaluation conditions for professional titles of health system?
Laboratory medicine is evaluated in