Current location - Education and Training Encyclopedia - Graduation thesis - What programming mode does cloud computing usually adopt?
What programming mode does cloud computing usually adopt?
1)MapReduce

MapReduce is a programming model proposed by Jeff Dean of Google Company, which is used to process and generate large-scale data. Conceptually, MapReduce processes one set of input key-value pairs (key-value pairs) and generates another set of output key-value pairs. The current software implementation is to specify a mapping function to map a set of key-value pairs to a new set of key-value pairs, and specify a concurrent Reduce function to ensure that each mapped key-value pair enjoys the same key group. Programmers only need to design Map and Reduce functions according to business logic, and the specific distributed and high concurrency mechanism is realized by MapReduce programming system.

I believe everyone is familiar with MapReduce-related mechanisms, so I won't repeat them here.

MapReduce has been widely used in Google, including reverse index construction, distributed sorting, Web access log analysis, machine learning, machine translation based on statistics, document clustering and so on.

Hadoop-as an open source implementation of MapReduce-got Yahoo! Support and application of a large number of companies such as Facebook and IBM.

2) Dryad

Dryad is a data parallel processing programming system designed and implemented by Microsoft, which allows programmers to use computing resources in clusters or data centers. Conceptually, an application is represented as a directed acyclic graph (DAG). Vertex represents computing, application developers write serial programs for vertices, and the edges between vertices represent data channels for data transmission. Data transmission mechanisms such as file, TCP pipeline and FIFO with memory can be used. Dryad is similar to the pipeline in Unix. If the pipeline in Unix is regarded as one-dimensional, that is, the data flow is unidirectional, each step of calculation is single input and single output, and the whole data flow is a linear structure, then Dryad can be regarded as a two-dimensional distributed pipeline, and a computing vertex can have multiple input data flows. After data processing, multiple output data streams can be generated, and a Dryad job is a DAG.

3) pregel

Pregel is a general programming model proposed by Google for large-scale graphics computing. Many practical applications involve large-scale graph algorithms, such as web links, social relationships, geographical location maps, citation relationships in scientific research papers, etc. Some graphs can reach billions of vertices and trillions of edges. Pregel programming model is designed for the efficient calculation of such large-scale graphics.