Current location - Education and Training Encyclopedia - Graduation thesis - The relationship between hadoop and Google's mapreduce, gfs and other technologies.
The relationship between hadoop and Google's mapreduce, gfs and other technologies.
Simply put, Hadoop is a framework developed by inheriting Google's MapReduce and GFS ideas, and was later handed over to Apache as an open source project.

MapReduce was born in Google Labs. MapReduce, GFS and BigTable are also called Google Troika, and Hadoop is an open source implementation of Google Troika.

In 2003, Google published a technical academic paper, Google File System (GFS). GFS is a special file system designed by google to store massive search data.

In 2004, Nutch founder Doug Cutting implemented a distributed file storage system named NDFS based on Google's GFS paper.

In 2004, Google published another technical academic paper MapReduce. MapReduce is a programming model for parallel analysis of large-scale data sets (above 1TB).

In 2005, Doug Cutting realized this function in Nutch search engine based on MapReduce.

In 2006, Yahoo hired Doug Cutting, who named the upgrade of NDFS and MapReduce Hadoop, and Yahoo set up an independent team to research and develop Hadoop for Goff Cardin.