Current location - Education and Training Encyclopedia - Graduation thesis - The birth history of Hadoop
The birth history of Hadoop
Founder: Doug Cutting, known as the father of Hadoop, the chairman of Apache Software Foundation, and the initiator of Lucene, Nutch, Hadoop and other projects.

At first Hadoop was only part of Nutch, a sub-project of Apache Lucene.

Lucene is the world's first open source full-text search engine toolkit, which must have been contacted by students who have done Javaweb search function.

It has a complete query engine and some text analysis engines.

Nutch, based on Lucene, has the functions of web crawling and parsing, and can realize the development of a search engine. However, if it is put into use, it must respond in a very short time, and hundreds of millions of web pages can be analyzed and processed in a short time, which requires consideration of distributed task processing, fault recovery and load balancing.

Later, Doug Cutting borrowed from Google's two papers, Google File System and MapReduce: Simplifying data processing on large clusters, transplanted the technology and named it Hadoop.