Current location - Education and Training Encyclopedia - Resume - What does Hadoop for big data do?
What does Hadoop for big data do?
There are many ways to add multiple data sets in Hadoop. MapReduce provides data connection between the mapping end and the Reduce end. These connections are very special connections and can be very expensive operations. Pig and Hive also have the same ability to apply for connecting multiple data sets. Pig provides copy connection, merge connection and skew connection, while Hive provides map-side connection and complete external connection to analyze data.

An important fact is that by using various tools, such as MapReduce, Pig and Hive, data can use them based on their built-in functions and actual needs. As for analyzing a large amount of data in Hadoop, Anoop pointed out that generally speaking, in the world of big data /Hadoop, some problems may not be complicated and the solutions are simple, but the challenge lies in the amount of data. In this case, different solutions are needed to solve the problem.

Some analysis tasks are counting the number of cleared IDs from log files, converting stored data within a specific date range, and ranking netizens. All these tasks can be solved by various tools and technologies in Hadoop, such as MapReduce, Hive, Pig, Giraph and Mahout. These tools can flexibly extend their functions with the help of custom routines.