Current location - Education and Training Encyclopedia - University ranking - What is the difference between data mining and data analysis?
What is the difference between data mining and data analysis?
1. data mining

Data mining refers to the process of mining unknown and valuable information and knowledge from a large number of data through statistics, artificial intelligence, machine learning and other methods. Data mining mainly solves four kinds of problems: classification, clustering, association and prediction, both quantitative and qualitative. The focus of data mining is to discover unknown patterns and laws. Output models or rules, and get model scores or labels accordingly. Model scores such as loss probability value, total score, similarity, predicted value, etc. Labels include high, medium and low value users, loss or no loss, good or bad credit. This paper mainly uses decision tree, neural network, association rules, cluster analysis and other statistical, artificial intelligence, machine learning and other methods to mine. Taken together, the essence of data analysis (narrow sense) and data mining is the same, which is to discover business knowledge (valuable information) from data, thus helping enterprises to operate, improve products and help enterprises make better decisions. Therefore, data analysis (narrow sense) and data mining constitute data analysis in a broad sense. These contents are different from data analysis.

2. Data analysis

In fact, we can say that data analysis is an operation method or algorithm for data. The goal is to sort out, filter and process the data according to the prior constraints, so as to get information. Data mining is a valuable analysis of information after data analysis. And data analysis and data mining are even recursive. That is, the result of data analysis is information, which is mined as data. Data mining, by means of data analysis, goes round and round. Therefore, the difference between data analysis and data mining is obvious.

And the specific difference between the two is:

(In fact, data analysis covers a wide range, including data mining. The difference here mainly refers to statistical analysis. )

Data volume: the data volume of data analysis may be small, while the data volume of data mining is extremely large.

Constraint: Data analysis starts from a hypothesis and needs to establish an equation or model to match the hypothesis, while data mining does not need hypothesis and can automatically establish the equation.

Object: Data analysis is often aimed at digitized data, while data mining can use different types of data, such as voice and text.

Results: Data analysis explained the results and presented effective information, while the results of data mining were not easy to explain. We evaluate the value of information, focus on predicting the future and make decision suggestions.

Data analysis is a tool to turn data into information, and data mining is a tool to turn information into cognition. If we want to extract certain rules (namely cognition) from data, we often need to combine data analysis and data mining.

For example, you take 50 yuan to the vegetable market to buy food. You want to mix meat and vegetables with all kinds of chickens, ducks, fish, pork and all kinds of vegetables. You ask prices one by one, do statistical analysis constantly, and you will have a set of information in your heart. This is data analysis. When you make a choice, you need to evaluate the value of this information. According to your own preferences, nutritional value, scientific collocation, meal time plan, the most cost-effective combination, etc. You need to analyze the value of this information and finally determine a purchase plan. This is data mining.

The combination of data analysis and data mining can finally land and give full play to the usefulness of data.