Current location - Education and Training Encyclopedia - Graduation thesis - What mathematical basis does machine learning need?
What mathematical basis does machine learning need?
We know that machine learning involves many tools, the most important of which is mathematical tools, so the necessary mathematical foundation can be said to be the necessary key to open the door of machine learning. The basic mathematics involved in machine learning includes three aspects, namely linear algebra, probability statistics and optimization theory. The following small series will give you a good introduction to the basic knowledge of mathematics involved in machine learning, so that everyone can better use mathematical tools in daily machine learning.

First of all, let me introduce you to linear algebra. One of the most important functions of linear algebra is to transform concrete things into abstract mathematical models. No matter how complicated our world is, we can transform it into a vector or a matrix. This is the main function of linear algebra. So in the process of solving the representation problem with linear algebra, we mainly include two parts. On the one hand, it is the theory of linear space, which is what we call vector, matrix and transformation. The second is matrix analysis. Given a matrix, we can do the so-called SVD decomposition, which is singular value decomposition, or do some other analysis. In this way, the two parts * * * are isomorphic to form the linear algebra we need in machine learning.

Then let's talk about probability statistics. In the evaluation process, we need to use probability statistics. Probability statistics includes two aspects, one is mathematical statistics, the other is probability theory. Generally speaking, mathematical statistics are easy to understand, and many models used in machine learning are derived from mathematical statistics. Like the simplest linear regression and logistic regression, they all come from statistics. After giving the objective function in detail, we will use some probability theory when actually evaluating this objective function. When a distribution is given, we need to know the expected value of this objective function. On average, to what extent can this objective function be achieved? Probability theory is needed at this time. Therefore, in the evaluation process, we will mainly apply some knowledge of probability and statistics.

Finally, let's talk about optimization theory. In fact, it goes without saying that we must use the optimization theory. In optimization theory, the main research direction is convex optimization. Convex optimization certainly has some limitations, but its benefits are obvious, such as simplifying the solution of this problem. Because in optimization, we all know that what we require is a maximum or a minimum, but in practice, we may encounter some local maxima, local minima and saddle points. Convex optimization can avoid this problem. In convex optimization, the maximum is the maximum and the minimum is the minimum. But in practice, especially after the introduction of neural network and deep learning, the application scope of convex optimization is getting narrower and narrower, and it is no longer applicable in many cases, so we mainly use unconstrained optimization here. At the same time, one of the most widely used algorithms and optimization methods in neural networks is back propagation.