Do you want to learn full stack in data analysis and image processing? Want to learn the Internet? Are you kidding?
For machine learning:
Have you studied probability theory, linear algebra and matrix multiplication? It doesn't let you really calculate, only know formulas and derivation;
Variance, expected to learn from these, normal distribution, cypress distribution and hard distribution, these are very simple;
Isn't linear regression y = wx +bias? This elementary school thing, but here W and X are written as vectors, and vectors are vectors of elementary schools or junior high schools. Is it difficult?
Gradient descent, that is, chain derivation, is also the content of higher mathematics. Is it difficult?
Logistic regression This is to change the linear regression y into 0 to 1 by sigmoid function. Do you find it difficult?
For the classification problem, you use the maximum likelihood method. You multiply all the points and then take the derivative. Is it really difficult?
Bayesian is just a formula, which is based on conditional probability. Don't you just look at formulas and examples? Is it really difficult?
Optimization is to set one or more restrictions on a function. Really can't understand?
Lagrange multiplier is to divide a matrix into two matrices, remember?
Discriminant regression, like linear regression and Logistic regression, has high requirements for your hypothesis. An example of generating regression is Bayesian separator, which is Bayesian. Is it really difficult?
PCA is to find eigenvalues and eigenvectors, which is a method to find the characteristics of the maximum variance of projection. The greater the variance, the more representative it is. Don't let you do the math yourself, just look at the picture and think about it, okay?
Doesn't k mean cluster? Why not make a few random points, distribute the distance between the surrounding points, and then update the points in each round? I won't let you write the algorithm by hand.
SVM is a classification method. Is it difficult for support vectors to separate points with parallel lines or curves as much as possible? Isn't the kernel function just to replace X.transpose*X?
Neural network is only a derivative of logical regression. Forward and backward propagation, look at the derivation given in the book, that is, chain derivation, but the other way around. Is it difficult?