Current location - Education and Training Encyclopedia - Graduation thesis - Common similarity measurement methods
Common similarity measurement methods
Referring to Zou Bo's PPT, this paper summarizes five common similarity measurement methods.

1, Minkowski distance:

2. Jaccard distance:

3. Cosine similarity:

4. Pearson correlation coefficient:

For the relationship between Euclidean distance, cosine similarity and Pearson coefficient, please refer to the discussion about Zhihu. To sum up:

After a data standardization, Pearson correlation coefficient, cosine similarity and the square of Euclidean distance can be regarded as equivalent.

Bperson correlation coefficient is an improvement of cosine similarity in the case of missing dimension value.

5. Kurbak-Laible divergence (relative entropy, KL divergence)

A, KL divergence is asymmetric, that is, the distance from P to Q is not equal to the distance from Q to P;

B, KL divergence does not satisfy the triangle distance formula, the sum of two sides is greater than the third side, and the difference between the two sides is less than the third side.

References:

1,/Question/197346 16

10,/Question /4 1252833