Paper
I haven't written a blog for a month, burying my face.

I went home to rest for a few days during the Dragon Boat Festival, and I have to refuel in June ~

Back to the text, HOG is a classic image feature extraction method, especially in the field of pedestrian recognition. Although the article was published in CVPR in 2005, the articles that have not been submerged in recent ten years are really worth reading.

Key idea:

The shape and appearance of local objects can be expressed by local gradient or density distribution of edges.

Main steps:

The picture above is the picture provided in the paper. Personally, I think the pictures given in the blogs listed in Resources may be easier to understand.

Specific details:

The detailed explanation of each process has been clearly written in this blog, so I won't go into details here.

The image size of the data set in this paper is 64 * 128, the block size is 16x 16, the block span is 8×8, the unit size is 8×8, and bins=9 (the number of histogram layers);

After obtaining the feature dimension of each image, the linear SVM is used to train the classifier.

The following figure is an example given by the author:

These two blogs are both very good, so I recommend reading them.