Current location - Education and Training Encyclopedia - Graduation thesis - Summary of significance test (complete arrangement)
Summary of significance test (complete arrangement)
First,? Cheng Mingming et al.' s paper: salient target detection: a summary (briefly summarizing the part of the article that I think is more important)

The purpose of this paper is to comprehensively review the latest progress of highlight target detection, and link it with other closely related fields, such as general scene segmentation, target hint generation and the significance of fixed prediction. The main contents include: 1) root cause, key concepts and tasks; 2) Core technologies and main modeling trends; 3) Data set and evaluation index in salient target detection. The open problems such as the reverse of future research are discussed and put forward.

1.

1. 1 What is a salient object?

It is mentioned that it is generally believed that a good saliency detection model should at least meet the following three criteria: 1) Good detection: the possibility of losing actual saliency areas and wrongly marking the background as saliency areas should be low; 2) High resolution: the salient map should have high resolution or full resolution to accurately locate salient objects and retain the original image information; 3) Computational efficiency: As the front end of other complex processes, these models should be able to quickly detect significant areas.

1.3 detection history of important objects

(1) The earliest and most classic saliency model proposed by ITTI et al., such as predicting saliency map, where G is the ground form binary mask of salient objects.

(1) exact recall (PR). First, the saliency map s is converted into a binary mask m, and then the accuracy and recall are calculated by comparing m with the basic truth value g:

(2) F value: Generally, neither accuracy nor recall can fully evaluate the quality of saliency maps. Therefore, the f value is proposed as a non-negative weight of the precision and recall of centralized river-jumping average:

(3) ROC curve: It is a curve with false positive rate (FP_rate) and false negative rate (TP_rate) as the axis.

(4) Area under 4)ROC curve (AUC): The greater the AUC, the better the performance.

(5) Mean absolute error (MAE): Make a more comprehensive comparison.

Figure 12, popular salient object detection data set:

Second,? Supplement to the traditional significance test (the classification in the paper is not quite consistent with my usual habits, so I collected the data again and sorted it out)

Common significance detection methods:

1.? Cognitive model

Almost all models are directly or indirectly inspired by cognitive models, one of which is the combination of psychology and neurology. Itti model (using three characteristic channels: color, attribute and direction) is the representative of this kind of model, and it is also the basis of many subsequent derivative models.

2.? Information theory model

The essence is to maximize the information of visual environment, and the most influential model is AIM model.

3.? Graph theory model

The saliency model based on graph wheel regards eye movement data as time series, and uses hidden Markov model, dynamic Bayesian network and conditional random field. Graph model can model complex attention mechanism, so it can achieve better prediction ability. The disadvantage lies in the high complexity of the model, especially in training and readability. Typical ones are: GBVS, etc.

4.? Frequency domain model

The saliency model based on spectral analysis is simple, easy to explain and realize, and has achieved great success in attention focus prediction and salient region detection, but its biological rationality is not very clear. Classical models include: the significance detection model of spectral residuals (pure mathematical calculation method).

Resource link:

/p-9 1506085 1.html

/p-499356 1 18 12 19 . html

/u 0 12507022/article/details/5286346 1

Third,? Content supplement of significance detection based on deep learning (the paper was written on 20 14, and the part about deep learning is not perfect, so it is added here)

In the early stage of development, the research of salient target detection based on deep learning, from target detection neural network to overtraining, has been difficult to achieve ideal results. The birth of R-CNN in 2065438+2004 became the first real industrial application scheme, and its mAP in VOC2007 test set increased to 66%. But there are still many problems in the framework of R-CNN:

1) training is divided into several stages, and the steps are complicated: network fine-tuning+training SVM+ training boundary regression.

2) Training is time-consuming and takes up a lot of disk space: 5000 pictures produce hundreds of G-feature files.

3) Slow speed: It takes 47s to process an image using GPU and VGG- 16 model.

So far, the research on salient object detection based on deep learning can be divided into two categories: deep learning object detection based on regional suggestion and deep learning object detection based on regression.

The deep learning target detection methods based on regional suggestions include: R-CNN, SPP-net, FAST-CNN, FAST-CNN, R-FCN, etc.

1) R-CNN (an area with CNN characteristics) is expensive in time and space;

2) SPP-net (spatial pyramid pooling) strengthens the use of CNN, allows images of different sizes to be input, and further emphasizes the idea that CNN feature calculation moves forward and regional processing moves backward, which greatly saves the calculation amount, but it is not an end-to-end model and CNN feature extraction has no linkage parameters;

3) The appearance of 3)fastr-CNN solves the problem of duplicate counting of the first two, and realizes the convolution of regional suggestion and target detection. The RoI Pooling technology, which was put forward for the first time, gave full play to the advantages of regional backward movement and accelerated the training speed. CNN network model adopts VGG- 16, and the experimental effect is improved by linkage calling parameters, but the end-to-end model is still not realized, and it relies heavily on SS regional suggestions.

4) The faster R-CNN gave up selective search and proposed RPN network to calculate candidate frames. Using end-to-end network for target detection has greatly improved the speed and accuracy, but the speed can not meet the real-time demand, and the calculation amount for each proposed classification is still very large, and the function has not yet entered the stage of case segmentation.

The deep learning target detection methods based on regression include YOLO, SSD, G-CNN, NMS, etc.

1) YOLO (you only watch it once) turns the target detection task into a regression problem, which greatly simplifies the detection process and speeds up the detection. However, when predicting the target window, the global information is used, which has high redundancy, no regional suggestion mechanism and low detection accuracy.

2) SSD (single shot multi-box detector) makes use of the features around a certain position when predicting it. Combining YOLO's regression idea with the candidate region mechanism in FasterR-CNN, it not only keeps YOLO fast, but also ensures the accuracy of location.

3) G-CNN focuses on reducing the number of initialization suggestions, turning tens of thousands of suggestions into a few initial grids, which improves the detection speed;

4 4) NMS (non-maximum suppression) removes duplicate candidate frames by iteration, and selects the frame with the highest confidence.

At present, deep learning target detection based on regional suggestions is widely used in practical applications.

Research status of significance detection methods based on deep learning;

R-CNN series salient target detection framework and YOLO salient target detection framework provide us with two basic frameworks for target detection based on deep learning. At present, researchers have proposed a series of methods to improve the performance of target detection from other aspects based on these frameworks. Such as: difficult sample mining, multi-layer feature fusion, the use of context information, the characteristics of deeper network learning and so on.

Original link: /QQ _ 32493539/ article/details /79530 1 18.