Current location - Education and Training Encyclopedia - Graduation thesis - Paper Reading "Structured Deep Clustering Network"
Paper Reading "Structured Deep Clustering Network"
Clustering is the basic task of data analysis. In recent years, deep clustering inspired by deep learning methods has achieved the most advanced performance and attracted wide attention. At present, deep clustering methods usually use the powerful representation ability of deep learning to improve the clustering results, such as autoencoder, which shows that learning effective clustering representation is a crucial requirement. The advantage of deep clustering method is to extract useful representations from the data itself, rather than from the structure of the data, which is rarely paid attention to in representation learning. Based on the great success of graph convolution network (GCN) in graph structure coding, we propose a structured deep clustering network (SDCN), which integrates structural information into deep clustering. Specifically, we design a transfer operator to transfer the representation learned by the automatic encoder to the corresponding GCN layer, and design a dual self-monitoring mechanism to unify these two different deep neural structures and guide the update of the whole model. In this way, various data structures from low order to high order are naturally combined with various representations learned by the automatic encoder. In addition, we theoretically analyze the transfer operator, that is, through the transfer operator, GCN improves the unique representation of the self-encoder into a high-order graph regularization constraint, and the self-encoder is helpful to alleviate the over-smoothing problem in GCN. Through comprehensive experiments, we have proved that our proposed model can always perform better than the most advanced technology.

The key point of this paper: the capture of structural information is extended in DEC's single-view depth clustering model, and it is captured by using GCN structure. Compared with the structure of GAE, the part about GCN does not use the reconstruction of adjacency matrix to supervise, but uses the target distribution information of clustering to construct another structure distribution to quantify the supervision of structure information.

Legend: and are input data and reconstruction data respectively. And are the outputs of the first layer DNN and GCN modules respectively. Different colors represent different expressions learned from DNN. The blue solid line indicates that the target distribution is calculated by distribution, and the two red dashed lines indicate the dual self-monitoring mechanism. Target distribution guides the update of DNN module and GCN module at the same time.

Summary: First, construct a KNN diagram according to the original data. Then input the original data and KNN map into AE and GCN respectively. The author connects each layer of AE with the corresponding GCN layer, so that AE-specific representation can be integrated into structure-aware representation through transfer operators. At the same time, a dual self-monitoring mechanism is proposed to monitor the training process of AE and GCN.

B: Generally speaking, the number of layers mentioned when introducing AE structure refers to the number of layers from the first hidden layer to the coding layer except the input layer and the reconstruction layer.

DNN module adopts layered basic AE structure, which will not be described here.

Step-:Obtain the convolution operation output result of the first layer.

The convolution operation of the results of each layer is consistent with that of the graphic nerve, but in the output structure, the author connects the expression matrix of the corresponding layer of DNN module (which will be propagated through the normalized adjacency matrix), and selects the balance factor to combine the information from DNN and GCN.

Step-: But for the output of the first layer, only the original output is kept.

Step-:In the construction of structured information distribution, multi-classification softmax layer is adopted to obtain it.

The results show that the probability sample belongs to the cluster center, which can be regarded as probability distribution.

Advantages of the objective function:

(1) Compared with the traditional multi-classification loss function, KL divergence updates the whole model (soft label) in a more "gentle" way to prevent the data representation from being seriously disturbed;

(2) GCN and DNN modules are unified in the same optimization goal, so that their results tend to be consistent in the training process.

Because the goal of DNN module and GCN module is approximate target distribution, and there is a strong connection between the two modules, it is called double self-monitoring mechanism.

In this paper, the expression of GCN+ AE in different layers is used to increase the structure. The experimental results verify the effectiveness of the model and provide theoretical support. The whole experiment is finished. For the structure of the relationship between samples, a new idea is given, and the overall sample structure is further explored.

I haven't seen this piece about theoretical support, and I'm not going to continue. .