Current location - Education and Training Encyclopedia - Graduation thesis - Toolbox of causal reasoning recommendation system -CFF (1)
Toolbox of causal reasoning recommendation system -CFF (1)
CIKM-202 1 Beijing Key Laboratory of Big Data Management and Analysis Methods-Suggestions Based on Counterfactual Review

Aiming at the problem of sparse and unbalanced comments in the existing comment-based recommendation system, this paper proposes to use counterfactual samples in feature-aware recommendation scenarios to improve the performance of the model. The author generates counterfactual samples by interfering with users' preferences (reflected in some comments of users), and uses observation samples and counterfactual samples to jointly train the recommendation model to improve the performance of the model. When generating counterfactual samples, a learning-based method is used instead of random generation to generate counterfactual samples that can best improve the model performance. In addition, the author also makes a theoretical analysis and discusses the relationship between the number of generated samples and the noise interference of the model.

The existing comment-based methods can be divided into two categories, as shown in Figure A below.

However, none of the above methods touch on the essential problem of comment recommendation, that is, sparse and unbalanced data. Comment information can greatly improve the performance of recommendation system, but its sparsity and imbalance bring great challenges to accurate and efficient recommendation. It takes great efforts to make the model achieve satisfactory performance. The statistical results on Amazon data set show that there are few users who often comment and mention few items and aspects.

Therefore, the author draws lessons from the idea of counterfactual, and changes the ranking results of user preferences by adjusting user preferences to the minimum extent, thus generating counterfactual samples.

The author uses BPR loss[ 19] pairwise learning, and the specific loss function is shown in the following figure. Among them, the training sample is sigmoid function, the recommendation model (here it should be the ranking model), and the second term represents the regular term as a whole. Indicates the user's preference score for the item.

As mentioned earlier, user comments are sparse. At the same time, users' attention to different characteristics (aspects) of goods affects users' preferences. For example, in the picture below, users will choose "IPhone" if they pay attention to the brand, and "Xiaomi" if they pay more attention to the price. Therefore, counterfactual can be obtained by interfering with users' characteristic attention, and the labels of counterfactual samples can be obtained by using (existing, possibly pre-trained) recommendation models to predict samples.

The method of comparing Naive is to replace random samples with users' attention characteristics, but this method is obviously suboptimal because of the different importance of samples and characteristics [12]. The author uses learning-based method to learn counterfactual sample generation (which should be the basic operation of counterfactual sample generation now). Drawing lessons from [1, 12], the author learns how to change the decision results of the model by changing users' attention to features (features represent users' preferences) and generate counterfactual samples. In fact, the decision boundary of the model is used to reflect the characteristics of the potential structure or mode of data. Its schematic diagram is shown in sub-picture B in the above figure.

Specifically, the author introduces perturbation, and each element of the perturbation vector acts on each feature of the article (which can also be the hidden vector representation of the feature). Which represents a set of all features. Then use the formula shown in the figure below to find the best disturbance.

Among them, the calculation formula of is shown in the following figure, which respectively represents the feature matrix of users and items, that is, the degree of attention of each user to this feature and the quality of each item in this feature.

It is worth noting that in the process of learning optimization, the parameters are fixed, and the goal of the first term in the loss function is to find the minimum disturbance, and the goal of the second term is to change the model's preference order for the two terms.

This section expounds the author's research background, basic model and the idea of generating counterfactual samples, and the next section continues to introduce the control details and theoretical analysis of counterfactual generation.

The author of this paper is also a big shot at Rutgers University, so the routine is very similar to CCF (I) and ——DCCF (I). Both of them first use counterfactual samples to enhance the model, and the generation method is mainly based on learning, with the goal of generating so-called "hard samples" to maximize the performance of the model. Finally, the relationship between the error rate of the model and the number of samples and noise is analyzed.

At the same time, the process of generating counterfactual is to use a pre-trained weak recommendation model to judge the labels of counterfactual samples, and then get a higher performance (or feel a bit like bootstrap) in the training model.

[1] Ehsan Abbasnejad, Damien Teney, Amin Parvaneh, Javen Shi and Anton van den.

Hengel. Counterfactual vision and language learning. Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10044– 10054.

[4] Rose Katrina and William Cohen. 20 17.TransNets: learning turns into recommendation. ArXiv preprint arxiv:1704.02298 (2017).

[7], Yin Hongzhi, Ye, Wang Meng. 2020. Try to do this: personalized and explanatory substitute recommendation. (2020).

[12] Yash Gauillard, Wu Ziyan, Jan Ernst, Drew Batra, Devi Parik and Stefan Lee. 20 19. visual explanation of counterfactual. ArXiv preprint arxiv:1904.07451(2019).

Steffen Rendle, Christoph Freudenthaler, Zeno Gantner and Lars Schmidt-Thieme. 2009.BPR: Bayesian personalized sorting based on implicit feedback. Proceedings of the 25th Conference on Uncertainty of Artificial Intelligence. AUAI Press, 452–461.