In the past three years, Dharma Institute has kept pace with the academic research and application of artificial intelligence. No matter in international academic summits and various competitions, or in promoting the commercialization of academic achievements, they have handed over eye-catching transcripts, which has attracted top researchers in the field of artificial intelligence to gather here.
I think everyone is full of curiosity about the research work being carried out by these top researchers!
On July 9th (evening) 19:30-2 1:00, AI Science and Technology Review will jointly present a live broadcast of "Interpretation of ACL 2020 Series Papers and Alibaba Specials" with the same "firm" existence of Ali Group in academic research!
At that time, six senior algorithm experts, algorithm engineer and research interns from Ali Dharma Institute Machine Intelligence Technology Team and Ali Security Intelligence Team will focus on NLP subdivision fields such as multi-task learning, text classification with few samples, task-based dialogue, neural machine translation, knowledge extraction, cross-domain word segmentation and labeling, and bring you a feast of paper interpretation!
Who is the guest to share this time? The following are revealed one by one: * * * Sharing theme: SpanMlt: a multi-task learning framework based on the paired extraction of attribute words and opinion words of span * * * * Sharing guest: Huang.
Shared content:
The extraction of attribute words and opinion words are two key problems in fine-grained attribute-based sentiment analysis (ABSA). Aspect-Opinion pairs are global profiles that can provide related products or services to consumers and opinion mining systems. However, without given attribute words and viewpoint words, traditional methods cannot directly output attribute-viewpoint word pairs. Although researchers have recently proposed some * * * extraction methods to jointly extract attribute words and opinion words, they cannot extract both in pairs. Therefore, this paper proposes an end-to-end method to solve the paired extraction of attribute words and opinion words. In addition, this paper deals with this problem from the perspective of joint words and relations extraction, rather than the sequence labeling method implemented in most previous work. We propose a multi-task learning framework based on * * * shared span to extract words under the supervision of span boundaries. At the same time, span representation is used to identify the pairing relationship together. A large number of experiments show that our model is always better than SOTA method.
Shared content:
The existing work often adopts the method of meta-learning, and obtains the ability of learning with few samples by switching between a series of meta-tasks, but the switching between tasks will bring the problem of forgetting, so we consider using the memory mechanism to assist the training of meta-learning. In this work, we regard the classification parameters obtained by supervised learning as the global memory of meta-learning, and propose a dynamic memory routing algorithm, which integrates the global memory information into the training and prediction stage of meta-tasks based on dynamic routing. In addition, the dynamic memory routing algorithm can also enhance the ability of inductive category representation by using query information, which has better generalization performance for the expression of language diversity in spoken scenes. STOA results are obtained on a small sample of Chinese and English scene classification task data sets.
Sharing theme: multi-domain dialogue actions and responses are generated together * * * * Sharing guest: Tian Junfeng.
Shared content:
In task-based dialogue, it is very important to give a fluent and informative answer. Existing pipeline methods usually predict multiple dialogue actions first, and then use their global representations to assist reply generation. This method has two defects: first, it ignores the internal structure of multi-domain when predicting dialogue action; Secondly, the semantic connection between the dialogue action and the reply is not considered when generating the reply. In order to solve these problems, we propose a neural joint generation model that can generate dialogue actions and responses at the same time. Different from the previous methods, our dialogue action generation module can maintain the hierarchical structure of multi-domain dialogue actions, and our reply generation module can dynamically pay attention to related dialogue actions. In training, we use the uncertainty loss function to adaptively adjust the weights of two tasks. The experimental results on large-scale MultiWOZ data sets show that this model is superior to SOTA model in both automatic evaluation and manual evaluation. * * * * Sharing Topic: Multi-scale Collaborative Depth Model of Neuromachine Translation * * * * Sharing Guest: Wei * *
In recent years, the neural machine translation (NMT) method has replaced the statistical machine translation method in a large number of application scenarios with its excellent translation performance. At present, the factors that restrict the performance of NMT model mainly include the model's feature expression ability and data scale. Therefore, we propose a deep neural machine translation model based on multi-scale collaboration (MSC) mechanism to improve the modeling ability of the model for low-level (concrete) and high-level (abstract) features.
Experiments show that (1) multi-scale cooperation mechanism is helpful to build deep NMT model and improve its performance. (2) The deep NMT model based on MSC mechanism can better translate natural language sentences with complex semantic structures.
* * * * Sharing theme: Structural-level knowledge extraction of multilingual sequence labeling * * * * * Sharing guest: Wang Xinyu * *
Multilingual sequence tagging is a task to predict multilingual tag sequences using a single unified model. Compared with relying on multiple monolingual models, using multilingual models has the advantages of small model scale, easy online service and universality for low-resource languages. However, due to the limitation of model capacity, the current multilingual model is still far below the single monolingual model. This paper proposes to extract the structured knowledge of multiple teachers into a unified multilingual student, so as to narrow the gap between monolingual model and unified multilingual model. We propose two kinds of knowledge mining methods based on structural hierarchical information:
* * * * Sharing theme: Cross-domain Chinese word segmentation remote tagging and confrontation coupling training * * * * * Sharing guest: Ding Ning * *
Fully supervised neural method has made great progress in Chinese word segmentation. However, if the domain migration is caused by the distribution difference between domains and OOV(out-of-set words) problem, the performance of the supervision model will be greatly reduced. In order to alleviate this problem in real time, this paper intuitively combines the remote tagging of cross-domain Chinese word segmentation with antagonistic training.
On July 9th, six sharing guests from Ali will be with you!
ACL 2020 was originally scheduled to be held in Seattle, Washington, USA from July 5, 2020 to 10, but it was changed to an online meeting due to the COVID-19 epidemic. In order to promote academic exchanges and facilitate domestic teachers and students to understand the cutting-edge research of natural language processing (NLP) in advance, AI Science and Technology Review will launch the content of "Interpretation of ACL Lab Series Papers", and more laboratories are welcome to participate and share, so stay tuned!