Current location - Education and Training Encyclopedia - Graduation thesis - Discussion on knowledge spectrum technology and its application perfection
Discussion on knowledge spectrum technology and its application perfection
Foreword and background: In the process of constructing knowledge map, a large amount of knowledge information comes from documents and web pages, and there are often deviations in the process of extracting knowledge from documents, which come from two aspects:

(1) There will be a lot of noise information in the document, that is, useless information, which may come from the knowledge extraction algorithm itself or be related to the effectiveness of the language itself;

(2) The amount of information in the document is limited, and it will not cover all knowledge, especially a lot of common sense knowledge.

All of the above will lead to the incompleteness of knowledge map, so the completeness of knowledge map is becoming more and more important in constructing knowledge map.

Through the acquired knowledge, the relationship between entities is predicted, so as to complete the relationship between entities or the information of entity types. This process can be completed by using the internal knowledge of this knowledge base or introducing the knowledge of a third-party knowledge base.

Compiled a 200G AI data packet:

① Artificial intelligence courses and projects include courseware source code.

② Ultra-detailed explanation of artificial intelligence learning roadmap.

3 artificial intelligence must read high-quality books and e-book abstracts.

④ Well-known elite resources at home and abroad

(5) sorting out high-quality artificial intelligence resource websites (looking for predecessors, looking for codes, looking for papers)

⑥ Artificial Intelligence Industry Report

All landowners essays on artificial intelligence

/P3 . toutiaoimg . com/origin/tos-cn-I-qvj 2 LQ 49k 0/36 AE 8d 96 bccf 490 bb 4d 877 abda 852 f 7d "," uri ":"," width":3 1," height":27, " darkImgUrl ":"-I-qvj 2 LQ 49k 0/b 8 1 AEA 2925484 cf 5 bde 9 cbb4 C2 c 62 FD "," darkImgUri

The data is arranged neatly and cleanly in the network disk! I hope it will be helpful to everyone's study. Please add a private note 05.

Knowledge map completion can be divided into two levels: conceptual knowledge completion and instance knowledge completion.

It is often mentioned that only the extraction of entities and relationships is mentioned in the process of knowledge map construction, and then RDF composed of entities and relationships can be generated.

However, it is not enough to obtain triples, and these should be considered, because the entities in triples can be mapped to types associated with the hierarchy of knowledge concepts in addition to their attributes and relationships, and an entity can have multiple types.

For example, Obama's entity type is different in different relationships.

In the description of birth information, the type is human; In the description of creating memoirs, it can also be a writer; You can also be a politician in the job description.

Conceptual hierarchy model of entity types

Here: there are levels among the concepts of people, writers and politicians. This is a hierarchical model of concepts.

1, knowledge completion at the conceptual level-mainly solving the problem of lack of entity type information.

As mentioned in the previous example, once an entity is identified as a human type, it still needs to search for lower concepts in addition to the human type in order to find more category description information.

(1) Rule reasoning mechanism based on description logic.

Ontology and pattern: both entities can belong to an ontology, and this ontology has a set of patterns to ensure its uniqueness, which can be described by rules, so for ontology, this set of rules can be used to describe it.

For example, Obama is an entity, and his ontology can be attributed to people, while the human model is to use language and tools to transform other affairs, and so on. These patterns can be described by rules, so the rule reasoning method based on description logic appears.

Description logic is a common knowledge representation, which is based on concepts and relationships.

For example, you can collect entity instances (which can be text) about people, extract patterns from them, and record them in the form of rules. In this way, as long as you encounter a new entity instance, you only need to substitute the previously recorded rules for comparison to make a judgment. If it conforms to the rules, it means that the instance can be classified as a human conceptual type, otherwise it will be judged as a non-conceptual type.

(2) Type reasoning mechanism based on machine learning.

After experiencing the development stage of rule reasoning based on descriptive logic, machine learning related research began to occupy the mainstream. At this time, we should not only use internal clues such as rules generated by examples to judge, but also use external characteristics and clues to learn type prediction.

For an entity e 1 of unknown type, if an entity e2 of similar and known type can be found, it can be inferred that the type of the entity e 1 should be the same as or at least similar to that of e2.

This kind of method can be divided into three directions: content-based type reasoning, link-based type reasoning and statistical relationship-based type reasoning (such as Markov logic network).

(3) Inference mechanism based on representation learning.

Embedded learning and deep learning are introduced into type reasoning. Most types of reasoning methods based on machine learning assume that there is no noise in the data, and its characteristics still need to be considered as selection and design. Introducing deep learning can avoid feature engineering. Type reasoning should be based on text content, and also need the support of other features such as link structure. At this time, embedded methods can play their own advantages.

2. Knowledge completion at the instance level

It can be understood that for an example triple (SPO, subject-predicate-object), the possible omissions are (? ,P,O),(S,? , o) or (s, p,), just like there is no triplet in the knowledge base, so it is necessary to predict what the missing entity or relationship is.

In fact, a lot of missing knowledge can be inferred from the acquired knowledge, and sometimes this process is also called link prediction.

Note: Sometimes knowledge is not missing, but new, that is, a new triple appears, which was unknown in the original knowledge base. At this time, it needs to be added to the knowledge base as new knowledge, but this situation is not completed in the traditional sense.

(1) Probability completion method based on random walk

(2) Complement method based on representation learning.

Knowledge map embedding process:

① Structural embedding characterization

② Tensor neural network method

③ Matrix decomposition method

④ Translation methods

(3) Other completion methods

Cross-knowledge base completion method, knowledge base completion method based on information retrieval technology and common sense knowledge completion in knowledge base.

Challenges and main development directions:

(1) Solve the sparsity of long tail entities and relationships.

There will be many examples of the relationship between celebrities and stars, but there are few examples of ordinary people, but they are a dime a dozen, which leads to the sparse examples of their related relationships, and this situation will become more obvious with the increase of the number.

(2) One-to-many, many-to-one and many-to-many problems of entities.

For large-scale data, it is not as simple as tens or tens of orders of magnitude, but hundreds of orders of magnitude. Traditional solutions can't be effective, and Shenzhen can't solve this order of magnitude relationship learning problem at all.

(3) The dynamic increase and change of triplet lead to the aggravation of the dynamic change of KG.

New knowledge is constantly produced, and the previous knowledge may be later proved to be wrong or need to be revised. All these will make the process of knowledge completion need to be revised and changed. How to make the knowledge map completion technology adapt to the dynamic change of KG is becoming more and more important, but this technology has not attracted enough attention.

(4) The predicted path length of the relationship in 4)KG will continue to increase.

The length of relationship prediction reasoning is limited, but when a large-scale knowledge map flashes, the relationship path sequence between entities will become longer and longer, which requires a more efficient model to describe a more complex relationship prediction model.