Current location - Education and Training Encyclopedia - Graduation thesis - What is the coincidence rate and principle of duplicate checking? How many consecutive words are coincident?
What is the coincidence rate and principle of duplicate checking? How many consecutive words are coincident?
The principle is to compare the words in the database; Thirteen consecutive words overlap.

After the whole paper is uploaded, the system will automatically detect the chapter information of the paper according to the directory generated by the article, and then the system will detect the chapter of the paper, so that the copy ratio of each single chapter can be obtained, and the directory is grayed out without participating in the text detection; Otherwise, it will be automatically segmented and detected according to 10000 or so characters. At the same time, the directory may be detected as text, and if it is duplicated, it will be marked as red.

China Knowledge Network has set a threshold for the sensitivity of this duplicate checking system, which is 5%. In terms of paragraphs, plagiarism or quotation below 5% cannot be detected, which is common in clauses or small concepts in large paragraphs. For example, if the detection paragraph 1 has 10000 words, a single document with less than 500 words will not be detected.

The condition of online paper detection is that 13 words with continuous similarity or plagiarism will be marked as red, but the precondition in 3 must be met: that is, the total number of words in a document you quoted or plagiarized and the number of words in each detected paragraph must reach more than 5% to be detected as red.

The network detection system will automatically identify references, and references will not participate in text detection. In addition, if excluded, the references in the HowNet test report are displayed in gray font, indicating that they did not participate in the test. Of course, if the format of the reference is completely correct and standardized, this will be automatically excluded. Otherwise, references will be detected as text, which will cause all references to be marked in red. Higher grades!