Current location - Education and Training Encyclopedia - Graduation thesis - What is the mechanism of paper duplicate checking?
What is the mechanism of paper duplicate checking?
Paper duplication checking is a method to detect academic misconduct, which is mainly used to detect whether plagiarism and plagiarism exist in papers. Its mechanism mainly includes the following steps:

1. Text pre-processing: This is the first step of duplicate checking, mainly dealing with the original text, including removing stop words, punctuation, numbers, etc. In order to facilitate the subsequent comparison and analysis.

2. Feature extraction: After preprocessing, it is necessary to extract the features of the text, which are usually converted into vectors by bag model or TF-IDF model.

3. Similarity calculation: After feature extraction, the similarity of two articles needs to be calculated. Commonly used methods are cosine similarity and Jaccard similarity.

4. Threshold judgment: According to the set threshold, judge whether the two articles are similar. If the similarity exceeds the threshold, it is considered that there is a possibility of plagiarism or plagiarism.

5. Result feedback: Finally, feedback the duplicate checking results to the author. If there is plagiarism or plagiarism, the author needs to correct it.

It should be noted that although the duplicate checking mechanism can effectively detect most plagiarism and plagiarism, it cannot completely replace manual auditing. Because some plagiarism and plagiarism may not be obvious, or complicated rewriting techniques are used, it is necessary to conduct in-depth analysis and judgment manually. In addition, the duplicate checking mechanism can not guarantee complete justice, because it may misjudge some normal quotations and references as plagiarism or plagiarism. Therefore, check-up is only a means to prevent academic misconduct, and we can't rely entirely on check-up to ensure academic justice and fairness.