2. Limitation of data set quality and coverage: The quality and diversity of articles generated by AI algorithm are limited by the input data set. If the quality of the data set is not high or the coverage is narrow, the articles generated by AI will lack originality, resulting in high duplicate checking rate.
3. Duplicate checking rate refers to the probability that a paper is detected as plagiarism when it passes through the duplicate checking system, also known as repetition rate or similarity. Duplicate checking rate is an index to measure the proportion of repeated or similar content in a text. In academic writing, publishing, content creation and other fields, duplicate checking rate is an important indicator, because it can help identify and avoid plagiarism.