After the paper is submitted, HowNet will scan the submitted paper. At present, the paper text formats supported by HowNet are doc, docx, txt and pdf. After the paper is submitted, HowNet will transcode the paper to distinguish sentences, paragraphs, chapters, quotations, references and so on. In addition, HowNet rechecks the full-text upload.
The duplicate checking of HowNet papers is based on chapters. Eight consecutive words are judged as "repeated sentences" and 13 consecutive words are judged as "repeated paragraphs". Sentences or paragraphs judged to be repetitive will be detected before and after.
In order to detect "repeated sentences" or "repeated paragraphs", the following conditions need to be met: HowNet has set a threshold of 5% for the repetition rate of paper duplicate checking, and no plagiarism or citation below 5% can be detected in paragraph units.
Generally speaking, the detection algorithm of HowNet duplicate checking is mainly based on text similarity detection. By establishing a full-text database and collecting multiple documents for comparison, we can judge whether there are similar parts in the article. At the same time, HowNet also uses intelligent detection means, which can automatically identify citations, notes and other parts of the paper to avoid misjudgment.
The above information is for reference only. If in doubt, please visit official website.