Current location - Education and Training Encyclopedia - Graduation thesis - Which is more accurate, Word format or PDF format?
Which is more accurate, Word format or PDF format?
The first thing to tell you is that both word and pdf formats can be recognized. In addition to these two formats, China HowNet also supports other paper formats. The paper duplicate checking system of China HowNet supports uploading in various formats, such as Doc, DOCX, WPS, CAJ, TXT, PDF, KDH, NH, RTF, and each format can be uploaded, so as long as it is above, use the corresponding genuine software to read the content and analyze it.

But whether the test results are different depends on whether the contents read are consistent. In practice, it is found that the content of the same article will be different when edited with different writing tools, so the test results between pdf and word formats are still different.

First of all, PDF format is difficult to be converted into format or edited because of its strong privacy. Therefore, after the PDF document is submitted to the HowNet duplicate checking system, the HowNet duplicate checking system needs to scan and analyze the contents in the PDF. During this analysis, the parsing may be unsuccessful and garbled. If it is garbled, the result of duplicate checking by HowNet is completely different from the normal result.

The second point: If there are a lot of footnotes and endnotes in the paper, or there are a lot of contents in the header and footer, the duplicate checking system in normal word documents can distinguish these footnotes and endnotes from the header and footer, and these contents will not participate in duplicate checking together with the text. In PDF format, the above parts are likely to be recognized as text and participate in duplicate checking together, so the duplicate checking results will be different.

The third point: When China HowNet checks the duplicate, the pictures and formulas in Word can't be detected at all, but in PDF mode, the pictures and formulas will be treated as text and detected, so the identified contents are quite different from the pictures and formulas and are detected as plagiarism, and PDF is unreasonable.

The fourth point: the same paper, in word and PDF, may cause subtle differences in chapter and structure. These subtle differences may lead to different paragraph division in duplicate checking of HowNet, and directory identification may also lead to different paragraph division, which will affect the content of the paper and the result of duplicate checking of HowNet.

(Academic Hall provides more paper knowledge)