1. Requirements for detecting document format
Baidu's academic paper assistant duplicate checking system supports the whole article uploading for thesis detection, but the document format may have an impact on the detection results. The final submission format is to submit the test according to the specification, which can minimize the impact. The algorithm of the system is complex, and there may be small plagiarism that is not detected for the first time after each paper revision (the small plagiarism will not exceed 200? Words, and the second revision of the paper will generally greatly reduce the plagiarism rate)? .
2. Check the comparison database.
At present, the published comparative databases include: general database of academic journals, full-text database of degree (doctor, master and bachelor), full-text database of conference papers, full-text database of newspapers, full-text database of books, full-text database of blogs, online literature database, internet literature database, English database, yearbook database, standard database and user database. Some individuals compare the database, and some books are not checked by Baidu's academic thesis assistant. Baidu's duplicate checking system for academic papers is a very mature detection system in the market at present, with fast detection speed and accurate results.
3. Questions about the results of paragraphs and chapters.
After the paper is uploaded, the system will automatically detect the chapter information of the paper. If the school's directory settings meet the judgment conditions of chapter division built into Baidu's academic paper assistant system, the system will check the results, otherwise the results will be segmented. Threshold mainly involves segmentation or chapter division. It is suggested that whether it is divided into chapters or sections, it should be consistent with the school.
4. How to detect the quoted content?
Quoting other people's paragraphs or sentences, undetected, and the quotation indicates the source, is plagiarism. "First of all, whether the citation belongs to plagiarism has nothing to do with the labeling of the source, and whether the citation can be detected has nothing to do with the accuracy of the system. These all depend on the threshold of the system. Repeatedly check the system to set a threshold for detecting the sensitivity of the system. Take the usual threshold of 3% as an example. If you calculate the number of words in a paragraph (or chapter), plagiarism or quotation of a single document below 3% can't be detected, which is common in small concepts such as clauses or large paragraphs. For example, suppose the detection paragraph 1 (Chapter 1) has 10000? Word, and then quote one? 300 words (10000? Multiplied by 3%=300), it will not be detected. If more than 300 words are quoted from the B document, the plagiarism of the B document distributed in the first chapter will be marked in red, no matter where it is located in the first chapter, even if it is interrupted into a sentence, as long as it exceeds 20 words, it will be marked.
So usually, it is difficult to detect the citation of an article by selecting as many documents as possible and intercepting a few words from an article.
However, as for why some quotations are plagiarized, or because of the previous threshold problem, if the standard is higher than the set threshold, it will be plagiarized uniformly. Once it exceeds the standard, even if the quotation is marked, it will not help. For example, there are 5000 in the first chapter of a paper? Word, so in the first chapter, you can only quote one document 150? Below the word, otherwise it will be considered plagiarism by the system. Chapter 2: 4000? Word, then you can only quote one? The document is less than 120 words, otherwise it will be considered plagiarism by the system. The third chapter 8000? By the way, the fourth chapter 7000? Words, 240 respectively? Is there 2 10 under the word? Below the word, and so on. To sum up, the calculation method of citation exceeding the standard is calculated by chapter, just like the calculation method of plagiarism.
5. How does the system calculate plagiarism?
The condition for detecting plagiarism is that 13 or more than 20 character units of similarity or plagiarism will be marked in red.
6. plagiarized modification
Besides those mentioned in point 4, there are other ways to modify the words marked in red, such as changing words, sentences and descriptions (changing the original sentences into inverted sentences, passive sentences, active sentences, etc.). ), disrupt the order of paragraphs, delete keywords, key sentences, etc. The combination of the above methods can effectively reduce the copying rate and ensure the smooth passage. Generally speaking, it is necessary to keep different from the original sentence on the premise of ensuring the smoothness of the revised sentence.
The following steps are used:
( 1)? First of all, Baidu searches for "Baidu Academic Paper Assistant" to check the network;
Baidu Academic Paper Assistant: /u/ Bi Ye? Label = Home Page
Open the following picture:
(2) Then click "Repeat Detection"
(3) Fill in the title of the paper and the author of the paper (please fill in the author's real name if you have published this article or quoted your previous articles), copy the contents of the paper into the text for testing or click "Select File" to upload the paper to be tested, and then click "Submit Order" to pay the test report;
(4) Download the test report. The test time is generally 1 minute to 2 hours, and the average time is within 1-3 minutes. The peak exam time may be extended, so wait patiently.
(5) After the duplicate checking is completed, you can download the test report of Baidu Academic Paper Assistant Duplicate Checking System. For Baidu's academic paper assistant system, you can view the report online or download the report in html format for duplicate checking.
(6) When checking the cross-reference report, you will find that the cross-reference report is the same as the top full-text citation. If you look down, there will be two columns, one is the original content, and the other is the source of similar content. The main function of cross-reference report is to tell students where the plagiarized content comes from and what is similar to others, so that we can modify the content of the paper in a targeted manner.