Paper plagiarism detection algorithm;
1. Paragraph and format of the file
Paper detection is basically the whole article upload. After uploading, the paper detection software first divides it, and the final manuscript format has a great influence on the plagiarism rate. The division of different paragraphs may cause small paragraphs of dozens of words not to be found. So we can reduce the plagiarism rate by dividing more small paragraphs.
2. Database database
Paper detection mostly aims at matching published graduation papers, periodical papers and conference papers, and some databases also contain some articles on the Internet. It is revealed here that many books are not in the testing database. Before, my friend extracted a lot of words from a research work and didn't find it out. It can be seen that this method is still effective.
3. Chapter conversion
Many students changed the order of chapters, or extracted different articles from different articles, which had little effect on the results of plagiarism detection. Therefore, plagiarism detection experts suggest that you should not think that copying a few or dozens of articles will pass.
4. Mark references
How to define quoting others' articles and copying others' articles in detection software? It's actually quite simple. In our paper, reference symbols are added, but in plagiarism detection software. All these are viewed in a unified way. The threshold of software is generally set to 1%. For example, an article is 5000 words, and 1% of the article is 50 words. If more than 50 words are plagiarism, even if references are added, it is also plagiarism.
5. Word number matching
Paper plagiarism detection system is strict, as long as more than 20 units of text match, it is considered plagiarism, but the premise is to meet the fourth point, the labeling of references.
A complete book of skills for duplicate checking and revision of papers;
Method 1: Translation of foreign documents
Consult foreign literature in the research field, especially those in high-level journals, such as Science, Nature and Water Resources. And translate the theoretical explanation into Chinese and put it in your own paper.
Advantages: 1, everyone's language habits are different, and the translated Chinese is bound to be different. Therefore, even if the same paragraph is translated by different people, there will be no plagiarism. Reading foreign literature can improve your English and broaden your professional horizons.
Disadvantages: Students with poor English, especially those with poor professional English, are more difficult to implement.
Method 2: Change the wording method
Rewrite the words in other people's papers, or change the sentence structure, active voice and passive voice, or change keywords, or increase or decrease. Of course, if it is a classic sentence, it should be quoted in a classic way.
Advantages: 1 After the text is modified, according to HowNet program and algorithm, as long as there are no repeated 13 continuous words and keywords, it will not be marked red. I know every word and sentence of the paper like the palm of my hand, and I know it by heart, and I will be like a duck to water when I reply.
Disadvantages: word-for-word revision is time-consuming and laborious.
Method 3: Cut the head and tail and change the word order in the middle.
If you change the words in other people's papers from beginning to end, leave a paragraph in the middle and change the rest into passive sentences, then the sentence pattern and structure will change, and then you can successfully avoid duplication by correcting the language defects yourself.
Advantages: convenient and quick, and can be modified in large sections.
Disadvantages If the literature is not good, it will be very hard, and it will take half a day.
Method 4: Transform picture method.
Cut the words in other people's papers into pictures and put them in your own papers. At present, the duplicate checking system of HowNet can only check words, not pictures and tables, so duplicate checking can be avoided.
Advantages: It is more convenient and faster than changing sentence order.
Disadvantages: If it is convenient to use, it is easy to see that the whole page is full of pictures, which will affect the number of words in the whole paper.
Method 5: Insert document method
Insert some quoted words into the paper in the form of word documents.
Advantages: this method is even better than the fourth method, because it can be re-edited in the inserted document in the future, and the image conversion method is not convenient for further modification.
Disadvantages: not found yet.
Method 6: Insert space method
Insert spaces between all the words in the article, and then adjust the spacing between empty words to a minimum. Because the basis of duplicate checking is based on words, spaces truncate words and naturally skip the duplicate checking system.
Advantages: Based on the principle of duplicate checking system, it has high reliability.
Disadvantages: the workload is huge, and the course can be completed through macros, but you need to learn the compilation of macros.
Method 7: Self-original method
Write your own paper, or don't copy and paste the original text when writing; Please add the quotation correctly.
Advantages: Basically, you will never worry about not passing the duplicate check, even if the threshold of the duplicate check system is lowered.
Disadvantages: If there are advantages and disadvantages, it is that after writing the graduation thesis, more brain cells may die.
Paper duplicate checking and revision method;
Duplicate checking is a matching process, based on sentences. If a sentence is repetitive, it is easy to judge the repetition, so:
1) If it is indeed a classic sentence, it will be indicated by superscript endnotes in the references.
2) If it is a general quotation, add all the omitted subjects and predicates in a lengthy way. In the original sentence. Anyway, one more word is victory.
3) You can also use the horizontal knife method to remove some sentence elements and replace them with some pronouns.
4) or foreign devil law, if the foreign names in the original text are in Chinese, use English directly, and if the English names are in Chinese, use Chinese names directly. If the names are in Chinese, find them all and change them into Chinese names.
5) deliberately add (comments) to the English side of some abbreviations. In short, every sentence can be changed, even if one word is added or one word is subtracted, it is a victory.
6) If it is a quotation mark, don't use a period after the quotation mark. If you write a full stop, you will be plagiarized (although I think it is a quotation), so try to use a semicolon before the quotation ends. Some people put quotation marks after the period, which is wrong and should be put before the period.
7) Text can be converted into tables, and tables are basically impossible to copy. When words become graphics, tables become graphics, which are clear at a glance and will never be aware of repeated plagiarism.