In China, these three systems are HowNet/VIP/Wanfang, and the resources inside
In China, these three systems are HowNet/VIP/Wanfang, and the resources inside are constantly updated. Every year, except for confidentiality requirements, graduates' papers are basically collected by these three systems as a comparative resource library. You can't be careless. HowNet /VIP/ Wanfang HowNet is not open to individuals, but Weifang is open to individuals. Wanfang does not test Internet and English, but HowNet and VIP test Internet and English. At present, all schools must pass the exams for master's and doctoral dissertations to qualify. Undergraduate graduates, most of whom are 2 1 1 engineering key universities, take spot checks to check the repetition of undergraduate graduation thesis. Plagiarism or citation rate is too high, once it is found to be over 30%, the consequences are quite serious. Less than 50% of the students are similar, their graduation will be postponed, and more than 50% will have their degrees cancelled. After studying tens of thousands of dollars in college, plus several years, I can't find a job and get a degree, which is very sad. However, all detection systems are machines and have inherent detection principles. As long as you know the internal inspection principle, system algorithm and law, you can still pass the inspection and graduate easily through repeated revision of the inspection report.
Issues requiring special attention:
Summarize several common problems:
First, some books are very old, and these materials are not in detection systems such as HowNet. Is it safe to copy large sections? Some students also think that most of the articles in the database are articles from previous student papers and periodicals, and books and government work reports have not yet been put into storage, so copying books directly will generally not be "recruited".
A: These behaviors are risky. First, although China HowNet doesn't include books, there may be a classmate or teacher who plagiarized the same content and has published his plagiarized paper. China HowNet can include the full text of A's article in the database, so if you copy the same content again, it is likely to point to A's article in the paper test, which will be considered plagiarism.
"But if someone plagiarizes a book a few years ago, it will still be detected, so everyone will choose the new book published in the last two years to plagiarize." But new books may be copied or copied by others. In addition, when reviewing the manuscript, the experts have rich experience and theoretical level, and your long speech may be discovered by these old experts, and the result will be very sad!
Second, there are a lot of related materials on some web pages now. Can you copy the above content when writing a paper? For example, Baidu Library and Douding? "。
A: It's also dangerous. Web pages are largely from periodical websites, and many articles are extracted from periodical websites and copied by pasting N papers. In addition, some databases have taken web pages as one of the components of the database.
If 13 consecutive words are the same, it can be detected that you can express the original content with new words, with similar meanings. It is best to use the associative method, that is, read it again and speak it in your own language, but be professional, that is, replace synonyms with technical terms as much as possible and use different words to achieve the same meaning. For example, change active sentences into passive sentences, change sentence patterns and replace them with agreed words or technical terms. Also pay attention to the framework of the paper.
Ways to reduce the plagiarism rate:
1 divide many small paragraphs to reduce the plagiarism rate.
2. Many books are not in the test database, such as works. Extractable
3. Chapter conversion can't reduce the copy rate.
4. References in the paper, but in plagiarism detection software, for example, an article has 5000 words, and 1% of the articles have 50 words. If more than 50 words are plagiarism, even if references are added, it is also plagiarism.
As long as there are more than 20 units of words matching, it is considered plagiarism.
Modification method:
The first is the change of words. The professional vocabulary in the article can be kept, and synonyms can be changed as much as possible; Secondly, change the description in the text, such as inverted sentences, passive sentences and active sentences; Disrupt the order of paragraphs, divide the paragraphs when copying the original, and reorganize them.
HowNet duplicate checking is based on sentences. That is, the article is divided into sentences, and then the sentences are compared with the articles in HowNet database. If the main contents are the same (that is, notional words, such as nouns, verbs, professional vocabulary, etc. ) is marked in red. If a large number of red sentences appear in a paragraph, it is included in the repetition rate of the paper. According to my own experience, the best way to avoid repetition is to write relevant paragraphs in other people's papers in my own language. For example, changing the order between sentences, and more importantly, changing the subject-predicate-object structure of sentences. According to this method, the repetition rate of my paper is about 3%, no problem. I hope I can help you! Here's the thing. Because it is basically based on sentences. However, judging from the current situation, it is actually aimed at the content of each paragraph, breaking up all the sentences in this paragraph, and then comparing and checking them sentence by sentence. For example, a paragraph in your paper contains four sentences: A, B, C and D, and a paragraph in an article in the database contains four sentences: E, F, G and H, so when comparing, A, B, C and D should be compared at E, F, G and H respectively. A little more stupid, that is 16 times. In this case, simply changing the sentence order is not useful, and the sentence structure must be changed.
1. Comparison and Selection of Paper Detection Systems in Different Databases
As we all know, there are three troikas in the database: China Knowledge Network (cnki), Wanfang and VIP; In general, the thesis detection system for master's and doctoral dissertations in colleges and universities is HowNet (I'm not sure about undergraduate dissertations, but 80% of them should also be a duplicate checking system), because HowNet is the most comprehensive and powerful database for national dissertations and periodical papers, followed by VIP's, which is not worth mentioning. The general database collection process is like this. Each database goes to each university to contact the graduation thesis resources of the school. Basically, several databases are monopolized. Give it to HowNet and not to HowNet. Because HowNet is powerful and offers many benefits, most colleges and universities submit resources to HowNet. Why am I saying this? Many students don't know whether to choose HowNet, Wanfang or VIP when detecting plagiarism. Hownet has absolute authority and monopoly, which is consistent with the test results of the school, so it dares to be so awesome and ask for such a high price. However, I also heard that the high price is because HowNet can only test 5,000 words at a time, so there are 20,000-30,000 master's degrees, and it needs to be submitted many times before it can be tested. I have not been confirmed whether this is the case.
Second, the working principle and countermeasures of HowNet detection system
First, the paper detection of HowNet is the whole upload. After uploading the paper, the system will automatically detect the chapter information of the paper. If there is automatically generated directory information, then the system will check the papers chapter by chapter, otherwise it will automatically check every 10,000 words or so.
Secondly, it is normal for some students to report that they explicitly quoted or copied paragraphs or sentences from other documents in their own paragraphs, and why they were not detected. China Knowledge Network has set a threshold for the sensitivity of this detection system, which is about 3%. In terms of paragraphs, plagiarism or quotation below 3% cannot be detected, which is common in clauses or small concepts in large paragraphs. For example, if the paragraph 1 has 10000 words, a single document that references 100 words or less will not be detected. In fact, here also tells the students a method of revision, that is, never choose an article to quote from paragraph plagiarism, try to choose as many documents as possible, and intercept a few words from one article, so that it will not be found out.
Thirdly, besides the second point, there are other ways to modify the words marked in red, such as changing words, sentences and description (changing the original sentence into inverted sentence, passive sentence, active sentence and so on). ), disrupt the order of paragraphs and replace keywords and key sentences. Practice has proved that the combination of the above methods can effectively reduce the replication rate and ensure the smooth passage.
For example, the following sentence:
Overheating in overheating fault is different from heating in normal operation of transformer. During normal operation, its heating source comes from winding and iron core, that is, copper loss and iron loss, while the overheating fault of transformer is due to the accelerated deterioration of insulation caused by effective thermal stress, which has a medium level of energy density.
Almost marked in red, indicating that there is overlap with similar documents, and the similarity is high. After combining the above methods, this sentence can be changed to:
Overheating in overheating fault is easily confused with heating in normal operation of transformer. Overheating is caused by copper loss and iron loss in its winding and iron core, which is heating in normal operation, while overheating fault of transformer is accelerated deterioration of insulation caused by effective thermal stress, with moderate capacity density.
Fourth: the new use of Google.
If all the above students' "anti-plagiarism" secrets are still within the scope of understanding, then this "anti-plagiarism" method is jaw-dropping, thinking that they have met Martians. This method is named "Google Method". "The Google method is to find a ready-made paper, translate every paragraph of the paper into English with Google Online, and then translate all the translated English back into Chinese with Google Online. Suddenly, it looks like the original; But on closer inspection, every sentence is different! As long as you change a small part of the language barrier yourself, it's over. "
The system principle of hownet paper detection is that 13 words that appear similar or plagiarized continuously will be marked in red, but the precondition in 3 must be met: that is, the sum of A documents you quoted or plagiarized in each detection paragraph should reach 5%. If half of the words in 13 are similar and half are suspected to be similar, then you must change the sentence pattern and change it into professional terms, carefully and thoroughly, remember, remember.
HowNet detection range:
China Academic Journals Online Publishing Database
Chinese Doctoral Dissertation Full-text Database
China excellent master's degree thesis.
Full-text database of important conference papers in China
China important newspaper full-text database China patent full-text database
English database of Internet resources (including English materials of periodicals, doctoral programs and conferences, as well as Springer, Taylor &; Francis periodical database, etc. Give priority to the publication of literature libraries, academic literature libraries in Hong Kong, Macao and Taiwan and Internet literature resources.
Detailed description of hownet system calculation standard:
1. After reading the introduction of this system, I have a question. This system is good for text copy recognition, but what about other contents, such as data and charts? Isn't it useless to detect it?
Among all kinds of academic misconduct, plagiarism is the most common and serious. At present, the detection system has reached a high level. The detection of plagiarism and tampering of charts, formulas and data is currently under development and has made great progress. You are welcome to continue to pay attention to the progress of this detection system and put forward more critical and constructive opinions and suggestions.
2. According to this system, less than 39% is displayed in yellow, so does it mean that it is within the tolerable limit? Recently, I read the news that the national social science fund project of a teacher in Shanghai University was cancelled because two papers he published were plagiarized, accounting for 25% and 30% respectively. Please specify what the warning line is.
Percentage only describes the proportion of overlapping words in the detected documents, and does not refer to the plagiarism of the documents. It can only be said that the greater the percentage, the more overlapping words, and the greater the possibility of plagiarism. Whether it is plagiarism or not and the severity of plagiarism need to be decided by experts after review.
3. How to prevent the academic misconduct detection system of dissertations from becoming a platform for personal revenge?
This is something we are seriously considering. At present, this detection system is only used by users at the institutional level. We have established a strict management process. At the same time, technically, we have also taken various measures to prevent malicious acts as much as possible, including a series of strict identity authentication and login.
4. The minimum detection unit is one sentence, so you can't detect one or two words in each sentence?
We also deal with sentences accordingly, and have an algorithm of sentence similarity. It is not the same sentence that is judged to be the same. Sentences have sentence-level similarity algorithms, and paragraphs have paragraph-level similarity algorithms. Calculating whether a document or paragraph is similar to other documents is based on this.
5. If the original word is taken from relevant books, but the word has been copied from relevant documents in the database, that is to say, the previous article also took the same word from relevant books, but the words marked in my paper are from relevant books, is this academic plagiarism?
The detection system can't draw a conclusion, whether it is plagiarism or not, and finally there is manual review. So if it is the situation you describe, experts will make corresponding judgments. Our system only provides all kinds of clues and basis, so that people can quickly grasp the information of test literature.
6. The authority of HowNet detection system?
The detection system of academic misconduct documents has not reached a conclusion, that is, the detection system does not characterize the detection documents, but only shows the similarities between the detection documents and other published documents and lists objective facts, and whether such detection documents belong to academic misconduct needs the final examination and confirmation of experts.
HowNet related spot check regulations:
If there are regulations, you can modify them first. You can reply after the modification. If the second revision fails, it will be over, and the paper or design will be handed in within the next 4 months. This is based on 30% plagiarism. If you plagiarize more than 50%, you will have to hand in your paper or design within the next 4 months. 1. Undergraduate graduation design (thesis) identified as plagiarism, including the total number of words repeated with other people's existing papers and works, is between 30% and 50% (including 50%), which needs to be revised by myself. After revision, you can take part in the college defense only after passing the test again. Those who are still unqualified after the second interview are deemed to have graduated. You must submit the rewritten graduation design (thesis) after 3 months, and then take part in the defense after passing it. 2. Undergraduate graduation design (thesis) identified as plagiarism, if the total number of words repeated with other people's existing papers and works exceeds 50%, will be directly treated as graduation. You must submit the rewritten graduation design (thesis) after 4 months, and then take part in the defense after passing it.