In academic circles, similarity can be used to judge the research direction of a research field and the quality of published papers. If the similarity between the two papers is high, it may indicate that the two papers have done similar research on the same issue, or it may imply that one of them is suspected of plagiarism.
The calculation of similarity rate will generally consider the vocabulary, structure and expression of the paper. The most common calculation method is to calculate similarity rate according to word frequency statistics. The calculation of similarity rate can be accomplished by many algorithms, such as cosine similarity, k-gram algorithm, Jaccard coefficient and so on.
Similarity is often used as an important index in academic evaluation and plagiarism detection. If the similarity rate is high, it may cause experts to question and affect the results of the paper.
Of course, other factors need to be considered in comparing similarity, such as the title of the paper, the author, the published journals and so on. Only by combining these factors to evaluate can we better understand the actual situation of an academic paper.