通过段落句子的重复率进行比对的论文检测模型探究
- 2014-05-01
- 编辑整理:早检测网
- 标签:
A new model for plagiarism-identification of scientific papers based on sentence similarity is presented.Large-scale texts are quickly detected with Local Word-Frequency Fingerprint(LWFF) to find suspected plagiarism ones.Sentence similarity is computed according to the Longest Sorted Common Subsequence(LSCS) between source texts and destination texts.The algorithm can mark plagiarism details,and show evidence. 【提出一种基于句子相似度的论文抄袭检测模型。利用局部词频指纹算法对大规模文档进行快速检测,找出疑似抄袭文档。根据最长有序公共子序列算法计算句子间的相似度,并标注抄袭细节,给出抄袭依据。在标准中文数据集SOGOU-T上进行的实验表明,该模型具有较强的局部信息挖掘能力,在一定程度上克服了现有的论文抄袭检测算法精度不高的缺点。】