MATEC Web Conf.
Volume 128, 20172017 International Conference on Electronic Information Technology and Computer Engineering (EITCE 2017)
|Number of page(s)||5|
|Section||Simulation Model and Algorithm|
|Published online||25 October 2017|
A Plagiarism Detection Algorithm based on Extended Winnowing
College of Information Engineering, Sichuan Agricultural University, Ya’an 625014, China
a Corresponding author: email@example.com
Plagiarism is a common problem faced by academia and education. Mature commercial plagiarism detection system has the advantages of comprehensive and high accuracy, but the expensive detection costs make it unsuitable for real-time, lightweight application environment such as the student assignments plagiarism detection. This paper introduces the method of extending classic Winnowing plagiarism detection algorithm, expands the algorithm in functionality. The extended algorithm can retain the text location and length information in original document while extracting the fingerprints of a document, so that the locating and marking for plagiarism text fragment are much easier to achieve. The experimental results and several years of running practice show that the expansion of the algorithm has little effect on its performance, normal hardware configuration of PC will be able to meet small and medium-sized applications requirements. Based on the characteristics of lightweight, high efficiency, reliability and flexibility of Winnowing, the extended algorithm further enhances the adaptability and extends the application areas.
© The authors, published by EDP Sciences, 2017
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. (http://creativecommons.org/licenses/by/4.0/).
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.