Automatic Detection and Visualisation of Overlap for Tracking of Information Flow
Key: Leh10-2
Author: Lasse Lehmann, Arno Mittelbach, Christoph Rensing, Ralf Steinmetz
Date: September 2010
Kind: In proceedings
Publisher: Verlag der Technischen Universität Graz, Austria
Book title: Proceedings of I-KNOW 2010, 10th International Conference on Knowledge Management and Knowledge Technologies
Keywords: Overlap Detection, Holinshed, ShingleCloud, String Matching, Information Flow
Abstract: The detection of redundant or reused passages in texts is an important basis for various tasks including tracking of information flow, plagiarism detection, origin detection, web search and information retrieval. Being able to track the evolution of a piece of information through different revisions or instances of documents can generally help to gain an impression of the document's background. In this paper we propose an efficient algorithm for detection of textual overlap between documents as well as a tool for its visualisation, created in the course of the Holinshed Project at the University of Oxford. The Evaluation on an annotated corpus shows that the proposed algorithm performs better than state of the art approaches.
View Full paper (PDF) | Download Full paper (PDF)

The documents distributed by this server have been provided by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, not withstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.