@wnpxrz

Syntactic clustering of the Web

, , , and . Computer Networks and ISDN Systems, 29 (8-13): 1157--1166 (September 1997)

Abstract

We have developed an efficient way to determine the syntactic similarity of files and have applied it to every document on the World Wide Web. Using this mechanism, we built a clustering of all the documents that are syntactically similar. Possible applications include a "Lost and Found" service, filtering the results of Web searches, updating widely distributed web-pages, and identifying violations of intellectual property rights.

Description

ScienceDirect - Computer Networks and ISDN Systems : Syntactic clustering of the Web

Links and resources

Tags

community