copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Detecting spam web pages through content analysis

A. Ntoulas, M. Najork, M. Manasse, and D. Fetterly. WWW '06: Proceedings of the 15th international conference on World Wide Web, page 83--92. New York, NY, USA, ACM, (2006)
DOI: http://doi.acm.org/10.1145/1135777.1135794

Abstract

In this paper, we continue our investigations of "web spam": the injection of artificially-created pages into the web in order to influence the results from search engines, to drive traffic to certain pages for fun or profit. This paper considers some previously-undescribed techniques for automatically detecting spam pages, examines the effectiveness of these techniques in isolation and when aggregated using classification algorithms. When combined, our heuristics correctly identify 2,037 (86.2%) of the 2,364 spam pages (13.8%) in our judged collection of 17,168 pages, while misidentifying 526 spam and non-spam pages (3.1%).

Description

Detecting spam web pages through content analysis

Links and resources

BibTeX key: ntoulas2006spam
entry type: inproceedings
address: New York, NY, USA
booktitle: WWW '06: Proceedings of the 15th international conference on World Wide Web
year: 2006
pages: 83--92
publisher: ACM
location: Edinburgh, Scotland
isbn: 1-59593-323-9
DOI: http://doi.acm.org/10.1145/1135777.1135794
url: http://portal.acm.org/citation.cfm?id=1135794

@beate's tags highlighted

Cite this publication

search on

Meta data

Last update 17 years ago
Created 18 years ago

Comments and Reviews
(0)

There is no review or comment yet. You can write one!

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Detecting spam web pages through content analysis

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Detecting spam web pages through content analysis

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Detecting spam web pages through content analysis

Comments and Reviews
(0)