@khilgenberg

Uncovering Social Spammers: Social Honeypots + Machine Learning.

, , and . Proceeding of the 33rd international ACM SIGIR conference on Research and development in information retrieval, page 435--442. New York, NY, USA, ACM, (2010)
DOI: 10.1145/1835449.1835522

Abstract

Web-based social systems enable new community-based opportunities for participants to engage, share, and interact. This community value and related services like search and advertising are threatened by spammers, content polluters, and malware disseminators. In an effort to preserve community value and ensure longterm success, we propose and evaluate a honeypot-based approach for uncovering social spammers in online social systems. Two of the key components of the proposed approach are: (1) The deployment of social honeypots for harvesting deceptive spam profiles from social networking communities; and (2) Statistical analysis of the properties of these spam profiles for creating spam classifiers to actively filter out existing and new spammers. We describe the conceptual framework and design considerations of the proposed approach, and we present concrete observations from the deployment of social honeypots in MySpace and Twitter. We find that the deployed social honeypots identify social spammers with low false positive rates and that the harvested spam data contains signals that are strongly correlated with observable profile features (e.g., content, friend information, posting patterns, etc.). Based on these profile features, we develop machine learning based classifiers for identifying previously unknown spammers with high precision and a low rate of false positives.

Links and resources

Tags

community

  • @becker
  • @dimitargn
  • @beate
  • @dblp
  • @khilgenberg
@khilgenberg's tags highlighted