The success and popularity of social network systems, such
as del.icio.us, Facebook, MySpace, and YouTube, have generated
many interesting and challenging problems to the research
community. Among others, discovering social interests
shared by groups of users is very important because it
helps to connect people with common interests and encourages
people to contribute and share more contents. The
main challenge to solving this problem comes from the difficulty
of detecting and representing the interest of the users.
The existing approaches are all based on the online connections
of users and so unable to identify the common interest
of users who have no online connections.
In this paper, we propose a novel social interest discovery
approach based on user-generated tags. Our approach
is motivated by the key observation that in a social network,
human users tend to use descriptive tags to annotate
the contents that they are interested in. Our analysis on
a large amount of real-world traces reveals that in general,
user-generated tags are consistent with the web content they
are attached to, while more concise and closer to the understanding
and judgments of human users about the content.
Thus, patterns of frequent co-occurrences of user tags can
be used to characterize and capture topics of user interests.
We have developed an Internet Social Interest Discovery system,
ISID, to discover the common user interests and cluster
users and their saved URLs by different interest topics. Our
evaluation shows that ISID can effectively cluster similar
documents by interest topics and discover user communities
with common interests no matter if they have any online
connections.