Abstract

PhD, In an age when people are predisposed to report real-world events through their social
media accounts, many researchers value the advantages of mining such unstructured
and informal data from social media. Compared with the traditional news media, online
social media services, such as Twitter, can provide more comprehensive and timely
information about real-world events. Existing Twitter event monitoring systems analyse
partial event data and are unable to report the underlying stories or sub-events in realtime.
To ll this gap, this research focuses on the automatic identi cation of content for
events and sub-events through the analysis of Twitter streams in real-time.
To full the need of real-time content identification for events and sub-events, this research
First proposes a novel adaptive crawling model that retrieves extra event content
from the Twitter Streaming API. The proposed model analyses the characteristics of
hashtags and tweets collected from live Twitter streams to automate the expansion of
subsequent queries. By investigating the characteristics of Twitter hashtags, this research
then proposes three Keyword Adaptation Algorithms (KwAAs) which are based
on the term frequency (TF-KwAA), the tra c pattern (TP-KwAA), and the text content
of associated tweets (CS-KwAA) of the emerging hashtags. Based on the comparison
between traditional keyword crawling and adaptive crawling with di erent KwAAs, this
thesis demonstrates that the KwAAs retrieve extra event content about sub-events in
real-time for both planned and unplanned events.
To examine the usefulness of extra event content for the event monitoring system, a
Twitter event monitoring solution is proposed. This \Detection of Sub-events by Twit-
ter Real-time Monitoring (DSTReaM)" framework concurrently runs multiple instances
of a statistical-based event detection algorithm over different stream components. By
evaluating the detection performance using detection accuracy and event entropy, this
research demonstrates that better event detection can be achieved with a broader coverage
of event content., School of Electronic Engineering
 Computer Science (EECS), Queen Mary University of London (QMUL) 
China Scholarship Council (CSC),

Description

Real-time Content Identification for Events and Sub-Events from Microblogs.

Links and resources

Tags

community

  • @asmelash
  • @dblp
@asmelash's tags highlighted