Artikel in einem Konferenzbericht,

HumAID: Human-Annotated Disaster Incidents Data from Twitter

F. Alam, U. Qazi, M. Imran, und F. Ofli.
(2021)cite arxiv:2104.03090Comment: Accepted in ICWSM-2021, Twitter datasets, Textual content, Natural disasters, Crisis Informatics.

Zusammenfassung

Social networks are widely used for information consumption and dissemination, especially during time-critical events such as natural disasters. Despite its significantly large volume, social media content is often too noisy for direct use in any application. Therefore, it is important to filter, categorize, and concisely summarize the available content to facilitate effective consumption and decision-making. To address such issues automatic classification systems have been developed using supervised modeling approaches, thanks to the earlier efforts on creating labeled datasets. However, existing datasets are limited in different aspects (e.g., size, contains duplicates) and less suitable to support more advanced and data-hungry deep learning models. In this paper, we present a new large-scale dataset with ~77K human-labeled tweets, sampled from a pool of ~24 million tweets across 19 disaster events that happened between 2016 and 2019. Moreover, we propose a data collection and sampling pipeline, which is important for social media data sampling for human annotation. We report multiclass classification results using classic and deep learning (fastText and transformer) based models to set the ground for future studies. The dataset and associated resources are publicly available.https://crisisnlp.qcri.org/humaid_dataset.html

BibTeX-Schlüssel: alam2021humaid
Eintragstyp: inproceedings
Jahr: 2021
URL: http://arxiv.org/abs/2104.03090
Hinweis: cite arxiv:2104.03090Comment: Accepted in ICWSM-2021, Twitter datasets, Textual content, Natural disasters, Crisis Informatics

Nutzer

Kommentare und Rezensionenanzeigen / verbergen

Bitte melden Sie sich an um selbst Rezensionen oder Kommentare zu erstellen.

Zitieren Sie diese Publikation

@inproceedings{alam2021humaid, abstract = {Social networks are widely used for information consumption and dissemination, especially during time-critical events such as natural disasters. Despite its significantly large volume, social media content is often too noisy for direct use in any application. Therefore, it is important to filter, categorize, and concisely summarize the available content to facilitate effective consumption and decision-making. To address such issues automatic classification systems have been developed using supervised modeling approaches, thanks to the earlier efforts on creating labeled datasets. However, existing datasets are limited in different aspects (e.g., size, contains duplicates) and less suitable to support more advanced and data-hungry deep learning models. In this paper, we present a new large-scale dataset with ~77K human-labeled tweets, sampled from a pool of ~24 million tweets across 19 disaster events that happened between 2016 and 2019. Moreover, we propose a data collection and sampling pipeline, which is important for social media data sampling for human annotation. We report multiclass classification results using classic and deep learning (fastText and transformer) based models to set the ground for future studies. The dataset and associated resources are publicly available.\url{https://crisisnlp.qcri.org/humaid_dataset.html}}, added-at = {2021-04-08T10:47:20.000+0200}, author = {Alam, Firoj and Qazi, Umair and Imran, Muhammad and Ofli, Ferda}, biburl = {https://www.bibsonomy.org/bibtex/2cda3e75fa9a7bead8dbb8366b4fe5343/firojalam}, description = {HumAID: Human-Annotated Disaster Incidents Data from Twitter}, interhash = {a362e02297f62b37c55596f82f272ef9}, intrahash = {cda3e75fa9a7bead8dbb8366b4fe5343}, keywords = {Twitter datasets}, note = {cite arxiv:2104.03090Comment: Accepted in ICWSM-2021, Twitter datasets, Textual content, Natural disasters, Crisis Informatics}, timestamp = {2021-04-08T10:47:20.000+0200}, title = {HumAID: Human-Annotated Disaster Incidents Data from Twitter}, url = {http://arxiv.org/abs/2104.03090}, year = 2021 }

BibSonomy

HumAID: Human-Annotated Disaster Incidents Data from Twitter

Zusammenfassung

Tags

Nutzer

Kommentare und Rezensionenanzeigen / verbergen

Zitieren Sie diese Publikation

Mehr Zitationsstile

Suchen auf