Inproceedings,

WISHFUL - Website Extraction of Institutional Sources with Heterogeneous Factors and User-Driven Linkage

, , and .
Information Integration and Web Intelligence: 25th International Conference, IiWAS 2023, Denpasar, Bali, Indonesia, December 4–6, 2023, Proceedings, page 20–26. Berlin, Heidelberg, Springer-Verlag, (2023)
DOI: 10.1007/978-3-031-48316-5_3

Abstract

Extracting information from diverse websites is increasingly important, especially for analyzing vast data sets to detect trends, gain insights. By studying job ads, researchers can monitor employer demand shifts, assisting policymakers in aiding affected workers and industries. However, extraction faces challenges like varied website formats, dynamic content, and duplicate data. This study introduces a method for extracting data from diverse private university websites involving keyword identification, website categorization, and extraction pipelines.

Tags

Users

  • @kmd-ovgu

Comments and Reviews