Artikel in einem Konferenzbericht,

Hierarchical Web-Page Clustering via In-Page and Cross-Page Link Structures

C. Lin, Y. Yu, J. Han, und B. Liu.
Advances in Knowledge Discovery and Data Mining, Seite 222--229. Berlin, Heidelberg, Springer Berlin Heidelberg, (2010)

Zusammenfassung

Despite of the wide diversity of web-pages, web-pages residing in a particular organization, in most cases, are organized with semantically hierarchic structures. For example, the website of a computer science department contains pages about its people, courses and research, among which pages of people are categorized into faculty, staff and students, and pages of research diversify into different areas. Uncovering such hierarchic structures could supply users a convenient way of comprehensive navigation and accelerate other web mining tasks. In this study, we extract a similarity matrix among pages via in-page and crosspage link structures, based on which a density-based clustering algorithm is developed, which hierarchically groups densely linked webpages into semantic clusters. Our experiments show that this method is efficient and effective, and sheds light on mining and exploring web structures.

BibTeX-Schlüssel: 10.1007/978-3-642-13672-6_22
Eintragstyp: inproceedings
Adresse: Berlin, Heidelberg
Buchtitel: Advances in Knowledge Discovery and Data Mining
Jahr: 2010
Seiten: 222--229
Verlag: Springer Berlin Heidelberg
isbn: 978-3-642-13672-6

BibSonomy

Hierarchical Web-Page Clustering via In-Page and Cross-Page Link Structures

Zusammenfassung

Tags

Nutzer

Kommentare und Rezensionenanzeigen / verbergen

Zitieren Sie diese Publikation

Mehr Zitationsstile

Suchen auf