group :: l3s | BibSonomy

закладки (спрятать)7
показать
всё
только закладки
закладки на страницу
5
10
20
50
100
RSS
BibTeX
XML

1What’s New on the Web? The Evolution of the Web from a Search Engine Perspective
http://cs.brown.edu/courses/cs253/papers/www04-ntoulas.pdf
5 лет назад , @parismic
discover
comparability
web
crawl
discovercomparabilitywebcrawl
копироватьудалить
- Запись сообщества
- посмотреть историю записи
1WebIsaDB LOD
http://webisa.webdatacommons.org/
6 лет назад , @hotho
rdf
dataset
common
hypernym
crawl
rdfdatasetcommonhypernymcrawl
копироватьудалить
- Запись сообщества
- посмотреть историю записи
4Web Data Commons
http://webdatacommons.org/
8 лет назад , @hotho
semantic
rdf
web
dataset
common
data
relations
crawl
semanticrdfwebdatasetcommondatarelationscrawl
копироватьудалить
- Запись сообщества
- посмотреть историю записи
4Web Data Commons
http://webdatacommons.org/
10 лет назад , @jaeschke
lod
semantic
rdf
web
dataset
commoncrawl
data
microformat
open
crawl
linked
lodsemanticrdfwebdatasetcommoncrawldatamicroformatopencrawllinked
копироватьудалить
- Запись сообщества
- посмотреть историю записи
1blekko donates search data to Common Crawl | blekko
Blekko Blog | get the Latest Updates On SEO, Search Engines, SEO Tools, SEO Tutorials, SEO techniques, SEO APIs and much more
12 лет назад , @jaeschke
web
dataset
search
crawl
blekko
webdatasetsearchcrawlblekko
копироватьудалить
- Запись сообщества
- посмотреть историю записи
3sitemaps.org - Home
Sitemaps are an easy way for webmasters to inform search engines about pages on their sites that are available for crawling. In its simplest form, a Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL (when it was last updated, how often it usually changes, and how important it is, relative to other URLs in the site) so that search engines can more intelligently crawl the site. Web crawlers usually discover pages from links within the site and from other sites. Sitemaps supplement this data to allow crawlers that support Sitemaps to pick up all URLs in the Sitemap and learn about those URLs using the associated metadata. Using the Sitemap protocol does not guarantee that web pages are included in search engines, but provides hints for web crawlers to do a better job of crawling your site. Sitemap 0.90 is offered under the terms of the Attribution-ShareAlike Creative Commons License and has wide adoption, including support from Google, Yahoo!, and Microsoft.
15 лет назад , @jaeschke
web
metadata
engine
google
search
crawl
sitemap
webmetadataenginegooglesearchcrawlsitemap
копироватьудалить
- Запись сообщества
- посмотреть историю записи
1Stanford Computer Science
http://cs.stanford.edu/research/project.php?id=121
18 лет назад , @hotho
web
dataset
crawl
webdatasetcrawl
копироватьудалить
- Запись сообщества
- посмотреть историю записи

&lang;&lang;
⟨
1
&rang;
⟩⟩

публикации (спрятать)21
показать
всё
только публикации
публикации на страницу
5
10
20
50
100
расширенный...
RSS
BibTeX
RDF
дальше...

2CopyCat: Near-Duplicates Within and Between the ClueWeb and the Common Crawl
M. Fröbe, J. Bevendorff, L. Gienapp, M. Völske, B. Stein, M. Potthast, и M. Hagen. Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, (июля 2021)
2 лет назад , @jaeschke
web
detection
common
copycat
duplicate
crawl
webdetectioncommoncopycatduplicatecrawl
копироватьудалитьдобавить публикацию в буфер
2Analyzing the Web: Are Top Websites Lists a Good Choice for Research?
T. Alby, и R. Jäschke. Proceedings of the International Conference on Theory and Practice of Digital Libraries, стр. 11--25. Cham, Springer, (2022)
2 лет назад , @jaeschke
science
myown
web
tpdl
commoncrawl
archive
2022
alexa
crawl
research
sciencemyownwebtpdlcommoncrawlarchive2022alexacrawlresearch
копироватьудалитьдобавить публикацию в буфер
3Where are the Datasets? A case study on the German Academic Web Archive
Y. Younes, S. Tiesler, R. Jäschke, и B. Mathiak. Proceedings of the Web Archiving and Digital Libraries Workshop at JCDL 2022, (2022)
2 лет назад , @jaeschke
myown
german
unknowndata
web
dataset
academic
2022
gaw
crawl
myowngermanunknowndatawebdatasetacademic2022gawcrawl
копироватьудалитьдобавить публикацию в буфер
2How to Assess the Exhaustiveness of Longitudinal Web Archives: A Case Study of the German Academic Web
M. Paris, и R. Jäschke. Proceedings of the 31st ACM Conference on Hypertext and Social Media, New York, NY, USA, ACM, (2020)
4 лет назад , @jaeschke
myown
german
web
exhaustiveness
academic
archive
2020
regio
gaw
crawl
longitudinal
myowngermanwebexhaustivenessacademicarchive2020regiogawcrawllongitudinal
копироватьудалитьдобавить публикацию в буфер
2A Comparison over Focused Web Crawling Strategies
I. Avraam, и I. Anagnostopoulos. 2011 15th Panhellenic Conference on Informatics, стр. 245-249. (сентября 2011)
5 лет назад , @parismic
focused
strategy
crawl
focusedstrategycrawl
копироватьудалитьдобавить публикацию в буфер
2Data quality in web archiving
M. Spaniol, D. Denev, A. Mazeika, G. Weikum, и P. Senellart. Proceedings of the 3rd workshop on Information credibility on the web - WICOW \textquotesingle09, ACM Press, (2009)
5 лет назад , @parismic
web
coherence
crawl
quality
longitudinal
webcoherencecrawlqualitylongitudinal
копироватьудалитьдобавить публикацию в буфер
1Focused Web Crawling: A Generic Framework for Specifying the User Interest and for Adaptive Crawling Strategies
M. Ester, и H. Kriegel. (2001)
5 лет назад , @parismic
frontier
framework
focused
crawl
frontierframeworkfocusedcrawl
копироватьудалитьдобавить публикацию в буфер
2Web-crawling reliability
V. Cothey. J. Assoc. Inf. Sci. Technol., (2004)
5 лет назад , @parismic
web
reliability
crawl
webreliabilitycrawl
копироватьудалитьдобавить публикацию в буфер
1Efficient focused crawling based on best first search
S. Rawat, и D. Patil. 2013 3rd IEEE International Advance Computing Conference (IACC), стр. 908-911. (февраля 2013)
5 лет назад , @parismic
focused
exhaustive
crawl
focusedexhaustivecrawl
копироватьудалитьдобавить публикацию в буфер
4Intelligent crawling on the World Wide Web with arbitrary predicates
C. Aggarwal, F. Al-Garawi, и P. Yu. Proceedings of the tenth international conference on World Wide Web - WWW \textquotesingle01, ACM Press, (2001)
5 лет назад , @parismic
web
focused
crawl
webfocusedcrawl
копироватьудалитьдобавить публикацию в буфер
4Crawling the Web
G. Pant, P. Srinivasan, и F. Menczer. стр. 153--177. Springer Berlin Heidelberg, Berlin, Heidelberg, (2004)
5 лет назад , @parismic
web
book
scale
crawl
webbookscalecrawl
копироватьудалитьдобавить публикацию в буфер
3Crawling the Infinite Web: Five Levels Are Enough
R. Baeza-Yates, и C. Castillo. WAW, (2004)
5 лет назад , @parismic
level
web
scale
crawl
levelwebscalecrawl
копироватьудалитьдобавить публикацию в буфер
2Determining the Characteristic Vocabulary for a Specialized Dictionary using Word2vec and a Directed Crawler
G. Grefenstette, и L. Muchemi. (2016)cite arxiv:1605.09564.
5 лет назад , @parismic
vocabulary
web
embedding
crawl
vocabularywebembeddingcrawl
копироватьудалитьдобавить публикацию в буфер
11Focused crawling: a new approach to topic-specific Web resource discovery
S. Chakrabarti, M. van den Berg, и B. Dom. Computer Networks, 31 (11): 1623 - 1640 (1999)
5 лет назад , @parismic
web
topic
focused
crawl
webtopicfocusedcrawl
копироватьудалитьдобавить публикацию в буфер
1Longitudinal trends in academic web links
N. Payne, и M. Thelwall. Journal of Information Science, 34 (1): 3--14 (мая 2007)
5 лет назад , @parismic
comparability
web
link
crawl
longitudinal
comparabilityweblinkcrawllongitudinal
копироватьудалитьдобавить публикацию в буфер
3The discoverability of the web
A. Dasgupta, A. Ghosh, R. Kumar, C. Olston, S. Pandey, и A. Tomkins. Proceedings of the 16th international conference on World Wide Web - WWW \textquotesingle07, ACM Press, (2007)
5 лет назад , @parismic
discover
comparability
web
crawl
discovercomparabilitywebcrawl
копироватьудалитьдобавить публикацию в буфер
1A Comparison over Focused Web Crawling Strategies
I. Anagnostopoulos, Ioannis Avraam. (2011)
5 лет назад , @parismic
focused
crawl
focusedcrawl
копироватьудалитьдобавить публикацию в буфер
6iCrawl: Improving the Freshness of Web Collections by Integrating Social Web and Focused Web Crawling
G. Gossen, E. Demidova, и T. Risse. Proceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries, стр. 75--84. New York, NY, USA, ACM, (2015)
6 лет назад , @parismic
collection
graph
crawl
collectiongraphcrawl
копироватьудалитьдобавить публикацию в буфер
3Focused Crawl of Web Archives to Build Event Collections
M. Klein, L. Balakireva, и H. Van de Sompel. Proceedings of the 10th ACM Conference on Web Science, стр. 333--342. New York, NY, USA, ACM, (2018)
6 лет назад , @parismic
isomorphism
collection
focused
graph
crawl
event
isomorphismcollectionfocusedgraphcrawlevent
копироватьудалитьдобавить публикацию в буфер
3What's Really New on the Web?: Identifying New Pages from a Series of Unstable Web Snapshots
M. Toyoda, и M. Kitsuregawa. Proceedings of the 15th International Conference on World Wide Web, стр. 233--241. New York, NY, USA, ACM, (2006)
6 лет назад , @parismic
web
japan
unstable
crawl
webjapanunstablecrawl
копироватьудалитьдобавить публикацию в буфер
4Crawl Me Maybe: Iterative Linked Dataset Preservation
B. Fetahu, U. Gadiraju, и S. Dietze. Proceedings of the ISWC 2014 Posters & Demonstrations Track a track within the 13th International Semantic Web Conference, ISWC 2014, Riva del Garda, Italy, October 21, 2014., стр. 433--436. (2014)
10 лет назад , @bfetahu
myown
dataset
iterative
crawl
linked
myowndatasetiterativecrawllinked
копироватьудалитьдобавить публикацию в буфер

&lang;&lang;
⟨
1
&rang;
⟩⟩

BibSonomy

закладки (спрятать)7
показать
всё
только закладки
закладки на страницу
5
10
20
50
100
RSS
BibTeX
XML

1What’s New on the Web? The Evolution of the Web from a Search Engine Perspective

1WebIsaDB LOD

4Web Data Commons

4Web Data Commons

1blekko donates search data to Common Crawl | blekko

3sitemaps.org - Home

1Stanford Computer Science

публикации (спрятать)21
показать
всё
только публикации
публикации на страницу
5
10
20
50
100
расширенный...
RSS
BibTeX
RDF
дальше...

2CopyCat: Near-Duplicates Within and Between the ClueWeb and the Common Crawl

2Analyzing the Web: Are Top Websites Lists a Good Choice for Research?

3Where are the Datasets? A case study on the German Academic Web Archive

2How to Assess the Exhaustiveness of Longitudinal Web Archives: A Case Study of the German Academic Web

2A Comparison over Focused Web Crawling Strategies

2Data quality in web archiving

1Focused Web Crawling: A Generic Framework for Specifying the User Interest and for Adaptive Crawling Strategies

2Web-crawling reliability

1Efficient focused crawling based on best first search

4Intelligent crawling on the World Wide Web with arbitrary predicates

4Crawling the Web

3Crawling the Infinite Web: Five Levels Are Enough

2Determining the Characteristic Vocabulary for a Specialized Dictionary using Word2vec and a Directed Crawler

11Focused crawling: a new approach to topic-specific Web resource discovery

1Longitudinal trends in academic web links

3The discoverability of the web

1A Comparison over Focused Web Crawling Strategies

6iCrawl: Improving the Freshness of Web Collections by Integrating Social Web and Focused Web Crawling

3Focused Crawl of Web Archives to Build Event Collections

3What's Really New on the Web?: Identifying New Pages from a Series of Unstable Web Snapshots

4Crawl Me Maybe: Iterative Linked Dataset Preservation

L3S Research Center

просмотр

сходные по теме тэги

тэги

закладки (спрятать)7 показатьвсётолько закладкизакладки на страницу5102050100 RSSBibTeXXML

публикации (спрятать)21 показатьвсётолько публикациипубликации на страницу5102050100 расширенный... RSSBibTeXRDFдальше...

L3S Research Center

просмотр

сходные по теме тэги

тэги

закладки (спрятать)7
показать
всё
только закладки
закладки на страницу
5
10
20
50
100
RSS
BibTeX
XML

публикации (спрятать)21
показать
всё
только публикации
публикации на страницу
5
10
20
50
100
расширенный...
RSS
BibTeX
RDF
дальше...