tag :: web archive | BibSonomy

bookmarks (hide)137
display
all
bookmarks only
bookmarks per page
5
10
20
50
100
sort by
added at
title
RSS
BibTeX
XML

3Internet-Archivierung: Was bleibt vom Web? – iRights.info – iRights.info
2018. Welche Teile des Webs sollen für zukünftige Generationen archiviert werden? Das erkundet derzeit die Deutsche Nationalbibliothek und befragt Internetnutzer. Im Interview spricht Vizedirektorin Ute Schwens über den Stand der Dinge bei der Webarchivierung und die Auswirkungen des neuen Urheberrechts.
a year ago by @astrupp
show all tags
archive
crawler
web
archivecrawlerweb
(0)
copydelete
- community post
- history of this post
1Common Crawl - Get Started
Dive into Common Crawl: your guide to accessing vast web data. Start here to harness the web's potential effortlessly.
a year ago by @astrupp
show all tags
archive
commoncrawl
crawl
web
archivecommoncrawlcrawlweb
(0)
copydelete
- community post
- history of this post
1Home · internetarchive/heritrix3 Wiki · GitHub
This is the public wiki for the Heritrix archival crawler project. Heritrix is the Internet Archive’s open-source, extensible, web-scale, archival-quality web crawler project. Heritrix (sometimes spelled heretrix, or misspelled or mis-said as heratrix/heritix/ heretix/heratix) is an archaic word for heiress (woman who inherits).
a year ago by @astrupp
show all tags
archive
crawl
crawler
web
archivecrawlcrawlerweb
(0)
copydelete
- community post
- history of this post
1WACZ Format - ReplayWeb.Page
Serverless Web Archive Replay directly in the browser
2 years ago by @jaeschke
show all tags
archive
file
format
unknowndata
wacz
warc
web
archivefileformatunknowndatawaczwarcweb
(0)
copydelete
- community post
- history of this post
1GitHub - gildas-lormeau/SingleFile: Web Extension for Firefox/Chrome/MS Edge and CLI tool to save a faithful copy of an entire web page in a single HTML file
https://github.com/gildas-lormeau/SingleFile
3 years ago by @bshanks
show all tags
archive
web
archiveweb
(0)
copydelete
- community post
- history of this post
1chatnoir-eu/chatnoir-resiliparse: A robust web archive analytics toolkit
A robust web archive analytics toolkit. Contribute to chatnoir-eu/chatnoir-resiliparse development by creating an account on GitHub.
3 years ago by @jaeschke
show all tags
analysis
analytics
archive
code
programming
python
toolkit
warc
web
analysisanalyticsarchivecodeprogrammingpythontoolkitwarcweb
(0)
copydelete
- community post
- history of this post
1Wayback Machine APIs | Internet Archive
https://archive.org/help/wayback_api.php
4 years ago by @schmidt2
show all tags
api
archive
internet_archive
reference
wayback_machine
web
apiarchiveinternet_archivereferencewayback_machineweb
(0)
copydelete
- community post
- history of this post
7Time Travel
http://timetravel.mementoweb.org/
4 years ago by @schmidt2
show all tags
archive
memento
reference
search
timetravel
web
archivemementoreferencesearchtimetravelweb
(0)
copydelete
- community post
- history of this post
2archive.today - webpage capture
http://archive.today/
7 years ago by @thtbln
show all tags
archive
history
web
webtools
www
archivehistorywebwebtoolswww
(0)
copydelete
- community post
- history of this post
1New Zealand National Library : New Zealand Web Archive
To use this, search using the National Library Catalogue. You can search restricting for Material Type: Web sites and then click on the "view it" tab on the resulting record.
8 years ago by @michiel.verkade
show all tags
archive
national_library
web
archivenational_libraryweb
(0)
copydelete
- community post
- history of this post
1How OpenWayback handles revisit records in WARC files
https://github.com/iipc/openwayback/wiki/How-OpenWayback-handles-revisit-records-in-WARC-files
8 years ago by @jaeschke
show all tags
archive
duplicate
openwayback
revisit
warc
wayback
web
archiveduplicateopenwaybackrevisitwarcwaybackweb
(0)
copydelete
- community post
- history of this post
2Archive.is - webpage capture
https://archive.is/
8 years ago by @bshanks
show all tags
web
archive
tool
capture
util
webarchivetoolcaptureutil
(0)
copydelete
- community post
- history of this post
8Internet Archive: Wayback Machine
https://archive.org/web/
9 years ago by @michiel.verkade
show all tags
USA
allofweb
archive
web
USAallofwebarchiveweb
(0)
copydelete
- community post
- history of this post
24WebCite
An on-demand archiving system for webreferences (cited webpages and websites, or other kinds of Internet-accessible digital objects), which can be used to ensure that cited webmaterial will remain available to readers in the future
9 years ago by @dbslibrary
show all tags
Citation
Reference
archive
research
web
CitationReferencearchiveresearchweb
(0)
copydelete
- community post
- history of this post
5The On-Line Encyclopedia of Integer Sequences (OEIS)
Get/Submit information about a particular integer number sequence, find its name, and formula
9 years ago by @panic
show all tags
=Reference
Archive
Information
Math
WEB
=ReferenceArchiveInformationMathWEB
(0)
copydelete
- community post
- history of this post
1webarchive-commons/ResourceRecordReader.java at master · internetarchive/webarchive-commons · GitHub
This line needs to be removed with code that copies the HTTP payload into a byte array and returns it to Pig.
10 years ago by @jaeschke
show all tags
archive
body
content
payload
record
warc
web
archivebodycontentpayloadrecordwarcweb
(0)
copydelete
- community post
- history of this post
1WikiReverse - reverse links to Wikipedia articles
https://wikireverse.org/
10 years ago by @jaeschke
show all tags
analysis
archive
commoncrawl
link
web
wikipedia
analysisarchivecommoncrawllinkwebwikipedia
(0)
copydelete
- community post
- history of this post
1hawarp
http://hawarp.openpreservation.org/
10 years ago by @jaeschke
show all tags
archive
hawarp
internet
scape
warc
web
archivehawarpinternetscapewarcweb
(0)
copydelete
- community post
- history of this post
2What the Web Said Yesterday - The New Yorker
http://www.newyorker.com/magazine/2015/01/26/cobweb
10 years ago by @jaeschke
show all tags
archive
internet
web
archiveinternetweb
(0)
copydelete
- community post
- history of this post
1perma.cc
Perma.cc helps scholars, journals and courts create permanent links to the online sources cited in their work.
10 years ago by @jaeschke
show all tags
archive
bookmark
citation
internet
link
perma
permanent
web
archivebookmarkcitationinternetlinkpermapermanentweb
(0)
copydelete
- community post
- history of this post

⟨⟨
⟨
1
2
3
⟩
⟩⟩

publications (hide)65
display
all
publications only
publications per page
5
10
20
50
100
sort by
added at
title
author
publication date
entry type
help for advanced sorting...
RSS
BibTeX
RDF
more...

2Web Archiving: Issues and Methods
J. Masanès. Springer-Verlag, Berlin, (2006)
a year ago by @astrupp
show all tags
archive
history
web
archivehistoryweb
(0)
copydeleteadd this publication to your clipboard
2Analyzing the Web: Are Top Websites Lists a Good Choice for Research?
T. Alby, and R. Jäschke. Proceedings of the International Conference on Theory and Practice of Digital Libraries, page 11--25. Cham, Springer, (2022)
2 years ago by @jaeschke
show all tags
2022
alexa
archive
commoncrawl
crawl
myown
research
science
tpdl
web
2022alexaarchivecommoncrawlcrawlmyownresearchsciencetpdlweb
(0)
copydeleteadd this publication to your clipboard
1ArchiveSpark
H. Holzmann, V. Goel, and A. Anand. Proceedings of the 16th ACM/IEEE-CS on Joint Conference on Digital Libraries, ACM, (June 2016)
3 years ago by @jaeschke
show all tags
archive
archivespark
spark
warc
web
archivearchivesparksparkwarcweb
(0)
copydeleteadd this publication to your clipboard
2Archiving information from geotagged tweets to promote reproducibility and comparability in social media research
K. Kinder-Kurlanda, K. Weller, W. Zenk-Möltgen, J. Pfeffer, and F. Morstatter. Big Data & Society, 4 (2): 205395171773633 (November 2017)
3 years ago by @jaeschke
show all tags
archive
tweets
twitter
web
archivetweetstwitterweb
(0)
copydeleteadd this publication to your clipboard
3Evaluating dataset creation heuristics for concept detection in web pages using BERT
M. Paris, and R. Jäschke. Proceedings of the 14th International Conference on Knowledge Science, Engineering and Management, volume 12816 of Lecture Notes in Artificial Intelligence, page 1--14. Springer, (2021)
3 years ago by @jaeschke
show all tags
2021
archive
bert
classification
data
deeplearning
embedding
gaw
learning
machine
ml
myown
network
neural
regio
web
2021archivebertclassificationdatadeeplearningembeddinggawlearningmachinemlmyownnetworkneuralregioweb
(0)
copydeleteadd this publication to your clipboard
2Web archives as a data resource for digital scholars
E. Vlassenroot, S. Chambers, E. Di Pretoro, F. Geeraert, G. Haesendonck, A. Michel, and P. Mechant. International Journal of Digital Humanities, 1 (1): 85--111 (Apr 1, 2019)
4 years ago by @jaeschke
show all tags
archive
research
scholarly
web
webscience
archiveresearchscholarlywebwebscience
(0)
copydeleteadd this publication to your clipboard
2How to Assess the Exhaustiveness of Longitudinal Web Archives
M. Paris, and R. Jäschke. Proceedings of the 31st ACM Conference on Hypertext and Social Media, ACM, (July 2020)
4 years ago by @parismic
show all tags
archive
exhaustiveness
myown
web
archiveexhaustivenessmyownweb
(0)
copydeleteadd this publication to your clipboard
2How to Assess the Exhaustiveness of Longitudinal Web Archives: A Case Study of the German Academic Web
M. Paris, and R. Jäschke. Proceedings of the 31st ACM Conference on Hypertext and Social Media, New York, NY, USA, ACM, (2020)
4 years ago by @jaeschke
show all tags
2020
academic
archive
crawl
exhaustiveness
gaw
german
longitudinal
myown
regio
web
2020academicarchivecrawlexhaustivenessgawgermanlongitudinalmyownregioweb
(0)
copydeleteadd this publication to your clipboard
1Tracking entities in web archives
M. Spaniol, and G. Weikum. Proceedings of the 21st international conference companion on World Wide Web - WWW \textquotesingle12 Companion, ACM Press, (2012)
5 years ago by @parismic
show all tags
archive
entity
tracking
web
archiveentitytrackingweb
(0)
copydeleteadd this publication to your clipboard
12014 not found: a cross-platform approach to retrospective web archiving
A. Ben-David. Internet Histories, 3 (3-4): 316--342 (August 2019)
5 years ago by @parismic
show all tags
archive
dating
gaza
retrospect
web
archivedatinggazaretrospectweb
(0)
copydeleteadd this publication to your clipboard
1Web Archiving: Issues and Methods
J. Masanés. page 1--53. Springer Berlin Heidelberg, Berlin, Heidelberg, (2006)
5 years ago by @parismic
show all tags
archive
use
web
archiveuseweb
(0)
copydeleteadd this publication to your clipboard
2How much of the web is archived?
S. Ainsworth, A. Alsum, H. SalahEldeen, M. Weigle, and M. Nelson. Proceeding of the 11th annual international ACM/IEEE joint conference on Digital libraries - JCDL \textquotesingle11, ACM Press, (2011)
5 years ago by @parismic
show all tags
archive
quantity
web
archivequantityweb
(0)
copydeleteadd this publication to your clipboard
1Welcome to the web: The online community of GeoCities during the early years of the World Wide Web
I. Milligan. The Web as History, UCL Press, (2017)
5 years ago by @jaeschke
show all tags
archive
geocities
history
web
www
archivegeocitieshistorywebwww
(0)
copydeleteadd this publication to your clipboard
2Extracting and Aggregating Temporal Events from Text
L. Döhling, and U. Leser. Proceedings of the 23rd International Conference on World Wide Web, page 839--844. New York, NY, USA, ACM, (2014)
5 years ago by @jaeschke
show all tags
archive
estimate
event
extraction
information
temporal
text
time
web
archiveestimateeventextractioninformationtemporaltexttimeweb
(0)
copydeleteadd this publication to your clipboard
1Predicting Document Creation Times in News Citation Networks
A. Spitz, J. Strötgen, and M. Gertz. Companion Proceedings of the The Web Conference 2018, page 1731--1736. Republic and Canton of Geneva, Switzerland, International World Wide Web Conferences Steering Committee, (2018)
5 years ago by @jaeschke
show all tags
archive
dating
estimate
news
time
web
archivedatingestimatenewstimeweb
(0)
copydeleteadd this publication to your clipboard
2Publication Date Prediction Through Reverse Engineering of the Web
L. Ostroumova Prokhorenkova, P. Prokhorenkov, E. Samosvat, and P. Serdyukov. Proceedings of the Ninth ACM International Conference on Web Search and Data Mining, page 123--132. New York, NY, USA, ACM, (2016)
5 years ago by @jaeschke
show all tags
archive
dating
estimate
time
web
archivedatingestimatetimeweb
(0)
copydeleteadd this publication to your clipboard
2A Collaborative Approach to Research Data Management in a Web Archive Context
H. Huurdeman, and J. Kamps. Research Data Management - A European Perspective, chapter 4, De Gruyter, Berlin, Boston, (2017)
6 years ago by @jaeschke
show all tags
archive
data
management
research
web
archivedatamanagementresearchweb
(0)
copydeleteadd this publication to your clipboard
1A Large Time-aware Web Graph
P. Boldi, M. Santini, and S. Vigna. SIGIR Forum, 42 (2): 33--38 (November 2008)
7 years ago by @jaeschke
show all tags
archive
gaw
graph
pulse
web
archivegawgraphpulseweb
(0)
copydeleteadd this publication to your clipboard
2Temporal Evolution of the UK Web
I. Bordino, P. Boldi, D. Donato, M. Santini, and S. Vigna. 2008 IEEE International Conference on Data Mining Workshops, page 909-918. (December 2008)
7 years ago by @jaeschke
show all tags
archive
evolution
gaw
graph
pulse
temporal
uk
web
archiveevolutiongawgraphpulsetemporalukweb
(0)
copydeleteadd this publication to your clipboard
1Web Archiving
M. Pennock. 13-01. Digital Preservation Coalition, (March 2013)
8 years ago by @jaeschke
show all tags
archive
overview
web
archiveoverviewweb
(0)
copydeleteadd this publication to your clipboard

⟨⟨
⟨
1
2
3
⟩
⟩⟩