2018. Welche Teile des Webs sollen für zukünftige Generationen archiviert werden? Das erkundet derzeit die Deutsche Nationalbibliothek und befragt Internetnutzer. Im Interview spricht Vizedirektorin Ute Schwens über den Stand der Dinge bei der Webarchivierung und die Auswirkungen des neuen Urheberrechts.
This is the public wiki for the Heritrix archival crawler project. Heritrix is the Internet Archive’s open-source, extensible, web-scale, archival-quality web crawler project. Heritrix (sometimes spelled heretrix, or misspelled or mis-said as heratrix/heritix/ heretix/heratix) is an archaic word for heiress (woman who inherits).
To use this, search using the National Library Catalogue. You can search restricting for Material Type: Web sites and then click on the "view it" tab on the resulting record.
An on-demand archiving system for webreferences (cited webpages and websites, or other kinds of Internet-accessible digital objects), which can be used to ensure that cited webmaterial will remain available to readers in the future
M. Paris, and R. Jäschke. Proceedings of the 14th International Conference on Knowledge Science, Engineering and Management, volume 12816 of Lecture Notes in Artificial Intelligence, page 1--14. Springer, (2021)
M. Spaniol, and G. Weikum. Proceedings of the 21st international conference companion on World Wide Web - WWW \textquotesingle12 Companion, ACM Press, (2012)
A. Spitz, J. Strötgen, and M. Gertz. Companion Proceedings of the The Web Conference 2018, page 1731--1736. Republic and Canton of Geneva, Switzerland, International World Wide Web Conferences Steering Committee, (2018)