More and more websites have started to embed structured data describing products, people, organizations, places, and events into their HTML pages using markup standards such as Microdata, JSON-LD, RDFa, and Microformats. The Web Data Commons project extracts this data from several billion web pages. So far the project provides 11 different data set releases extracted from the Common Crawls 2010 to 2022. The project provides the extracted data for download and publishes statistics about the deployment of the different formats.
P. Ziegler, K. Dittrich, and E. Hunt. Proceedings of the 2008 IEEE 24th International Conference on Data Engineering Workshop, page 250--253. Washington, DC, USA, IEEE Computer Society, (2008)
P. Heim, S. Lohmann, D. Tsendragchaa, and T. Ertl. Proceedings of the 7th International Conference on Semantic Systems (I-SEMANTICS 2011), page 175-178. New York, NY, USA, ACM, (2011)
F. Abel, N. Henze, and D. Krause. Web Information Systems and Technologies, 4th International
Conference, WEBIST 2008, Funchal, Madeira, Portugal, May
4-7, 2008, Revised Selected Papers, volume 18 of Lecture Notes in Business Information Processing, page 199-213. Springer, (2008)
B. Berendt, A. Hotho, and G. Stumme. Web Semantics: Science, Services and Agents on the World Wide Web, 8 (2-3):
95 - 96(2010)Bridging the Gap--Data Mining and Social Network Analysis for Integrating Semantic Web and Web 2.0; The Future of Knowledge Dissemination: The Elsevier Grand Challenge for the Life Sciences.
S. Tramp, P. Frischmuth, T. Ermilov, and S. Auer. Proceedings of the EKAW 2010 - Knowledge Engineering and Knowledge Management by the Masses; 11th October-15th October 2010 - Lisbon, Portugal, volume 6317 of Lecture Notes in Artificial Intelligence, page 135--149. Berlin / Heidelberg, Springer, (October 2010)
B. Berendt, A. Hotho, and G. Stumme. Web Semantics: Science, Services and Agents on the World Wide Web, 8 (2-3):
95 - 96(2010)Bridging the Gap--Data Mining and Social Network Analysis for Integrating Semantic Web and Web 2.0; The Future of Knowledge Dissemination: The Elsevier Grand Challenge for the Life Sciences.
M. Atzmueller, S. Beer, and F. Puppe. Proc. 22nd International Florida Artificial Intelligence Research Society Conference (FLAIRS), accepted, page 372-377. AAAI Press, (2009)
M. Atzmueller, F. Puppe, and H. Buscher. Proc. 19th International Joint Conference on Artificial Intelligence (IJCAI-05), page 647--652. Edinburgh, Scotland, (2005)
M. Atzmueller, F. Puppe, and H. Buscher. Proc. 10th International Workshop on Intelligent Data Analysis in Medicine and Pharmacology (IDAMAP-2005), page 46--51. Aberdeen, Scotland, (2005)
M. Atzmueller, J. Baumeister, and F. Puppe. Proc. 15th Intl. Conference on Applications of Declarative Programming and Knowledge Management (INAP 2004), page 203--213. Potsdam, Germany, (2004)
M. Atzmueller, J. Baumeister, and F. Puppe. Artificial Intelligence in Medicine. Special Issue on Intelligent Data Analysis in Medicine, 37 (1):
19--30(2006)
M. Atzmueller, J. Baumeister, and F. Puppe. Medical Data Analysis, Proc. 4th Intl. Symposium on Medical Data Analysis (ISMDA 2003), LNCS 2868, page 23-30. (2003)
J. Baumeister, M. Atzmueller, and F. Puppe. Advances in Case-Based Reasoning, volume 2416 of LNAI, page 28-42. (2002)Proc. 6th European Conference on Case-Based Reasoning (ECCBR 2002).
M. Atzmueller, and F. Puppe. Proc. 15th International Conference on Knowledge Engineering and Knowledge Management (EKAW 2006), 4248, page 318--325. (2006)
T. Groza, S. Handschuh, K. Möller, and S. Decker. Proceedings of the 5th European Semantic Web Conference, Berlin, Heidelberg, Springer Verlag, (June 2008)
P. Lyngbaek, and V. Vianu. Proceedings of the 12th Annual ACM Conference on the Managemant of Data, page 132--142. San Francisco, California, (May 1987)