CSV2RDF: User-Driven CSV to RDF Mass Conversion Framework
I. Ermilov, S. Auer, and C. Stadler. Proceedings of the ISEM '13, September 04 - 06 2013, Graz, Austria, (2013)
Abstract
Governments and public administrations started recently to publish
large amounts of structured data on the Web, mostly in the form of
tabular data such as CSV files or Excel sheets. Various tools and
projects have been launched aiming at facilitating the lifting of
tabular data to reach semantically structured and linked data. However,
none of these tools supported a truly incremental, pay-as-you-go
data publication and mapping strategy, which enables effort sharing
between data owners, community experts and consumers. In this article,
we present an approach for enabling the user-driven semantic mapping
of large amounts tabular data. We devise a simple mapping language
for tabular data, which is easy to understand even for casual users,
but expressive enough to cover the vast majority of potential tabular
mappings use cases. We outline a formal approach for mapping tabular
data to RDF. Default mappings are automatically created and can be
revised by the community using a semantic wiki. The mappings are
executed using a sophisticated streaming RDB2RDF conversion. We report
about the deployment of our approach at the Pan-European data portal
PublicData.eu, where we transformed and enriched almost 10,000 datasets
accounting for 7.3 billion triples.
%0 Conference Paper
%1 ermilov-ivan-2013-isem
%A Ermilov, Ivan
%A Auer, Sören
%A Stadler, Claus
%B Proceedings of the ISEM '13, September 04 - 06 2013, Graz, Austria
%D 2013
%K 2013 auer group_aksw iermilov stadler
%T CSV2RDF: User-Driven CSV to RDF Mass Conversion Framework
%U http://svn.aksw.org/papers/2013/ISemantics_CSV2RDF/public.pdf
%X Governments and public administrations started recently to publish
large amounts of structured data on the Web, mostly in the form of
tabular data such as CSV files or Excel sheets. Various tools and
projects have been launched aiming at facilitating the lifting of
tabular data to reach semantically structured and linked data. However,
none of these tools supported a truly incremental, pay-as-you-go
data publication and mapping strategy, which enables effort sharing
between data owners, community experts and consumers. In this article,
we present an approach for enabling the user-driven semantic mapping
of large amounts tabular data. We devise a simple mapping language
for tabular data, which is easy to understand even for casual users,
but expressive enough to cover the vast majority of potential tabular
mappings use cases. We outline a formal approach for mapping tabular
data to RDF. Default mappings are automatically created and can be
revised by the community using a semantic wiki. The mappings are
executed using a sophisticated streaming RDB2RDF conversion. We report
about the deployment of our approach at the Pan-European data portal
PublicData.eu, where we transformed and enriched almost 10,000 datasets
accounting for 7.3 billion triples.
@inproceedings{ermilov-ivan-2013-isem,
abstract = {Governments and public administrations started recently to publish
large amounts of structured data on the Web, mostly in the form of
tabular data such as CSV files or Excel sheets. Various tools and
projects have been launched aiming at facilitating the lifting of
tabular data to reach semantically structured and linked data. However,
none of these tools supported a truly incremental, pay-as-you-go
data publication and mapping strategy, which enables effort sharing
between data owners, community experts and consumers. In this article,
we present an approach for enabling the user-driven semantic mapping
of large amounts tabular data. We devise a simple mapping language
for tabular data, which is easy to understand even for casual users,
but expressive enough to cover the vast majority of potential tabular
mappings use cases. We outline a formal approach for mapping tabular
data to RDF. Default mappings are automatically created and can be
revised by the community using a semantic wiki. The mappings are
executed using a sophisticated streaming RDB2RDF conversion. We report
about the deployment of our approach at the Pan-European data portal
PublicData.eu, where we transformed and enriched almost 10,000 datasets
accounting for 7.3 billion triples.},
added-at = {2017-01-27T23:28:47.000+0100},
author = {Ermilov, Ivan and Auer, S{\"o}ren and Stadler, Claus},
bdsk-url-1 = {http://svn.aksw.org/papers/2013/ISemantics_CSV2RDF/public.pdf},
biburl = {https://www.bibsonomy.org/bibtex/23bc97cec6ee1214c47991bf4f70f479c/soeren},
booktitle = {Proceedings of the ISEM '13, September 04 - 06 2013, Graz, Austria},
interhash = {af9c6f008700fe18c30c77dbd1770bc6},
intrahash = {3bc97cec6ee1214c47991bf4f70f479c},
keywords = {2013 auer group_aksw iermilov stadler},
timestamp = {2017-01-27T23:30:12.000+0100},
title = {CSV2RDF: User-Driven CSV to RDF Mass Conversion Framework},
url = {http://svn.aksw.org/papers/2013/ISemantics_CSV2RDF/public.pdf},
year = 2013
}