Article,

Semantic Quran: A Multilingual Resource for Natural-Language Processing

M. Sherif, and A. Ngonga Ngomo.
Semantic Web Journal, (2014)

Abstract

In this paper we describe the Semantic Quran dataset, a multilingual RDF representation of translations of the Quran. The dataset was created by integrating data from two different semi-structured sources and aligned to an ontology designed to represent multilingual data from sources with a hierarchical structure. The resulting RDF data encompasses 43 different languages which belong to the most under-represented languages in the Linked Data Cloud, including Arabic, Amharic and Amazigh. We designed the dataset to be easily usable in natural-language processing applications with the goal of facilitating the development of knowledge extraction tools for these languages. In particular, the Semantic Quran is compatible with the Natural-Language Interchange Format and contains explicit morpho-syntactic information on the utilized terms. We present the ontology devised for structuring the data. We also provide the transformation rules implemented in our extraction framework. Finally, we detail the link creation process as well as possible usage scenarios for the Semantic Quran dataset.

BibTeX key: SHNG14
entry type: article
year: 2014
journal: Semantic Web Journal
pages: 1-5
volume: XXX
owner: ngongaan
bdsk-url-1: http://www.semantic-web-journal.net/system/files/swj503.pdf
Document: http://www.semantic-web-journal.net/system/files/swj503.pdf

BibSonomy

Semantic Quran: A Multilingual Resource for Natural-Language Processing

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on