Abstract
Retrieving topically relevant passages over a huge document collection is deemed to be of central importance to many information retrieval tasks, particularly to Question Answering (QA). Indeed, Passage Retrieval (PR) is a longstanding problem in QA, that has been widely studied over the last decades and still requires further efforts in order to enable a user to have a better chance to find a relevant answer to his human natural language question. This paper describes a successful attempt to improve PR and ranking for open domain QA by finding out the most relevant passage to a given question. It uses a support vector machine (SVM) model that incorporates a set of different powerful text similarity measures constituting our features. These latter include our new proposed n-gram based metric relying on the dependency degree of n-gram words of the question in the passage, as well as other lexical and semantic features which have already been proven successful in a recent Semantic Textual Similarity task (STS). We implemented a system named PRSYS to validate our approach in different languages. Our experimental evaluations have shown a comparable performance with other similar systems endowing with strong performance.
Users
Please
log in to take part in the discussion (add own reviews or comments).