Boolean interpretation of conjunctions for document retrieval.
P. Das-Gupta. Journal of the American Society for Information Science, 38 (4):
245-254(1987)
Abstract
It is generally recognized that the conjunction “and” plays an ambiguous role in natural language. When considered within the domain of Boolean document retrieval, this ambiguity makes the automatic Boolean interpretation of statements representing information needs a difficult task. The human analyst is able to resolve this ambiguity with relative ease. However, the processes employed appear complex and are not well understood. This article examines a semantic property of the conjunction, i.e., the semantic similarity between the conjuncts with a view to automatically resolving this ambiguity. Specifically, the idea examined is that if the two conjuncts are semantically similar then the conjunction is best interpreted as a Boolean OR, otherwise as an AND. The study resulted in an algorithm which utilizes semantic information and some syntactic information (both of which are derivable from a standard dictionary) to obtain the appropriate Boolean interpretation. The algorithm was successful when evaluated against human decisions. In addition to contributing the algorithm, this article draws attention to the effects of this ambiguity on the derivation of appropriate Boolean search specifications from natural-language statements representing information needs.
%0 Journal Article
%1 dasgupta1987boolean
%A Das-Gupta, Padmini
%D 1987
%E of the American Society for Information Science, Journal
%J Journal of the American Society for Information Science
%K 1987 boolean das-gupta document irhhu retrieval
%N 4
%P 245-254
%T Boolean interpretation of conjunctions for document retrieval.
%V 38
%X It is generally recognized that the conjunction “and” plays an ambiguous role in natural language. When considered within the domain of Boolean document retrieval, this ambiguity makes the automatic Boolean interpretation of statements representing information needs a difficult task. The human analyst is able to resolve this ambiguity with relative ease. However, the processes employed appear complex and are not well understood. This article examines a semantic property of the conjunction, i.e., the semantic similarity between the conjuncts with a view to automatically resolving this ambiguity. Specifically, the idea examined is that if the two conjuncts are semantically similar then the conjunction is best interpreted as a Boolean OR, otherwise as an AND. The study resulted in an algorithm which utilizes semantic information and some syntactic information (both of which are derivable from a standard dictionary) to obtain the appropriate Boolean interpretation. The algorithm was successful when evaluated against human decisions. In addition to contributing the algorithm, this article draws attention to the effects of this ambiguity on the derivation of appropriate Boolean search specifications from natural-language statements representing information needs.
@article{dasgupta1987boolean,
abstract = {It is generally recognized that the conjunction “and” plays an ambiguous role in natural language. When considered within the domain of Boolean document retrieval, this ambiguity makes the automatic Boolean interpretation of statements representing information needs a difficult task. The human analyst is able to resolve this ambiguity with relative ease. However, the processes employed appear complex and are not well understood. This article examines a semantic property of the conjunction, i.e., the semantic similarity between the conjuncts with a view to automatically resolving this ambiguity. Specifically, the idea examined is that if the two conjuncts are semantically similar then the conjunction is best interpreted as a Boolean OR, otherwise as an AND. The study resulted in an algorithm which utilizes semantic information and some syntactic information (both of which are derivable from a standard dictionary) to obtain the appropriate Boolean interpretation. The algorithm was successful when evaluated against human decisions. In addition to contributing the algorithm, this article draws attention to the effects of this ambiguity on the derivation of appropriate Boolean search specifications from natural-language statements representing information needs.},
added-at = {2011-11-17T11:25:55.000+0100},
author = {Das-Gupta, Padmini},
biburl = {https://www.bibsonomy.org/bibtex/2af5367f951a07777532e4577a84d95e3/junor101},
editor = {of the American Society for Information Science, Journal},
interhash = {4aba17d35136792b60d2848c82614508},
intrahash = {af5367f951a07777532e4577a84d95e3},
journal = {Journal of the American Society for Information Science},
keywords = {1987 boolean das-gupta document irhhu retrieval},
number = 4,
pages = {245-254},
timestamp = {2011-11-17T11:41:57.000+0100},
title = {Boolean interpretation of conjunctions for document retrieval.
},
volume = 38,
year = 1987
}