Information Extraction methods can be used to automatically
"fill-in" database forms from unstructured
data such as Web documents or email. State-of-the-art
methods have achieved low error rates but invariably
make a number of errors. The goal of an interactive
information extraction system is to assist the user in filling
in database fields while giving the user confidence
in the integrity of the data. The user is presented with
an interactive interface that allows both the rapid verification
of automatic field assignments and the correction
of errors. In cases where there are multiple errors, our
system takes into account user corrections, and immediately
propagates these constraints such that other fields
are often corrected automatically.
Linear-chain conditional random fields (CRFs) have
been shown to perform well for information extraction
and other language modelling tasks due to their ability
to capture arbitrary, overlapping features of the input in
aMarkov model. We apply this framework with two extensions:
a constrained Viterbi decoding which finds the
optimal field assignments consistent with the fields explicitly
specified or corrected by the user; and a mechanism
for estimating the confidence of each extracted
field, so that low-confidence extractions can be highlighted.
Both of these mechanisms are incorporated in a
novel user interface for form filling that is intuitive and
speeds the entry of data—providing a 23% reduction in
error due to automated corrections.
%0 Conference Paper
%1 kristjansson2004interactive
%A Kristjansson, Trausti T.
%A Culotta, Aron
%A Viola, Paul A.
%A McCallum, Andrew
%B AAAI
%D 2004
%E McGuinness, Deborah L.
%E Ferguson, George
%I AAAI Press/The MIT Press
%K crf extraction ie information
%P 412--418
%T Interactive Information Extraction with Constrained Conditional Random Fields.
%U http://dblp.uni-trier.de/db/conf/aaai/aaai2004.html#KristjanssonCVM04
%X Information Extraction methods can be used to automatically
"fill-in" database forms from unstructured
data such as Web documents or email. State-of-the-art
methods have achieved low error rates but invariably
make a number of errors. The goal of an interactive
information extraction system is to assist the user in filling
in database fields while giving the user confidence
in the integrity of the data. The user is presented with
an interactive interface that allows both the rapid verification
of automatic field assignments and the correction
of errors. In cases where there are multiple errors, our
system takes into account user corrections, and immediately
propagates these constraints such that other fields
are often corrected automatically.
Linear-chain conditional random fields (CRFs) have
been shown to perform well for information extraction
and other language modelling tasks due to their ability
to capture arbitrary, overlapping features of the input in
aMarkov model. We apply this framework with two extensions:
a constrained Viterbi decoding which finds the
optimal field assignments consistent with the fields explicitly
specified or corrected by the user; and a mechanism
for estimating the confidence of each extracted
field, so that low-confidence extractions can be highlighted.
Both of these mechanisms are incorporated in a
novel user interface for form filling that is intuitive and
speeds the entry of data—providing a 23% reduction in
error due to automated corrections.
%@ 0-262-51183-5
@inproceedings{kristjansson2004interactive,
abstract = {Information Extraction methods can be used to automatically
"fill-in" database forms from unstructured
data such as Web documents or email. State-of-the-art
methods have achieved low error rates but invariably
make a number of errors. The goal of an interactive
information extraction system is to assist the user in filling
in database fields while giving the user confidence
in the integrity of the data. The user is presented with
an interactive interface that allows both the rapid verification
of automatic field assignments and the correction
of errors. In cases where there are multiple errors, our
system takes into account user corrections, and immediately
propagates these constraints such that other fields
are often corrected automatically.
Linear-chain conditional random fields (CRFs) have
been shown to perform well for information extraction
and other language modelling tasks due to their ability
to capture arbitrary, overlapping features of the input in
aMarkov model. We apply this framework with two extensions:
a constrained Viterbi decoding which finds the
optimal field assignments consistent with the fields explicitly
specified or corrected by the user; and a mechanism
for estimating the confidence of each extracted
field, so that low-confidence extractions can be highlighted.
Both of these mechanisms are incorporated in a
novel user interface for form filling that is intuitive and
speeds the entry of data—providing a 23% reduction in
error due to automated corrections.},
added-at = {2012-07-05T15:31:08.000+0200},
author = {Kristjansson, Trausti T. and Culotta, Aron and Viola, Paul A. and McCallum, Andrew},
biburl = {https://www.bibsonomy.org/bibtex/2fe6cb1dbef3216852a63a625a30799d6/jaeschke},
booktitle = {AAAI},
editor = {McGuinness, Deborah L. and Ferguson, George},
interhash = {89fe7fe6ef4c088b10d3b0b0aabeaf46},
intrahash = {fe6cb1dbef3216852a63a625a30799d6},
isbn = {0-262-51183-5},
keywords = {crf extraction ie information},
pages = {412--418},
publisher = {AAAI Press/The MIT Press},
timestamp = {2014-07-28T15:57:31.000+0200},
title = {Interactive Information Extraction with Constrained Conditional Random Fields.},
url = {http://dblp.uni-trier.de/db/conf/aaai/aaai2004.html#KristjanssonCVM04},
year = 2004
}