Sequential Classification With Empirically Observed Statistics

M. Haghifam, V. Tan, и A. Khisti.
IEEE Transactions on Information Theory, 67 (5): 3095-3113 (мая 2021)
DOI: 10.1109/TIT.2021.3059272

Аннотация

Motivated by real-world machine learning applications, we consider a statistical classification task in a sequential setting where test samples arrive sequentially. In addition, the generating distributions are unknown and only a set of empirically sampled sequences are available to a decision maker. The decision maker is tasked to classify a test sequence which is known to be generated according to either one of the distributions. In particular, for the binary case, the decision maker wishes to perform the classification task with minimum number of the test samples, so, at each step, she declares that either hypothesis 1 is true, hypothesis 2 is true, or she requests for an additional test sample. We propose a classifier and analyze the type-I and type-II error probabilities. We demonstrate the significant advantage of our sequential scheme compared to an existing non-sequential classifier proposed by Gutman. Finally, we extend our setup and results to the multi-class classification scenario and again demonstrate that the variable-length nature of the problem affords significant advantages as one can achieve the same set of exponents as Gutman's fixed-length setting but without having the rejection option.

ключ BibTeX: haghifam2021sequential
тип записи: article
год: 2021
месяц: may
журнал: IEEE Transactions on Information Theory
номер: 5
страницы: 3095-3113
том: 67
issn: 1557-9654
DOI: 10.1109/TIT.2021.3059272
url: https://ieeexplore.ieee.org/document/9354190/

тэги

Пользователи данного ресурса

Комментарии и рецензиипоказать / перейти в невидимый режим

Пожалуйста, войдите в систему, чтобы принять участие в дискуссии (добавить собственные рецензию, или комментарий)

Цитировать эту публикацию

@article{haghifam2021sequential, abstract = {Motivated by real-world machine learning applications, we consider a statistical classification task in a sequential setting where test samples arrive sequentially. In addition, the generating distributions are unknown and only a set of empirically sampled sequences are available to a decision maker. The decision maker is tasked to classify a test sequence which is known to be generated according to either one of the distributions. In particular, for the binary case, the decision maker wishes to perform the classification task with minimum number of the test samples, so, at each step, she declares that either hypothesis 1 is true, hypothesis 2 is true, or she requests for an additional test sample. We propose a classifier and analyze the type-I and type-II error probabilities. We demonstrate the significant advantage of our sequential scheme compared to an existing non-sequential classifier proposed by Gutman. Finally, we extend our setup and results to the multi-class classification scenario and again demonstrate that the variable-length nature of the problem affords significant advantages as one can achieve the same set of exponents as Gutman's fixed-length setting but without having the rejection option.}, added-at = {2023-04-26T08:15:38.000+0200}, author = {Haghifam, Mahdi and Tan, Vincent Y. F. and Khisti, Ashish}, biburl = {https://www.bibsonomy.org/bibtex/257eb67be930aae28cc6237ef6b0de8a1/gdmcbain}, doi = {10.1109/TIT.2021.3059272}, interhash = {4769047e05c09203626991c4d9c1a32f}, intrahash = {57eb67be930aae28cc6237ef6b0de8a1}, issn = {1557-9654}, journal = {IEEE Transactions on Information Theory}, keywords = {62h30-classification-discrimination-cluster-analysis 62l10-sequential-analysis}, month = may, number = 5, pages = {3095-3113}, timestamp = {2023-04-26T08:15:38.000+0200}, title = {Sequential Classification With Empirically Observed Statistics}, url = {https://ieeexplore.ieee.org/document/9354190/}, volume = 67, year = 2021 }

BibSonomy