A critique of software defect prediction models

N. Fenton, и M. Neil.
Software Engineering, IEEE Transactions on, 25 (5): 675--689 (06.09.1999)
DOI: 10.1109/32.815326

Аннотация

Many organizations want to predict the number of defects (faults) in software systems, before they are deployed, to gauge the likely delivered quality and maintenance effort. To help in this numerous software metrics and statistical models have been developed, with a correspondingly large literature. We provide a critical review of this literature and the state-of-the-art. Most of the wide range of prediction models use size and complexity metrics to predict defects. Others are based on testing data, the ” quality” of the development process, or take a multivariate approach. The authors of the models have often made heroic contributions to a subject otherwise bereft of empirical studies. However, there are a number of serious theoretical and practical problems in many studies. The models are weak because of their inability to cope with the, as yet, unknown relationship between defects and failures. There are fundamental statistical and data quality problems that undermine model validity. More significantly many prediction models tend to model only part of the underlying problem and seriously misspecify it. To illustrate these points the Goldilock's Conjecture, that there is an optimum module size, is used to show the considerable problems inherent in current defect prediction approaches. Careful and considered analysis of past and new results shows that the conjecture lacks support and that some models are misleading. We recommend holistic models for software defect prediction, using Bayesian belief networks, as alternative approaches to the single-issue models used at present. We also argue for research into a theory of ” software decomposition” in order to test hypotheses about defect introduction and help construct a better science of software engineering

ключ BibTeX: Fenton1999Critique
тип записи: article
адрес: Piscataway, NJ, USA
год: 1999
месяц: sep
день: 06
учреждение: Centre for Software Reliability, London, UK
журнал: Software Engineering, IEEE Transactions on
номер: 5
страницы: 675--689
издательство: IEEE
том: 25
citeulike-article-id: 356161
citeulike-linkout-2: http://dx.doi.org/10.1109/32.815326
citeulike-linkout-1: http://doi.ieeecomputersociety.org/10.1109/32.815326
citeulike-linkout-3: http://ieeexplore.ieee.org/xpls/abs\_all.jsp?arnumber=815326
priority: 2
posted-at: 2009-05-05 12:33:40
issn: 0098-5589
citeulike-linkout-0: http://portal.acm.org/citation.cfm?id=325401
DOI: 10.1109/32.815326
url: http://dx.doi.org/10.1109/32.815326

тэги

Пользователи данного ресурса

Комментарии и рецензиипоказать / перейти в невидимый режим

Пожалуйста, войдите в систему, чтобы принять участие в дискуссии (добавить собственные рецензию, или комментарий)

Цитировать эту публикацию

%0 Journal Article %1 Fenton1999Critique %A Fenton, N. E. %A Neil, M. %C Piscataway, NJ, USA %D 1999 %I IEEE %J Software Engineering, IEEE Transactions on %K defect\_prediction, software %N 5 %P 675--689 %R 10.1109/32.815326 %T A critique of software defect prediction models %U http://dx.doi.org/10.1109/32.815326 %V 25 %X Many organizations want to predict the number of defects (faults) in software systems, before they are deployed, to gauge the likely delivered quality and maintenance effort. To help in this numerous software metrics and statistical models have been developed, with a correspondingly large literature. We provide a critical review of this literature and the state-of-the-art. Most of the wide range of prediction models use size and complexity metrics to predict defects. Others are based on testing data, the ” quality” of the development process, or take a multivariate approach. The authors of the models have often made heroic contributions to a subject otherwise bereft of empirical studies. However, there are a number of serious theoretical and practical problems in many studies. The models are weak because of their inability to cope with the, as yet, unknown relationship between defects and failures. There are fundamental statistical and data quality problems that undermine model validity. More significantly many prediction models tend to model only part of the underlying problem and seriously misspecify it. To illustrate these points the Goldilock's Conjecture, that there is an optimum module size, is used to show the considerable problems inherent in current defect prediction approaches. Careful and considered analysis of past and new results shows that the conjecture lacks support and that some models are misleading. We recommend holistic models for software defect prediction, using Bayesian belief networks, as alternative approaches to the single-issue models used at present. We also argue for research into a theory of ” software decomposition” in order to test hypotheses about defect introduction and help construct a better science of software engineering

BibSonomy