Recommending Software Artifacts from Repository Transactions
J. David. New Frontiers in Applied Artificial Intelligence, (2008)
Abstract
The central problem addressed by this interdisciplinary paper is to predict related software artifacts that are usually changed
together by a developer. The working focus of programmers is revealed by means of their interactions with a software repositorythat receives a set of cohesive artifact changes within one commit transaction. This implicit knowledge of interdependentchanges can be exploited in order to recommend likely further changes, given a set of already changed artifacts. We suggesta hybrid approach based on Latent Semantic Indexing (LSI) and machine learning methods to recommend software development artifacts, that is predicting a sequence of configurationitems that were committed together. As opposed to related approaches to repository mining that are mostly based on symbolicmethods like Association Rule Mining (ARM), our connectionist method is able to generalize onto unseen artifacts. Text analysis methods are employed to considertheir textual attributes. We applied our technique to three publicly available datasets from the PROMISE Repository of Software Engineering Databases. The evaluation showed that the connectionist LSI-approach achieves a significantly higher recommendation accuracy than existingmethods based on ARM. Even when generalizing onto unseen artifacts, our approach still provides an accuracy of up to 72.7%on the given datasets.
%0 Journal Article
%1 paper:david:2008
%A David, Joern
%D 2008
%J New Frontiers in Applied Artificial Intelligence
%K machine-learning neural-networks recommendation requirements
%P 189--198
%T Recommending Software Artifacts from Repository Transactions
%U http://dx.doi.org/10.1007/978-3-540-69052-8_20
%X The central problem addressed by this interdisciplinary paper is to predict related software artifacts that are usually changed
together by a developer. The working focus of programmers is revealed by means of their interactions with a software repositorythat receives a set of cohesive artifact changes within one commit transaction. This implicit knowledge of interdependentchanges can be exploited in order to recommend likely further changes, given a set of already changed artifacts. We suggesta hybrid approach based on Latent Semantic Indexing (LSI) and machine learning methods to recommend software development artifacts, that is predicting a sequence of configurationitems that were committed together. As opposed to related approaches to repository mining that are mostly based on symbolicmethods like Association Rule Mining (ARM), our connectionist method is able to generalize onto unseen artifacts. Text analysis methods are employed to considertheir textual attributes. We applied our technique to three publicly available datasets from the PROMISE Repository of Software Engineering Databases. The evaluation showed that the connectionist LSI-approach achieves a significantly higher recommendation accuracy than existingmethods based on ARM. Even when generalizing onto unseen artifacts, our approach still provides an accuracy of up to 72.7%on the given datasets.
@article{paper:david:2008,
abstract = {The central problem addressed by this interdisciplinary paper is to predict related software artifacts that are usually changed
together by a developer. The working focus of programmers is revealed by means of their interactions with a software repositorythat receives a set of cohesive artifact changes within one commit transaction. This implicit knowledge of interdependentchanges can be exploited in order to recommend likely further changes, given a set of already changed artifacts. We suggesta hybrid approach based on Latent Semantic Indexing (LSI) and machine learning methods to recommend software development artifacts, that is predicting a sequence of configurationitems that were committed together. As opposed to related approaches to repository mining that are mostly based on symbolicmethods like Association Rule Mining (ARM), our connectionist method is able to generalize onto unseen artifacts. Text analysis methods are employed to considertheir textual attributes. We applied our technique to three publicly available datasets from the PROMISE Repository of Software Engineering Databases. The evaluation showed that the connectionist LSI-approach achieves a significantly higher recommendation accuracy than existingmethods based on ARM. Even when generalizing onto unseen artifacts, our approach still provides an accuracy of up to 72.7%on the given datasets.},
added-at = {2009-08-12T11:27:22.000+0200},
author = {David, Joern},
biburl = {https://www.bibsonomy.org/bibtex/2bc1e8719c1ea0c762e5fd1a04e89042e/mschuber},
description = {SpringerLink - Book Chapter},
interhash = {8ab92821ba781388c5a44dc1de655780},
intrahash = {bc1e8719c1ea0c762e5fd1a04e89042e},
journal = {New Frontiers in Applied Artificial Intelligence},
keywords = {machine-learning neural-networks recommendation requirements},
pages = {189--198},
timestamp = {2009-08-12T11:27:22.000+0200},
title = {Recommending Software Artifacts from Repository Transactions},
url = {http://dx.doi.org/10.1007/978-3-540-69052-8_20},
year = 2008
}