An effective approach to improve the performance of eCPDP (early cross-project defect prediction) via data-transformation and parameter optimization

  • Sunjae Kwon
  • , Duksan Ryu
  • , Jongmoon Baik*
  • *Corresponding author for this work

    Research output: Contribution to journalJournal articlepeer-review

    Abstract

    Cross-project defect prediction (CPDP) utilizes other finished projects (i.e., source project) data to predict defects of the current working project. Transfer learning (TL) has been mainly applied at CPDP to improve prediction performance by alleviating the data distribution discrepancy between different projects. However, existing TL-based CPDP techniques are not applicable at the unit testing phase since they require the entire historical target project data. As a result, they lose the chance to increase the product’s reliability in the early phase by applying the prediction results. The objective of the present study is to increase the product’s reliability in the early phase by proposing a novel TL-based CPDP technique applicable at the unit testing phase (i.e., eCPDP). We utilize singular value decomposition (SVD), which only requires source project data for TL. eCPDP performs similarly or better than the 8 state-of-the-art TL-based CPDP techniques on 9 different performance metrics over 24 projects. In conclusion, (1) we show that eCPDP is an applicable CPDP model at the unit testing phase. (2) It can help practitioners find and fix defects in an earlier phase than other TL-based CPDP techniques.

    Original languageEnglish
    Pages (from-to)1009-1044
    Number of pages36
    JournalSoftware Quality Journal
    Volume31
    Issue number4
    DOIs
    StatePublished - 2023.12

    Keywords

    • CPDP
    • SVD
    • Transfer learning
    • Unit testing phase

    Quacquarelli Symonds(QS) Subject Topics

    • Computer Science & Information Systems
    • Engineering - Petroleum

    Fingerprint

    Dive into the research topics of 'An effective approach to improve the performance of eCPDP (early cross-project defect prediction) via data-transformation and parameter optimization'. Together they form a unique fingerprint.

    Cite this