Two-pass technique for clone detection and type classification using tree-based convolution neural network

  • Young Bin Jo
  • , Jihyun Lee*
  • , Cheol Jung Yoo
  • *Corresponding author for this work

    Research output: Contribution to journalJournal articlepeer-review

    Abstract

    Appropriate reliance on code clones significantly reduces development costs and hastens the development process. Reckless cloning, in contrast, reduces code quality and ultimately adds costs and time. To avoid this scenario, many researchers have proposed methods for clone detection and refactoring. The developed techniques, however, are only reliably capable of detecting clones that are either entirely identical or that only use modified identifiers, and do not provide clone-type information. This paper proposes a two-pass clone classification technique that uses a tree-based convolution neural network (TBCNN) to detect multiple clone types, including clones that are not wholly identical or to which only small changes have been made, and automatically classify them by type. Our method was validated with BigCloneBench, a well-known and wildly used dataset of cloned code. Our experimental results validate that our technique detected clones with an average rate of 96% recall and precision, and classified clones with an average rate of 78% recall and precision.

    Original languageEnglish
    Article number6613
    JournalApplied Sciences (Switzerland)
    Volume11
    Issue number14
    DOIs
    StatePublished - 2021.07.2

    Keywords

    • Clone detection
    • Clone type classification
    • Clone-and-own approach
    • CNN
    • TBCNN

    Quacquarelli Symonds(QS) Subject Topics

    • Materials Science
    • Computer Science & Information Systems
    • Engineering - Petroleum
    • Data Science
    • Engineering - Chemical
    • Physics & Astronomy

    Fingerprint

    Dive into the research topics of 'Two-pass technique for clone detection and type classification using tree-based convolution neural network'. Together they form a unique fingerprint.

    Cite this