GMiner++: Boosting GPU-based frequent itemset mining by reducing redundant computations

  • Kang Wook Chon
  • , Chanki Kim*
  • *Corresponding author for this work

    Research output: Contribution to journalJournal articlepeer-review

    Abstract

    Frequent itemset mining (FIM) is increasingly important in fundamental data mining techniques. However, the applicability of existing FIM methods is limited, mainly because of their performance. The expected performance improvement is limited owing to the exploitation of only a single thread, despite numerous efficient single-threaded FIM methods being proposed. Numerous parallel FIM methods have been devised using graphic processing units (GPU) or multicore central processing units (CPUs) to overcome the shortcomings of these methods. However, when extracting patterns from large amounts of data, multi-threaded FIM methods exhibit a similar performance tendency to single-threaded FIM methods, because of their large memory footprints and computations. Hence, we propose GMiner++, a memory-efficient GPU-based FIM method equipped with several GPUs. We propose a sub-database of the same size called bit array blocks, which contains pre-calculated bit arrays of F1∪P(IK). These bit arrays are repeatedly exploited during mining tasks using an elegant probabilistic model. GMiner++ can obtain frequent patterns and use several GPUs only by using the bit array blocks and the occurrence update scheme. The proposed method decreased redundant computations using pre-calculated bit arrays with bit array blocks. In addition, GMiner++ does not create intermediate data during mining tasks to increase robustness and reduce memory footprints. Simulation results demonstrate that GMiner++ outperformed existing FIM methods concerning performance and scalability with increasing robustness.

    Original languageEnglish
    Article number123928
    JournalExpert Systems with Applications
    Volume250
    DOIs
    StatePublished - 2024.09.15

    Keywords

    • Big data
    • Data mining
    • Graphic processing units
    • Memory-efficient pattern mining method
    • Scalable algorithm

    Quacquarelli Symonds(QS) Subject Topics

    • Computer Science & Information Systems
    • Data Science

    Fingerprint

    Dive into the research topics of 'GMiner++: Boosting GPU-based frequent itemset mining by reducing redundant computations'. Together they form a unique fingerprint.

    Cite this