Scalable clustering with adaptive instance sampling

Research output: Contribution to conferenceConference paperpeer-review

Abstract

Most of the clustering algorithms are affected by the number of attributes and instances with respect to the computation time. Thus, the data mining community has made efforts to enable induction of the clustering efficient. Hence, scalability is naturally a critical issue that the data mining community faces. A method to handle this issue is to use a subset of all instances. This paper suggests an algorithm that enables to perform clustering efficiently. This is done by using nested partitions method for solving the noisy performance problems, which arises when using a subset of instances and adjusting the sample rate properly at each iteration. This Adaptive NPCLUSTER algorithm had better similarity in small dataset and had worse similarity in large dataset than NPCLUSTER, but it had shorter computation time than NPCLUSTER.

Original languageEnglish
Title of host publicationIEEE International Conference on Industrial Engineering and Engineering Management
PublisherIEEE Computer Society
Pages1309-1313
Number of pages5
ISBN (Electronic)9781479909865
DOIs
StatePublished - 2014.11.18
Event2013 IEEE International Conference on Industrial Engineering and Engineering Management, IEEM 2013 - Bangkok, Thailand
Duration: 2013.12.102013.12.13

Publication series

NameIEEE International Conference on Industrial Engineering and Engineering Management
ISSN (Print)2157-3611
ISSN (Electronic)2157-362X

Conference

Conference2013 IEEE International Conference on Industrial Engineering and Engineering Management, IEEM 2013
Country/TerritoryThailand
CityBangkok
Period13.12.1013.12.13

Keywords

  • Adaptive Sampling
  • Clustering
  • Data Mining
  • Metaheuristics
  • Nested Partition

Quacquarelli Symonds(QS) Subject Topics

  • Business & Management Studies
  • Engineering - Mechanical
  • Engineering - Petroleum

Fingerprint

Dive into the research topics of 'Scalable clustering with adaptive instance sampling'. Together they form a unique fingerprint.

Cite this