MATEC Web Conf.
Volume 232, 20182018 2nd International Conference on Electronic Information Technology and Computer Engineering (EITCE 2018)
|Number of page(s)||5|
|Section||Algorithm Study and Mathematical Application|
|Published online||19 November 2018|
A K-means Algorithm Based On Feature Weighting
College of Computer and Information Engineering, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia 010020, China
* Corresponding author: email@example.com
Cluster analysis is a statistical analysis technique that divides the research objects into relatively homogeneous groups. The core of cluster analysis is to find useful clusters of objects. K-means clustering algorithm has been receiving much attention from scholars because of its excellent speed and good scalability. However, the traditional K-means algorithm does not consider the influence of each attribute on the final clustering result, which makes the accuracy of clustering have a certain impact. In response to the above problems, this paper proposes an improved feature weighting algorithm. The improved algorithm uses the information gain and ReliefF feature selection algorithm to weight the features and correct the distance function between clustering objects, so that the algorithm can achieve more accurate and efficient clustering effect. The simulation results show that compared with the traditional K-means algorithm, the improved algorithm clustering results are stable, and the accuracy of clustering is significantly improved.
© The Authors, published by EDP Sciences, 2018
This is an open access article distributed under the terms of the Creative Commons Attribution License 4.0 (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.