Big Data Reduction and Visualization Using the K-Means Algorithm
Loading...
Date
2022
Authors
Akyol, Hakan
Kızılduman, Hale Sema
Dökeroğlu, Tansel
Journal Title
Journal ISSN
Volume Title
Publisher
Open Access Color
OpenAIRE Downloads
OpenAIRE Views
Abstract
A huge amount of data is being produced every day in our era. In addition to high-performance processing
approaches, efficiently visualizing this quantity of data (up to Terabytes) remains a major difficulty. In this study,
we use the well-known clustering method K-means as a data reduction strategy that keeps the visual quality of the
provided huge data as high as possible. The centroids of the dataset are used to display the distribution properties
of data in a straightforward manner. Our data comes from a recent Kaggle big data set (Click Through Rate), and
it is displayed using Box plots on reduced datasets, compared to the original plots. It is discovered that K-means
is an effective strategy for reducing the amount of huge data in order to view the original data without sacrificing
its distribution information quality
Description
Keywords
Big Data, Data Reduction, Visualization, K-Means
Turkish CoHE Thesis Center URL
Fields of Science
Citation
Akyol, H.; Kızılduman, H.S.; Dökeroğlu, T. (2022). "Big Data Reduction and Visualization Using the K-Means Algorithm", Ankara Science University, Researcher, Vol.2, No.1., pp.40-45.
WoS Q
Scopus Q
Source
Ankara Science University, Researcher
Volume
2
Issue
1
Start Page
40
End Page
45