Sınıflandırmada Küçük ve Dengesiz Veri Kümesi Problemi

Par, Öznur Esra; Akçapınar Sezer, Ebru; Sever, Hayri

Sınıflandırmada Küçük ve Dengesiz Veri Kümesi Problemi

dc.contributor.author	Par, Öznur Esra
dc.contributor.author	Akçapınar Sezer, Ebru
dc.contributor.author	Sever, Hayri
dc.date.accessioned	2020-12-14T07:31:59Z
dc.date.available	2020-12-14T07:31:59Z
dc.date.issued	2019
dc.description.abstract	Verilerinin sınıflandırılması, veri kümesinin küçük ve dengesiz olması durumunda zorlaşmakta ve sınıflama performansını direkt etkilemektedir. Veri setinin küçük olması ve/veya sınıflar arasında dengesizlik olması veri madenciliğinde büyük bir sorun haline gelmiştir. Sınıflama algoritmaları, veri setlerinin yeterli büyüklüğe sahip, dengeli olduğu varsayımı üzerine geliştirilmiştir. Bu algoritmaların çoğu, azınlık sınıfındaki örnekleri göz ardı ederken veya yanlış sınıflandırırken, çoğunluk sınıfa odaklanır. Medikal veri madenciliğinde bazı kısıtlardan dolayı küçük ve dengesiz veri seti problemi ile sıklıkla karşılaşılmaktadır. Çalışma kapsamında erişime açık hepatit veri seti, küçük veri setlerine bölünmüş, oluşturulan her bir veri seti uzaklık tabanlı yöntemlerle çoğaltılmıştır. Çoğaltılan veri setleri dört farklı makine öğrenmesi algoritması (Yapay Sinir Ağları, Destek Vektör Makineleri, Naive Bayes ve Karar Ağacı) kullanılarak sınıflandırılmış, elde edilen sınıflama sonuçları karşılaştırılmıştır.	en_US
dc.description.abstract	Classification of data is difficult in case of small and unbalanced data set and this problem directly affects the classification performance. Small and / or the imbalance dataset has become a major problem in data mining. Classification algorithms are developed based on the assumption that the data sets are balanced and large enough. The most of the algorithms ignore or misclassify examples of the minority class, focus on the majority class. Small and unbalanced data set problem is frequently encountered in medical data mining due to some limitations. Within the scope of the study, the public accessible data set, hepatitis, was divided into small and imblanced data subsets, each of the data subsets were oversampled by distance based data generation methods. The oversampled data sets were classified by using four different machine learning algorithms (Artificial Neural Networks, Support Vector Machines, Naive Bayes and Decision Tree) and the classification scores were compared.	en_US
dc.identifier.citation	Par, Öznur Esra; Akçapınar Sezer, Ebru; Sever, Hayri. "Sınıflandırmada Küçük ve Dengesiz Veri Kümesi Problemi/Small and Unbalanced Data Set Problem in Classification, IEEE 27th Signal Processing and Communications Applications Conference (SIU), 2019.	en_US
dc.identifier.isbn	1728119057
dc.identifier.uri	https://hdl.handle.net/20.500.12416/4331
dc.language.iso	tr	en_US
dc.relation.ispartof	IEEE 27th Signal Processing and Communications Applications Conference (SIU)	en_US
dc.rights	info:eu-repo/semantics/closedAccess	en_US
dc.subject	Makine Öğrenmesi	en_US
dc.subject	Küçük Veri Seti	en_US
dc.subject	Dengesiz Veri Seti	en_US
dc.subject	Örneklem Çoğaltma Yöntemleri	en_US
dc.subject	Machine Learning	en_US
dc.subject	Small Data Set	en_US
dc.subject	Imbalanced Data Set	en_US
dc.subject	Oversampling Methods	en_US
dc.title	Sınıflandırmada Küçük ve Dengesiz Veri Kümesi Problemi	tr_TR
dc.title	Sınıflandırmada Küçük ve Dengesiz Veri Kümesi Problemi	en_US
dc.title.alternative	Small and Unbalanced Data Set Problem in Classification	en_US
dc.type	Conference Object	en_US
dspace.entity.type	Publication
gdc.author.yokid	11916
gdc.coar.access	metadata only access
gdc.coar.type	text::conference output
gdc.description.department	Çankaya Üniversitesi, Mühendislik Fakültesi, Yazılım Mühendisliği Bölümü	en_US
gdc.virtual.author	Sever, Hayri
relation.isAuthorOfPublication	a26d16c1-fa24-4ceb-b2c8-8517c96e2534
relation.isAuthorOfPublication.latestForDiscovery	a26d16c1-fa24-4ceb-b2c8-8517c96e2534
relation.isOrgUnitOfPublication	12489df3-847d-4936-8339-f3d38607992f
relation.isOrgUnitOfPublication	43797d4e-4177-4b74-bd9b-38623b8aeefa
relation.isOrgUnitOfPublication	0b9123e4-4136-493b-9ffd-be856af2cdb1
relation.isOrgUnitOfPublication.latestForDiscovery	12489df3-847d-4936-8339-f3d38607992f

Files

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Yazılım Mühendisliği Bölümü Yayın Koleksiyonu