Browsing by Author "Sezer, Ebru Akcapinar"

Now showing 1 - 4 of 4

Authorship Modelling Approach for Authorship Verification on the Turkish Texts
(Ieee, 2018) Akcapinar Sezer, Ebru; Sever, Hayri; Canbay, Pelin; Sezer, Ebru Akcapinar
Authorship attribution which aims to extract information about an author by analyzing the text of the author is a challenging field that has been studied for years. This study becomes even more difficult when there is limited data on this field. The need for this study carried out under the name of Authorship Verification is increasing day by day with the increase of anonymous authors in the electronic environments. In this study, a model-based solution approach is presented for the authorship verification problem. With the presented approach, it was determined what should be the success interval to be considered in the authorship verification problem.
Deep Combination of Stylometry Features in Forensic Authorship Analysis
(2020) Sezer, Ebru Akcapinar; Canbay, Pelin; Sever, Hayri
Authorship Analysis (AA) in forensic is a process aim to extract information about an author from his/her writings.Forensic AA is needed for detection characteristics of anonymous authors to make better the security of digital media userswho are exposed to disturbing entries such as threats or harassment emails. To analyze whether two anonymous short textswere written by the same author, we propose a combination of stylometry features from different categories in differentprogress. In the majority of the previous AA studies, the stylometric features from different categories are concatenated in apreprocess. In these studies, during the learning process, no category-specific operations are performed; all categories used areevaluated equally. On the other hand, the proposed approach has a separate learning process for each feature category due totheir qualitative and quantitative characteristics and combines these processes at the decision phase by using a Combination ofDeep Neural Networks (C-DNN). To evaluate the Authorship Verification (AV) performance of the proposed approach, wedesigned and implemented a problem-specific Deep Neural Network (DNN) for each stylometry category we used.Experiments were conducted on two English public datasets. The results show that the proposed approach significantlyimproves the generalization ability and robustness of the solutions, and also have better accuracy than the single DNNs.
Detection of Stylometric Writeprint From the Turkish Texts
(Ieee, 2020) Canbay, Pelin; Sever, Hayri; Sezer, Ebru Akcapinar; Sever, Hayri; Bilgisayar Mühendisliği
Authorship attribution studies aim to extract information about the author by analyzing the data in the text form. With the increase of anonymous authors in digital environments, the need for these works is increasing day by day. Although there exists lots of studies focuse on stylometric writeprint detection in different languages using different attributes, there is no standard feature set and detection algorithm to be evaluated in these studies. Giving priority to Turkish texts, in this study, which features are more distinctive for determining stylistic writeprint of text, and which methods will contribute to increase the success to be achieved are shown with experimental studies.
Citation - WoS: 8
Citation - Scopus: 14
Small and Unbalanced Data Set Problem in Classification
(Ieee, 2019) Sezer, Ebru Akcapinar; Sever, Hayri; Par, Oznur Esra
Classification of data is difficult in case of small and unbalanced data set and this problem directly affects the classification performance. Small and / or the imbalance dataset has become a major problem in data mining. Classification algorithms are developed based on the assumption that the data sets are balanced and large enough. The most of the algorithms ignore or misclassify examples of the minority class, focus on the majority class. Small and unbalanced data set problem is frequently encountered in medical data mining due to some limitations. Within the scope of the study, the public accessible data set, hepatitis, was divided into small and imblanced data subsets, each of the data subsets were oversampled by distance based data generation methods. The oversampled data sets were classified by using four different machine learning algorithms (Artificial Neural Networks, Support Vector Machines, Naive Bayes and Decision Tree) and the classification scores were compared.