GCRIS

Now showing 1 - 2 of 2

Enhancing Content-Based Retrieval Through an End-to-End Approach Utilizing Deep Learning and Multidimensional Indexing
(Springer London Ltd, 2025) Uzel, Omer; Arslan, Serdar
Recent advancements in technology, coupled with reductions in hardware and software costs, have propelled visual search applications into the spotlight, making them both popular and indispensable. Consequently, the rapid and precise retrieval of images from vast databases through image queries has become a critical task. We introduce a novel end-to-end retrieval architecture that significantly enhances retrieval performance when compared to a baseline system that conducts database searches at the video frame level. Leveraging a pre-trained convolutional neural network model, we employ unsupervised image retrieval processes to extract and store low-level features for efficient indexing. To facilitate swift and effective access, we implement a tree-based indexing structure known as VP-Tree. This structure utilizes the extracted low-level features. To make these features compatible with our system, we employ dimension reduction techniques to represent them in a lower-dimensional space. Our experiments, conducted on three benchmark datasets, demonstrate that VP-Tree consistently outperforms k-nearest neighbor (KNN) search in terms of retrieval accuracy and efficiency. Specifically, for image data set, VP-Tree achieves a precision of 56.3903, an F1-score of 68.703, and an area under the curve (AUC) of 93.518719, all slightly surpassing KNN. Similarly, for news video data set, VP-Tree attains a precision of 38.704011, an F1-score of 55.029674, and an AUC of 64.6412, again outperforming KNN. For documentary data set, VP-Tree achieves a notable improvement with a precision of 73.511723, an F1-score of 84.734013, and an AUC of 80.981328, demonstrating superior performance over KNN. In addition to accuracy, we evaluated retrieval time across different dataset sizes. While KNN performs slightly faster on smaller datasets, VP-Tree scales significantly better as dataset size increases. For 100,000 images, VP-Tree reduces retrieval time from 79.77 to 54.34 ms, and for 200,000 images, it improves performance from 108.75 to 44.63 ms, confirming its efficiency in large-scale retrieval scenarios. These results highlight VP-Tree as a robust and scalable alternative to traditional KNN-based methods, ensuring both accuracy and efficiency in large-scale image retrieval tasks.
Citation - WoS: 18
Citation - Scopus: 27
Application of Bilstm-Crf Model With Different Embeddings for Product Name Extraction in Unstructured Turkish Text
(Springer London Ltd, 2024) Arslan, Serdar
Named entity recognition (NER) plays a pivotal role in Natural Language Processing by identifying and classifying entities within textual data. While NER methodologies have seen significant advancements, driven by pretrained word embeddings and deep neural networks, the majority of these studies have focused on text with well-defined grammar and structure. A significant research gap exists concerning NER in informal or unstructured text, where traditional grammar rules and sentence structure are absent. This research addresses this crucial gap by focusing on the detection of product names within unstructured Turkish text. To accomplish this, we propose a deep learning-based NER model which combines a Bidirectional Long Short-Term Memory (BiLSTM) architecture with a Conditional Random Field (CRF) layer, further enhanced by FastText embeddings. To comprehensively evaluate and compare our model's performance, we explore different embedding approaches, including Word2Vec and Glove, in conjunction with the Bidirectional Long Short-Term Memory and Conditional Random Field (BiLSTM-CRF) model. Furthermore, we conduct comparisons against BERT to assess the efficacy of our approach. Our experimentation utilizes a Turkish e-commerce dataset gathered from the internet, where traditional grammatical and structural rules may not apply. The BiLSTM-CRF model with FastText embeddings achieved an F1 score value of 57.40%, a precision value of 55.78%, and a recall value of 59.12%. These results indicate promising performance in outperforming other baseline techniques. This research contributes to the field of NER by addressing the unique challenges posed by unstructured Turkish text and opens avenues for improved entity recognition in informal language settings, with potential applications across various domains.

Scopus İndeksli Yayınlar Koleksiyonu

Browse

Filters

Settings

Sort By

Results per page

Search Results