Browsing by Author "Mohammed, Hamza Haruna"

Now showing 1 - 2 of 2

Multi-label classification of text document using deep learning
(2019) Mohammed, Hamza Haruna
Recently, studies in the field of Natural Language Processing and some of its related important problem and Applications in the machine learning field continue to mount up. Machine Learning is prove to be predominantly data-driven in the sense that generic model buildings are used and then tailored to a specific application data. Needless to say, this has proven to be a very effective approach to modeling the complicated data dependencies we frequently experience in practice, making very few assumptions and allowing the information to talk for themselves. Examples can be found in chemical process engineering, climate science, systems, healthcare, and linguistic processing of natural language, to name a few. Moreover, text classification is one of the important aspect of Natural Language Processing. Text classification is the act of categorizing text or text documents into a given set of labels. While on the other hand, multi-label text classification deals with classifying text or documents into one more labels at the same time. Over the years, some methods for classifying text and documents have been proposed, including popularly known Bag of Words (BoW) method, Supervised Machine Learning, tree induction and label-vector embedding, to mention a few. These kind of tools can be used in many digital applications, such as document filtering, search engines, document management systems, etc. Lately, Deep Learning based methods is getting more attention, especially in an Extreme Multi-Label text classification. Deep learning is one of the major solutions to many machine learning applications that involve high-dimensional and unstructured data, such as pictures and text documents. However, it is of paramount importance in many of these applications to be able to reason accurately about the uncertainties associated with the predictions of these models. Therefore in this studies, we explore multi-label classification of text documents using deep learning methods such as CNN, RNN, LSTM, and even GRU. We investigate two scenarios in the studies. Firstly, multi-label classification models with plane embedding layer, and secondly with a Glove, Word2vec, and FastText as pre-trained embedding corpus for our models. We evaluate and compare these different neural network models performances in terms of multi-label evaluation metrics with respect to the two approaches.
Citation - WoS: 10
Citation - Scopus: 21
Multi-Label Classification of Text Documents Using Deep Learning
(Ieee, 2020) Mohammed, Hamza Haruna; Dogdu, Erdogan; Gorur, Abdul Kadir; Choupani, Roya
Recently, studies in the field of Natural Language Processing and its related applications continue to mount up. Machine learning is proven to be predominantly data-driven in the sense that generic model building methods are used and then tailored to specific application domains. Needless to say, this has proven to be a very effective approach in modeling the complicated data dependencies we frequently experience in practice, making very few assumptions, and allowing the information to talk for themselves. Examples of these applications can be found in chemical process engineering, climate science, healthcare, and linguistic processing systems for natural languages, to name a few. Text classification is one of the important machine learning tasks that is used in many digital applications today; such as in document filtering, search engines, document management systems, and many more. Text classification is the process of categorizing of text documents into a given set of labels. Furthermore, multi-label text classification is the task of categorization of text documents into one or more labels simultaneously. Over the years, many methods for classifying text documents have been proposed, including the popularly known bag of words (BoW) method, support vector machine (SVM), tree induction, and label-vector embedding, to mention a few. These kinds of tools can be used in many digital applications, such as document filtering, search engines, document management systems, etc. Lately, deep learning-based approaches are getting more attention, especially in extreme multi-label text classification case. Deep learning has proven to be one of the major solutions to many machine learning applications, especially those involving high-dimensional and unstructured data. However, it is of paramount importance in many applications to be able to reason accurately about the uncertainties associated with the predictions of the models. In this paper, we explore and compare the recent deep learning-based methods for multi-label text classification. We investigate two scenarios. First, multi-label classification model with ordinary embedding layer, and second with Glove, word2vec, and FastText as pre-trained embedding corpus for the given models. We evaluated these different neural network model performances in terms of multi-label evaluation metrics for the two approaches, and compare the results with the previous studies.