Improving Classification Performance under Imbalanced Data Conditions using Generative Adversarial Networks

  • Deep learning has achieved significant improvements in a variety of tasks in computer vision applications with an open image dataset which has a large amount of data. However, the acquisition of a large number of the dataset is a challenge in real-world applications, especially if they are new eras for deep learning. Furthermore, the distribution of class in the dataset is often imbalanced. The data imbalance problem is frequently bottlenecks of the neural network performance in classification. Recently, the potential of generative adversarial networks (GAN) as a data augmentation method on minority data has been studied. This dissertation investigates using GAN and transfer learning to improve the performance of the classification under imbalanced data conditions. We first propose a classification enhancement generative adversarial networks (CEGAN) to enhance the quality of generated synthetic minority data and more importantly, to improve the prediction accuracy in data imbalanced condition. Our experiments show that approximating the real data distribution using CEGAN improves the classification performance significantly in data imbalanced conditions compared with various standard data augmentation methods. To further improve the performance of the classification, we propose a novel supervised discriminative feature generation method (DFG) for minority class dataset. DFG is based on the modified structure of Generative Adversarial Network consisting of four independent networks: generator, discriminator, feature extractor, and classifier. To augment the selected discriminative features of minority class data by adopting attention mechanism, the generator for class-imbalanced target task is trained while feature extractor and classifier are regularized with the pre-trained ones from large source data. The experimental results show that the generator of DFG enhances the augmentation of label-preserved and diverse features, and classification results are significantly improved on the target task. In this thesis, these proposals are deployed to bearing fault detection and diagnosis of induction motor and shipping label recognition and validation for logistics. The experimental results for bearing fault detection and diagnosis conclude that the proposed GAN-based framework has good performance on the imbalanced fault diagnosis of rotating machinery. The experimental results for shipping label recognition and validation also show that the proposed method achieves better performance than many classical and state-of-the-art algorithms.

Download full text files

Export metadata

Author:Sungho SuhORCiD
URN (permanent link):urn:nbn:de:hbz:386-kluedo-66263
Advisor:Paul Lukowicz
Document Type:Doctoral Thesis
Language of publication:English
Publication Date:2021/10/27
Date of first Publication:2021/10/27
Publishing Institute:Technische Universität Kaiserslautern
Granting Institute:Technische Universität Kaiserslautern
Acceptance Date of the Thesis:2021/10/21
Date of the Publication (Server):2021/10/28
Number of page:XVII, 163
Faculties / Organisational entities:Fachbereich Informatik
CCS-Classification (computer science):A. General Literature
DDC-Cassification:0 Allgemeines, Informatik, Informationswissenschaft / 004 Informatik
MSC-Classification (mathematics):68-XX COMPUTER SCIENCE (For papers involving machine computations and programs in a specific mathematical area, see Section {04 in that areag 68-00 General reference works (handbooks, dictionaries, bibliographies, etc.)
Licence (German):Creative Commons 4.0 - Namensnennung, nicht kommerziell (CC BY-NC 4.0)