Systems Engineering and Electronics ›› 2025, Vol. 47 ›› Issue (11): 3739-3753.doi: 10.12305/j.issn.1001-506X.2025.11.22

• Systems Engineering • Previous Articles    

Imbalanced data oversampling method based on DBSCAN and CGAN

Xi TANG(), Wenhai LI, Zhenhao TANG, Ruifeng LI, Gen LI   

  1. Academy of Aeronautical Operations Service,Naval Aviation University,Yantai 264001,China
  • Received:2025-04-24 Online:2025-11-25 Published:2025-12-08
  • Contact: Ruifeng LI E-mail:910073134@qq.Com

Abstract:

In order to improve the classification accuracy of classifiers for imbalanced data, an oversampling method based on density-based spatial clustering of applications with noise (DBSCAN) and conditional generative adversarial network (CGAN) is proposed. Firstly, DBSCAN is applied to cluster positive and negative samples separately. The dataset is then reconstructed using cluster labels, and noise samples are identified and removed based on a safety level criterion to improve data quality. Subsequently, the refined dataset is fed into a CGAN model for training. To address the issues of training instability and mode collapse in CGAN, the Wasserstein distance with gradient penalty is adopted as the loss function, and an adaptive modification of the Wasserstein distance is introduced to better suit the classification problem, enabling the generation of high-quality minority class samples. Finally, experiments are conducted on nine general imbalanced datasets and one analog circuit measurement dataset, comparing the proposed method with five classical oversampling methods across three typical classifiers. Results show that the proposed method outperforms other oversampling algorithms on most datasets, with more significant advantages observed with higher levels of class imbalance. The proposed method provides a novel approach for handling imbalanced data.

Key words: imbalanced data, conditional generative adversarial networks (CGAN), density-based spatial clustering of applications with noise (DBSCAN), oversampling

CLC Number: 

[an error occurred while processing this directive]