MFDU-Net: kidney ultrasound image segmentation based on improved U-shaped network
-
-
Abstract
Ultrasound image segmentation is critical for accurate diagnosis and effective treatment planning. Although numerous deep learning methods have been developed for organ and lesion segmentation in ultrasound images, their performance remains limited by indistinct anatomical boundaries and high speckle noise. To solve these problems, this study improves the U-Net model and proposes a network model, MFDU-Net, based on multiscale feature extraction and frequency attention denoising. First, a multi-resolution input (MI) module is introduced at the network input end to select pooling windows of different sizes for size reduction from the feature maps extracted after one convolution of the input image. The feature maps are then overlaid with the corresponding-size encoding layers by channels for feature fusion, and the model provides input information for different encoding layers for subsequent encoding operations. Second, the convolutional encoding layer of the original U-Net is replaced with a multi-scale separation convolution (MSC) module to capture the image feature information at different scales. Unlike existing multiscale methods that use different sizes of convolution kernels in parallel, the proposed framework employs multiple feature extraction branches with the same 3×3 convolution kernel. By changing the number of convolutions, the same effect as changing the size of the convolution kernel is achieved, enabling multiscale feature extraction and allowing the network to adapt to targets of different sizes and positions, thus enhancing its expressive power and robustness. Finally, considering that the U-Net skip connection does not consider the direct transmission of noise and irrelevant information when fusing the encoder and decoder features, a frequency-denoising attention module is introduced in the skip connection. Drawing on the ideas of the CBAM module, frequency–space attention and frequency–channel attention modules are designed to remove noise and weight feature information, respectively. This overcomes the limitations of spatial domain feature learning commonly found in most medical image segmentation networks and reduces the impact of irrelevant information and noise on the model segmentation performance. U-Net, DeepLabv3+, Att U-Net, ACC-Unet, and FCRNet were used as comparative methods to evaluate the proposed MFDU-Net on two self-built kidney ultrasound datasets and two publicly available datasets: DDTI and ISIC2018. Considering the small number of collected images and the requirement of segmentation models for the training set size, data augmentation was performed on all the datasets used in this study. Six quantitative evaluation indicators were selected: Jaccard coefficient, recall rate, accuracy rate, Dice coefficient, HD95, and ASSD. Compared with the baseline U-Net, MFDU-Net improves the dice index by 5.47, 9.66, 4.83, and 3.59 percentage points, respectively, and the hd95 distance index decreased by 36.59, 5.76, 48.25 and 25.75, respectively, achieving good segmentation results. To verify the effectiveness of the improved modules designed in this study, ablation experiments were conducted using a kidney ultrasound dataset. The experimental results show that, compared with existing advanced medical image segmentation models, MFDU-Net demonstrates superior performance in segmenting organs and lesions..
-
-