MFDU-Net：基于改进U型网络的肾脏超声图像分割

刘育铭; 代煜; 周璐; 张建勋

doi:10.13374/j.issn2095-9389.2025.11.24.002

摘要: 超声图像分割在确保准确诊断和制订有效治疗方案方面起着至关重要的作用，目前已经开发了许多深度学习方法来分割超声图像中的器官和病变. 然而，模糊的组织边界和大量的散斑噪声等问题限制了现有模型在超声图像上的分割精度. 为了解决这些问题，本文在U-Net模型基础上进行改进，提出了一种基于多尺度特征提取和频率注意力去噪的网络模型MFDU-Net. 首先，在图像输入端采用了多分辨率输入模块，使模型能够为不同编码层提供输入信息. 其次，将原始U-Net的卷积编码层替换为多尺度分离卷积模块，以捕捉不同尺度的图像特征信息. 最后，在跳跃连接处引入频率去噪注意力模块，克服大多数医学图像分割网络中常见的空间域特征学习的局限性，减少无关信息和噪声对模型分割性能的影响. 对所提出的MFDU-Net在两个自建肾脏超声数据集及DDTI和ISIC2018两个公开数据集上进行了对比评估，与基线U-Net方法相比，MFDU-Net在Dice指标上分别提高了5.47、9.66、4.83和3.59个百分点，HD95距离指标分别缩减了36.59、5.76、48.25和25.75，取得了良好的分割效果. 实验结果表明，与现有的先进医学图像分割模型相比，MFDU-Net在分割器官和病灶方面具有更优越的性能.

Abstract: Ultrasound image segmentation is critical for accurate diagnosis and effective treatment planning. Although numerous deep learning methods have been developed for organ and lesion segmentation in ultrasound images, their performance remains limited by indistinct anatomical boundaries and high speckle noise. To solve these problems, this study improves the U-Net model and proposes a network model, MFDU-Net, based on multiscale feature extraction and frequency attention denoising. First, a multi-resolution input (MI) module is introduced at the network input end to select pooling windows of different sizes for size reduction from the feature maps extracted after one convolution of the input image. The feature maps are then overlaid with the corresponding-size encoding layers by channels for feature fusion, and the model provides input information for different encoding layers for subsequent encoding operations. Second, the convolutional encoding layer of the original U-Net is replaced with a multi-scale separation convolution (MSC) module to capture the image feature information at different scales. Unlike existing multiscale methods that use different sizes of convolution kernels in parallel, the proposed framework employs multiple feature extraction branches with the same 3×3 convolution kernel. By changing the number of convolutions, the same effect as changing the size of the convolution kernel is achieved, enabling multiscale feature extraction and allowing the network to adapt to targets of different sizes and positions, thus enhancing its expressive power and robustness. Finally, considering that the U-Net skip connection does not consider the direct transmission of noise and irrelevant information when fusing the encoder and decoder features, a frequency-denoising attention module is introduced in the skip connection. Drawing on the ideas of the CBAM module, frequency–space attention and frequency–channel attention modules are designed to remove noise and weight feature information, respectively. This overcomes the limitations of spatial domain feature learning commonly found in most medical image segmentation networks and reduces the impact of irrelevant information and noise on the model segmentation performance. U-Net, DeepLabv3+, Att U-Net, ACC-Unet, and FCRNet were used as comparative methods to evaluate the proposed MFDU-Net on two self-built kidney ultrasound datasets and two publicly available datasets: DDTI and ISIC2018. Considering the small number of collected images and the requirement of segmentation models for the training set size, data augmentation was performed on all the datasets used in this study. Six quantitative evaluation indicators were selected: Jaccard coefficient, recall rate, accuracy rate, Dice coefficient, HD95, and ASSD. Compared with the baseline U-Net, MFDU-Net improves the dice index by 5.47, 9.66, 4.83, and 3.59 percentage points, respectively, and the hd95 distance index decreased by 36.59, 5.76, 48.25 and 25.75, respectively, achieving good segmentation results. To verify the effectiveness of the improved modules designed in this study, ablation experiments were conducted using a kidney ultrasound dataset. The experimental results show that, compared with existing advanced medical image segmentation models, MFDU-Net demonstrates superior performance in segmenting organs and lesions..

MFDU-Net：基于改进U型网络的肾脏超声图像分割

MFDU-Net: kidney ultrasound image segmentation based on improved U-shaped network