基于切分通道注意力网络的图像分类算法

Image classification algorithm based on split channel attention network

  • 摘要: 通道注意力机制可以有效利用不同的特征通道,通过对特征图的通道进行加权和调整,使得卷积神经网络可以更加关注重要的特征通道,以提高卷积神经网络的分类能力. 然而,对于使用全局平均池化来获取通道全局特征的方法,特征图中不同的通道有极大概率出现相同的均值,使得全局平均池化后的特征缺乏多样性,进一步影响网络分类性能. 针对此问题,提出一种切分通道注意力机制来构建模块,该模块对全局平均池化的输出维度进行了扩展,减轻全局平均池化造成的信息丢失,增强了通道注意力中全局平均池化层的特征多样性,然后使用多个一维卷积分别计算通道维度上每个区域的注意力权重. 将切分通道注意力机制与多种图像分类网络相结合,在CIFAR-100和ImageNet数据集上进行了图像分类实验. 实验结果表明,切分通道注意力机制在保持轻量化的基础上仍然能有效提高模型的精度,并且与其他注意力机制相比也表现出较好的优势.

     

    Abstract: The channel attention mechanism can effectively make use of different feature channels. By weighting and adjusting the channels of feature graphs, convolutional neural networks can pay more attention to important feature channels, thus improving their classification ability. The first step in this mechanism involves compressing the feature map of each channel to obtain global features inside each channel. Global average pooling stands out as the best choice because of its ease of use and high efficiency. However, a challenge arises when global average pooling is used to obtain the global features of channels: different channels in the feature graph have a high probability of exhibiting the same mean value. Meanwhile, using only one scalar to measure the importance of the whole feature graph will not accurately reflect the complexity and diversity of features, resulting in the lack of diversity of features after global average pooling, which further affects the classification performance of the network. To solve this problem, a split channel attention mechanism is proposed to build a module. This module extends the output dimension of global average pooling, reduces the information loss caused by global average pooling, enhances the diversity of the output features of the global average pooling layer in channel attention, and uses multiple one-dimensional convolutions to calculate the attention weight of each region in the channel dimension. By splitting the output feature map of the global average pooling layer into multiple regions, variations in the feature maps of different regions are preserved while the global information of channels is compressed. Furthermore, the importance of different regional features is considered comprehensively, and a more comprehensive and fine-grained method is adopted to evaluate and utilize feature map information than global average pooling, effectively improving the ability and performance of the model. Image classification experiments are performed on the CIFAR-100 and ImageNet datasets by combining the split channel attention mechanism with multiple image classification networks. Experimental results show that the split channel attention mechanism can effectively improve the accuracy of the model while remaining lightweight and that the proposed mechanism has better advantages than other attention mechanisms. Furthermore, Grad-CAM is used to analyze the results predicted by the model visually. The analysis results show that the network model when integrated with the split channel attention mechanism, can better learn the feature fitting of the target object region well and has better feature extraction and classification capabilities. This underscores the potential of the split attention mechanism to improve the performance of network models.

     

/

返回文章
返回