基于U-Net架构改进VGG19模型的人脸表情识别方法

Facial Expression Recognition Method Based on U-Net Architecture Improved VGG19 Model

  • 摘要: 针对传统面部识别技术中存在的诸多问题,如网络模型对关键通道特征的关注不足、参数量过大以及识别准确率不高等,本文提出了一种基于改进VGG19模型的全新方案。该方案融合了U-Net网络架构的设计理念,并引入了改进的SEAttention模块,以期提高模型的收敛速度和对面部细节的关注程度。在保持VGG19深层特征提取能力的基础上,本文通过特定设计的卷积层和跳跃连接,实现了对特征的高效融合与优化。经过改进的VGG19模型,不仅能更好地提取面部特征,还能在保证准确度的前提下,降低模型参数,提高运算效率。为了验证改进模型的效果,利用FER2013数据集和CK+两个数据集对本文提出的模型进行了测试。试验结果显示,改进后的VGG19网络在表情识别的准确率上分别取得了1.58%和4.04%的提升。这一结果充分证明了本文提出的方法在解决传统面部识别问题方面的优越性,也为面部识别技术的进一步发展提供了新的思路。

     

    Abstract: In response to many problems in traditional facial recognition techniques, such as insufficient attention of network models to key channel features, large parameter quantities, and low recognition accuracy, this paper proposes an improved VGG19 model that incorporates the ideas from the U-Net architecture. While maintaining the deep feature extraction capability of VGG19, the model employs specially designed convolutional layers and skip connections, utilizing feature cropping and stitching techniques to efficiently integrate multi-scale features. This design ensures the harmonious integration of features from different layers, which is crucial for accurate facial expression recognition. Additionally, the paper introduces an improved SEAttention module for facial expression recognition tasks. The innovation in the SEAttention module lies in replacing the original activation function with the Mish activation function, which can dynamically adjust the weights of different channels, ensuring that important features are emphasized while redundant features are suppressed. This selective focus speeds up the convergence of the network and improves the ability of the model to detect subtle changes in facial expressions. Moreover, the parameters of the fully connected layers were modified by replacing the first and second fully connected layers with convolutional layers while retaining the last fully connected layer. The number of nodes in the fully connected layers in the original network was changed from 4096, 4096, 1000 to 7, thus addressing the issue of the large parameter size in the VGG19 network and enhancing its anti-overfitting capability. In this paper, we performed a large number of experiments on the FER2013 dataset and CK+ dataset. The improved VGG19 model significantly improves recognition accuracy compared to the original VGG19 model, with improvements of 1.58% and 4.04% on the FER2013 dataset and CK+dataset, respectively. Additionally, the parameter efficiency of the model was evaluated, showing a reduction in parameter quantity without sacrificing performance. This balance between model complexity and accuracy highlights the practical applicability of the proposed method in real-world facial recognition scenarios. In conclusion, integrating the U-Net architecture and SEAttention module into the VGG19 network has led to significant advancements in facial expression recognition. The improved model not only improves the performance of feature extraction and fusion, but also solves the problem of parameter size and computation efficiency. These innovations contribute to state-of-the-art performance in facial expression recognition, making the proposed method an important contribution to the fields of computer vision and deep learning. The robustness and efficiency of the proposed method make it a promising solution for various applications requiring accurate real-time facial expression analysis, such as human-computer interaction, security systems, and emotion-driven computing. Future work will explore the model's adaptability to other datasets and further optimization techniques to enhance its performance and applicability.

     

/

返回文章
返回