ViTAU: Facial Paralysis Recognition and Analysis Based on Vision Transformer and Facial Action Units
-
-
Abstract
Facial Nerve Paralysis (FNP), commonly known as Bell's palsy or facial paralysis, significantly impacts patients' daily life and mental health. Timely identification and diagnosis of facial paralysis are crucial for early treatment and recovery. With the rapid development of deep learning and computer vision technologies, automatic recognition of facial paralysis has become feasible, providing a more accurate and objective method for diagnosis. Current research primarily focuses on overall facial changes, neglecting the importance of facial details. The influence of different facial regions on recognition results varies, and these studies have yet to meticulously differentiate and analyze each facial area. This research introduces an innovative method that combines the Vision Transformer (ViT) model and an Action Unit (AU) region detection network for the automatic recognition and regional analysis of facial paralysis. The ViT model accurately identifies facial paralysis through its self-attention mechanism, while the AU-based strategy uses features extracted from the StyleGAN2 model and analyzes affected areas using a pyramid convolutional neural network. This comprehensive approach achieved a 99.4% accuracy rate in facial paralysis recognition and an 81.36% accuracy rate in facial paralysis region recognition in experiments on the YouTube Facial Palsy (YFP) and the extended Cohn Kanade (CK+) datasets. Experimental results demonstrate the effectiveness of the proposed automatic facial paralysis recognition method compared to the latest techniques.
-
-