Abstract:
Traditional anthropometric measurements in clinical smile aesthetics practice are fundamentally constrained by subjective errors and operator-dependent variability. To overcome these limitations, this study develops an intelligent prediction system for maxillary central incisor (MCI) width determination through synergistic integration of three-dimensional (3D) facial analysis and machine learning. Our methodology establishes a comprehensive technical framework incorporating automated 3D facial landmark detection, binocular stereovision reconstruction, Wasserstein generative adversarial networks with gradient penalty (WGAN-GP) augmentation, and multivariate regression modeling. The research cohort comprised 200 Chinese adult participants (age 18–30) undergoing standardized 3D facial scanning using the 3dMDface system under controlled conditions. A rigorous ethical protocol (#2024055) ensured compliance with STROBE guidelines, while exclusion criteria maintained the anatomical homogeneity. Through binocular image acquisition and dlib’s 68-point model, we quantified five critical anthropometric parameters: inter-canthal width, inter-alar width, medial canthal width, lateral canthal width, and inter-pupillary width. The 3D reconstruction leveraged pinhole camera projection models and distortion correction algorithms to calculate Euclidean distances in anatomical space. We innovatively implemented WGAN-GP data augmentation to address the critical challenge of small-sample generalization. This approach employed a four-layer generator network (100→128→256→512→6 neurons) and four-layer discriminator (6→512→256→1 neuron) with gradient penalty enforcement (
λ=10). The augmentation protocol enhanced feature correlations by 37.45%–85.00%, while reducing prediction error by 25.55%–73.44% across regression models. Five machine learning algorithms underwent systematic comparison: gradient boosting regression (GBR) with 200 trees and max depth 5, multilayer perceptron (MLP) featuring dual hidden layers and ReLU activation, random forest (RF), decision tree (DT), and multiple linear regression (MLR). Performance evaluation revealed that GBR achieved exceptional predictive accuracy with a coefficient of determination (
R2) of
0.9446 on test data, corresponding to a root mean square error (RMSE) of
0.1238 mm. This represents a 73.44% reduction in prediction error compared to conventional methods. MLP demonstrated superior generalization stability, maintaining near-identical performance between training (
R2=
0.9692) and testing (
R2=
0.9691) datasets with minimal variance (Δ
R2=
0.0001), ultimately achieving the highest precision at RMSE of
0.0924 mm on the testing datasets. All models attained submillimeter accuracy (
0.0924–
0.2358 mm) post-augmentation on the testing datasets, with RF, MLR, and DT showing RMSE values of
0.1705 mm,
0.1991 mm, and
0.2358 mm respectively. The performance differentials highlight MLP’s exceptional robustness in handling nonlinear feature interactions, while GBR’s sensitivity to feature engineering delivered optimal initial accuracy. We recommend that clinicians prioritize the MLP or GBR model architectures to provide an intelligent decision-making model with strong interpretability and clinical adaptability for digital smile design. This research makes three primary contributions to digital smile design. First, it establishes the integrated pipeline combining 3D facial reconstruction with deep learning augmentation for dental parameter prediction. Second, it resolves the persistent challenge of small-sample generalization in anthropometric modeling through WGAN-GP implementation. Third, it delivers a clinically applicable decision system with submillimeter precision. The quantified relationship matrix between facial landmarks and MCI dimensions provides prosthodontists with objective guidelines for crown fabrication, substantially reducing subjectivity in aesthetic rehabilitation. Future research directions include expanding multi-ethnic validation cohorts, integrating cone-beam computed tomography data for occlusal plane correlation, and developing real-time chairside prediction interfaces. The framework’s adaptability shows significant promise for extension to other dental parameters, including canine guidance optimization and incisal edge position determination.