Pose-Based Static Sign Language Recognition with Deep Learning for Turkish, Arabic, and American Sign Languages


Yayla R., Üçgün H., Abbas M.

SENSORS, cilt.26, sa.2, ss.1-27, 2026 (SCI-Expanded, Scopus)

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 26 Sayı: 2
  • Basım Tarihi: 2026
  • Doi Numarası: 10.3390/s26020524
  • Dergi Adı: SENSORS
  • Derginin Tarandığı İndeksler: Scopus, Science Citation Index Expanded (SCI-EXPANDED), Compendex, INSPEC, MEDLINE, Directory of Open Access Journals
  • Sayfa Sayıları: ss.1-27
  • Bilecik Şeyh Edebali Üniversitesi Adresli: Evet

Özet

Advancements in artificial intelligence have significantly enhanced communication for individuals with hearing impairments. This study presents a robust cross-lingual Sign Language Recognition (SLR) framework for Turkish, American English, and Arabic sign languages. The system utilizes the lightweight MediaPipe library for efficient hand landmark extraction, ensuring stable and consistent feature representation across diverse linguistic contexts. Datasets were meticulously constructed from nine public-domain sources (four Arabic, three American, and two Turkish). The final training data comprises curated image datasets, with frames for each language carefully selected from varying angles and distances to ensure high diversity. A comprehensive comparative evaluation was conducted across three state-of-the-art deep learning architectures—ConvNeXt (CNN-based), Swin Transformer (ViT-based), and Vision Mamba (SSM-based)—all applied to identical feature sets. The evaluation demonstrates the superior performance of contemporary vision Transformers and state space models in capturing subtle spatial cues across diverse sign languages. Our approach provides a comparative analysis of model generalization capabilities across three distinct sign languages, offering valuable insights for model selection in pose-based SLR systems.