Skip to content

extremecoder-rgb/SubASL-Model

Repository files navigation

🤟 SubAsL AI: Advanced Real-Time Sign Language Recognition

Python TensorFlow PyTorch MediaPipe

SubAsL AI is a state-of-the-art, dual-engine artificial intelligence system designed to bridge the communication gap between the Deaf community and the hearing world. By utilizing raw skeletal landmarks extracted via computer vision, SubAsL AI achieves high-speed, real-time American Sign Language (ASL) recognition entirely locally, making it highly privacy-preserving, production-ready, and scalable.

📸 Live Demonstration

Here is the system successfully performing real-time inference, generating raw MediaPipe skeletons and predicting dynamic ASL vocabulary with high confidence:

Inference 1 Inference 2

Inference 3 Inference 4


🧠 The Dual-Engine Architecture

SubAsL AI is powered by two distinct neural network architectures, specialized for different aspects of communication: Fingerspelling (Alphabets) and Dynamic Signing (Words).

1. The Word Detection Engine (Elite ResNet)

Designed to recognize complex, multi-frame dynamic gestures across a vocabulary of 250 distinct ASL words (e.g., "Hello," "Water," "Hungry").

  • Architecture: A deep 1D-CNN Residual Network (ResNet) featuring 4 residual blocks, Global Average Pooling, and a 512-Dense classification head.
  • Feature Engineering: Extracts 75 skeletal landmarks (Hands + Pose) via MediaPipe. We utilize Nose-Centric Spatial Normalization (making predictions position-invariant) and compute the Temporal Velocity of joints to capture the true speed and direction of the sign.
  • Performance:
    • Training Accuracy: 85%
    • Validation Accuracy: 71% (Across 250 classes; Random guessing would be 0.4%)
  • Production Deployment: Model weights are quantized and converted to TensorFlow Lite (.tflite), allowing inference at 60 FPS on a standard CPU without requiring cloud GPUs.
  • Dataset Used: Google - Isolated Sign Language Recognition (94K+ videos)

2. The Alphabet Detection Engine

Designed for ultra-fast, static frame-by-frame fingerspelling detection (A-Z). Used for spelling out names or words outside the 250-word dictionary.

  • Architecture: A highly optimized PyTorch feed-forward neural network.
  • Feature Engineering: Analyzes the 21 3D-landmarks of a single hand, focusing on the relative geometry of the fingers.
  • Production Deployment: PyTorch weights saved as best_model.pth.
  • Performance:
    • Accuracy: 97%
    • F1-Score: 97%
    • Precision/Recall: Balanced high performance across all 26 alphabet classes (Avg. 0.96+).
  • Dataset Used: Sign Language Landmarks Dataset (Kaggle)

🚀 How to Run Locally

Prerequisites

Ensure you have Python 3.9+ installed. Install the core dependencies:

pip install opencv-python mediapipe tensorflow torch numpy

Running the Word Detection System (Elite ResNet)

To launch the 250-word dynamic ASL recognizer:

python inference_resnet.py

Controls:

  • SPACE: Add a space to your sentence.
  • D / Backspace: Delete the last recorded word.
  • ESC: Exit the application.

Running the Alphabet Detection System

To launch the A-Z fingerspelling recognizer:

python inference.py

💡 How It Works Under The Hood

  1. Skeleton Extraction: As the user signs, MediaPipe Holistic tracks their body, drawing a virtual skeleton consisting of 75 key points (Left Hand, Right Hand, Face, Torso).
  2. Mathematical Transformation: Instead of processing heavy pixel data, SubAsL AI processes pure mathematics. Coordinates are normalized against the user's nose, meaning the AI works perfectly whether the user is 1 foot or 10 feet away from the camera.
  3. Temporal Processing: For word detection, a buffer collects 60 frames of data. The ResNet model evaluates the flow of time, analyzing how the landmarks move, accelerate, and stop to determine the exact sign.
  4. Real-Time Output: The system confidently predicts the sign and outputs it to the UI in milliseconds.

🌍 Future Scope: SubAsL Chrome Extension

The lightweight nature of our TFLite and PyTorch models paves the way for direct browser integration.

Next Steps: SubAsL AI can be packaged into a JavaScript-based Chrome Extension using TensorFlow.js and MediaPipe.js. This will allow the AI to run natively inside platforms like Google Meet or Zoom, providing live ASL-to-Text closed captioning completely offline, ensuring total user privacy.


Built with ❤️ for a more accessible world.

About

SubAsL AI - a superfast intelligent model built by me to help deaf community

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors