Advanced gaze estimation, head pose tracking, and cognitive assessment toolkit for real-world applications
State-of-the-art AI models optimized for mobile and edge devices
State-of-the-art head pose tracking using continuous 6D rotation representation. Outperforms RepVGG-B1 baseline while using 35% fewer parameters through RepNeXt-M4 architecture.
Key Innovation: Replaces Euler angles with continuous 6D representation to avoid gimbal lock, ensuring stable rotations across full ±80° yaw/pitch range.
Stereo gaze estimation architecture leveraging 9-camera multi-view consistency. Combines PANet feature pyramid with EyeFLAME module for joint 2D-3D supervision resolving depth ambiguity.
Progressive Training: Phase 1 focuses on 2D projection (weight=1.0), Phase 2 introduces 3D (weight=0.1), Phase 3 balances both (weights=0.5) for optimal convergence.
FLAME-inspired geometric module integrated with RayNet for anatomical eye reconstruction using multiple camera streams. Predicts 3D eyeball structures with true perspective projection leveraging multi-view consistency.
Multi-View Approach: Utilizes RayNet architecture to process 9 synchronized camera streams, resolving depth ambiguity through geometric triangulation rather than single-view weak perspective.
Single-camera gaze point estimation on target plane using RepNeXt-M3. Training on ARGaze dataset with fixed 300cm depth assumption for desktop interaction scenarios.
Ray Casting: Gaze point calculated via ray intersection: pupil center + t·gaze_direction where t = (d_plane - p_z) / v_z for fixed depth d=300cm.
From mobile devices to cognitive analysis systems
Lightweight architectures designed for edge deployment with minimal latency and maximum accuracy.
Analyze iris patterns, pupil dynamics, and gaze behavior for cognitive load estimation.
True 3D reconstruction of eye anatomy with accurate depth perception and multi-view consistency.
Optimized inference pipelines for real-time applications in VR, AR, and interactive systems.
Models trained on state-of-the-art datasets including GazeGene, ETH-XGaze, and ARGaze.
Flexible architecture supporting multiple backbones and easy integration into existing pipelines.
Built on cutting-edge computer vision and deep learning research
EyeFlame combines multiple breakthrough techniques to achieve superior performance: