This paper introduces TrajMamba-3D, a novel architecture for pedestrian trajectory prediction that extends ego-motion guidance to monocular depth-aware prediction. It leverages the Mamba state space model for efficient sequence modeling, integrates monocular depth estimation for 3D spatial awareness, and models ego-motion compensation. The method addresses computational efficiency with linear complexity and is planned to be evaluated on ETH-UCY and PIE benchmarks.
Key findings
Proposes TrajMamba-3D, a three-stream architecture for pedestrian trajectory prediction.
Integrates Mamba-based temporal encoding, monocular depth-aware spatial modeling, and ego-motion compensation.
Achieves linear computational complexity, overcoming the quadratic bottleneck of transformer-based approaches.
Plans comprehensive evaluation including benchmarks, ablation studies, and comparison with state-of-the-art methods.
Limitations & open questions
The paper is a research proposal and thus the experimental validation and results are yet to be conducted.