This paper introduces RT-BRIDGE, a novel framework extending Bridge-Prompt to real-time broadcast analysis for live coaching feedback. It addresses the challenge of modeling ordinal action relationships in sports broadcasts with streaming video analysis and latency-aware inference. The framework includes a streaming prompt encoder, a latency-aware fusion module, and a coaching feedback generator, evaluated on sports datasets for detection performance and coaching utility.
Key findings
RT-BRIDGE extends prompt-based learning to streaming video analysis for real-time coaching feedback.
Introduces a streaming prompt encoder for incremental video chunk processing.
Includes a latency-aware fusion module optimizing accuracy and response time.
Develops a coaching feedback generator for actionable insights from action predictions.
Limitations & open questions
The paper does not discuss specific limitations but mentions identifying implementation constraints and failure modes for live coaching environments.