The paper proposes AeroKV, an adaptive KV-cache compression framework for streaming aerial navigation under memory constraints. It addresses the memory-intensive nature of KV cache storage during streaming inference for large vision-language models on UAVs. AeroKV dynamically adjusts compression strategies based on flight phase, environmental complexity, and task-critical information requirements, achieving up to 8x KV cache compression with minimal navigation accuracy degradation.
Key findings
AeroKV achieves up to 8x KV cache compression with minimal navigation accuracy degradation.
The framework dynamically adjusts compression strategies based on flight phase and environmental complexity.
AeroKV preserves navigation-critical tokens while aggressively compressing redundant visual features.
Limitations & open questions
The framework's performance in highly dynamic and unknown environments is yet to be fully evaluated.
The long-term stability of the compression algorithms under varying operational conditions needs further testing.