This paper introduces LightDecomp, a specialized lightweight agent designed for real-time query decomposition in retrieval-augmented generation systems. It features a modular architecture with a Query Intent Analyzer, Decomposition Planner, and Sub-query Generator. The training paradigm includes synthetic data generation and pre-training with structured knowledge distillation, followed by reinforcement learning from task-specific feedback. The goal is to achieve sub-100ms latency on consumer hardware while maintaining high decomposition accuracy.
Key findings
LightDecomp is a specialized lightweight agent for real-time query decomposition.
It uses a modular architecture with three specialized components: Query Intent Analyzer, Decomposition Planner, and Sub-query Generator.
The training paradigm combines structured knowledge distillation with reinforcement learning from decomposition feedback.
The goal is to achieve sub-100ms latency on consumer hardware while maintaining within 5% of GPT-4 decomposition accuracy.
Limitations & open questions
The paper is a research proposal and does not yet report experimental results.
The effectiveness of LightDecomp needs to be validated through comprehensive experiments on real-world datasets.