ABSTRACT
This paper establishes theoretical bounds on the generalization error of retrieval-augmented classifiers operating with noisy sources, developing a unified framework to model retrieval noise and deriving PAC-Bayesian and information-theoretic bounds on excess risk.
PAPER · PDF
Loading PDF...
Key findings
Retrieval-augmented classifiers can achieve consistent learning with proper regularization even with corrupted examples.
The bounds quantify how retrieval accuracy and noise rates jointly determine classifier performance.
Theoretical guidance is provided for designing robust retrieval-augmented systems.
Limitations & open questions
The analysis assumes bounded noise conditions and may not extend to all types of noise.
Practical implementation details are discussed but not fully explored.