This research introduces DepthCal, a novel framework that dynamically adjusts attention weights across different transformer depths to improve instruction following in large language models. It accounts for representational maturity, attention sink phenomena, and instruction saliency.
Key findings
Attention mechanisms exhibit distinct behaviors at different transformer depths.
DepthCal introduces depth-specific calibration functions for improved instruction adherence.
The framework dynamically adjusts calibration based on real-time attention distributions.
Comprehensive experimental validation across benchmarks and safety scenarios is planned.
Limitations & open questions
The effectiveness of DepthCal in non-instruction-following tasks remains to be evaluated.
The optimal calibration functions for different instruction types need further investigation.