Temporal Credit Assignment with Hierarchical Progress Mil...

ABSTRACT

This research proposes Hierarchical Progress Milestone Networks (HPMN) to address the challenge of temporal credit assignment in reinforcement learning, particularly in tasks with long horizons and sparse rewards. HPMN introduces a structured decomposition of credit assignment through learnable progress milestones, combining hierarchical value decomposition with explicit progress estimation for effective credit propagation.

PAPER · PDF

Temporal_Credit_Assignment_HPMN.pdf ↓ Download PDF

Loading PDF...

↓ View full paper PDF →

Key findings

HPMN introduces hierarchical progress decomposition, dense progress signals, and multi-scale credit propagation.

The method provides convergence guarantees and is evaluated on robotic manipulation, locomotion, and navigation benchmarks.

Addresses limitations of existing methods including bias-variance tradeoff in return estimation and inefficient exploration in sparse reward settings.

Limitations & open questions

The paper does not discuss the computational complexity of HPMN or its scalability to very large tasks.

Temporal Credit Assignment with Hierarchical Progress Milestones

Key findings

Limitations & open questions

Related Papers