LANG-EXPLORE: Language-Guided Robot Exploration

ABSTRACT

This research proposal introduces Lang-Explore, a framework for active robot exploration that leverages natural language instructions to guide autonomous navigation in unknown environments. The method integrates Vision-Language Models with information-theoretic exploration through a Language-Guided Information Gain formulation that scores frontiers based on semantic relevance. The system maintains dense semantic maps by fusing CLIP-based visual features with geometric occupancy, enabling open-vocabulary spatial queries without pre-trained environment models.

PAPER · PDF

LANG_EXPLORE_Paper.pdf ↓ Download PDF

Loading PDF...

↓ View full paper PDF →

Key findings

Language-Guided Information Gain (LGIG) formulation scores exploration frontiers based on their semantic relevance to natural language instructions.

Integration of Vision-Language Models with information-theoretic exploration enables efficient task-relevant mapping in unknown environments.

Dense semantic mapping fuses CLIP visual features with 3D geometric occupancy to support open-vocabulary spatial queries.

The framework eliminates dependency on pre-trained environment models or task-specific navigation graphs for goal-directed exploration.

Limitations & open questions

Traditional frontier-based exploration treats all unknown regions equally, potentially wasting effort on task-irrelevant areas.

Existing Vision-Language Navigation methods typically require pre-computed navigation graphs limiting deployment in truly unknown environments.

Current approaches often rely on passive perception rather than actively seeking viewpoints that maximize information gain for language understanding.

LANG-EXPLORE: Language-Guided Robot Exploration

Key findings

Limitations & open questions

Related Papers