Probing Implicit Bias for Symmetry Learning in Modern Tra...

ABSTRACT

This paper proposes a research framework to understand the implicit biases governing symmetry learning in transformers, using a multi-level probing methodology.

PAPER · PDF

manuscript.pdf ↓ Download PDF

Loading PDF...

↓ View full paper PDF →

Key findings

Transformers exhibit bias toward permutation-symmetric functions, especially in sequence space.

Initial weights significantly influence the inductive biases of transformer architectures.

Symmetry-related representations in transformers remain underdeveloped and opaque.

SymProbe framework combines sparse autoencoders, circuit tracing, and controlled interventions to analyze symmetry biases.

Limitations & open questions

Limited mechanistic understanding of how symmetry representations emerge in transformers.

Existing probing methodologies are insufficient for detecting symmetry-related representations.

Uncertainty in the relationship between architectural design and symmetry learning biases.

Probing Implicit Bias for Symmetry Learning in Modern Transformer Architectures

Key findings

Limitations & open questions

Related Papers