Mathematical proof generation is a challenge for large language models, requiring logical reasoning and knowledge integration. GenCat models proofs as compositional structures in category theory and employs stepwise knowledge tracing for reasoning coherence. It represents proof steps as morphisms, enabling formal verification through commutative diagrams, and tracks concept mastery to identify error propagation points.
Key findings
GenCat models mathematical proofs as morphisms in a structured category for formal verification.
Stepwise knowledge tracing monitors proof construction, tracking concept mastery and detecting logical inconsistencies.
Process-supervision with process reward models enhances proof generation fidelity.
Extensive experiments show significant improvements over baseline models in proof completion accuracy.
Limitations & open questions
The approach may face scalability challenges for extremely complex proofs.
The integration of process reward models requires careful calibration to avoid overfitting.