LLM-Powered Multi-Style Citation Extraction: Benchmarking...

ABSTRACT

This paper presents a benchmarking study of Large Language Models for zero-shot citation extraction across multiple scholarly platforms, languages, and citation conventions, revealing challenges and proposing strategies for robust multi-style parsing.

PAPER · PDF

paper.pdf ↓ Download PDF

Loading PDF...

↓ View full paper PDF →

Key findings

LLMs like GPT-4 and Claude show potential for non-English citation styles.

Baseline parsers struggle with compound surnames, German compound nouns, and archival citations.

Systematic failure modes identified include parsing compound surnames and handling multilingual content.

Prompting strategies can improve robustness across citation conventions.

Limitations & open questions

Limited evaluation to three datasets may not capture all citation style variations.

LLMs may struggle with precise field boundaries and formatting consistency.

LLM-Powered Multi-Style Citation Extraction: Benchmarking Zero-Shot Parsing

Key findings

Limitations & open questions

Related Papers