ABSTRACT
This paper presents a comprehensive methodological framework for analyzing inter-rater reliability in expert-based compliance rule classification systems, addressing challenges such as multi-label classification, missing data, hierarchical category structures, and providing actionable guidance for reliability thresholds in high-stakes compliance contexts.
PAPER · PDF
Loading PDF...
Key findings
Develops a multi-dimensional reliability assessment protocol tailored to compliance rule annotation.
Synthesizes statistical agreement metrics including Cohen’s κ, Fleiss′κ, Krippendorff′sα, and ICC.
Proposes a validation protocol involving domain experts across multiple regulatory domains.
Limitations & open questions
The framework's effectiveness is yet to be systematically evaluated across various compliance contexts.