This research proposes a framework to maintain tight cardinality bounds on recursive Datalog query results under high-velocity streaming updates. The key insight is that size bounds can be maintained incrementally using a semiring-structured provenance algebra, enabling constant-time bound updates per tuple insertion or deletion.
Key findings
StreamBound algorithm propagates bound constraints through Datalog rule dependency graph using differential dataflow principles.
Maintains upper bounds on intermediate relation sizes with O(1) amortized update cost.
Supports both monotone and non-monotone recursion, and provides probabilistic confidence intervals.
Limitations & open questions
The framework's performance under varying data distributions and streaming velocities needs further evaluation.