18
Unjustified high-impact action points executed by baseline.
Legal eval
LABE measures whether legal AI workflows take justified high-impact actions and avoid unjustified ones in negotiation, compliance, signature routing, and orchestrated review flows.
The eval is based on public legal workflow classes and is open-sourced for inspection.
Headline result
18
Unjustified high-impact action points executed by baseline.
0
Unjustified high-impact action points executed with VerifiedX.
0
False blocks in the current legal suite.
41.7% -> 100%
Workflow completion after intervention.
Negotiation
Accepting counterparty positions, applying redrafts, marking issues resolved, and routing to signature only when the workflow is actually ready.
Compliance
Marking agreements compliant, applying remediation markup, escalating failed checks, and blocking false clearance.
Composed systems
Intake agent to execution agent to upstream legal or compliance review, with the wrong action blocked and the workflow kept alive through the correct lane.
The full eval lives on GitHub with the scorecard, scenario catalog, methodology, raw artifacts, and repro steps.