Live coverage
Covenant keeps default CI deterministic while tracking which surfaces have opt-in live coverage. Live tests are Rust tests named live_* and marked with #[ignore].
Commands
node agent-os/scripts/validate-live-coverage.mjs
node agent-os/scripts/model-availability.mjs
bash agent-os/scripts/test-stats.sh
cd agent-os
cargo test --workspace --exclude covenant-settlement-program -- --ignored live_
# Before targeted live CLI tests:
cargo build -p covenant --locked
cargo test -p covenantd --test live_cli_version -- --ignored live_cli_version_reads_protocol_info_without_token
# Linux gVisor runtime validation:
COVENANT_LIVE_GVISOR_ROOTFS=/path/to/rootfs \
cargo test -p covenant-runtime --test live_gvisor -- --ignored live_gvisor_runner_dispatches_with_runscLinux gVisor coverage requires a Linux host with runsc and a rootfs containing /bin/sh. The repeatable setup lives in Linux gVisor runner.
Matrix
| Surface | Status | Next gap |
|---|---|---|
| Daemon IPC core | covered | daemon IPC plus CLI intent/resume/version |
| State verifier | covered | typed repair hints |
| Memory retention | covered | record-to-receipt correlation |
| HTTP gateway | covered | audit and capabilities purge mutations |
| CLI capability lifecycle | covered | capability purge after retention defaults |
| CLI audit feed | covered | audit query filters after predicate support |
| Ignore policy gate | covered | scoped ignore override policy |
| Peer authentication | covered | forced self-revoke recovery fixture |
| Peer listing | covered | ambiguous-prefix listing |
| A2A mailbox | covered | per-peer repair visibility |
| MCP subprocess | covered | third-party fixture |
| Runtime subprocess | covered | daemon dispatch failure receipts |
| Linux gVisor runtime | external service | documented Linux runsc runner |
| Budget enforcement | covered | resume success after pause/resume policy |
| Settlement receipts | covered | scoped receipt filters |
| Local model | external service | pin model set before more coverage |
Related
- System architecture — the surfaces under test.
- Security model — why real boundary tests matter.
- Linux gVisor runner — host setup for the sandbox live path.
- Provenance — evidence attached to committed autonomous work.