Guardrails
Autonomous loops need safety floors. Sentigent enforces data-driven YAML guardrail packs on every lap — before any step executes. Opt-in, versioned in git, reviewable by anyone on the team.
The problem
A loop that can drive 15 steps autonomously can also autonomously drop a production table, force-push to main, or write secrets to a public file — if nothing is in the way. Static rules hardcoded into the loop itself are brittle: they can't be updated without a code change, they're invisible to teammates, and they don't compose across teams.
Sentigent separates the safety policy from the loop engine. Guardrail packs are plain YAML files checked into your repo — readable, diffable, and enforced identically on every lap of every loop that loads the pack.
How it works
At the start of each lap the loop driver loads all active guardrail packs and evaluates the pending step against every rule. If any rule matches:
- block — the step is refused; the loop records the violation and halts the lap.
- approve — the step is paused; a human sign-off is required before execution continues.
- warn — the step proceeds, but the violation is logged to the receipt and flagged in the FAP output.
No rule match means the loop proceeds normally. The guardrail evaluation happens before
the claude -p
subprocess is spawned for that step — the policy floor is inviolable.
Pack format
# Sentigent guardrail pack
# Versioned in git. Loaded per-lap by the loop driver.
# action: block | approve | warn
# severity: critical | high | medium | low
rules:
- id: irreversible-recursive-delete
description: "rm -rf / rm -fr — cannot be undone"
match: "rm -rf|rm -fr"
scope: bash
action: block
severity: critical
- id: production-deploy
description: "Deploy to prod — requires human sign-off"
match: "kubectl apply|deploy --prod|fly deploy --remote-only"
scope: bash
action: approve
severity: high
- id: force-push
description: "git push --force — rewrites shared history"
match: "push --force|push -f"
scope: bash
action: block
severity: critical
- id: secrets-write
description: "Writing to secrets / credentials files"
match: "\\.env|credentials|secrets\\.json|\\.pem|\\.key"
scope: write
action: warn
severity: high
- id: drop-table
description: "Destructive DDL on production database"
match: "DROP TABLE|TRUNCATE TABLE|DROP DATABASE"
scope: bash
action: block
severity: critical
Rule fields
| Field | Type | Description |
|---|---|---|
id |
string | Unique rule identifier. Used in violation logs and the FAP receipt. |
description |
string | Human-readable explanation shown when the rule fires. |
match |
regex | Pattern matched against the step's command or file path, depending on scope. |
scope |
bash | write | any | Which step types to evaluate. any matches all. |
action |
block | approve | warn | What happens when the rule matches. |
severity |
critical | high | medium | low | Shown in the receipt and violation log. Does not change enforcement — block always blocks. |
Actions in detail
Step is refused immediately. The loop records the violation, skips the step, and either
halts the lap or moves to the next step depending on on_block
loop config. No claude -p subprocess is spawned.
Loop pauses and writes an approval request to the receipt. Execution resumes only after
a human confirms via loop_driver approve <step_id>.
FAP is not penalised — the ask is attributed to a guardrail, not a blocker.
Step proceeds. The violation is logged with timestamp to the receipt and flagged in the FAP output. Useful for visibility without halting fast-moving loops.
Loading a pack
Packs are loaded automatically from guardrails/ in your project root.
You can also specify a pack explicitly:
# drive with a specific guardrail pack
python -m sentigent.operator.loop_driver drive <loop_id> \
--execute \
--guardrails guardrails/default.yaml
# drive with multiple packs (all rules are merged)
python -m sentigent.operator.loop_driver drive <loop_id> \
--execute \
--guardrails guardrails/default.yaml guardrails/prod-safety.yaml
Writing your own pack
Copy guardrails/default.yaml
and adjust to your org's invariants. Commit the file. Every engineer and every loop in the repo
now uses the same safety floor — no coordination overhead, no per-developer drift.
Good candidates for custom rules: internal deploy commands, protected branch patterns, sensitive file paths specific to your stack, compliance-required approval gates.
Relationship to learned judgment
Guardrails and the CloneResolver (learned push-vs-ask) operate at different layers.
The guardrail pack is evaluated first, before the learned judgment. A block
rule cannot be overridden by a high-confidence CloneResolver decision — the safety floor is
inviolable. The learned judgment only operates on steps that cleared the guardrail gate.