Guardrails — Sentigent Docs

The problem

A loop that can drive 15 steps autonomously can also autonomously drop a production table, force-push to main, or write secrets to a public file — if nothing is in the way. Static rules hardcoded into the loop itself are brittle: they can't be updated without a code change, they're invisible to teammates, and they don't compose across teams.

Sentigent separates the safety policy from the loop engine. Guardrail packs are plain YAML files checked into your repo — readable, diffable, and enforced identically on every lap of every loop that loads the pack.

How it works

At the start of each lap the loop driver loads all active guardrail packs and evaluates the pending step against every rule. If any rule matches:

block — the step is refused; the loop records the violation and halts the lap.
approve — the step is paused; a human sign-off is required before execution continues.
warn — the step proceeds, but the violation is logged to the receipt and flagged in the FAP output.

No rule match means the loop proceeds normally. The guardrail evaluation happens before the claude -p subprocess is spawned for that step — the policy floor is inviolable.

PolicyWall is sticky. If any hard rule matches, the step escalates regardless of the CloneResolver's learned confidence score. The inviolable safety floor has its own test suite and cannot be overridden by the learned push-vs-ask path.

Pack format

guardrails/default.yaml

# Sentigent guardrail pack
# Versioned in git. Loaded per-lap by the loop driver.
# action: block | approve | warn
# severity: critical | high | medium | low

rules:
  - id: irreversible-recursive-delete
    description: "rm -rf / rm -fr — cannot be undone"
    match: "rm -rf|rm -fr"
    scope: bash
    action: block
    severity: critical

  - id: production-deploy
    description: "Deploy to prod — requires human sign-off"
    match: "kubectl apply|deploy --prod|fly deploy --remote-only"
    scope: bash
    action: approve
    severity: high

  - id: force-push
    description: "git push --force — rewrites shared history"
    match: "push --force|push -f"
    scope: bash
    action: block
    severity: critical

  - id: secrets-write
    description: "Writing to secrets / credentials files"
    match: "\\.env|credentials|secrets\\.json|\\.pem|\\.key"
    scope: write
    action: warn
    severity: high

  - id: drop-table
    description: "Destructive DDL on production database"
    match: "DROP TABLE|TRUNCATE TABLE|DROP DATABASE"
    scope: bash
    action: block
    severity: critical

Rule fields

Field	Type	Description
`id`	string	Unique rule identifier. Used in violation logs and the FAP receipt.
`description`	string	Human-readable explanation shown when the rule fires.
`match`	regex	Pattern matched against the step's command or file path, depending on scope.
`scope`	bash \| write \| any	Which step types to evaluate. `any` matches all.
`action`	block \| approve \| warn	What happens when the rule matches.
`severity`	critical \| high \| medium \| low	Shown in the receipt and violation log. Does not change enforcement — `block` always blocks.

Actions in detail

block

Step is refused immediately. The loop records the violation, skips the step, and either halts the lap or moves to the next step depending on on_block loop config. No claude -p subprocess is spawned.

approve

Loop pauses and writes an approval request to the receipt. Execution resumes only after a human confirms via loop_driver approve <step_id>. FAP is not penalised — the ask is attributed to a guardrail, not a blocker.

warn

Step proceeds. The violation is logged with timestamp to the receipt and flagged in the FAP output. Useful for visibility without halting fast-moving loops.

Loading a pack

Packs are loaded automatically from guardrails/ in your project root. You can also specify a pack explicitly:

terminal

# drive with a specific guardrail pack
python -m sentigent.operator.loop_driver drive <loop_id> \
  --execute \
  --guardrails guardrails/default.yaml

# drive with multiple packs (all rules are merged)
python -m sentigent.operator.loop_driver drive <loop_id> \
  --execute \
  --guardrails guardrails/default.yaml guardrails/prod-safety.yaml

Writing your own pack

Copy guardrails/default.yaml and adjust to your org's invariants. Commit the file. Every engineer and every loop in the repo now uses the same safety floor — no coordination overhead, no per-developer drift.

Good candidates for custom rules: internal deploy commands, protected branch patterns, sensitive file paths specific to your stack, compliance-required approval gates.

Relationship to learned judgment

Guardrails and the CloneResolver (learned push-vs-ask) operate at different layers. The guardrail pack is evaluated first, before the learned judgment. A block rule cannot be overridden by a high-confidence CloneResolver decision — the safety floor is inviolable. The learned judgment only operates on steps that cleared the guardrail gate.