Company Agent Guardrails

Purpose

Define and enforce practical safety guardrails for AI coding agents across Keshet — visible, human-confirmed control over dangerous actions, not sandbox-grade security. Applies to all agents in Claude Code, Cowork, and any pipeline or automation using Claude. Default policy categories and per-app file placement paths live in reference.md.

Trigger Conditions

Always active for any AI agent session. Activate explicitly when: a new session starts on a Keshet project, an agent is about to run shell/git/MCP calls, a user grants broad autonomy ("just handle it"), a pipeline is being designed or reviewed, or an action would touch secrets or files outside the project directory.

First Principles

Identify the action surface: shell, file reads/writes, git, MCP calls, package installs, hooks, skills, secrets.
Classify each control as deny, ask, monitor, or guidance.
Prefer small, explicit rules over broad vague ones.
Preserve developer productivity — ask on ambiguous actions, deny only clearly unsafe ones.
State limitations clearly: instruction files guide behavior; hooks/endpoint tools enforce more, but neither is a full OS sandbox without real containment.

Rule Drafting Workflow

Ask where the rule should live: personal config, project-shared config, or both.
Ask what should happen per category: block, ask, monitor, or allow.
Draft the smallest useful rule set.
Include examples of actions that trigger the rule.
Include known limitations — what still needs sandboxing or endpoint controls.
If writing files, use the target app's native location; keep policy wording app-agnostic.

Recommended Default Stance

Use this unless the company provides a stricter policy:

Deny: secret exfiltration, pipe-to-shell (curl ... | sh), sandbox/permission disable, destructive system commands (recursive delete, disk format, history wipe), credential-store access.
Ask: git push/force-push, deploys, package installs, schema changes, writes outside the project directory, any MCP tool call not on the org-approved list.
Monitor: all shell execution, in-project file writes, external API calls — logged and visible, not blocked.

What NOT to do

Do not install MCP servers, hooks, or persistent services without explicit user approval
Do not read, display, or pass .env files or credential files into any context
Do not execute destructive shell commands without confirmation
Do not push to git or deploy without the user seeing exactly what will be pushed
Do not access file paths outside the current project directory
Do not silently retry failed operations — always surface failures
Do not treat instruction-only guardrail files as OS-level enforcement
Do not install third-party agent tools from unverified sources without a security review — prefer reviewing source over installing prebuilt release artifacts, and route installs through checksum verification and a pilot environment

Output Template

When proposing a guardrail set, always respond with:

## Guardrail Summary
<short summary of what is being protected>

## Policies
- Deny:    <actions that are blocked outright>
- Ask:     <actions that require explicit user confirmation>
- Monitor: <actions that are logged and visible but not blocked>

## Files To Create Or Update
- `<path>`: <purpose of this file>

## Limitations
<what these guardrails do NOT enforce — what requires additional sandboxing or endpoint controls>

Amitro1234/keshet-claude-skills

¿Qué es keshet-claude-skills?

Preguntar en tu IA favorita

Documentación

Company Agent Guardrails

Purpose

Trigger Conditions

First Principles

Rule Drafting Workflow

Recommended Default Stance

What NOT to do

Output Template

Skills relacionados

steipete/sag

steipete/oracle

steipete/peekaboo

obra/brainstorming

affaan-m/prisma-patterns

affaan-m/django-celery