Agent Contract Readiness

Give every agent PR a standard your reviewers can see.

Coding agents can move quickly across code, tests, config, and CI. ContractForge gives the team a local audit packet for one repeated workflow: expected scope, validation commands, approval-sensitive changes, final-response evidence, preflight notes, and clear limits.

For small engineering teams already using coding agents in active repositories.
$750 flat audit One repository One primary agent workflow Local-first by default

What actually happens

One workflow goes in. A review packet comes out.

1

Run locally

contractforge init
contractforge audit
contractforge compile
contractforge eval-gen
contractforge eval
contractforge report

No source upload by default. The audit works from a clean local clone or worktree.

2

Writes repo artifacts

  • agent.contract.yaml for scope, commands, approvals, failure limits, and evidence rules.
  • AGENTS.md guidance compiled from the contract for team review.
  • .contractforge/eval_tasks.yaml for deterministic review-task skeletons.
  • .contractforge/report.md with traces, policy notes, check output, and limits.
3

Supports PR review

  • Reviewer sees expected files and commands for the selected workflow.
  • Known gated prompts can be blocked before an external agent command runs.
  • Forbidden paths are flagged by local checks.
  • The memo separates observed evidence from unsupported claims.

Why teams care

The review should feel governed, not improvised.

The first friction with coding agents is rarely whether they can edit code. It is whether the team can see the rules that governed the edit, inspect the evidence, and decide what still needs human judgment.

Surprise scope

The PR includes CI, config, generated files, package metadata, or release-adjacent edits alongside the intended code change.

Unclear ownership

Dependency, migration, auth, billing, security, or production-facing changes arrive without a clear owner record.

Review evidence is thin

The final answer says tests passed, but the exact commands, skipped checks, changed files, and residual risks are hard to inspect.

Team standards are scattered

Agent expectations live across README text, AGENTS.md, tool-specific instructions, old PR comments, and reviewer memory.

Concrete scenarios

What changes in the reviewer’s hands.

Each scenario shows a PR moment that normally becomes debate, then the audit artifact that makes the next review easier to inspect.

01

Dependency added during a billing change

Review moment: A coding agent updates a billing handler and adds a package in the same PR. The reviewer cannot tell whether dependency work was approved.

Returned artifact:

agent.contract.yaml marks dependency, package metadata, billing, and release-adjacent work as approval-required for this workflow.

Run path: contractforge eval --agent-command checks the task prompt first. If it sees dependency work, the run is blocked unless --approve-gated, approver, reason, and scope are supplied.

Reviewer sees: the blocked preflight note in run evidence and the approval requirement in the contract packet.

02

API behavior changes without review evidence

Review moment: An agent changes a route handler, type definitions, and tests. API compatibility expectations exist, but they are spread across docs and past reviewer comments.

Returned artifact:

AGENTS.md lists allowed files, expected validation commands, API-compatibility notes, skipped-check disclosure, and final-response evidence.

Run path: contractforge compile reads agent.contract.yaml and renders the AGENTS.md guidance the team reviews before use.

Reviewer sees: the exact paths, commands, and final-response evidence expected for the API-change workflow.

03

CI changes mixed into a routine fix

Review moment: A failing test is fixed, but the PR also changes package scripts or CI configuration. The reviewer needs a clean separation between the code fix and workflow changes.

Returned artifact:

Eval tasks and reports flag expected paths, forbidden paths, command results, policy notes, and owner-review requirements for script or CI changes.

Run path: contractforge eval checks commands, diff scope, forbidden paths, and policy notes. contractforge report writes the review packet.

Reviewer sees: whether the PR stayed inside the expected workflow and which CI or script changes need owner review.

What comes back

A concrete review packet, not another policy doc.

The audit is deliberately narrow. It gives the team files and notes that can sit next to the next agent-created PR, with limits stated plainly.

Instruction surface mapWhere agent-facing instructions live today and where they conflict or leave gaps.
Contract scaffoldAn editable agent.contract.yaml with commands, paths, approval-required categories, failure limits, and evidence rules.
Compiled guidanceA draft AGENTS.md section generated from the contract for reviewer inspection.
Eval skeletonsDeterministic task templates for scoped edits, probes, recovery, and read-only analysis.
Check reportBaseline/policy output, traces, preflight notes, and local command results.
Founder memoA reviewed summary of findings, recommended next steps, and limitations.

How it plugs in

Local files first. PR review second.

ContractForge is used around your existing repository and agent command. It does not require source upload by default.

1

Inspect the repo

contractforge init and contractforge audit identify project signals and instruction surfaces.

2

Write the contract

The audit edits the scaffold into rules for one workflow: allowed paths, commands, approval-sensitive work, stop rules, and final evidence.

3

Generate review assets

contractforge compile, eval-gen, eval --plan, eval, and report create guidance, tasks, traces, and report output.

4

Use it in the next PR

Reviewers compare the agent-created PR against the compiled guidance, contract scope, command evidence, preflight notes, and stated limits.

Best fit

Small teams already using coding agents in active repositories.

The first rollout is built for founders, CTOs, heads of engineering, DevEx leads, and senior engineering owners at AI-native devtool, infrastructure, and SaaS teams. The buyer has one repo where agent-review friction is visible enough to justify a focused audit.

Good fit5-75 engineers, local validation commands, one repeated agent PR pattern, owner available to review the memo.
Poor fitNeeds hosted SaaS, production enforcement, compliance certification, broad agent benchmarking, or enterprise procurement first.

Getting started

Start with the PR pattern reviewers already recognize.

The first step is not a platform rollout. It is a focused fit check around one agent-created change that already creates review friction.

1

Pick the repeated PR pattern

Name the agent-created change that keeps raising scope, approval, command, or evidence questions.

2

Send the context

Share the repo name or safe public URL, agent used today, expected commands, current instruction files, and the recurring review question.

3

Confirm fit and access

The first reply confirms timing, authorization, local inspection path, and whether the $750 audit scope is a good fit.

4

Run the local audit

ContractForge runs in a clean local clone or worktree and produces the contract scaffold, compiled guidance, eval tasks, checks, traces, and report.

5

Review the packet

Your team inspects the artifacts, memo, and limitations before adopting any generated guidance into the repo.

Start here

Send the PR pattern your reviewers keep questioning.

The first reply confirms fit, timing, and how the repository can be inspected by someone authorized to review it.

Current limits: Production sandboxing, complete policy enforcement, compliance certification, reliability lift, productivity lift, security review, and broad cross-agent validation require separate evidence and separate work.