For teams wiring AI agents into systems that matter

When verification fails, does your agent stop — or proceed?

Most agent deployments fail open: a prompt injection, a stale permission, a confused agent — and the default behavior is to keep going. Fail-open is the default you get for free. Fail-closed is the default you have to build. The Agent Readiness Audit is a structured review of exactly where your deployment proceeds when it should stop, with a written plan to fix it.

$2,500 fixed fee

10 business days from evidence delivery

12-point scored framework

100% written — no calls, ever

Request the audit intake Five questions. ~10 minutes. Free written assessment before you pay anything.

What we check

Your deployment — planned or live — is reviewed against a twelve-point framework spanning five domains. Each point is scored Pass / Conditional / Fail with evidence cited. The framework is derived from Prime Form Calculus (PFC), an open agent-governance architecture with public, tested reference implementations. You don't need to adopt PFC; it's the measuring stick, and the findings stand on their own.

DOMAIN A

Authority Issuance

Does every agent task chain trace back to a durable, attributable human authorization — with scope enumerated, not implied, and an expiry on every grant? Or does approval exist only as a checkbox click that evaporates after the session?

Points 1–3: authorization artifacts · enumerated scope · expiry & usage policy

DOMAIN B

Credential Architecture

Where do the secrets physically live? If the agent holds credentials, a compromised agent is a compromised everything — the attacker doesn't need to escalate, the privileges are already in the context window. We check for credential starvation, key lifecycle management, and identity binding at every hop.

Points 4–6: credential starvation · key registry & lifecycle · per-hop identity binding

DOMAIN C

Delegation & Scope Containment

When agents delegate to agents, is the child's authority provably a subset of the parent's — enforced mechanically at delegation time, not described in a prompt? And can you revoke a single authorization without nuking unrelated work?

Points 7–8: scope containment · revocation with defined, tested semantics

DOMAIN D

Execution Boundary

Does every effectful action pass an admission check before the effect — and does the system deny on ambiguity? Is authority freshness bounded and matched to risk, so a revocation actually stops a long-running task? Are replays and duplicate effects structurally impossible, so a network blip can't become a double payment?

Points 9–11: fail-closed verification · bounded freshness · replay & idempotency

DOMAIN E

Evidence & Audit

If an agent did something wrong last night, could you prove what happened without trusting the agent? Logs the actor controls are testimony, not evidence. We check for signed, immutable pre- and post-effect records that a third party can verify independently.

Point 12: independently verifiable receipts in an append-only store

What you get

Every deliverable is a written artifact you keep. Nothing lives in a meeting, a screen-share, or anyone's memory.

Scored written report

All twelve points scored Pass / Conditional / Fail against your actual deployment, with the evidence for each score cited inline.

Prioritized remediation plan

Findings ordered by exploitability-weighted severity — execution-boundary failures first, because they're the difference between an incident that is blocked and an incident that is merely logged. Design changes (days to weeks) are distinguished from configuration fixes (same week).

Two rounds of written Q&A

After the report lands, you ask anything in writing — twice. Answers are documented artifacts, so the whole team gets the same answer and it's still there in six months.

Scoring note: the remediation plan also flags which findings would be resolved wholesale by adopting a mediated, receipt-chained execution model — so you can weigh one architectural fix against a dozen point fixes.

Timeline & fee

$2,500 fixed. One deployment per engagement; additional agent systems are quoted separately.

Day 0

You submit the five-question written intake below. About ten minutes of your time.
Within 2 business days

You receive a short written assessment — what I expect the audit to find, whether your deployment is in scope, and a payment link if you want to proceed. Before you've paid anything.
Evidence delivery

You provide the evidence requested per framework point — token lifetimes, a revocation runbook, a sample authorization record. Written requests, written responses.
10 business days later

Scored report and remediation plan delivered. Two rounds of written Q&A follow at your pace.

Why no calls?

This audit's central finding, in deployment after deployment, is the same: claims that live only in conversation don't survive contact with an incident. "We can revoke the agent" said on a call is testimony. A written revocation runbook with timings is evidence. It would be strange to sell you a receipts-based audit through a process that doesn't produce receipts.

So the entire engagement is written: intake, evidence requests, findings, remediation plan, both Q&A rounds. You get documented artifacts your whole team can read, forward, and act on — not notes from a meeting that three people attended and two remember differently. It's also faster for you: no scheduling, no time zones, answer when you have twenty minutes.

An audit whose own process isn't documented wouldn't be much of an audit.

Request the audit intake

Five questions, roughly ten minutes to answer. Your answers determine scope fit and let me tell you — before you pay anything — what I expect the audit to find. Copy these into an email, answer them in line, and send.

QUESTION 1

What can your agents actually do without a human in the loop?

List the systems your agents can write to or act in (send email, modify CRM records, move money, deploy code, message customers — whatever applies), and for each, whether a human approves the specific action or only the general task.

Why I ask: this separates read-path deployments (low urgency) from write-path deployments (where fail-open gets expensive). It also tells me which audit domains carry the most weight for you.

QUESTION 2

Where do the credentials physically live?

When your agent calls an external system, where is the API key, OAuth token, or password at that moment — in the agent's environment or context, in a vault the agent reads from, or behind a gateway the agent calls without ever seeing the secret?

Why I ask: this single answer predicts roughly a third of the findings. If the agent holds credentials, a compromised agent is a compromised everything.

QUESTION 3

Walk me through revoking the agent — for real.

Suppose at 2:14 p.m. you decide one agent task must stop immediately, but everything else should keep running. What exactly do you do, how long does it take to be effective, and what happens to actions already in flight? If the honest answer is "disable the API key and everything stops," say that — it's the most common answer and it's exactly what the audit is for.

Why I ask: revocation is where designs that look fine on a whiteboard fail in practice. The gap between "we decided to stop it" and "it actually stopped" is your real risk exposure window.

QUESTION 4

If an agent did something wrong last night, how would you prove what happened?

Describe what records exist for an agent action after the fact — what they contain, where they're stored, and who or what is capable of altering them. Could you demonstrate to a customer, regulator, or your own board what the agent was authorized to do versus what it actually did?

Why I ask: logs written by the same process being investigated are testimony, not evidence. This question reveals whether you have an audit trail or an audit story.

QUESTION 5

What's the scenario that worries you most?

In a sentence or two: what's the agent failure that would actually hurt — the one that costs you a customer, a compliance finding, or a news cycle? And what, if anything, currently stands between you and it?

Why I ask: the report prioritizes findings by your real exposure, not a generic checklist order. Your answer here decides what gets fixed first.

What happens next: within two business days of receiving your answers, you'll get a short written assessment — what I expect the audit to find, whether your deployment is in scope, and a payment link if you want to proceed. No calls at any stage; every deliverable is an artifact you keep.

Send your intake answers

Who's behind this

Dan Evans, creator of Prime Form Calculus — the open agent-governance architecture this audit's twelve-point framework is derived from. The public PFC reference implementations — the connector gateway proof and the execution boundary proof — are the working, tested measuring stick behind every finding.

Scope boundaries

The audit covers agent authority, delegation, execution boundaries, and auditability. It does not cover model-level alignment, prompt-injection resistance of specific models, general application security, or compliance certification (SOC 2, ISO 27001) — though findings frequently feed those efforts.