Agent-driven testing

Sandbox projects are designed for an AI coding agent to integrate @txnod/sdk end-to-end, exercise every webhook event deterministically, and verify the integration before any production wallet exists. This guide is the agent-loop deep dive — what the agent reads, how it writes the integration, how it drives the simulate-* matrix, how it verifies, and where it stops.

For the project-creation flow and safety analysis, read Sandbox projects and Sandbox safety first.

The agent loop

The loop is integrate → exercise → verify. The agent is given three credentials (TXNOD_API_KEY_ID, TXNOD_API_SECRET starting sk_sandbox_, sandbox PAT) and a prompt; it produces a working webhook handler plus a Vitest suite that proves all 7 events round-trip with mode: 'sandbox' set, then stops at sandbox-green. The TXNOD_API_SECRET is the only secret — it signs API requests and verifies webhook signatures. No production credentials, no hardware wallet, no faucet are involved.

What the agent reads first

The bundled documentation is the authoritative entry point — frozen at the installed SDK version, network-free:

node_modules/@txnod/sdk/AGENTS.md — read order, non-negotiable invariants.
node_modules/@txnod/sdk/docs/05-sandbox.md — client.sandbox.* surface, environment-detection guards, layered defenses, per-chain testnet truth table.
node_modules/@txnod/sdk/docs/examples/sandbox-vitest-suite.md — the canonical 7-scenario Vitest harness an agent should mirror.
https://docs.txnod.com/llms.txt and llms-full.txt — the indexed corpus when an offline copy is not available.

How to write the integration

A complete worked example fits in two pieces: a shared TxnodClient constructed at module scope and two route handlers (checkout to mint invoices, webhook to verify and dispatch events). The shape uses Web-standard Request/Response so it ports verbatim into Next.js App Router, Hono, Fastify (with request.raw), and any Web-Fetch runtime; for Express, swap request.text() for req.rawBody.toString('utf8') after express.raw().


import { TxnodClient } from '@txnod/sdk';
 
export const txnod = new TxnodClient({
  projectId: process.env['TXNOD_API_KEY_ID']!,
  apiSecret: process.env['TXNOD_API_SECRET']!, // sk_sandbox_...
  environment: 'non-production',
});


import { TxnodClient } from '@txnod/sdk';
 
declare const txnod: TxnodClient;
 
export async function checkout(request: Request): Promise<Response> {
  const { externalId, amountUsd } = (await request.json()) as {
    externalId: string;
    amountUsd: number;
  };
  const invoice = await txnod.sandbox.createInvoice({
    external_id: externalId,
    amount_usd: amountUsd,
    coin: 'usdt_trc20',
    callback_url: `${process.env['SITE_URL']!}/api/txnod-webhook`,
  });
  return Response.json({
    invoiceId: invoice.id,
    paymentUri: invoice.payment_uri,
  });
}


import {
  verifyWebhookSignature,
  TxnodHmacError,
  TxnodTimestampError,
} from '@txnod/sdk';
 
const seenEventIds = new Set<string>(); // replace with your dedupe store
 
export async function webhook(request: Request): Promise<Response> {
  const rawBody = await request.text();
  try {
    const event = verifyWebhookSignature(
      request.headers,
      rawBody,
      process.env['TXNOD_API_SECRET']!,
    );
    if (event.mode === 'sandbox' && process.env['NODE_ENV'] === 'production') {
      return Response.json(
        { error: 'refusing sandbox event in production' },
        { status: 400 },
      );
    }
    if (seenEventIds.has(event.event_id)) return Response.json({ ok: true });
    seenEventIds.add(event.event_id);
    if (event.event_type === 'invoice.paid') {
      // event.data.invoice_id is fully typed in this branch — fulfil order
    }
    return Response.json({ ok: true });
  } catch (err) {
    if (err instanceof TxnodHmacError || err instanceof TxnodTimestampError) {
      return Response.json({ error: 'signature' }, { status: 401 });
    }
    throw err;
  }
}

The pattern that makes this agent-friendly: dedupe on event.event_id (stable across retries and reorg-replays), branch on event.event_type for narrowed event.data types, fail-close when event.mode === 'sandbox' is observed in NODE_ENV=production. A single TXNOD_API_SECRET both signs your API requests and verifies inbound webhook signatures, so the handler verifies every callback with that one secret and branches on the event.mode discriminator only after the signature checks out.

The agent prompt structure that produces clean integrations is:


You are integrating @txnod/sdk into this project. Default to sandbox mode.

Credentials (already in .env.local):
  TXNOD_API_KEY_ID
  TXNOD_API_SECRET (starts with sk_sandbox_; signs API requests AND verifies webhooks)
  TXNOD_PAT (sandbox:simulate scope)

Steps:
  1. Read node_modules/@txnod/sdk/AGENTS.md and docs/05-sandbox.md.
  2. Mirror the route shape from docs/examples/nextjs-route-handler.md.
  3. Verify inbound webhook signatures with TXNOD_API_SECRET (your only secret).
  4. Idempotent-dedup on event.event_id.
  5. Branch handler logic on event.mode === 'sandbox' to fail-closed in production.
  6. Write a Vitest suite that drives all 7 simulate-* scenarios.
  7. Run pnpm test until green; STOP. Do not deploy, do not push.

A typical agent trace follows the seven-step prompt above: read the bundled AGENTS.md + docs/05-sandbox.md, install @txnod/sdk, scaffold the two route handlers shown above, implement an event.event_id dedupe set, write a 7-scenario Vitest suite that drives the simulate-* matrix below, run pnpm test until green, then stop. The agent never proceeds to mainnet without explicit human approval — see Stop condition.

How to drive the simulate-* loop

Sandbox state transitions are deterministic — every method call advances exactly one state and emits exactly one webhook (except simulateDuplicateDelivery, which re-fires the most recent terminal event with the same event_id). The 7 scenarios that cover the full event matrix:

Scenario	SDK calls	MCP tools	Webhook events delivered
1. detected → paid	`simulateDetect`, `simulatePaid`	`sandbox_simulate_detect`, `sandbox_simulate_paid`	`invoice.detected`, `invoice.paid`
2. detected → overpaid	`simulateDetect`, `simulateOverpaid`	`sandbox_simulate_detect`, `sandbox_simulate_overpaid`	`invoice.detected`, `invoice.overpaid`
3. detected → partial	`simulateDetect`, `simulatePartial`	`sandbox_simulate_detect`, `sandbox_simulate_partial`	`invoice.detected`, `invoice.partial`
4. pending → expired	`simulateExpire`	`sandbox_simulate_expire`	`invoice.expired`
5. expired → expired_paid_late	`simulateExpire`, `simulateLatePayment`	`sandbox_simulate_expire`, `sandbox_simulate_late_payment`	`invoice.expired`, `invoice.expired_paid_late`
6. paid → reverted → paid (reorg + reconfirm)	`simulateDetect`, `simulatePaid`, `simulateReorg`, `simulateReconfirm`	`sandbox_simulate_detect`, `sandbox_simulate_paid`, `sandbox_simulate_reorg`, `sandbox_simulate_reconfirm`	`invoice.detected`, `invoice.paid`, `invoice.reverted`, `invoice.paid` (re-emitted with stable `event_id`)
7. duplicate delivery (idempotency)	`simulateDuplicateDelivery`	`sandbox_simulate_duplicate_delivery`	re-fires most recent terminal event with the SAME `event_id`

clockAdvance(projectId, { chain, blocks }) increments per-chain confirmation counters across detected invoices — drive it when integration logic gates on event.data.confirmations.

reset(projectId) is called as beforeAll and afterAll so each test run starts clean. destroy(projectId) cascades the entire project (use only at end-of-life — sandbox projects are usually kept long-lived).

How to verify

The agent verifies correctness in three places:

assertSafeMode() at test boot. The byte-exact CI assertion (see Sandbox safety → Recommended CI assertion) imported from tests/setup.ts fails the run if the env wiring is wrong.
Expected event_type and terminal status per scenario. Each scenario polls client.sandbox.listWebhookEvents (the sandbox mirror of the kind-locked production delivery log) for the expected event-type set (invoice.detected, invoice.paid, etc.) and asserts the SDK’s returned status. The structural sandbox-mode invariant is enforced server-side by the dispatcher (every webhook from a kind='sandbox' project carries mode: 'sandbox'); receiver-side per-event mode checks live in the route handler at src/app/api/txnod-webhook/route.ts, which fail-closes when event.mode === 'sandbox' is observed in NODE_ENV=production.
The 7-scenario Vitest suite, all green. This is the operational definition of “agent-ready” — sandbox-green proves the integration handles every webhook type the dispatcher can emit.

The handler MUST dedup on event.event_id (not on tx_hash, not on (invoice_id, event_type)) — simulateDuplicateDelivery and the paid → reverted → paid reconfirm flow both re-fire the same event_id and your idempotency layer is the device under test.

Stop condition

The agent stops at sandbox-green. Sandbox-green is the operational completion signal — all 7 scenarios pass, assertSafeMode() is wired, the handler dedups on event_id, the integration commits to git. The agent does NOT proceed to mainnet, does NOT swap to a production secret, does NOT register a real wallet — those steps belong to the human owner of the project. Cross-link to the Sandbox projects → Graduate to production section for the human’s promotion checklist.

What NOT to do

These anti-patterns must never appear in agent-authored code:

Do NOT promote a sandbox xpub to a production env. Sandbox xpubs are testnet-derived; using them in production routes real customer funds to addresses an attacker could reach via the public testnet faucet.
Do NOT bypass iAcknowledgeRoutingRealCustomerFundsToSandboxAddresses without explicit human approval. The override exists for staging-replica setups that mirror production env vars; defaulting to it in agent-generated code is a category error.
Do NOT paste a sk_sandbox_... API secret into a production .env. Always keep the sandbox secret in .env.local (developer machines + CI) and the production secret in the deployed environment.
Do NOT ignore the mode field in the webhook handler. A handler that doesn’t branch on event.mode will happily process a sandbox event in production, defeating layer 5 of the seven-layer defense.
Do NOT proceed to mainnet without the human’s explicit approval. Sandbox-green is the stop condition; mainnet promotion is a separate human-driven step (see Sandbox projects → Graduate to production).

For the full safety analysis with per-layer code examples and three failure-scenario walkthroughs, read Sandbox safety. For the project-creation flow, read Sandbox projects.