Skip to Content
GuidesAgent-driven testing

Agent-driven testing

Sandbox projects are designed for an AI coding agent to integrate @txnod/sdk end-to-end, exercise every webhook event deterministically, and verify the integration before any production wallet exists. This guide is the agent-loop deep dive — what the agent reads, how it writes the integration, how it drives the simulate-* matrix, how it verifies, and where it stops.

For the project-creation flow and safety analysis, read Sandbox projects and Sandbox safety first.

The agent loop

The loop is integrate → exercise → verify. The agent is given four credentials (TXNOD_PROJECT_ID, TXNOD_API_SECRET starting sk_sandbox_, TXNOD_WEBHOOK_SECRET, sandbox PAT) and a prompt; it produces a working webhook handler plus a Vitest suite that proves all 7 events round-trip with mode: 'sandbox' set, then stops at sandbox-green. No production credentials, no hardware wallet, no faucet are involved.

What the agent reads first

The bundled documentation is the authoritative entry point — frozen at the installed SDK version, network-free:

How to write the integration

A complete worked example fits in three pieces: a shared TxnodClient constructed at module scope, a two-secret routing helper for dual-mode receivers, and two route handlers (checkout to mint invoices, webhook to verify and dispatch events). The shape uses Web-standard Request/Response so it ports verbatim into Next.js App Router, Hono, Fastify (with request.raw), and any Web-Fetch runtime; for Express, swap request.text() for req.rawBody.toString('utf8') after express.raw().

import { TxnodClient } from '@txnod/sdk'; export const txnod = new TxnodClient({ projectId: process.env['TXNOD_PROJECT_ID']!, apiSecret: process.env['TXNOD_API_SECRET']!, // sk_sandbox_... environment: 'non-production', });
// Two-secret routing for dual-mode receivers — sandbox callbacks reach the // same URL as production. Peek the body for the "mode":"sandbox" literal // before HMAC verification; if present, use the sandbox secret. export function pickWebhookSecret(rawBody: string): string { const looksSandbox = /"mode"\s*:\s*"sandbox"/.test(rawBody); return looksSandbox ? process.env['TXNOD_WEBHOOK_SECRET_SANDBOX']! : process.env['TXNOD_WEBHOOK_SECRET']!; }
import { TxnodClient } from '@txnod/sdk'; declare const txnod: TxnodClient; export async function checkout(request: Request): Promise<Response> { const { externalId, amountUsd } = (await request.json()) as { externalId: string; amountUsd: number; }; const invoice = await txnod.createInvoice({ external_id: externalId, amount_usd: amountUsd, coin: 'usdt_trc20', callback_url: `${process.env['SITE_URL']!}/api/txnod-webhook`, }); return Response.json({ invoiceId: invoice.id, paymentUri: invoice.payment_uri, }); }
import { verifyWebhookSignature, TxnodHmacError, TxnodTimestampError, } from '@txnod/sdk'; declare function pickWebhookSecret(rawBody: string): string; const seenEventIds = new Set<string>(); // replace with your dedupe store export async function webhook(request: Request): Promise<Response> { const rawBody = await request.text(); try { const event = verifyWebhookSignature( request.headers, rawBody, pickWebhookSecret(rawBody), ); if (event.mode === 'sandbox' && process.env['NODE_ENV'] === 'production') { return Response.json( { error: 'refusing sandbox event in production' }, { status: 400 }, ); } if (seenEventIds.has(event.event_id)) return Response.json({ ok: true }); seenEventIds.add(event.event_id); if (event.event_type === 'invoice.paid') { // event.data.invoice_id is fully typed in this branch — fulfil order } return Response.json({ ok: true }); } catch (err) { if (err instanceof TxnodHmacError || err instanceof TxnodTimestampError) { return Response.json({ error: 'signature' }, { status: 401 }); } throw err; } }

The pattern that makes this agent-friendly: dedupe on event.event_id (stable across retries and reorg-replays), branch on event.event_type for narrowed event.data types, fail-close when event.mode === 'sandbox' is observed in NODE_ENV=production. The two-secret routing means a single deploy can verify both production and sandbox callbacks without per-route forking.

The agent prompt structure that produces clean integrations is:

You are integrating @txnod/sdk into this project. Default to sandbox mode. Credentials (already in .env.local): TXNOD_PROJECT_ID TXNOD_API_SECRET (starts with sk_sandbox_) TXNOD_WEBHOOK_SECRET TXNOD_WEBHOOK_SECRET_SANDBOX TXNOD_PAT (sandbox:simulate scope) Steps: 1. Read node_modules/@txnod/sdk/AGENTS.md and docs/05-sandbox.md. 2. Mirror the route shape from docs/examples/nextjs-route-handler.md. 3. Implement two-secret webhook routing (sandbox first, fall through to production). 4. Idempotent-dedup on event.event_id. 5. Branch handler logic on event.mode === 'sandbox' to fail-closed in production. 6. Write a Vitest suite that drives all 7 simulate-* scenarios. 7. Run pnpm test until green; STOP. Do not deploy, do not push.

A typical agent trace follows the seven-step prompt above: read the bundled AGENTS.md + docs/05-sandbox.md, install @txnod/sdk, scaffold the two route handlers shown above, wire the two-secret helper, implement an event.event_id dedupe set, write a 7-scenario Vitest suite that drives the simulate-* matrix below, run pnpm test until green, then stop. The agent never proceeds to mainnet without explicit human approval — see Stop condition.

How to drive the simulate-* loop

Sandbox state transitions are deterministic — every method call advances exactly one state and emits exactly one webhook (except simulateDuplicateDelivery, which re-fires the most recent terminal event with the same event_id). The 7 scenarios that cover the full event matrix:

ScenarioSDK callsMCP toolsWebhook events delivered
1. detected → paidsimulateDetect, simulatePaidsandbox_simulate_detect, sandbox_simulate_paidinvoice.detected, invoice.paid
2. detected → overpaidsimulateDetect, simulateOverpaidsandbox_simulate_detect, sandbox_simulate_overpaidinvoice.detected, invoice.overpaid
3. detected → partialsimulateDetect, simulatePartialsandbox_simulate_detect, sandbox_simulate_partialinvoice.detected, invoice.partial
4. pending → expiredsimulateExpiresandbox_simulate_expireinvoice.expired
5. expired → expired_paid_latesimulateExpire, simulateLatePaymentsandbox_simulate_expire, sandbox_simulate_late_paymentinvoice.expired, invoice.expired_paid_late
6. paid → reverted → paid (reorg + reconfirm)simulateDetect, simulatePaid, simulateReorg, simulateReconfirmsandbox_simulate_detect, sandbox_simulate_paid, sandbox_simulate_reorg, sandbox_simulate_reconfirminvoice.detected, invoice.paid, invoice.reverted, invoice.paid (re-emitted with stable event_id)
7. duplicate delivery (idempotency)simulateDuplicateDeliverysandbox_simulate_duplicate_deliveryre-fires most recent terminal event with the SAME event_id

clockAdvance(projectId, { chain, blocks }) increments per-chain confirmation counters across detected invoices — drive it when integration logic gates on event.data.confirmations.

reset(projectId) is called as beforeAll and afterAll so each test run starts clean. destroy(projectId) cascades the entire project (use only at end-of-life — sandbox projects are usually kept long-lived).

How to verify

The agent verifies correctness in three places:

  1. assertSafeMode() at test boot. The byte-exact CI assertion (see Sandbox safety → Recommended CI assertion) imported from tests/setup.ts fails the run if the env wiring is wrong.
  2. Expected event_type and terminal status per scenario. Each scenario polls listWebhookEvents for the expected event-type set (invoice.detected, invoice.paid, etc.) and asserts the SDK’s returned status. The structural sandbox-mode invariant is enforced server-side by the dispatcher (every webhook from a kind='sandbox' project carries mode: 'sandbox'); receiver-side per-event mode checks live in the route handler at src/app/api/txnod-webhook/route.ts, which fail-closes when event.mode === 'sandbox' is observed in NODE_ENV=production.
  3. The 7-scenario Vitest suite, all green. This is the operational definition of “agent-ready” — sandbox-green proves the integration handles every webhook type the dispatcher can emit.

The handler MUST dedup on event.event_id (not on tx_hash, not on (invoice_id, event_type)) — simulateDuplicateDelivery and the paid → reverted → paid reconfirm flow both re-fire the same event_id and your idempotency layer is the device under test.

Stop condition

The agent stops at sandbox-green. Sandbox-green is the operational completion signal — all 7 scenarios pass, assertSafeMode() is wired, the handler dedups on event_id, the integration commits to git. The agent does NOT proceed to mainnet, does NOT swap to a production secret, does NOT register a real wallet — those steps belong to the human owner of the project. Cross-link to the Sandbox projects → Graduate to production section for the human’s promotion checklist.

What NOT to do

These anti-patterns must never appear in agent-authored code:

  • Do NOT promote a sandbox xpub to a production env. Sandbox xpubs are testnet-derived; using them in production routes real customer funds to addresses an attacker could reach via the public testnet faucet.
  • Do NOT bypass iAcknowledgeRoutingRealCustomerFundsToSandboxAddresses without explicit human approval. The override exists for staging-replica setups that mirror production env vars; defaulting to it in agent-generated code is a category error.
  • Do NOT paste a sk_sandbox_... API secret into a production .env. Always keep the sandbox secret in .env.local (developer machines + CI) and the production secret in the deployed environment.
  • Do NOT ignore the mode field in the webhook handler. A handler that doesn’t branch on event.mode will happily process a sandbox event in production, defeating layer 5 of the seven-layer defense.
  • Do NOT proceed to mainnet without the human’s explicit approval. Sandbox-green is the stop condition; mainnet promotion is a separate human-driven step (see Sandbox projects → Graduate to production).

For the full safety analysis with per-layer code examples and three failure-scenario walkthroughs, read Sandbox safety. For the project-creation flow, read Sandbox projects.