Sample Deliverables

This page shows the structure, core content, and review lens of a standard Agent delivery package. Content is anonymized.

Package Directory Structure

README.md -> project overview and quick start

agent-spec.json -> Agent goals, boundaries, and input/output definitions

system-prompt.md -> full system prompt text

workflow.md -> workflow steps, routing logic, and exception handling

test-cases/ -> 26+ test cases (normal / edge / error / high-risk / adversarial)

platform-adapters/ -> Claude Code / Codex / OpenClaw / n8n deployment notes

user-guide.md -> illustrated client handoff guide

Agent Spec Example (Summary)

Agent Name

Customer FAQ Auto-Reply Agent

Business Goal

Auto-handle 80% of common questions and escalate complex cases to humans

Target User

E-commerce customer support team

Risk Boundary

No refund amount promises, no order edits, no formal external commitments

System Prompt Example (Excerpt)

You are a professional customer support assistant. Your responsibilities are:
1) Answer common questions based on the FAQ knowledge base
2) Identify complex issues that require human intervention and classify them
3) Never promise refund amounts or modify orders
4) When uncertain, reply: 'I need to confirm this and get back to you'

Workflow Steps Summary

011. Receive customer message -> extract key information

022. Match FAQ knowledge base -> confidence > 0.85 auto-reply

033. Confidence < 0.85 -> classify issue type (logistics / refund / specs / other)

044. Logistics / specs -> query system and reply

055. Refund -> flag for human review and generate a ticket summary

066. Human review checkpoints: refund confirmation and abnormal complaints

How to review a delivery package

Check whether the Agent Spec defines the goal, inputs, outputs, and forbidden actions clearly.
Review whether the workflow includes exception paths and human review checkpoints.
Confirm the test cases cover normal, edge, error, high-risk, and adversarial behavior.
Verify the platform guidance is detailed enough for the client team to deploy or review independently.

Test Case Examples

ID	Type	Input	Expected
TC-001	Normal	Where is my package? Tracking #SF123456	Auto-query logistics status and reply
TC-008	Edge	Order number is empty	Prompt the user for a valid order number
TC-015	Error	I want a refund or I'll file a complaint	Flag for human review, create a ticket, do not promise a refund

Common mistakes

A prompt without a complete workflow or escalation boundary.
Tests that only cover the happy path and ignore risky inputs.
A package that still depends on the author’s verbal explanation to run.
Automation claims that should still require human review.

Want to see what your workflow package would look like?

Request Workflow Assessment

See the delivery package page Review delivery method