Use Dify for the visible app, workflow canvas, publishing path, and operator-facing experience.
How to ship a Dify app with MCP tools.
The useful unit is one governed workflow: Dify as the visible app, MCP as the tool boundary, and Policy OS as the approval, eval, runbook, and proof layer.
Give the app only the server cards, tools, resources, and setup state needed for that workflow.
The app is not done until approvals, blocked actions, eval gates, runbook, and proof are reviewable.
Ship the smallest workflow that can be inspected.
Do not start with a generic agent. Start with a workflow whose owner, tools, approval path, and proof can be named.
Start with the business path, source systems, approval owner, failure mode, and first safe delegation point.
List allowed tools, read resources, setup steps, auth scopes, forbidden actions, and harmless smoke checks.
Build the Dify workflow or chatflow, connect MCP tools, publish the app, and preserve a DSL or manifest snapshot.
Run expected tool use, forbidden tool use, write confirmation, secret refusal, latency, and cost checks.
Share route health, gate names, pass/fail status, release notes, and sanitized examples without raw traces.
This owned diagram keeps the implementation legible: Dify carries the app, MCP scopes capability, Policy OS governs behavior, and proof explains what changed.
Created by CREATE SOMETHING for this field guide.The app ships with artifacts, not just prompts.
A Dify workflow becomes governable when its tool access, behavior, outcome, tests, and operating steps are separate enough to review.
Tool schemas, resource URIs, auth scopes, setup state, error model, and safe smoke input.
Allowed actions, approval-needed actions, blocked actions, escalation triggers, runtime surface, and graduation status.
The business result, fallback path, owner responsibility, review cadence, and measurable acceptance criteria.
Positive and negative examples that prove normal tool use, refusal behavior, and approval pauses still work.
How to publish, pause, rotate access, handle incidents, roll back, and hand the workflow to a new operator.
Run gates where the workflow can overreach.
The goal is not a generic benchmark. The goal is evidence that changes a publish, hold, rollback, or scope decision.
The app can call the expected Dify API path and the MCP server card can answer a harmless read.
Normal examples call the intended read, search, draft, classify, or handoff tool instead of improvising.
The app does not call post, delete, refund, export, broad-search, or account-mutation tools outside the contract.
Customer-facing, revenue-touching, or irreversible actions stop with context and options for the named owner.
The app refuses requests for tokens, private traces, broad account records, and prompt-injection disclosure.
The workflow stays inside the agreed response and spend range, or it narrows scope before launch.
For support triage, the same workflow has four different decisions.
This is the quality bar from the papers applied to one Dify app: every tool call needs a state, owner, evidence path, and stop condition.
The app can read scoped records and summarize the customer state when the MCP contract limits the account boundary.
The app can prepare a customer-safe draft with cited source records, but it does not send the message yet.
The app pauses for the support owner when the action becomes customer-facing or affects account trust.
The app refuses or escalates when the request exceeds the support lane, touches revenue, or asks for private evidence.
Separate client-safe proof from private evidence.
Buyers need enough proof to trust the workflow. Operators still need private traces, eval runs, credentials, account records, and incident notes protected.
App purpose, workflow boundary, gate names, latest pass/fail status, release note, fallback path, and owner.
Dify DSL snapshots, Langfuse traces, Braintrust runs, prompt variants, credentials, account records, and incident notes.
Passing evidence can expand scope. Failing evidence narrows tools, adds approval, rolls back, or keeps a manual path.
This guide is the implementation form of the current paper stack.
The `.io` papers define the operating model. This `.agency` guide turns that model into a route a builder, operator, or agency can use before shipping a Dify app.
The operating model behind run, wait, stop, approval owners, receipts, and runtime graduation.
Policy OS Contract BundleThe artifact family this guide applies: MCP contract, agent contract, outcome contract, golden tasks, and runbook.
Eval Evidence LayerThe measurement split behind Langfuse runtime traces, Braintrust MCP gates, and release decisions.
Proof SurfaceThe public/private evidence boundary that keeps client-safe proof readable without leaking raw traces.
The experiment trail keeps the guide from becoming generic advice.
The quality standard is implementation evidence: MCP writes, agent continuity, agent-native tools, and review lineage all informed the current shipping path.
Validated MCP write work: human intent, agent interpretation, protocol boundary, and database state update.
Agent ContinuityWhy long-running agent work needs durable artifacts so a new session can re-enter the workflow.
Webflow Plagiarism DetectionClassic algorithms and AI tiers became useful to agents once exposed through MCP tools.
Webflow Analyzer LineageA narrow analysis problem became governed review only after the evidence and policy surfaces separated.
Most failures come from collapsing the layers.
The app, tool boundary, evidence stream, and operating proof should stay separate enough that another operator can inspect them.
Broad tool access makes the app look powerful while making review, context, and liability harder to control.
Traces are evidence. The proof surface is the business-readable receipt that explains what the evidence means.
Move runtime-critical pieces to code only when the evidence justifies losing visual editing speed.
The recommendation follows the current Dify surfaces.
Dify provides the app and publishing model, MCP connects external tools, and Langfuse can monitor app performance. CREATE SOMETHING adds the workflow contract, eval gate, and proof surface.
Dify app types, published web/API access, and app-as-MCP-server positioning.
Use MCP toolsDify documentation for connecting external MCP server tools to apps.
API publishingDify documentation for using an app as a backend API service.
Langfuse integrationDify documentation for app performance tracing with Langfuse.
Bring one Dify workflow before adding more tools.
I'll map the workflow, MCP tool boundary, approval path, eval gates, and client-safe proof package before the app gets more autonomy.
Name the workflow, owner, tool boundary, and first safe delegation point.
Package the visible surface and connect only the scoped MCP tools.
Attach approvals, gates, runbook, and client-safe proof before launch.