CommunityWriting & Editinggithub.com

agora-creations/mikros

An agent-skills library that teaches AI coding agents how to author, test, validate, and deploy megálos workflows.

Works with~Claude Code~Codex CLI~Cursor
npx add-skill agora-creations/mikros

name: testing-workflows version: "0.1.0" megalos_version: ">=0.4.0"

Testing megálos workflows

1. Scope and non-goals

This skill teaches an AI coding agent how to verify the behaviour of a megálos workflow YAML file before it ships. The primary surface is the dry-run CLI (python -m megalos_server.dryrun), which walks a workflow through the production runtime with a mock input source in place of a real LLM. Dry-run answers the question "given these mock responses, does this workflow step, branch, gate, descend, and terminate the way the author intended?" It is fast, offline, and deterministic; it makes no LLM calls and no MCP round-trips. A live run against a real megálos deployment is available as a bounded secondary surface for the cases dry-run cannot reach — real-LLM output branches and real-registry mcp_tool_call digressions — and is covered in §6 (drafted in a later task).

In scope. The interactive dry-run recipe (§3). Scripted-responses fixtures (§4) for regression replay. Load-time cross-workflow errors — unknown_call_target and call_cycle_detected — and how dry-run surfaces them via create_app() bootstrap (§5). Exit-code and error-banner contracts the agent must read.

Out of scope. Per-file workflow validation — covered by the companion skill validating-workflows. Workflow authoring — covered by authoring-workflows. Domain-repo packaging, Horizon deployment, and registry setup — covered by deploying-workflows. Framework-level tests and workflow-fixture contributions to the megálos repository itself: pytest/conftest is not a surface this skill teaches. The testing-workflows skill verifies workflow behaviour with dry-run; framework test authoring is a distinct activity for contributors to megálos, and is explicitly out of scope here. Real LLM invocation — dry-run never calls an LLM; authors who need a real LLM in the loop reach for the live-run surface (§6) or a deployed instance.

Audience. The reader is an AI coding agent acting on behalf of a workflow author. The dry-run design explicitly frames this audience: authors are not test engineers. A workflow author needs to see the directive template rendered, the gates listed, and the schema feedback at each step — not write a pytest case. The interactive and scripted dry-run recipes below are shaped around that audience.

2. Dry-run as primary mode

python -m megalos_server.dryrun loads a workflow and drives it through the production execution path, reading each step's mock "LLM response" from stdin (or from a scripted YAML file — see §4) instead of calling a model. The dry-run is not a parallel simulator; it is the production runtime with a mock input source. The same start_workflow, submit_step, classification, output_schema validation, retry accounting, branch selection, and sub-workflow descent that a live server runs are what dry-run runs. Anything dry-run accepts, the server accepts; anything dry-run rejects at bootstrap, the server rejects at startup.

What dry-run exercises. Step rendering (banner, precondition, gates, directive). output_schema validation and the retry surface (validation hints, remaining-retry counts, budget exhaustion). Branch selection and branch-default resolution. Precondition skip detection — when a step's precondition: is unmet at runtime, dry-run prints Skipped: <step_id> on stdout and advances (see §3 for the surface contract; §9 for the common-mistake shape). Sub-workflow descent and parent resume. create_app() load-time checks over the full workflow set in the target's parent directory, including unknown_call_target and call_cycle_detected (§5).

What dry-run does not exercise. Real LLM generation quality. Real MCP registry round-trips (mcp_tool_call steps are covered by schema and structural checks at load time, but a real registry call is a live-run concern). Production-scale session persistence, latency, or concurrency. When the author needs to verify any of these, the live-run surface is the tool (§6).

Exit-code contract. Dry-run exits 0 on workflow_complete and 1 on any other terminal status — the production _TERMINAL_STATUSES frozenset is {"workflow_complete", "error", "session_escalated", "workflow_changed"}. A non-zero exit always pairs with a decoded error banner on stderr. Scripting around dry-run (CI smoke-tests, pre-commit checks) should treat exit 0 as the sole green.

3. Interactive dry-run recipe

The interactive mode is the common case: the agent (or its human operator) types a mock LLM response at each step and watches the workflow unfold. Invocation:

python -m megalos_server.dryrun workflows/my_workflow.yaml

Dry-run loads every *.yaml file in the target's parent directory (required so sub-workflow call: targets can resolve — see §5), then bootstraps the workflow and enters a REPL. At each step, the runtime renders a banner, the optional precondition line, any gates, and the step's directive template, then prompts with > . The agent types a mock LLM response and presses enter. The workflow advances.

A minimal three-step linear-workflow session looks like this:

$ python -m megalos_server.dryrun workflows/example.yaml

=== Step: alpha — First step ===
<directive template rendered here>
> ok

=== Step: bravo — Second step ===
<directive template rendered here>
> ok

=== Step: charlie — Third step ===
<directive template rendered here>
> ok

Workflow complete

Exit code 0. Three mock responses drove three steps to workflow_complete.

Prompts the agent may encounter.

  • Step response prompt (> ). The default prompt at every step. Whatever the agent types is fed to the runtime as if an LLM had produced it. For steps with an output_schema, the response must be valid JSON matching the schema; an invalid JSON payload triggers a validation error (with a Retries remaining: N line on stderr) and re-prompts at the same step. Exhausting the retry budget terminates the run with a Max retries (N) exceeded banner and exit 1.

  • Branch selection prompt. When a step with a branches: block is reached, dry-run first consumes a step response, then prints:

    Branches:
      1. first_branch_target
      2. second_branch_target [default]
    Choose branch [1-2, empty = default]: 
    

    The [default] tag marks the branch chosen when the selector evaluates against the step response. An empty input resolves to the default.

  • Precondition line. Steps with a precondition: render it at entry as Precondition: <ref> == "<value>" or Precondition: <ref> is present, before the directive.

  • Skip surface (Skipped: <step_id>). When a step's precondition is unmet at runtime, dry-run does not render that step's banner or prompt; instead it prints Skipped: <step_id> on stdout and advances to the next step. The earlier-rendered Precondition: line on the prior step is the implicit explanation. Skip detection only flags precondition-bearing steps; alternate branches that simply weren't taken are not "skipped" — they are an unreached path. A run that unintentionally skips a step still terminates with workflow_complete on exit 0, so the only signal of an unintended skip is the Skipped: line in stdout. Read every Skipped: line and confirm the precondition that suppressed the step was meant to suppress it.

  • EOF (Ctrl-D). Closing stdin mid-workflow aborts with Dry-run aborted by user (EOF) on stderr and exit 1.

Sub-workflow descent. When a step's call: target resolves to a sibling workflow, dry-run prints Entering sub-workflow <name> and indents the child's banners by two spaces per nesting level. On the child's workflow_complete, dry-run prints Returned from sub-workflow and resumes the parent at the next step.

Clean-directory discipline. Because dry-run loads every *.yaml in the parent directory, a broken sibling YAML blocks the target from loading. If the bootstrap-time error banner names a file other than the target, a sibling is at fault — fix it, or move the target into a directory that contains only it and its call: targets. The framing paragraph dry-run prints on load failure is:

Failed to load workflows from <parent_dir>: <exception>
Note: dry-run loads all *.yaml files in the parent directory (required
for sub-workflow 'call' target resolution). If the error above names a
file other than <target.yaml>, a sibling workflow has a problem — fix
it, or move <target.yaml> to a directory containing only it and its
call targets.

4. Scripted-responses fixtures

For regression replay and CI smoke-tests, dry-run accepts a --responses-file pointing at a YAML fixture that drives the workflow non-interactively:

python -m megalos_server.dryrun workflows/my_workflow.yaml \
    --responses-file tests/responses/my_workflow_happy.yaml

The responses file has a fixed shape: a top-level mapping with a required version: 1 key and a required entries: list. Each entry is a mapping with a required step_id: and exactly one of response: (for step content) or branch: (for branch-selection prompts) — never both, never neither. A minimal three-step happy-path fixture:

version: 1
entries:
  - step_id: alpha
    response: ok
  - step_id: bravo
    response: ok
  - step_id: charlie
    response: ok

Ordered consumption. At the root workflow frame (descent depth 0), entries are consumed in order: the first entry must match the first step's step_id, the second entry the second step, and so on. A spelling or order drift aborts the run. During sub-workflow descent (depth > 0), dry-run walks past non-matching entries until a match is found, so parent-frame and child-frame entries may be interleaved in file order — but each frame's own entries must appear in that frame's execution order.

Drift-detection banners. The parser and scripted-consume layer print verbatim banners on stderr before exiting 1. Authors will see exactly these strings when a fixture is wrong:

  • Missing version field: Responses file missing required 'version' field. Expected: version: 1
  • Unknown version: Unknown responses-file version: <n>. Supported: [1]
  • Response/branch mutex (both present): ... must have exactly one of 'response' or 'branch', not both.
  • Response/branch mutex (neither present): ... must have exactly one of 'response' or 'branch'.
  • Step-id drift at root depth: Script entry expected step_id=<entry_step>, REPL at step_id=<repl_step>
  • Entry-type mismatch (step expected response, script gave branch): At step <repl_step>, expected step response but script provided branch selection (or, symmetrically, expected branch selection but script provided response)
  • Exhaustion (too few entries): Responses file exhausted at step <repl_step> (expecting: <expected_label>)
  • Unused entries at completion (too many entries): Responses file had <N> unused entries after workflow completion

The unused-entries guard runs on workflow_complete immediately before the run exits; a longer-than-needed fixture is a scripting error, not a silent pass. Every drift case exits 1, so CI that treats exit 0 as green will fail loudly on any of the above.

Coverage parity with interactive mode. Scripted mode drives the same surface the interactive mode drives: output_schema validation and retry banners (an invalid response: triggers the same validation-error re-prompt), branch selection (via branch: entries), and sub-workflow descent (with interleaved parent/child entries). A scripted fixture that threads a workflow end-to-end is the canonical regression artifact for that workflow.

5. Load-time cross-workflow errors — unknown_call_target and call_cycle_detected

Two cross-workflow errors surface at workflow-set load time, not under per-file validation. Both are emitted from the same pass the validating-workflows skill flagged as a forward reference:

A nuance worth internalising: these two errors surface at workflow-set load time, not under a per-file python -m megalos_server.validate <file> run. The per-file validator call covers only the outer workflow's own structural and semantic rules; it cannot know about the other workflows the server will eventually load. The call-target and cycle checks run inside the server's create_app() after every workflow YAML in the workflow directory has been loaded individually, and they run before the MCP app is constructed. In practice this means: an agent that authors or modifies a workflow with a call: step must either load the full workflow set into the server to surface unknown_call_target and call_cycle_detected, or use the test surface covered in testing-workflows. A green per-file validate run is a necessary but not sufficient condition for a loadable server.

This section is that test surface. The verbatim f-string shapes the loader emits are:

Workflow '<parent_name>' step '<step_id>' calls unknown workflow '<target>' (code: unknown_call_target)
call cycle detected: <wf_a> -> <wf_b> -> ... -> <wf_a> (code: call_cycle_detected)

How dry-run surfaces them. Dry-run's bootstrap calls create_app(workflow_dir=<target's parent directory>). create_app() loads every *.yaml in that directory, then runs the cross-workflow call-resolution pass before the MCP app is constructed. If any call: target does not resolve to a loaded workflow, or if the call graph contains a cycle, the load fails with the error code above and dry-run exits 1 at bootstrap, before the REPL starts. The failure is wrapped in the framing paragraph shown in §3 ("dry-run loads all *.yaml files in the parent directory …") so the author can see which file is implicated.

The practical implication. A per-file python -m megalos_server.validate <file> run cannot surface unknown_call_target or call_cycle_detected: it sees only the one file on the command line, not the workflow set. Dry-run on a single target does surface them, because it loads the target's entire parent directory. For a workflow with call: digressions, the agent's verdict loop is:

  1. Run per-file validate on each workflow individually — catches structural and semantic errors local to each file.
  2. Run python -m megalos_server.dryrun <target.yaml> on any workflow that has call: steps, or on one workflow that transitively reaches them — catches unknown_call_target and call_cycle_detected at bootstrap across the full directory.
  3. For an even stronger signal against the actual server's loader, run python main.py (domain-repo entrypoint) or fastmcp inspect main.py:mcp — both exercise create_app() the same way dry-run does. These are the deployment-side verdict tools (deploying-workflows §8) and surface exactly the same cross-workflow errors.

A green dry-run at the target confirms per-workflow structural correctness and cross-workflow graph correctness for the workflow set in that target's parent directory. That is the gap the per-file validator leaves open, and closing it is the job of this section and of §§6+.

For the verbatim error messages, what each one means, and the fix for each, see references/load-time-errors.md.

6. Live-run as bounded secondary

Dry-run covers roughly 80% of the teach-value for a workflow author: the production execution path, schema validation, branch resolution, sub-workflow descent, and the load-time cross-workflow errors of §5 all surface in dry-run without an LLM, without an MCP round-trip, and without network cost. The remaining cases — where the agent needs real-LLM output to drive a branch the scripted file cannot fairly mock, or where an mcp_tool_call step must actually round-trip to a registered server — are what the live-run surface is for. Live-run is deliberately a secondary surface and is deliberately thin in this skill: it is a pointer, not a tutorial.

When to reach for live-run. Two cases only:

  1. LLM-output-dependent branches. The workflow has a branches: block whose selector depends on the content of a real LLM response in a way the agent cannot fairly script. In dry-run the agent is typing the mock response and then the branch selection — it is grading its own homework. A live run against a real model is the honest test.
  2. Real-registry mcp_tool_call digressions. The workflow has an mcp_tool_call step and the agent needs to confirm the call actually reaches the registered server, the auth env var resolves, and the response shape matches what the workflow expects downstream. Dry-run's structural checks on mcp_tool_call are real, but the network round-trip and env-var resolution are not exercised.

The recipe. Stand up a local megálos server from the domain repo that holds the workflow, then point any MCP-compatible client at it. The server side is the deploying-workflows skill's territory; this skill only names the shape:

$ python main.py

or, for the FastMCP CLI form,

$ fastmcp run main.py:mcp

Either command binds the HTTP transport on FASTMCP_HOST:FASTMCP_PORT and keeps the process up. From there, point an MCP-compatible client at the local endpoint (any MCP client works) and drive the workflow through the client the same way an end user would. Server-start mechanics, env-var wiring, and the deploy.sh pre-flight live in deploying-workflows §§3–6; client configuration and connection-add UX belong to whichever client the operator chose and are out of scope here.

What live-run does not earn. Client authoring, SDK selection, client-library tutorials, LLM-integration patterns, and real-LLM cost/quality discussion are all out of scope for this skill. Live-run is a verification surface, not a development topic. If a workflow passes dry-run — interactive and scripted — and passes the per-file validator, a live run is confirmation, not discovery. Dry-run stays primary; live-run closes the real-LLM and real-network gap and nothing wider.

7. Relationship to sibling skills

This skill sits third in the four-skill author's loop — after authoring and validating, before deploying. The boundaries are sharp on purpose.

authoring-workflows. You write the workflow YAML against the exported JSON Schema and the authoring guidance there. When you are ready to check that it behaves the way you intended under the runtime, you hand it to this skill. Authoring produces the artifact; testing-workflows verifies its behaviour before it ships.

validating-workflows. Validation is the fast, offline, per-file gate: structural and semantic rules, JSON Schema conformance, step-reference resolution, single-file registry cross-check. It answers "is this one workflow loadable in isolation?" — nothing about behaviour, nothing about siblings. The testing-workflows surface picks up where validation stops: behavioural checks under the production runtime (output_schema retries, branch resolution, sub-workflow descent) and the cross-workflow load-time surface.

Closing the forward-reference. The validating-workflows skill flags unknown_call_target and call_cycle_detected as errors that surface at workflow-set load time, not under per-file validate, and forwards to this skill for the surfacing mechanism. §5 above is that forwarding target. Dry-run's create_app() bootstrap runs the cross-workflow resolution pass over the target's parent directory, so running python -m megalos_server.dryrun <target.yaml> on a workflow with call: digressions surfaces both error classes before the REPL starts. A reader arriving here from the validating-workflows Anti-scope paragraph should land on §5 for the verbatim error shapes and on references/load-time-errors.md for the catalogue.

deploying-workflows. Deployment is the ship gate: domain-repo layout, entrypoint contract, deploy.sh pre-flight, Horizon flow. A green dry-run is a pre-flight for ship, not a replacement for deploy.sh: the authoritative pre-flight still runs the python main.py + fastmcp inspect main.py:mcp verdict triple against the real domain repo. Dry-run catches the same load-time cross-workflow errors those commands catch (they share create_app()), but deploy.sh also checks domain-repo shape, which dry-run has no opinion about. Treat dry-run as the daily inner loop and the deploy verdict triple as the outer gate.

The author's day-to-day rhythm: edit in authoring, gate in validating, behave-check in testing, ship-gate in deploying. Each skill answers one question the others cannot.

8. Worked example — output_schema retry and a mis-defaulted branch

This example walks a narrower slice than the skill's empirical test: two small workflows that exercise two of the five stress classes — a mis-defaulted branch and an output_schema violation caught inside a scripted run. The worked example deliberately does not cover unknown_call_target (that is the empirical test's job) so the reader walks away having learned the shape without having copy-pasted the harder case.

8a. Starting conditions

Two workflows on disk in a clean directory:

  • collect.yaml — a two-step collect-and-summarise workflow, modelled on the megálos demo_validation fixture. Step 1 has an output_schema requiring title (string, 3+ chars), goals (array of 3+ strings), and confirmed (boolean true). Step 2 summarises what step 1 collected.
  • route.yaml — a single-step workflow with a branches: block routing to one of two terminal banners depending on the step response. The [default] tag is on the wrong branch (the author intended short as default but tagged long).

Both workflows pass per-file validation individually.

8b. Class 4 — output_schema violation inside a scripted run

The agent writes a scripted-responses file to regress the collect-and-summarise flow. First attempt — collect_happy.yaml:

version: 1
entries:
  - step_id: collect_info
    response: '{"title": "xy", "goals": ["only one"]}'
  - step_id: summarize
    response: summary line

Running it:

$ python -m megalos_server.dryrun workflows/collect.yaml \
      --responses-file tests/responses/collect_happy.yaml

=== Step: collect_info — Collect Project Information ===
<directive rendered>

Validation failed:
  - title: string shorter than 3 characters
  - goals: array has fewer than 3 items
  - confirmed: required property missing
Hint: Submit JSON with title (string, 3+ chars), goals (array of 3+
strings), and confirmed (must be boolean true).
Retries remaining: 2

The scripted response: payload does not satisfy the step's output_schema, so dry-run emits the validation banner and re-prompts at the same step. In scripted mode, the next entry in the file must drive the same step again. The first attempt only had one entry per step, so the run fails with a scripted-exhaustion banner:

Responses file exhausted at step collect_info (expecting: step response)

Exit code 1. The fix is a scripted file that threads the retry explicitly — one invalid submission, then a valid one, then the summariser:

version: 1
entries:
  - step_id: collect_info
    response: '{"title": "xy", "goals": ["only one"]}'
  - step_id: collect_info
    response: '{"title": "Project X", "goals": ["a", "b", "c"], "confirmed": true}'
  - step_id: summarize
    response: summary line

Re-running:

$ python -m megalos_server.dryrun workflows/collect.yaml \
      --responses-file tests/responses/collect_happy.yaml

=== Step: collect_info — Collect Project Information ===
<directive rendered>
Validation failed:
  - title: string shorter than 3 characters
  ...
Retries remaining: 2

=== Step: collect_info — Collect Project Information ===
<directive rendered>

=== Step: summarize — Summarize the Project ===
<directive rendered>

Workflow complete

Exit code 0. The regression artifact captures both the failure and the recovery.

8c. Class 1 — mis-defaulted branch

The route.yaml workflow's author believes typing an empty branch selection resolves to short, because that was the intended default. In interactive dry-run:

$ python -m megalos_server.dryrun workflows/route.yaml

=== Step: route — Route Request ===
<directive rendered>
> pick
Branches:
  1. short
  2. long [default]
Choose branch [1-2, empty = default]: 

Pressing enter resolves to branch 2 (long) — the [default] tag is on long, not short. Dry-run surfaces the actual default the runtime will take, not the one the author intended. The fix is to swap the default: key on the branches: block in route.yaml and re-run — the [default] tag now sits on short, and empty input resolves there.

8d. What this example does not exercise

No call: step, therefore no unknown_call_target and no call_cycle_detected. No mcp_tool_call step. No sub-workflow descent and no precondition. The empirical test for this skill combines those axes to probe transfer; the worked example stays narrower so the reader walks through two clean failure classes without conflating five.

9. Common mistakes

Each item below names the symptom the author sees, the cause, and the fix. Most surface through stderr banners the dry-run prints verbatim before exiting 1.

  • Symptom: Responses file missing required 'version' field. Expected: version: 1 on stderr, exit 1. Cause: The scripted-responses YAML has no top-level version key. Fix: Add version: 1 as the top-level mapping's first key. The only supported value is 1.

  • Symptom: Unknown responses-file version: <n>. Supported: [1]. Cause: version: is present but holds a value other than 1. Fix: Set version: 1. The supported set is literally [1].

  • Symptom: ... must have exactly one of 'response' or 'branch', not both. (or ... must have exactly one of 'response' or 'branch'. when neither is present). Cause: A responses-file entry supplies both response: and branch: on the same mapping, or neither. Fix: Every entry is either a content response (response: <str>) or a branch selection (branch: <target>). Branch entries are only valid at the branch-selection prompt that follows a branching step's content response.

  • Symptom: Script entry expected step_id=<entry>, REPL at step_id=<repl>. Cause: At root depth, the scripted entries are out of order or the step_id: is misspelled against the workflow's actual step ids. Fix: At depth 0, entries must appear in the workflow's step order. Check the spelling of step_id: against the workflow YAML. During sub-workflow descent (depth > 0) dry-run walks past non-matching entries, but the root frame is strict.

  • Symptom: At step <id>, expected step response but script provided branch selection (or the reverse). Cause: A branching step's scripted entries are shaped wrong. A branching step needs a content response: entry first (the mock LLM output), then a branch: entry at the selection prompt. A plain (non-branching) step needs only a response: entry. Fix: For a branching step, write two entries: one response: and one branch:. For a plain step, write one response:.

  • Symptom: Responses file exhausted at step <id> (expecting: step response). Cause: The scripted file has too few entries to reach workflow_complete, typically because a retry was not accounted for. Fix: Count the steps the workflow will actually take, including any retries the author is deliberately driving, and provide one entry per prompt.

  • Symptom: Responses file had <N> unused entries after workflow completion. Cause: The scripted file has leftover entries after the workflow reached workflow_complete — a silent pass would mask a script-authoring error. Fix: Trim the file to exactly the entries the workflow consumes. The unused-entries guard runs immediately before exit, so a longer file always exits 1.

  • Symptom: Failed to load workflows from <parent_dir>: ... and the error implicates a sibling YAML, not the target. Cause: Dry-run loads every *.yaml in the target's parent directory (required for call: target resolution). A broken sibling blocks the target from loading. Fix: Fix the sibling, or move the target into a directory that contains only it and its call: targets. See §3's clean-directory discipline.

  • Symptom: Skipped: <step_id> appears on stdout, exit 0, but the author intended that step to run. Cause: The step has a precondition: whose runtime resolution is false — the referenced prior-step output was missing or did not match. Skip is silent on stderr and does not flip the exit code: a run with unintended skips still terminates workflow_complete, so the Skipped: line on stdout is the only signal. Fix: Re-read the prior step's output_schema and the precondition's <ref> clause. Confirm the referenced field is populated by some live path. If the precondition is correct but the prior mock response did not satisfy it, supply a different mock response in the next dry-run; if the precondition itself is wrong, fix it in the workflow YAML.

  • Symptom: Author treats the > prompt as a user-input prompt and types what a human end-user might say. Cause: Misreading the prompt's contract. The > prompt is the mock LLM response at the current step — the string a live model would have produced, not the string a human user would have typed into a client. Fix: Type what the LLM would have emitted. For steps with an output_schema, that is a JSON payload matching the schema. For free-form steps, it is whatever the runtime would read from an LLM's text output.

  • Symptom: Author assumes a green dry-run proves the workflow is correct under a real LLM. Cause: Dry-run never calls an LLM. It verifies the runtime's structural behaviour (transitions, validation, retries, branch resolution, descent) given mock responses the author chose. It does not prove that a real LLM will emit responses that drive the same path, and terminal envelopes prove only that the runtime's finish state is well-formed, not that the artifact is semantically right. Fix: Use dry-run for structural verdict; use the live-run surface (§6) when real-LLM behaviour or real-registry mcp_tool_call round-trip is the question.

10. References

  • references/scripted-responses-template.md — YAML shape, annotated example, and the verbatim drift-detection banners emitted by the responses-file parser and scripted-consume layer, each with Means and Fix.
  • references/load-time-errors.md — the two cross-workflow load-time errors (unknown_call_target and call_cycle_detected) with verbatim message shapes, a preamble pinning the create_app() boundary, and Means/Fix entries.

These references are consulted by message — when dry-run prints one of the banners they catalogue, the agent opens the matching file and looks up the entry.

Related Skills