Community程式設計與開發github.com

openclaw-test-heap-leaks

Investigate OpenClaw pnpm test memory growth, Vitest OOMs, RSS spikes, and heap snapshot deltas.

相容平台~Claude Code~Codex CLI~Cursor
npx add-skill https://github.com/clawdbot/clawdbot/tree/main/.agents/skills/openclaw-test-heap-leaks

OpenClaw Test Heap Leaks

Use this skill for test-memory investigations. Do not guess from RSS alone when heap snapshots are available. Treat snapshot-name deltas as triage evidence, not proof, until retainers or dominators support the call.

For runtime fixes (e.g., closure leaks in long-running services like the gateway), see Validating runtime fixes below — that uses a dedicated harness, not the test-parallel snapshot machinery.

Workflow

  1. Reproduce the failing shape first.

    • Match the real entrypoint if possible. For Linux CI-style unit failures, start with:
    • pnpm canvas:a2ui:bundle && OPENCLAW_TEST_MEMORY_TRACE=1 OPENCLAW_TEST_HEAPSNAPSHOT_INTERVAL_MS=60000 OPENCLAW_TEST_HEAPSNAPSHOT_DIR=.tmp/heapsnap OPENCLAW_TEST_WORKERS=2 OPENCLAW_TEST_MAX_OLD_SPACE_SIZE_MB=6144 pnpm test
    • Keep OPENCLAW_TEST_MEMORY_TRACE=1 enabled so the wrapper prints per-file RSS summaries alongside the snapshots.
    • If the report is about a specific shard or worker budget, preserve that shape.
    • Before you analyze snapshots, identify the real lane names from [test-parallel] start ... lines or pnpm test --plan. Do not assume a single unit-fast lane; local plans often split into unit-fast-batch-*.
  2. Wait for repeated snapshots before concluding anything.

    • Take at least two intervals from the same lane.
    • Compare snapshots from the same PID inside the real lane directory such as .tmp/heapsnap/unit-fast-batch-2/.
    • Use .agents/skills/openclaw-test-heap-leaks/scripts/heapsnapshot-delta.mjs to compare either two files directly or the earliest/latest pair per PID in one lane directory.
    • If the helper suggests transformed-module retention, confirm the top entries in DevTools retainers/dominators before calling it solved.
  3. Classify the growth before choosing a fix.

    • If growth is dominated by Vite/Vitest transformed source strings, Module, system / Context, bytecode, descriptor arrays, or property maps, treat it as likely retained module graph growth in long-lived workers.
    • If growth is dominated by app objects, caches, buffers, server handles, timers, mock state, sqlite state, or similar runtime objects, treat it as a likely cleanup or lifecycle leak.
    • If the names are ambiguous, stop short of a confident label and inspect retainers/dominators in DevTools for the top deltas.
  4. Fix the right layer.

    • For likely retained transformed-module growth in shared workers:
    • Prefer timing and hotspot-driven scheduling fixes first. Check whether the file is already represented in test/fixtures/test-timings.unit.json and whether scripts/test-update-memory-hotspots.mjs should refresh the measured hotspot manifest before hand-editing behavior overrides.
    • Move hotspot files out of the real shared lane by updating test/fixtures/test-parallel.behavior.json only when timing-driven peeling is insufficient.
    • Prefer singletonIsolated for files that are safe alone but inflate shared worker heaps.
    • If the file should already have been peeled out by timings but is absent from test/fixtures/test-timings.unit.json, call that out explicitly. Missing timings are a scheduling blind spot.
    • For real leaks:
    • Patch the implicated test or runtime cleanup path.
    • Look for missing afterEach/afterAll, module-reset gaps, retained global state, unreleased DB handles, or listeners/timers that survive the file.
  5. Verify with the most direct proof.

    • Re-run the targeted lane or file with heap snapshots enabled if the suite still finishes in reasonable time.
    • If snapshot overhead pushes tests over Vitest timeouts, fall back to the same lane without snapshots and confirm the RSS trend or OOM is reduced.
    • For wrapper-only changes, at minimum verify the expected lanes start and the snapshot files are written.

Heuristics

  • Do not call everything a leak. In this repo, large unit-fast or unit-fast-batch-* growth can be a worker-lifetime problem rather than an application object leak.
  • scripts/test-parallel.mjs and scripts/test-parallel-memory.mjs are the primary control points for wrapper diagnostics.
  • The lane names printed by [test-parallel] start ... and [test-parallel][mem] summary ... tell you where to focus.
  • When one or two files account for most of the delta and they are missing from timings, reducing impact by isolating them is usually the first pragmatic fix.
  • When the same retained object families grow across multiple intervals in the same worker PID, trust the snapshots over intuition, then confirm ambiguous calls with retainer evidence.

Snapshot Comparison

  • Direct comparison:
    • node .agents/skills/openclaw-test-heap-leaks/scripts/heapsnapshot-delta.mjs before.heapsnapshot after.heapsnapshot
  • Auto-select earliest/latest snapshots per PID within one lane:
    • node .agents/skills/openclaw-test-heap-leaks/scripts/heapsnapshot-delta.mjs --lane-dir .tmp/heapsnap/unit-fast-batch-2
  • Useful flags:
    • --top 40
    • --min-kb 32
    • --pid 16133

Read the top positive deltas first. Large positive growth in module-transform artifacts suggests lane isolation; large positive growth in runtime objects suggests a real leak. If the names alone do not settle it, open the same snapshot pair in DevTools and inspect retainers/dominators for the top rows before declaring root cause.

Validating runtime fixes (not test-memory)

The workflow above is for diagnosing Vitest worker memory growth. For validating that a runtime/closure fix actually releases captured state, use the dedicated harness:

  • pnpm leak:embedded-run — runs scripts/embedded-run-abort-leak.ts. Loops N aborted runs in a function-shaped scope mimicking runEmbeddedAttempt, writes heap snapshots, and reports a PASS/FAIL verdict on retention growth using FinalizationRegistry for tracked-instance counting plus RSS delta.

Modes:

  • closure-extracted (default) — production fix shape (helper at module scope).
  • closure-inline — pre-fix shape (closure inside the runner scope). Use as a sensitivity check: if it passes you've broken the harness, not fixed a bug.
  • synthetic-leak — deliberately retains via a module-level bucket. Use to confirm the harness can detect leaks before trusting a PASS on a real fix.

Snapshots land in .tmp/embedded-run-abort-leak/. Diff with the same script as above:

node .agents/skills/openclaw-test-heap-leaks/scripts/heapsnapshot-delta.mjs \
  .tmp/embedded-run-abort-leak/baseline-*.heapsnapshot \
  .tmp/embedded-run-abort-leak/batch-N-*.heapsnapshot --top 30

When fixing a different runtime leak, add a new harness alongside this one rather than retrofitting it. The fixture function should mimic the lexical scope of the function where the leak lives, not be a generic abort-loop.

Output Expectations

When using this skill, report:

  • The exact reproduce command.
  • Which lane and PID were compared.
  • The dominant retained object families from the snapshot delta.
  • Whether the issue is a likely real leak or likely shared-worker retained module growth, plus whether retainers/dominators confirmed it.
  • The concrete fix or impact-reduction patch.
  • What you verified, and what snapshot overhead prevented you from verifying.

Individual skills in this repo

This repo contains 20 individual skills — each has its own dedicated page.

1password

Set up and use 1Password CLI for sign-in, desktop integration, and reading or injecting secrets.

acp-router

Route plain-language requests for Claude Code, Cursor, Copilot, OpenClaw ACP, OpenCode, Gemini CLI, Qwen, Kiro, Kimi, iFlow, Factory Droid, Kilocode, or explicit ACP harness work into either OpenClaw ACP runtime sessions or direct acpx-driven sessions ("telephone game" flow). For coding-agent thread requests, read this skill first, then use only `sessions_spawn` for thread creation. Codex chat binding defaults to the native Codex app-server plugin unless ACP is explicit or background spawn needs ACP.

agent-transcript

Add a redacted agent transcript section to GitHub PR or issue bodies during OpenClaw agent-created PR/issue workflows.

apple-notes

Create, view, edit, delete, search, move, or export Apple Notes via the memo CLI on macOS.

apple-reminders

List, add, edit, complete, or delete Apple Reminders and reminder lists via remindctl.

autoreview

Auto Review closeout. Codex review is the default when no engine is set and is the recommended reviewer.

bear-notes

Create, search, and manage Bear notes via grizzly CLI.

blacksmith-testbox

Run Blacksmith Testbox for CI-parity checks, secrets, hosted services, migrations, or builds local cannot reproduce.

blogwatcher

Monitor blogs and RSS/Atom feeds for updates using the blogwatcher CLI.

blucli

BluOS CLI (blu) for discovery, playback, grouping, and volume.

bluebubbles

Send and manage iMessages via BlueBubbles, including attachments, tapbacks, edits, replies, and groups.

browser-automation

Use when controlling web pages with the OpenClaw browser tool, especially multi-step flows, login checks, tab management, or recovery from stale refs/timeouts.

camsnap

Capture frames or clips from RTSP/ONVIF cameras.

canvas

Present HTML on connected OpenClaw node canvases, navigate/eval/snapshot, and debug canvas host URLs.

channel-message-flows

Use when previewing local channel message flow fixtures.

clawdtributor

Use for OpenClaw clawtributors PR/issue triage: Discrawl discovery, live-open rechecks, deep review, topic grouping, and compact @handle/LOC/type/blast/verification summaries.

clawhub

Search, install, update, sync, or publish agent skills with the ClawHub CLI and registry.

clawsweeper

Use for all ClawSweeper work: OpenClaw issue/PR sweep reports, commit-review reports, repair jobs, cloud fix PRs, @clawsweeper maintainer mention commands, trusted ClawSweeper-reviewed autofix/automerge, GitHub Actions monitoring, permissions, gates, and manual backfills.

clownfish-cloud-pr

Use when launching Clownfish in GitHub Actions to create or update one guarded GitHub implementation PR from issue/PR refs, a ClawSweeper report, a custom maintainer prompt, or to opt an existing Clownfish PR into ClawSweeper-reviewed cloud automerge.

codex-review

Codex code review closeout: local dirty changes, PR branch vs main, parallel tests.

相關技能