Community艺术与设计github.com

walidboulanouar/vault-organizer

Clean up, organize and index any markdown knowledge base (Obsidian/PARA) with Karpathy's 3-layer LLM-wiki method. Works as an AI agent skill or standalone Python.

兼容平台Claude Code~Codex CLI~Cursor
npx skills add walidboulanouar/vault-organizer

文档


name: vault-organizer description: > Clean up, organize, and build a retrieval registry for a markdown knowledge base (Obsidian / PARA second brain) using Karpathy's 3-layer LLM-wiki method: RAW sources, COMPILED wiki, SCHEMA/index. Runs a read-only audit (missing frontmatter, broken or missing [[wikilinks]], orphan pages, missing Related sections, stale files, duplicate names, folders without an INDEX), then fixes them, enforces consistent structure, and generates per-folder + a master INDEX so any file is fast to find. Use when the user says: "clean up my folders / vault", "organize my files", "build an index or registry", "make my notes retrievable", "lint my knowledge base", "tidy my second brain", "standardize frontmatter and wikilinks", or wants a folder reorganized for better recall.

Vault Organizer

Turn a messy markdown vault into a clean, cross-linked, indexed knowledge base that an LLM (or a teammate) can retrieve from fast. Built on Karpathy's principle: treat knowledge like code. Raw sources are source files, the compiled wiki is the build output, the INDEX is the schema, and this skill is the compiler + linter.

The model (3 layers)

LayerWhatWhere
RAWUnprocessed: transcripts, clips, dumps, screenshots00_Inbox/raw/, closest topic folder
COMPILEDSynthesized pages, cross-referenced with [[wikilinks]]PARA folders (01_04_)
SCHEMAINDEX files, CLAUDE.md, the rulesper-folder INDEX.md + a master index

Every file carries type: frontmatter (raw | compiled | reference | framework | client | index). See references/standards.md for the frontmatter and INDEX templates.

Workflow

Run these in order. Audit and propose before changing anything. Never delete or move without confirming. Prefer a git repo so changes are reversible.

1. Audit (read-only)

# whole vault:
python3 scripts/audit.py <vault_root> --stale-days 120 --report /tmp/vault-audit.md
# one subfolder, but resolve [[wikilinks]] against the whole vault (avoids false "broken" flags):
python3 scripts/audit.py <subfolder> --link-root <vault_root> --report /tmp/vault-audit.md

Returns a health report: missing frontmatter, broken/missing wikilinks, orphans, missing Related sections, duplicates, stale files, folders without an INDEX. Read it, then show the user the summary table and the top issues. (--json for machine output.)

2. Triage

Group the findings and ask what to fix now vs later. Default priority:

  1. Folders without an INDEX (kills retrieval)
  2. Missing frontmatter (kills search + Properties)
  3. Broken wikilinks (kills the graph)
  4. Missing Related sections + orphans (kills discovery)
  5. Stale / duplicates (cleanup)

3. Clean (per file)

For each flagged file:

  • Add YAML frontmatter (title, type, source, author, created, description, tags) — infer from content, use today's date for created if unknown.
  • Add [[wikilinks]] on the first mention of any concept, tool, person, or file that exists elsewhere in the vault.
  • Add a ## Related section at the bottom with 3–8 [[wikilinks]] to the most connected pages. Add the file back into the Related of 2–3 of those pages (bidirectional).
  • Fix broken wikilinks: repoint to the correct slug, or leave as an intentional stub if the target is worth writing later. Use the templates in references/standards.md. Do not remove existing frontmatter or links.

4. Organize (propose first)

  • Move misfiled items to the right layer/folder (raw → 00_Inbox/raw/, compiled → PARA).
  • Normalize filenames to kebab-case; merge or rename duplicates.
  • Keep moves small and reversible. List every move and get a yes before running it.

5. Register (the deliverable)

  • For each content folder, generate or update INDEX.md: a one-line pointer per file (- [[file-name]] — short hook), grouped by type, newest or most-important first.
  • Build/refresh a master index (e.g. INDEX.md at the root, or a meta/ schema folder) that links to every folder INDEX.
  • This registry is what makes retrieval fast and what you hand to a teammate.

6. Lint (confirm)

Re-run audit.py. Show the before/after counts. The vault is "clean" when frontmatter, INDEX coverage, and broken-link counts are at zero (orphans/stale may remain by choice).

Sending it to a teammate

When the goal is to share the organized vault with a teammate: make sure the master INDEX is current, every folder has its own INDEX, and a short README.md at the root explains the 3-layer model and how to navigate. The README + master INDEX is the entry point.

Rules

  • Read-only audit first, always. Surface, don't surprise.
  • Never delete or bulk-move without explicit confirmation; recommend committing first.
  • Don't touch .obsidian/, .git/, or Claude Code config (settings.json, hooks).
  • Match the vault's existing conventions (this one: PARA + the standards in references/standards.md).
  • Work in batches; report what changed after each batch.

Files

  • scripts/audit.py — read-only scanner, the engine. Run it first and last.
  • references/standards.md — frontmatter spec, type taxonomy, INDEX + README templates.

相关技能