Finance Audit — Personal CFO Report

Turn a folder of raw statement files into a Personal CFO Report: verified data plus a narrative layer that answers What happened? Why? What should I do next?

Accuracy-first: amounts are read verbatim and reconciled against the running balance — nothing is invented.

Bank-agnostic. Works with any bank or account export, given a transaction list. Columns, header rows and date order (DMY/MDY/YMD) are auto-detected per file; the merchant/counterparty cleaner is rail-agnostic (UPI, POS, ACH, SEPA, card, wire, …). Nothing is hardcoded to a specific bank — you confirm the auto-detected mapping and tune the category keywords.

Pipeline (5 stages)

ingest  ->  parse  ->  categorize (loop to >=99%)  ->  analyze  ->  report (serve + hot reload)

The scripts live in this skill. Run them against the user's audit folder (the folder containing the statement files). Set two variables first:

SKILL="<absolute path to this skill directory>"   # the folder containing this SKILL.md
ROOT="<absolute path to the audit folder>"        # where the statement files are

Privacy — never commit real data

All statement files, balances, narrations, and generated output stay local in ROOT/. The skill repo only ships templates, example config, and dummy fixtures (assets/fixtures/sample_analysis.json).

Gitignored (never push): output/ (including report.html, analysis.json, audit.db), statement files, and live config/mapping.json / overrides.

Optional labels in config/mapping.json: account_label or account_last4 for the sidebar footer (defaults to "Primary account"). Business display name in config/business.json → display_name (defaults to "Business").

Assume statement files are already downloaded into ROOT (PDF / XLSX / XLS / CSV / TXT).

Pick the runtime (the user prefers Node.js):

Use Node when node is available AND no input is a PDF.
Use Python when any input is a PDF (Node path can't parse PDF) or node is unavailable.

Detect and install dependencies:

# Node path (preferred)
command -v node && node --version            # needs Node >= 22 for built-in sqlite
( cd "$ROOT" && npm install xlsx )           # ONLY for .xls/.xlsx; CSV/TXT need nothing

# Python path (fallback / required for PDF)
python3 --version
pip install openpyxl "xlrd==2.0.1" pdfplumber pandas  # + pandas for analyze stage

Define how you'll invoke a stage ($STAGE = ingest|parse|categorize|analyze|report):

# Node:    node   "$SKILL/scripts/node/$STAGE.mjs" --root "$ROOT" [args]
# Python:  python3 "$SKILL/scripts/python/$STAGE.py" --root "$ROOT" [args]

Step 1 — Ingest & detect

node "$SKILL/scripts/node/ingest.mjs" --root "$ROOT"      # or python3 .../ingest.py --root "$ROOT"

This previews every file and writes config/mapping.json with an auto-detected header row, column roles, and an explicit date_format. Read the preview and verify mapping.json — banks differ. If an entry has a _date_note, the date order was ambiguous (e.g. 03/04/2024); confirm date_format with the user. Each file entry needs: header_row, data_start_row, date_format (a strptime pattern like %d/%m/%Y, or auto), columns (role → 0-based index), and amount_mode:

debit_credit — separate withdrawal + deposit columns (most common)
signed — one amount column whose sign indicates direction
amount_type — an amount column plus a type column holding DR/CR

Fix any wrong indices before parsing.

Step 2 — Parse, normalize, dedupe, validate

node "$SKILL/scripts/node/parse.mjs" --root "$ROOT"       # or .../parse.py

Writes output/audit.db (SQLite). The debit/credit split is derived from the signed amount so it always matches the balance chain (correctly handling refunds encoded as negative withdrawals). Require the integrity check to PASS (or n/a when there is no balance column). If it FAILs:

Re-check mapping.json columns / data_start_row / date_format.
Re-read the ingest stdout preview (re-run ingest if needed); confirm you didn't include header/footer rows.
Per-file chain_issues > 0 points to the exact rows that mis-parsed.

Do not proceed to categorization until parsing reconciles.

Step 3 — Categorize (loop until >= 99% by value)

cp "$SKILL/config/categories.example.json" "$ROOT/config/categories.json"   # first run only
# For a business/freelancer account, also:
cp "$SKILL/config/business.example.json"   "$ROOT/config/business.json"     # then edit it
node "$SKILL/scripts/node/categorize.mjs" --root "$ROOT"   # or .../categorize.py

Rule precedence: overrides.json > business.json > categories.json; highest priority wins. Categorization is script-first; reach 99% like this:

Run categorize. Read the coverage summary and output/uncategorized.csv (merchants grouped, largest value first).
For merchants whose identity is obvious from the narration, add rules to config/overrides.json yourself (see format below) — no need to ask.
For ambiguous merchants (unclear who/what, or personal-vs-business unclear), ask the user a concise question, then add the rule.
Re-run categorize. Repeat until value coverage ≥ 99%.

config/overrides.json format:

{ "rules": [
  { "keywords": ["ACME", "ACME CORP"], "primary": "Income", "sub": "Acme retainer", "when": "credit" },
  { "keywords": ["BIGRENT"], "primary": "Housing", "sub": "Rent", "is_business": 0 }
] }

Aim for every transaction to carry a primary_cat and sub_cat. The factual P2P-counterparty fallback labels transfers but counts as "needs review" — convert the important ones into real rules.

Step 4 — Analyze (build `output/analysis.json`)

python3 "$SKILL/scripts/python/analyze.py" --root "$ROOT"

Computes FY slices, category reports, CFO narrative, income engine, burn summary, problem areas, and business post-mortem from output/audit.db. Writes output/analysis.json (and optional findings.md locally — never commit).

Requires pandas. Tune config/business.json → display_name for the business section label (default "Business").

Step 5 — Generate the report (serve + hot reload + auto-open)

python3 "$SKILL/scripts/python/report.py" --root "$ROOT" --serve

Builds output/report.html from analysis.json + ledger rows embedded inline. With --serve, opens http://127.0.0.1:8765/ and hot-reloads when db/config changes (re-runs analyze + re-renders). Use --refresh-analysis to force re-analyze on a one-shot build.

Report UI (`output/report.html`) — Money.Audit hybrid dashboard

Shell: rounded app frame, light sidebar with FY list + section nav (scroll-spy), full-width main column. Title: What matters about your money.

Look: Fraunces serif + Spline Sans UI; warm cream shell (#f7f6f3), gold accent. Not generic gray SaaS.

Sections (sidebar): Snapshot, Income engine, Monthly burn, Problem areas, Business post-mortem, Cash & movement, Action plan, All categories, Ledger (search + filters).

Data embedded: const ANALYSIS = … and const LEDGER = … — no external fetch. Sidebar footer shows account_label from mapping (never hardcode account numbers in the template).

Motion: emil-design-eng — :active scale(0.98), bar width transitions, reduced-motion safe. Style guide: references/ui_style_guide.md.

Template: assets/report_template.html. Dummy fixture for docs: assets/fixtures/sample_analysis.json.

When changing report UI: edit the template + style guide; never commit real output/report.html. Regenerate locally via analyze → report.

Report content

Snapshot — health score, metrics, three things CFO quote.
Income / burn / problems — decision-first sections with drill-down to categories.
Business review — when config/business.json is present.
Ledger — searchable transaction evidence (first 350 matches shown).

Flags: --port N, --no-open, --refresh-analysis. Without --serve, writes output/report.html once (runs analyze first if analysis.json is missing).

Methodology: references/financial_frameworks.md.

Note: The hybrid Money.Audit UI uses the Python analyze + report path. Node report.mjs still renders the legacy CFO template for basic audits without analyze.

Layout produced in the audit folder

ROOT/
├── <statement files>            # input (PDF/XLSX/XLS/CSV/TXT)
├── config/
│   ├── mapping.json             # column mapping (verify after ingest)
│   ├── categories.json          # category rules (from the example)
│   ├── business.json            # optional: business income/payroll/infra
│   └── overrides.json           # your labels added during the loop
└── output/
    ├── audit.db                 # SQLite source of truth
    ├── analysis.json            # analyze stage (local only — PII)
    ├── uncategorized.csv        # review queue (regenerated each categorize run)
    └── report.html              # Money.Audit dashboard (local only — PII)

The skill only writes what's needed: editable config, the SQLite db that carries state between stages, the review queue, and the final report. Intermediate dumps (the ingest preview, the report JSON) are printed/embedded, not written as extra files. mapping.json and audit.db must persist on disk because each stage runs as a separate process.

Notes

The same mapping.json / config / SQLite schema work across both runtimes; you can switch runtimes mid-audit.
Keep raw statement files out of version control (they contain PII).
For full schema, validation math, and config reference, see reference.md.
For analysis frameworks (savings rate, 50/30/20, benchmarks, recommendation logic), see references/financial_frameworks.md.

Common requests this skill handles

"Audit / categorize my bank statements and make a report" — full pipeline.
"What's my Personal CFO report for FY2024-25?" — use period selector after report.
"What's my savings rate and how do I improve it?" — executive summary + action plan.
"Where am I overspending?" — spending intelligence + counterfactuals.
"What should I stop/start/continue?" — action plan section.
"Ask my CFO about runway / debt / investments" — use Ask your CFO prompts with your agent.
"Improve the report UI / make it full width / fix the layout" — edit assets/report_template.html per references/ui_style_guide.md; regenerate locally with analyze → report. Never commit output/report.html.

sdrplk/finance-audit

Ask in your favorite AI

Documentation

Finance Audit — Personal CFO Report

Pipeline (5 stages)

Privacy — never commit real data

Step 1 — Ingest & detect

Step 2 — Parse, normalize, dedupe, validate

Step 3 — Categorize (loop until >= 99% by value)

Step 4 — Analyze (build `output/analysis.json`)

Step 5 — Generate the report (serve + hot reload + auto-open)

Report UI (`output/report.html`) — Money.Audit hybrid dashboard

Report content

Layout produced in the audit folder

Notes

Common requests this skill handles

Related Skills

fluyeporlaweb/mcp-x-intelligence

yoakev/nitrotype-tps

VoyagerXvoyagerx/modelscope-notebook-adapter

OpenBB-finance/OpenBB

jagavaibhav2007-beep/my-agent-skills

ignazsoured466/claw-ds

Ask in your favorite AI

Documentation

Finance Audit — Personal CFO Report

Pipeline (5 stages)

Privacy — never commit real data

Step 1 — Ingest & detect

Step 2 — Parse, normalize, dedupe, validate

Step 3 — Categorize (loop until >= 99% by value)

Step 4 — Analyze (build output/analysis.json)

Step 5 — Generate the report (serve + hot reload + auto-open)

Report UI (output/report.html) — Money.Audit hybrid dashboard

Report content

Layout produced in the audit folder

Notes

Common requests this skill handles

Related Skills

fluyeporlaweb/mcp-x-intelligence

yoakev/nitrotype-tps

VoyagerXvoyagerx/modelscope-notebook-adapter

OpenBB-finance/OpenBB

jagavaibhav2007-beep/my-agent-skills

ignazsoured466/claw-ds

Step 4 — Analyze (build `output/analysis.json`)

Report UI (`output/report.html`) — Money.Audit hybrid dashboard