Communitygithub.com

sulipapa/localize-infographic-pdf

Artwork-preserving PDF localization (e.g. EN→ZH) for design-exported infographics/one-pagers — a Claude Code skill that erases vector/Type3 source text and redraws the translation in place.

Compatible avecClaude Code~Codex CLI~Cursor
npx skills add sulipapa/localize-infographic-pdf

Ask in your favorite AI

Open a new chat with this agent skill pre-loaded.

Documentation

Localize Infographic PDF (artwork-preserving)

Produce a translated version of a design PDF that looks identical to the original except that every piece of running text is swapped to the target language. Works by rasterizing the page, erasing the source text, and compositing the translation back at the same coordinates — the artwork (photos, charts, gradients, icons, logos) is never regenerated.

Why this pipeline (read first)

Design-exported PDFs usually store the visible text as vector outlines (text-to-curves) or Type3 glyphs — it is NOT editable text, and there is no clean way to "replace the text". But these PDFs almost always also carry a hidden, searchable text layer whose word boxes are pixel-aligned with the visible text. That layer gives exact positions/sizes for free. So: do not try to edit text, and do not use a generative image model to "repaint" the background — that alters the artwork and cannot be pixel-faithful. Instead, cover-and-overlay: erase each source glyph run and redraw the translation in place. Background stays byte-faithful. See references/pipeline.md for the deeper rationale and failure modes.

Workflow (run scripts in order from the skill's scripts/)

Pick a work dir (default loc_work). All scripts take SOURCE.pdf [OUTDIR] [PAGE].

  1. Analyze01_analyze.py SOURCE.pdf OUTDIR Confirms how text is stored, lists embedded images, renders the page. Look at the render and note any text that is baked into a chart/logo/photo image (it will NOT be in the text layer and must be handled as manual_labels).

  2. Extract02_extract.py SOURCE.pdf OUTDIR (env LOC_KEEP='brandA|brandB' to keep more strings in the source language). Produces units.json (one entry per physical text line, with bbox/colour/size) and translator_source.txt (a readable, id-tagged payload) + half_*.png.

  3. Translate — dispatch a translator agent. Give it translator_source.txt and half_0/1.png. It must write OUTDIR/translation.json = {"<id>":"译文", ...} covering every id (echo KEEP-AS-IS ids unchanged). Use the prompt in references/agent-prompts.md.

  4. Overrides (only if needed) — write OUTDIR/overrides.json for the per-document bits the auto-engine can't see: text baked into images (manual_labels), text inside thin stroked controls like pills/badges (solid_fill_labels), centered label zones (centered_id_ranges), and drop_ids. Schema + how to measure coordinates: references/overrides.md. Most simple flyers need no overrides; chart-heavy ones need a handful.

  5. Compose03_compose.py SOURCE.pdf OUTDIRfinal_zh.png. This is the deterministic engine (background reconstruction + size-unified redraw). It needs no tuning per document.

  6. PDF + crops04_pdf.py SOURCE.pdf OUTDIR OUT.pdf → the PDF plus zh_L/zh_R.png for QA.

  7. Review & iterate — dispatch a reviewer agent to compare zh_L/zh_R.png against half_0/1.png (prompt in references/agent-prompts.md). Fix findings by editing overrides.json / translation.json and re-running 03→04. Repeat until clean. Typical issues and their fixes are listed in references/pipeline.md.

The 3-role loop (translator → compositor → reviewer) mirrors a human localization team; running the reviewer 1–2 times catches the long tail (residual source text, mis-sized blocks).

What the engine handles automatically (no config)

  • Light-text-on-dark AND dark-text-on-light panels (picks the right background colour).
  • Flat panels, smooth gradients, and photo backgrounds (chooses fill vs inpaint per region).
  • "Ghost" text-layer entries that have no visible glyphs (auto-skipped — never drawn).
  • Bulleted lists where the bullet got swallowed into the first row's box (auto-realigned).
  • Font size: original size, unified per block so wrapped paragraphs/quotes don't go ragged.

Defaults & requirements

  • Python: pymupdf (fitz), numpy, opencv-python (cv2), Pillow.
  • CJK font default Hiragino Sans GB.ttc (macOS; W3 regular idx 0 / W6 bold idx 2). Override via overrides.json "font" for other languages/fonts.
  • Render scale default 6.0 (≈432 dpi). Lower to 4 for speed while iterating; raise for print.
  • Keep in source language by default: URLs, emails, brand/product trademarks, logos' baked text, bare numbers/units, and proper names. The translator decides; the KEEP regex pre-flags the obvious.

Reference files

  • references/pipeline.md — rationale, the removal strategy decision tree, and a catalogue of common visual defects with their exact fixes (streaks, residue, broken pills, size collapse).
  • references/agent-prompts.md — ready-to-paste translator and reviewer agent prompts.
  • references/overrides.mdoverrides.json schema and how to measure chart/pill coordinates.

Skills associés