Localize Infographic PDF (artwork-preserving)
Produce a translated version of a design PDF that looks identical to the original except that every piece of running text is swapped to the target language. Works by rasterizing the page, erasing the source text, and compositing the translation back at the same coordinates — the artwork (photos, charts, gradients, icons, logos) is never regenerated.
Why this pipeline (read first)
Design-exported PDFs usually store the visible text as vector outlines (text-to-curves) or
Type3 glyphs — it is NOT editable text, and there is no clean way to "replace the text".
But these PDFs almost always also carry a hidden, searchable text layer whose word boxes are
pixel-aligned with the visible text. That layer gives exact positions/sizes for free.
So: do not try to edit text, and do not use a generative image model to "repaint" the
background — that alters the artwork and cannot be pixel-faithful. Instead, cover-and-overlay:
erase each source glyph run and redraw the translation in place. Background stays byte-faithful.
See references/pipeline.md for the deeper rationale and failure modes.
Workflow (run scripts in order from the skill's scripts/)
Pick a work dir (default loc_work). All scripts take SOURCE.pdf [OUTDIR] [PAGE].
-
Analyze —
01_analyze.py SOURCE.pdf OUTDIRConfirms how text is stored, lists embedded images, renders the page. Look at the render and note any text that is baked into a chart/logo/photo image (it will NOT be in the text layer and must be handled asmanual_labels). -
Extract —
02_extract.py SOURCE.pdf OUTDIR(envLOC_KEEP='brandA|brandB'to keep more strings in the source language). Producesunits.json(one entry per physical text line, with bbox/colour/size) andtranslator_source.txt(a readable, id-tagged payload) +half_*.png. -
Translate — dispatch a translator agent. Give it
translator_source.txtandhalf_0/1.png. It must writeOUTDIR/translation.json = {"<id>":"译文", ...}covering every id (echo KEEP-AS-IS ids unchanged). Use the prompt inreferences/agent-prompts.md. -
Overrides (only if needed) — write
OUTDIR/overrides.jsonfor the per-document bits the auto-engine can't see: text baked into images (manual_labels), text inside thin stroked controls like pills/badges (solid_fill_labels), centered label zones (centered_id_ranges), anddrop_ids. Schema + how to measure coordinates:references/overrides.md. Most simple flyers need no overrides; chart-heavy ones need a handful. -
Compose —
03_compose.py SOURCE.pdf OUTDIR→final_zh.png. This is the deterministic engine (background reconstruction + size-unified redraw). It needs no tuning per document. -
PDF + crops —
04_pdf.py SOURCE.pdf OUTDIR OUT.pdf→ the PDF pluszh_L/zh_R.pngfor QA. -
Review & iterate — dispatch a reviewer agent to compare
zh_L/zh_R.pngagainsthalf_0/1.png(prompt inreferences/agent-prompts.md). Fix findings by editingoverrides.json/translation.jsonand re-running 03→04. Repeat until clean. Typical issues and their fixes are listed inreferences/pipeline.md.
The 3-role loop (translator → compositor → reviewer) mirrors a human localization team; running the reviewer 1–2 times catches the long tail (residual source text, mis-sized blocks).
What the engine handles automatically (no config)
- Light-text-on-dark AND dark-text-on-light panels (picks the right background colour).
- Flat panels, smooth gradients, and photo backgrounds (chooses fill vs inpaint per region).
- "Ghost" text-layer entries that have no visible glyphs (auto-skipped — never drawn).
- Bulleted lists where the bullet got swallowed into the first row's box (auto-realigned).
- Font size: original size, unified per block so wrapped paragraphs/quotes don't go ragged.
Defaults & requirements
- Python:
pymupdf (fitz),numpy,opencv-python (cv2),Pillow. - CJK font default
Hiragino Sans GB.ttc(macOS; W3 regular idx 0 / W6 bold idx 2). Override viaoverrides.json"font"for other languages/fonts. - Render scale default 6.0 (≈432 dpi). Lower to 4 for speed while iterating; raise for print.
- Keep in source language by default: URLs, emails, brand/product trademarks, logos' baked text, bare numbers/units, and proper names. The translator decides; the KEEP regex pre-flags the obvious.
Reference files
references/pipeline.md— rationale, the removal strategy decision tree, and a catalogue of common visual defects with their exact fixes (streaks, residue, broken pills, size collapse).references/agent-prompts.md— ready-to-paste translator and reviewer agent prompts.references/overrides.md—overrides.jsonschema and how to measure chart/pill coordinates.