agentskills.codes

>

Install

mkdir -p .claude/skills/richpdf && curl -L -o skill.zip "https://agentskills.codes/api/skills/download/15625" && unzip -o skill.zip -d .claude/skills/richpdf && rm skill.zip

Installs to .claude/skills/richpdf

Activation

This is the description your AI agent reads to decide when to run this skill — the better it matches your request, the more reliably it fires.

Extract structured wiki-ready markdown from any document — PDF (text or image/scanned), markdown clippings, or plain text. Captures text, embedded images/diagrams, and code samples per page. Images saved to raw/assets/ and referenced inline with the page they came from. Use when user says "/richpdf", "ingest", "process this pdf", "ingest this document", or provides a file path.
380 chars✓ has a “when” triggerlonger than Claude Code's old 250-char listing cap (fine on current versions)

About this skill

richpdf / ingest

Extracts structured, wiki-ready markdown from any document. Per page it captures:

  • 5 key ideas
  • All embedded images and diagrams (saved to raw/assets/, described by vision)
  • All code samples (detected by font analysis or fenced blocks, language identified)

Each image and code sample is placed inline immediately after the key ideas of the page it came from — so the extracted knowledge stays in context.

Requirements (PDF extraction)

pip install pymupdf anthropic

Usage (PDFs)

python .agent/skills/richpdf/extract.py <file_path> [output_dir]
  • file_path — path to document (PDF, .md, .txt)
  • output_dir — where to save the wiki .md (default: wiki/sources/)
  • Images always saved to raw/assets/

For clippings and articles: read the file directly — no script needed.

How assets are detected

Source typeTextCode blocksImages/Diagrams
PDF (text-based)pymupdf text extractionFont analysis — monospace spans flagged as codepymupdf image extraction per page
PDF (image/scanned)Vision reads whole pageVision wraps code in ``` fencesWhole page sent to vision; individual images also extracted
Markdown clippingDirect read``` fenced blocks![[wikilink]] and ![alt](url) references captured
Plain textDirect read``` fenced blocksn/a

Images under 5KB (icons, bullets, decorations) are skipped automatically.

Asset file naming

raw/assets/{Source-Title}-p{page}-img{index}.{ext}

After extraction

Follow all steps in the Ingest operation defined in AGENT.md.

Search skills

Search the agent skills registry