richpdf

Name: richpdf
Author: 0r1xByte

Install

mkdir -p .claude/skills/richpdf && curl -L -o skill.zip "https://agentskills.codes/api/skills/download/15625" && unzip -o skill.zip -d .claude/skills/richpdf && rm skill.zip

Installs to .claude/skills/richpdf

Activation

This is the description your AI agent reads to decide when to run this skill — the better it matches your request, the more reliably it fires.

Extract structured wiki-ready markdown from any document — PDF (text or image/scanned), markdown clippings, or plain text. Captures text, embedded images/diagrams, and code samples per page. Images saved to raw/assets/ and referenced inline with the page they came from. Use when user says "/richpdf", "ingest", "process this pdf", "ingest this document", or provides a file path.

380 chars✓ has a “when” triggerlonger than Claude Code's old 250-char listing cap (fine on current versions)

About this skill

richpdf / ingest

Extracts structured, wiki-ready markdown from any document. Per page it captures:

5 key ideas
All embedded images and diagrams (saved to raw/assets/, described by vision)
All code samples (detected by font analysis or fenced blocks, language identified)

Each image and code sample is placed inline immediately after the key ideas of the page it came from — so the extracted knowledge stays in context.

Requirements (PDF extraction)

pip install pymupdf anthropic

Usage (PDFs)

python .agent/skills/richpdf/extract.py <file_path> [output_dir]

file_path — path to document (PDF, .md, .txt)
output_dir — where to save the wiki .md (default: wiki/sources/)
Images always saved to raw/assets/

For clippings and articles: read the file directly — no script needed.

How assets are detected

Source type	Text	Code blocks	Images/Diagrams
PDF (text-based)	pymupdf text extraction	Font analysis — monospace spans flagged as code	pymupdf image extraction per page
PDF (image/scanned)	Vision reads whole page	Vision wraps code in ``` fences	Whole page sent to vision; individual images also extracted
Markdown clipping	Direct read	``` fenced blocks	`![[wikilink]]` and `![alt](url)` references captured
Plain text	Direct read	``` fenced blocks	n/a

Images under 5KB (icons, bullets, decorations) are skipped automatically.

Asset file naming

raw/assets/{Source-Title}-p{page}-img{index}.{ext}

After extraction

Follow all steps in the Ingest operation defined in AGENT.md.

More by 0r1xByte

View all by 0r1xByte →

to-prd

0r1xByte

Turn the current conversation context into a PRD and save it to wiki/llm_conversations/. Use when user wants to create a PRD from the current context.

Install

mkdir -p .claude/skills/richpdf && curl -L -o skill.zip "https://agentskills.codes/api/skills/download/15625" && unzip -o skill.zip -d .claude/skills/richpdf && rm skill.zip

Installs to .claude/skills/richpdf

Safety

Review before install

Runs shell / code

Automated static scan of the SKILL.md and repo. A flag describes what the skill can do — not a verdict. Always review code before installing.

Source & maintenance

Updated

1mo ago

Repo stars

Loads

~418 tokens

Stars are for the whole repository, not this skill alone.

Stats

Views

Installs

Author

0r1xByte

2 skills published

Links

Source code

richpdf

Install

Activation

About this skill

richpdf / ingest

Requirements (PDF extraction)

Usage (PDFs)

How assets are detected

Asset file naming

After extraction

More by 0r1xByte

to-prd

Search skills