research-gather

Name: research-gather
Author: YuriNakayama

Gathers and lists research resources (academic papers, patents, websites, business cases) for specified research domains. Works as the "resource collection" phase after domain mapping — takes clustering results, user keywords, or domain descriptions as input and produces structured resource lists pe

Install

mkdir -p .claude/skills/research-gather && curl -L -o skill.zip "https://agentskills.codes/api/skills/download/14453" && unzip -o skill.zip -d .claude/skills/research-gather && rm skill.zip

Installs to .claude/skills/research-gather

Activation

This is the description your AI agent reads to decide when to run this skill — the better it matches your request, the more reliably it fires.

Gathers and lists research resources (academic papers, patents, websites, business cases) for specified research domains. Works as the "resource collection" phase after domain mapping — takes clustering results, user keywords, or domain descriptions as input and produces structured resource lists per domain. Use this skill when the user wants to "collect papers for each area", "find patents in this domain", "gather resources for these topics", "list relevant papers and patents", "arXivで論文を集めて", "各領域のリソースを収集", "特許と論文のリストを作って", "この分野の文献を集めて", or any request to systematically find and list research materials across multiple domains. Also triggers when the user has clustering output and wants to proceed to resource collection, or when they provide keywords and want a literature/patent list.

797 chars✓ has a “when” triggerlonger than Claude Code's old 250-char listing cap (fine on current versions)

About this skill

Research Gather — Resource Collection by Domain

Collects academic papers, patents, websites, and business cases for specified research domains and produces structured resource lists. This skill sits between domain mapping (research-clustering) and detailed reports (research-retrieval) in the research pipeline.

Auto Mode (`--auto`)

When $ARGUMENTS contains --auto, run the entire workflow non-interactively — skip ALL AskUserQuestion calls and use the following defaults:

Parameter	Default Value
Resource Types	学術論文 + 特許
Time Range	直近4年
Collection Depth	標準（各5〜10件）
Domain Selection	すべてのクラスタ
Next Action (Step 6)	完了（自動終了）

In --auto mode, the remaining text in $ARGUMENTS (after removing --auto) is used as the input (file path or keywords). For example: /research-gather --auto docs/research/clustering-result.md → input is the clustering result file.

If $ARGUMENTS does NOT contain --auto, proceed with the normal interactive workflow below.

Pipeline Position

research-clustering → research-gather → research-retrieval
(domain mapping)      (resource lists)   (paper deep-dive)

Workflow

Step 1: Parse Input

Determine the input type and extract domain information.

Supported input types:

Clustering output file — Markdown file generated by research-clustering. Parse the cluster structure (names, keywords, overview) directly.
User keywords/text — Keywords, phrases, or natural-language descriptions provided in conversation. Extract domains and search terms from these.
Existing Markdown file — A user-prepared file listing research domains or topics.

For clustering output, detect it by looking for the characteristic structure: "Cluster Summary" table, "Cluster Details" sections with keywords and research strategy. Use the cluster names, keywords, and strategies as the basis for resource collection.

For user keywords/text, group related terms into tentative domains before proceeding. If the grouping is ambiguous, confirm with the user.

Step 2: User Hearing

--auto mode: Skip this entire step. Use the default values from the Auto Mode table above.

Confirm research parameters via AskUserQuestion. Skip hearings for parameters already specified by the user in their request.

Hearing 1: Resource Types

AskUserQuestion:
  question: "どの種類のリソースを収集しますか？（複数選択可）"
  header: "リソース種別"
  multiSelect: true
  options:
    - label: "学術論文"
      description: "arXiv、IEEE、ACM等の学術論文を検索"
    - label: "特許"
      description: "Google Patents、USPTO、J-PlatPat、Espacenet等から検索"
    - label: "技術情報"
      description: "技術ブログ、カンファレンス発表、OSSプロジェクト等"
    - label: "ビジネス事例"
      description: "企業導入事例、市場レポート、業界動向"

Hearing 2: Time Range

AskUserQuestion:
  question: "対象期間を指定してください"
  header: "対象期間"
  multiSelect: false
  options:
    - label: "直近4年（推奨）"
      description: "2022年〜現在の結果を対象"
    - label: "直近2年"
      description: "最新トレンドに絞る"
    - label: "直近7年"
      description: "より広い範囲をカバー"
    - label: "カスタム"
      description: "任意の期間を指定"

If "カスタム" is selected, ask a follow-up for the specific year range.

Hearing 3: Collection Depth

AskUserQuestion:
  question: "各領域あたりの収集件数はどの程度にしますか？"
  header: "収集件数"
  multiSelect: false
  options:
    - label: "標準（各5〜10件）（推奨）"
      description: "主要なリソースを網羅。バランスの良い量"
    - label: "広範（各10〜20件）"
      description: "できるだけ多くのリソースを収集。時間がかかる場合あり"
    - label: "簡潔（各3〜5件）"
      description: "代表的なリソースのみ。素早く概観を得たい場合"

Hearing 4: Domain Selection (clustering input only)

If the input is from clustering and contains multiple clusters, ask which domains to investigate:

AskUserQuestion:
  question: "どのクラスタのリソースを収集しますか？"
  header: "対象クラスタ"
  multiSelect: true
  options:
    (dynamically generated from cluster names — show up to 4; if more than 4 clusters, group or offer "すべて" as the first option)

Step 3: Resource Collection

For each target domain, search for resources in parallel using the Agent tool to spawn subagents.

3a: Academic Papers (arXiv-first)

Papers are searched with arXiv as the primary source because it provides open-access full text, stable URLs, and consistent metadata.

Search strategy:

arXiv search via WebSearch: Query site:arxiv.org "{domain keyword}" {year range} to find relevant papers. Also search for survey/review papers: site:arxiv.org "{domain keyword}" survey OR review.
Semantic Scholar / Google Scholar fallback: If arXiv results are insufficient (e.g., the domain is not well-represented on arXiv), broaden to "{domain keyword}" paper {year} on general web search.
IEEE/ACM for specific domains: For domains where conference proceedings are important (networking, systems, HCI), also search site:ieee.org or site:dl.acm.org.

For each paper, collect:

Title
Authors (first author + "et al." for >3 authors)
Year
Venue (arXiv, conference name, journal)
arXiv ID or DOI
URL (prefer arxiv.org/abs/ format)
1-2 sentence summary

CRITICAL — Anti-hallucination rule for URLs:

Only record URLs that appear verbatim in WebSearch results. NEVER construct or guess arXiv IDs.
If a search result shows a title but no direct URL, run a follow-up WebSearch for site:arxiv.org "{exact paper title}" to obtain the real URL.
Do NOT fabricate arXiv IDs by combining partial numbers. Every URL must come from a search result or a WebFetch response.

Quality signals to prioritize:

High citation count (if visible in search results)
Survey/review papers (valuable for overview)
Papers from top venues (NeurIPS, ICML, CVPR, ACL, etc.)
Recent papers with significant attention

3b: Patents

Search across multiple patent databases to get broad coverage.

Search strategy:

Google Patents (primary): site:patents.google.com "{domain keyword}" — provides international coverage with English abstracts
USPTO: Search for US patents when the domain has strong US presence
J-PlatPat: Search in Japanese for Japan-specific patents — useful when keywords have Japanese equivalents
Espacenet: Search for European patents when relevant

For each patent, collect:

Title
Patent number (e.g., US11234567B2, JP2023-123456)
Assignee/Applicant
Filing year
Patent office (USPTO/JPO/EPO/WIPO)
URL
1 sentence summary of the invention

CRITICAL — Anti-hallucination rule for URLs:

Only record patent numbers and URLs that appear verbatim in search results or WebFetch responses.
NEVER fabricate patent numbers. If a search result mentions a patent without a clear number, run a follow-up search to obtain the exact number and URL.

Prioritize:

Patents from major companies in the domain
Recent patents (within the specified time range)
Patents with many citations or family members

3c: Technical Resources

Search for high-quality technical content.

Search targets:

Technical blogs from major companies (Google AI Blog, Meta Research, Microsoft Research, etc.)
Conference talks and presentations (from slides/video sharing sites)
Notable OSS projects on GitHub
Technical standards and specifications

For each resource, collect:

Title
Source/Author
Year/Date
Type (blog/talk/OSS/standard)
URL
1 sentence description

3d: Business Cases

Search for enterprise adoption and market information.

Search targets:

Case studies from consulting firms and vendors
Industry reports and market analysis
Press releases about deployments
Industry conference presentations

For each case, collect:

Title
Company/Organization
Year
Type (case study/report/press release)
URL
1 sentence summary

Step 4: Organize, Deduplicate, and Verify

After collection:

Remove duplicate entries (same paper appearing from different searches)
Sort within each category by year (newest first), then by relevance
Verify URLs are properly formatted (especially arXiv links — ensure arxiv.org/abs/ format)
URL Verification (MANDATORY) — verify every collected resource against its URL:

4a: Academic Paper URL Verification

For each paper with an arXiv URL:

WebFetch the arXiv abstract page (arxiv.org/abs/XXXX.XXXXX)
Compare the fetched title with the collected title
Apply one of the following actions:
- Match: Title matches (allowing minor formatting differences) → mark as verified
- Mismatch: Title does not match → discard the entry and log a warning. Do NOT keep entries with mismatched title/URL pairs.
- Fetch failed: URL is unreachable or returns an error → discard the entry

For papers with non-arXiv URLs (IEEE, ACM, etc.):

WebFetch the URL
Verify the page contains the expected paper title
Apply the same match/mismatch/failed logic above

4b: Patent URL Verification

For each patent:

WebFetch the patent URL (Google Patents, USPTO, etc.)
Verify the patent number and title match
Discard entries where the patent number or title does not match

4c: Technical Resource / Business Case URL Verification

For each resource:

WebFetch the URL
Verify the page is accessible and contains content related to the collected title
Discard entries where the URL is unreachable or content is unrelated

4d: Verification Summary

After verification, log the results:

Total collected: N entries
Verified: M entries
Discarded (mismatch): X entries
Discarded (unreachable): Y entries

Important: It is better to have fewer verified entries than many unverified ones. Never include an entry in the output unless its URL has been verified. This prevents hallucinated or mismatched URL/title pairs from propagating to downstream tools (CSV lists, daily research pipeline).

If verification reduces the result set below the requested collection depth, run additional searches to find replacement resources, then verify th

Content truncated.

Install

mkdir -p .claude/skills/research-gather && curl -L -o skill.zip "https://agentskills.codes/api/skills/download/14453" && unzip -o skill.zip -d .claude/skills/research-gather && rm skill.zip

Installs to .claude/skills/research-gather

Safety

Review before install

Runs shell / code

Automated static scan of the SKILL.md and repo. A flag describes what the skill can do — not a verdict. Always review code before installing.

Source & maintenance

Updated

2mo ago

Repo stars

Loads

~3,950 tokens

Stars are for the whole repository, not this skill alone.

Stats

Views

Installs

Author

YuriNakayama

Links

Source code

research-gather

Install

Activation

About this skill

Research Gather — Resource Collection by Domain

Auto Mode (--auto)

Pipeline Position

Workflow

Step 1: Parse Input

Step 2: User Hearing

Hearing 1: Resource Types

Hearing 2: Time Range

Hearing 3: Collection Depth

Hearing 4: Domain Selection (clustering input only)

Step 3: Resource Collection

3a: Academic Papers (arXiv-first)

3b: Patents

3c: Technical Resources

3d: Business Cases

Step 4: Organize, Deduplicate, and Verify

4a: Academic Paper URL Verification

4b: Patent URL Verification

4c: Technical Resource / Business Case URL Verification

4d: Verification Summary

Search skills

Auto Mode (`--auto`)