research-gather
Gathers and lists research resources (academic papers, patents, websites, business cases) for specified research domains. Works as the "resource collection" phase after domain mapping — takes clustering results, user keywords, or domain descriptions as input and produces structured resource lists pe
Install
mkdir -p .claude/skills/research-gather && curl -L -o skill.zip "https://agentskills.codes/api/skills/download/14453" && unzip -o skill.zip -d .claude/skills/research-gather && rm skill.zipInstalls to .claude/skills/research-gather
Activation
This is the description your AI agent reads to decide when to run this skill — the better it matches your request, the more reliably it fires.
Gathers and lists research resources (academic papers, patents, websites, business cases) for specified research domains. Works as the "resource collection" phase after domain mapping — takes clustering results, user keywords, or domain descriptions as input and produces structured resource lists per domain. Use this skill when the user wants to "collect papers for each area", "find patents in this domain", "gather resources for these topics", "list relevant papers and patents", "arXivで論文を集めて", "各領域のリソースを収集", "特許と論文のリストを作って", "この分野の文献を集めて", or any request to systematically find and list research materials across multiple domains. Also triggers when the user has clustering output and wants to proceed to resource collection, or when they provide keywords and want a literature/patent list.About this skill
Research Gather — Resource Collection by Domain
Collects academic papers, patents, websites, and business cases for specified research domains and produces structured resource lists. This skill sits between domain mapping (research-clustering) and detailed reports (research-retrieval) in the research pipeline.
Auto Mode (--auto)
When $ARGUMENTS contains --auto, run the entire workflow non-interactively — skip ALL AskUserQuestion calls and use the following defaults:
| Parameter | Default Value |
|---|---|
| Resource Types | 学術論文 + 特許 |
| Time Range | 直近4年 |
| Collection Depth | 標準(各5〜10件) |
| Domain Selection | すべてのクラスタ |
| Next Action (Step 6) | 完了(自動終了) |
In --auto mode, the remaining text in $ARGUMENTS (after removing --auto) is used as the input (file path or keywords). For example: /research-gather --auto docs/research/clustering-result.md → input is the clustering result file.
If $ARGUMENTS does NOT contain --auto, proceed with the normal interactive workflow below.
Pipeline Position
research-clustering → research-gather → research-retrieval
(domain mapping) (resource lists) (paper deep-dive)
Workflow
Step 1: Parse Input
Determine the input type and extract domain information.
Supported input types:
- Clustering output file — Markdown file generated by research-clustering. Parse the cluster structure (names, keywords, overview) directly.
- User keywords/text — Keywords, phrases, or natural-language descriptions provided in conversation. Extract domains and search terms from these.
- Existing Markdown file — A user-prepared file listing research domains or topics.
For clustering output, detect it by looking for the characteristic structure: "Cluster Summary" table, "Cluster Details" sections with keywords and research strategy. Use the cluster names, keywords, and strategies as the basis for resource collection.
For user keywords/text, group related terms into tentative domains before proceeding. If the grouping is ambiguous, confirm with the user.
Step 2: User Hearing
--automode: Skip this entire step. Use the default values from the Auto Mode table above.
Confirm research parameters via AskUserQuestion. Skip hearings for parameters already specified by the user in their request.
Hearing 1: Resource Types
AskUserQuestion:
question: "どの種類のリソースを収集しますか?(複数選択可)"
header: "リソース種別"
multiSelect: true
options:
- label: "学術論文"
description: "arXiv、IEEE、ACM等の学術論文を検索"
- label: "特許"
description: "Google Patents、USPTO、J-PlatPat、Espacenet等から検索"
- label: "技術情報"
description: "技術ブログ、カンファレンス発表、OSSプロジェクト等"
- label: "ビジネス事例"
description: "企業導入事例、市場レポート、業界動向"
Hearing 2: Time Range
AskUserQuestion:
question: "対象期間を指定してください"
header: "対象期間"
multiSelect: false
options:
- label: "直近4年(推奨)"
description: "2022年〜現在の結果を対象"
- label: "直近2年"
description: "最新トレンドに絞る"
- label: "直近7年"
description: "より広い範囲をカバー"
- label: "カスタム"
description: "任意の期間を指定"
If "カスタム" is selected, ask a follow-up for the specific year range.
Hearing 3: Collection Depth
AskUserQuestion:
question: "各領域あたりの収集件数はどの程度にしますか?"
header: "収集件数"
multiSelect: false
options:
- label: "標準(各5〜10件)(推奨)"
description: "主要なリソースを網羅。バランスの良い量"
- label: "広範(各10〜20件)"
description: "できるだけ多くのリソースを収集。時間がかかる場合あり"
- label: "簡潔(各3〜5件)"
description: "代表的なリソースのみ。素早く概観を得たい場合"
Hearing 4: Domain Selection (clustering input only)
If the input is from clustering and contains multiple clusters, ask which domains to investigate:
AskUserQuestion:
question: "どのクラスタのリソースを収集しますか?"
header: "対象クラスタ"
multiSelect: true
options:
(dynamically generated from cluster names — show up to 4; if more than 4 clusters, group or offer "すべて" as the first option)
Step 3: Resource Collection
For each target domain, search for resources in parallel using the Agent tool to spawn subagents.
3a: Academic Papers (arXiv-first)
Papers are searched with arXiv as the primary source because it provides open-access full text, stable URLs, and consistent metadata.
Search strategy:
- arXiv search via WebSearch: Query
site:arxiv.org "{domain keyword}" {year range}to find relevant papers. Also search for survey/review papers:site:arxiv.org "{domain keyword}" survey OR review. - Semantic Scholar / Google Scholar fallback: If arXiv results are insufficient (e.g., the domain is not well-represented on arXiv), broaden to
"{domain keyword}" paper {year}on general web search. - IEEE/ACM for specific domains: For domains where conference proceedings are important (networking, systems, HCI), also search
site:ieee.orgorsite:dl.acm.org.
For each paper, collect:
- Title
- Authors (first author + "et al." for >3 authors)
- Year
- Venue (arXiv, conference name, journal)
- arXiv ID or DOI
- URL (prefer
arxiv.org/abs/format) - 1-2 sentence summary
CRITICAL — Anti-hallucination rule for URLs:
- Only record URLs that appear verbatim in WebSearch results. NEVER construct or guess arXiv IDs.
- If a search result shows a title but no direct URL, run a follow-up
WebSearchforsite:arxiv.org "{exact paper title}"to obtain the real URL. - Do NOT fabricate arXiv IDs by combining partial numbers. Every URL must come from a search result or a WebFetch response.
Quality signals to prioritize:
- High citation count (if visible in search results)
- Survey/review papers (valuable for overview)
- Papers from top venues (NeurIPS, ICML, CVPR, ACL, etc.)
- Recent papers with significant attention
3b: Patents
Search across multiple patent databases to get broad coverage.
Search strategy:
- Google Patents (primary):
site:patents.google.com "{domain keyword}"— provides international coverage with English abstracts - USPTO: Search for US patents when the domain has strong US presence
- J-PlatPat: Search in Japanese for Japan-specific patents — useful when keywords have Japanese equivalents
- Espacenet: Search for European patents when relevant
For each patent, collect:
- Title
- Patent number (e.g., US11234567B2, JP2023-123456)
- Assignee/Applicant
- Filing year
- Patent office (USPTO/JPO/EPO/WIPO)
- URL
- 1 sentence summary of the invention
CRITICAL — Anti-hallucination rule for URLs:
- Only record patent numbers and URLs that appear verbatim in search results or WebFetch responses.
- NEVER fabricate patent numbers. If a search result mentions a patent without a clear number, run a follow-up search to obtain the exact number and URL.
Prioritize:
- Patents from major companies in the domain
- Recent patents (within the specified time range)
- Patents with many citations or family members
3c: Technical Resources
Search for high-quality technical content.
Search targets:
- Technical blogs from major companies (Google AI Blog, Meta Research, Microsoft Research, etc.)
- Conference talks and presentations (from slides/video sharing sites)
- Notable OSS projects on GitHub
- Technical standards and specifications
For each resource, collect:
- Title
- Source/Author
- Year/Date
- Type (blog/talk/OSS/standard)
- URL
- 1 sentence description
3d: Business Cases
Search for enterprise adoption and market information.
Search targets:
- Case studies from consulting firms and vendors
- Industry reports and market analysis
- Press releases about deployments
- Industry conference presentations
For each case, collect:
- Title
- Company/Organization
- Year
- Type (case study/report/press release)
- URL
- 1 sentence summary
Step 4: Organize, Deduplicate, and Verify
After collection:
- Remove duplicate entries (same paper appearing from different searches)
- Sort within each category by year (newest first), then by relevance
- Verify URLs are properly formatted (especially arXiv links — ensure
arxiv.org/abs/format) - URL Verification (MANDATORY) — verify every collected resource against its URL:
4a: Academic Paper URL Verification
For each paper with an arXiv URL:
- WebFetch the arXiv abstract page (
arxiv.org/abs/XXXX.XXXXX) - Compare the fetched title with the collected title
- Apply one of the following actions:
- Match: Title matches (allowing minor formatting differences) → mark as
verified - Mismatch: Title does not match → discard the entry and log a warning. Do NOT keep entries with mismatched title/URL pairs.
- Fetch failed: URL is unreachable or returns an error → discard the entry
- Match: Title matches (allowing minor formatting differences) → mark as
For papers with non-arXiv URLs (IEEE, ACM, etc.):
- WebFetch the URL
- Verify the page contains the expected paper title
- Apply the same match/mismatch/failed logic above
4b: Patent URL Verification
For each patent:
- WebFetch the patent URL (Google Patents, USPTO, etc.)
- Verify the patent number and title match
- Discard entries where the patent number or title does not match
4c: Technical Resource / Business Case URL Verification
For each resource:
- WebFetch the URL
- Verify the page is accessible and contains content related to the collected title
- Discard entries where the URL is unreachable or content is unrelated
4d: Verification Summary
After verification, log the results:
- Total collected: N entries
- Verified: M entries
- Discarded (mismatch): X entries
- Discarded (unreachable): Y entries
Important: It is better to have fewer verified entries than many unverified ones. Never include an entry in the output unless its URL has been verified. This prevents hallucinated or mismatched URL/title pairs from propagating to downstream tools (CSV lists, daily research pipeline).
If verification reduces the result set below the requested collection depth, run additional searches to find replacement resources, then verify th
Content truncated.