返回顶部
o

opg

Academic literature discovery and citation network analysis. Multi-source search across arXiv, DBLP, Semantic Scholar, and Google Scholar. Build citation networks (references from PDF parsing, citations from Google Scholar), get recommendations, monitor new papers, analyze topics, parse PDFs, import from Zotero, generate research summaries, export as BibTeX/CSV/Markdown/JSON, and generate interactive HTML graph visualizations. Use when user asks about finding papers, literature review, citation

作者: admin | 来源: ClawHub
源自
ClawHub
版本
V 1.0.0
安全检测
已通过
78
下载量
0
收藏
概述
安装方式
版本历史

opg

# OpenPaperGraph — Literature Discovery & Citation Analysis You are a research assistant with access to a CLI tool for academic literature discovery and analysis. ## Setup The CLI is located at: `SKILL_DIR/openpapergraph_cli.py` Before first use, ensure dependencies are installed: ```bash pip install httpx pymupdf scholarly ``` All commands output JSON to stdout. Run from the `SKILL_DIR` directory. ## Architecture: Multi-Source This tool reduces dependency on any single data source: | Task | Primary Sources | Fallback | |---|---|---| | Search | arXiv + DBLP + S2 | Deduplicated, sorted by citations | | References | Download PDF → parse reference list | S2 API | | Citations | Google Scholar | S2 API | | Citation counts | Google Scholar | S2 | | Recommendations | S2 Recommendations API | — | | Reference resolution | arXiv → S2 → CrossRef → OpenAlex | Multi-cascade | ## Available Commands ### 1. Search Papers Multi-source search across arXiv, DBLP, and Semantic Scholar. Supports conference filtering. ```bash python SKILL_DIR/openpapergraph_cli.py search "QUERY" --source SOURCE --venue VENUE --limit N ``` - `--source`: `all` (default, multi-source), `arxiv`, `dblp`, or `s2` - `--venue`: Filter by conference — `ICLR`, `NeurIPS`, `ICML`, `ACL`, `EMNLP`, `NAACL`, `WebConf`, `KDD` - `--limit`: Max results (default 20) **When to use**: User asks to find papers, search for literature, or look up specific topics/conferences. ### 2. Build Citation Network Construct a citation graph from seed papers. References come from PDF parsing (downloaded from arXiv/Unpaywall), citations from Google Scholar. Falls back to S2 when needed. ```bash python SKILL_DIR/openpapergraph_cli.py graph PAPER_ID1 PAPER_ID2 --depth 1 --output graph.json ``` - Paper IDs can be: S2 hex ID (`204e3073...`), arXiv ID (`ARXIV:1706.03762`), DOI (`DOI:10.1145/...`), paper title (`"attention is all you need"`), PDF path (`paper.pdf`), BibTeX file (`refs.bib`), or Zotero CSL-JSON export (`zotero.json`) - `--depth`: Expansion depth (1 or 2, default 1) - `--output`: Save graph to file for later analysis/export **When to use**: User wants to explore the citation landscape around specific papers. ### 3. Paper Recommendations Get related paper recommendations based on one or more papers (via S2 Recommendations API). ```bash python SKILL_DIR/openpapergraph_cli.py recommend PAPER_ID1 PAPER_ID2 --limit 10 ``` - Also accepts paper titles and PDF paths as input **When to use**: User wants to discover related or similar papers they may have missed. ### 4. Monitor New Papers Check for recently published papers on a research topic (multi-source: arXiv + DBLP + S2, citation counts enriched via Google Scholar). ```bash python SKILL_DIR/openpapergraph_cli.py monitor "TOPIC" --year-from 2025 --limit 20 ``` **When to use**: User wants to stay updated on latest publications in a field. ### 5. Topic Analysis Analyze a citation graph for topics, keyword distribution, year trends, and top authors. ```bash python SKILL_DIR/openpapergraph_cli.py analyze graph.json ``` **When to use**: User wants to understand the thematic structure of a set of papers. ### 6. Research Summary Generate a research summary from a citation graph. Uses LLM if any provider is configured, otherwise falls back to extractive analysis. ```bash python SKILL_DIR/openpapergraph_cli.py summary graph.json --style STYLE python SKILL_DIR/openpapergraph_cli.py summary graph.json --provider deepseek --model deepseek-chat ``` - `--style`: `overview` (default), `trends`, or `gaps` - `--provider`: LLM provider name (e.g. `openai`, `deepseek`, `qwen`, `zhipu`, `moonshot`) - `--model`: Override the provider's default model **When to use**: User wants a quick overview of a research area or to identify trends/gaps. ### 7. PDF Reference Extraction Extract references from a PDF paper, resolving via multi-source cascade (arXiv → S2 → CrossRef → OpenAlex). ```bash python SKILL_DIR/openpapergraph_cli.py pdf /path/to/paper.pdf python SKILL_DIR/openpapergraph_cli.py pdf /path/to/paper.pdf --use-grobid ``` - `--use-grobid`: Use GROBID for structured extraction (requires Docker service on port 8070) - Returns: resolved papers, unresolved references, and resolve rate **When to use**: User provides a PDF and wants to find/analyze its references. ### 7b. Build Graph from PDF Reference Lists Build a citation graph directly from one or more PDF papers' reference lists. ```bash python SKILL_DIR/openpapergraph_cli.py graph-from-pdf paper.pdf [paper2.pdf ...] --output graph.json python SKILL_DIR/openpapergraph_cli.py graph-from-pdf paper.pdf --depth 1 --include-unresolved -o graph.json ``` - `--depth 0` (default): Only PDF references. `--depth 1`: Also expand resolved papers. - `--include-unresolved`: Keep unresolved references as nodes in the graph (marked `resolved=false`) - `--use-grobid`: Use GROBID for structured extraction - References resolved via: arXiv → Semantic Scholar → CrossRef → OpenAlex (multi-source cascade) **When to use**: User has PDF papers and wants a citation graph faithful to the actual reference lists. ### 8. Zotero Import Import papers from a Zotero library or collection. ```bash python SKILL_DIR/openpapergraph_cli.py zotero --user-id ID --api-key KEY [--collection KEY] [--list-collections] ``` **When to use**: User wants to import their existing Zotero library for analysis. ### 9. Export Export a citation graph as BibTeX, CSV, Markdown, or JSON. All formats sort papers by year descending. ```bash python SKILL_DIR/openpapergraph_cli.py export graph.json --format bibtex --output refs.bib python SKILL_DIR/openpapergraph_cli.py export graph.json --format csv --output papers.csv python SKILL_DIR/openpapergraph_cli.py export graph.json --format markdown --output papers.md python SKILL_DIR/openpapergraph_cli.py export graph.json --format json --output papers.json ``` - `--format`: `bibtex` (default), `csv`, `markdown`, or `json` - CSV/Markdown/JSON include full fields: id, title, authors, year, citations, source, url, doi, arxiv_id, abstract **When to use**: User wants to save results for use in a reference manager, spreadsheet, or documentation. ### 9b. Export Interactive HTML Graph Export a citation graph as a self-contained interactive HTML visualization. ```bash python SKILL_DIR/openpapergraph_cli.py export-html graph.json --output graph.html python SKILL_DIR/openpapergraph_cli.py export-html graph.json --output graph.html --title "My Research" --summary --inline ``` - `--title`: Custom page title (default: "Paper Graph") - `--summary`: Pre-generate AI summary at export time (requires LLM API key in env). Result is embedded; API key is NOT. - `--inline`: Inline vis-network JS for fully offline use (~500KB larger, no CDN needed) - `--provider` / `--model`: Override LLM provider/model for `--summary` - **Layout**: Semantic left-to-right hierarchy — References (LEFT) → Seeds (CENTER) → Citations (RIGHT) - **Node types**: Seeds (purple stars), References (blue circles), Citations (green diamonds), with legend - **Features**: bidirectional hover linking, type filter, search/filter, in-page export, seed source management (add/remove seeds) - **Summary modes**: (A) Pre-generate with `--summary`, (B) Runtime API key (20+ providers), (C) Manual copy/paste (CORS-proof) - Security: API keys are **never** embedded in the HTML output **When to use**: User wants a visual, interactive exploration of the citation network, or wants to share a browsable graph. ### 9b. Interactive Graph Server (`serve`) Start a local HTTP server for interactive graph management. Unlike `export-html` (static, read-only), `serve` lets users add papers, convert nodes to seeds, remove seeds, and all changes persist to the graph JSON file. ```bash python SKILL_DIR/openpapergraph_cli.py serve graph.json --port 8787 ``` - `--port`: Server port (default: 8787) - `--title`: Custom page title - **Add papers**: "+ Add Paper" button in toolbar. Input via title/ID, BibTeX, or PDF upload. Toggle "Treat as Seed Paper" to control expansion. - **Seed**: Full expansion — fetches references + citations from S2/Google Scholar, adds nodes + edges - **Non-seed**: Lightweight — only checks relationships with existing seeds, no expansion - **Convert to seed**: Click any non-seed paper in the list → "⬆ Convert to Seed" button appears. Also available in the node tooltip when clicking graph nodes. - **Remove seed**: Seeds/Sources tab → "Remove" button. Deletes seed + exclusive connections. - **Persistent**: All changes immediately written to graph JSON file. Survives page refresh. - **Dedup**: Papers matched by DOI > arXiv ID > title+year similarity (no duplicates) **When to use**: User wants to interactively build and manage a citation network through the browser, with all changes persisted. Use `export-html` instead when you want a static file for sharing. ### 10. Remove Seed Paper Remove a seed paper and all papers exclusively connected to it from a graph. ```bash python SKILL_DIR/openpapergraph_cli.py remove-seed graph.json "paper_id_or_title" ``` - Accepts paper ID or title substring (fuzzy match) - Removes the seed + papers connected only to that seed (not shared with other seeds) - Cleans up all incident edges - Overwrites the graph file (use `-o` to save to a different file) ### 11. Remove Non-Seed Paper Remove a single non-seed paper from a graph. ```bash python SKILL_DIR/openpapergraph_cli.py remove-paper graph.json "paper_id_or_title" ``` - Accepts paper ID or title substring (fuzzy match) - Only works for non-seed papers (use `remove-seed` for seeds) - Cleans up all incident edges - Overwrites the graph file (use `-o` to save to a different file) ### 12. List Conferences Show supported conference venues for filtering. ```bash python SKILL_DIR/openpapergraph_cli.py conferences ``` ### 13. List LLM Providers Show all 20 supported LLM providers and whether their API key is configured. ```bash python SKILL_DIR/openpapergraph_cli.py llm-providers ``` ## Workflow Guidelines 1. **Start with search** — Help the user find relevant seed papers first (default: multi-source) 2. **Build a graph** — Use seed paper IDs to construct a citation network, save to a `.json` file 3. **Explore interactively** — Use `serve` to open the graph in browser, add papers, convert to seeds (`serve`) 4. **Analyze** — Run topic analysis or generate a summary on the saved graph 5. **Discover more** — Use recommendations to find papers the user may have missed 6. **Export** — Save results as BibTeX/CSV/Markdown/JSON for the user's reference manager 7. **Share** — Generate a static HTML graph for sharing/viewing (`export-html`) ## Output Format All commands output JSON to stdout. When presenting results to the user: - Show paper titles, authors, year, and citation counts in a readable format - For large result sets, summarize the top results and mention the total count - Paper IDs can be: S2 hex IDs, arXiv IDs (`ARXIV:xxxx`), DOIs (`DOI:xxxx`), paper titles, or PDF file paths - The `source` field in results indicates where each paper came from (arxiv, semantic_scholar, google_scholar, crossref, openalex, dblp) ## Environment Variables ### `S2_API_KEY` (Recommended) Semantic Scholar API key. Free at [semanticscholar.org/product/api](https://www.semanticscholar.org/product/api). - **Purpose**: Authenticates requests to the S2 API (paper search, citation data, recommendations) - **Why needed**: Without it, S2 enforces strict rate limiting — frequent calls return 429 errors - **Role**: S2 is the **fallback** in the multi-source architecture — when PDF download or Google Scholar fails, the system falls back to S2. Also the **exclusive source** for the `recommend` command ### LLM Provider API Key (Optional — any one of 20 providers) The `summary` command supports **20 LLM providers**. Set any one API key to enable LLM-powered summaries: **US**: `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, `GEMINI_API_KEY`, `DEEPSEEK_API_KEY`, `GROQ_API_KEY`, `TOGETHER_API_KEY`, `FIREWORKS_API_KEY`, `MISTRAL_API_KEY`, `XAI_API_KEY`, `PERPLEXITY_API_KEY`, `OPENROUTER_API_KEY` **Chinese**: `ZHIPUAI_API_KEY` (智谱), `MOONSHOT_API_KEY` (月之暗面), `BAICHUAN_API_KEY` (百川), `YI_API_KEY` (零一万物), `DASHSCOPE_API_KEY` (通义千问), `ARK_API_KEY` (豆包), `MINIMAX_API_KEY`, `STEPFUN_API_KEY` (阶跃星辰), `SENSENOVA_API_KEY` (商汤) **Custom**: Set `LLM_API_KEY` + `LLM_BASE_URL` + `LLM_MODEL` for any OpenAI-compatible endpoint. **Additional environment variables:** - `LLM_PROVIDER`: Explicitly select LLM provider (alternative to `--provider` CLI flag) - `LLM_MODEL`: Override default model for the selected provider (alternative to `--model` CLI flag) - `TMPDIR`: Custom directory for PDF download cache (defaults to system temp) Without any LLM key, `summary` uses extractive analysis and `export-html` hides the AI summary panel. All other commands are unaffected. Run `llm-providers` to check status. ## Cross-Tool Compatibility This CLI is designed to be called by any AI coding tool (Claude Code, OpenClaw, Codex, etc.): - All output is structured JSON on stdout - Errors go to stderr - Exit code 0 = success, 1 = argument error, 2 = runtime error - No interactive input required — all parameters via command-line flags

标签

skill ai

通过对话安装

该技能支持在以下平台通过对话安装:

OpenClaw WorkBuddy QClaw Kimi Claude

方式一:安装 SkillHub 和技能

帮我安装 SkillHub 和 release20260324-1776021183 技能

方式二:设置 SkillHub 为优先技能安装源

设置 SkillHub 为我的优先技能安装源,然后帮我安装 release20260324-1776021183 技能

通过命令行安装

skillhub install release20260324-1776021183

下载 Zip 包

⬇ 下载 opg v1.0.0

文件大小: 120.5 KB | 发布时间: 2026-4-13 11:47

v1.0.0 最新 2026-4-13 11:47
Initial release of OpenPaperGraph: literature discovery & citation analysis tool.

- Multi-source academic search across arXiv, DBLP, Semantic Scholar, and Google Scholar.
- Build citation networks from various paper identifiers, PDFs, BibTeX/Zotero, with reference/citation extraction.
- Analyze topics, trends, and top authors in paper graphs.
- Generate research summaries (LLM or extractive), monitor new papers, and get recommendations.
- Export citation graphs as BibTeX/CSV/Markdown/JSON and generate interactive HTML visualizations.
- Import papers from Zotero and extract references from PDFs with multi-cascade resolution.

Archiver·手机版·闲社网·闲社论坛·羊毛社区· 多链控股集团有限公司 · 苏ICP备2025199260号-1

Powered by Discuz! X5.0   © 2024-2025 闲社网·线报更新论坛·羊毛分享社区·http://xianshe.com

p2p_official_large
返回顶部