返回顶部
g

gmail-link-archiver

>-

作者: admin | 来源: ClawHub
源自
ClawHub
版本
V 1.1.0
安全检测
已通过
80
下载量
0
收藏
概述
安装方式
版本历史

gmail-link-archiver

# Gmail Link Archiver Archive web content from your email links. This skill connects to Gmail via IMAP, filters emails by a subject prefix keyword, crawls every link using Playwright (headless Chromium), converts pages to Markdown, and saves them to your OpenClaw workspace. ## Quick Start ### 1. Install dependencies (one-time) ```bash bash references/setup.sh ``` This automatically installs: - `playwright` (Python) + Chromium browser binary - `html2text` for HTML→Markdown conversion ### 2. First run — interactive setup ```bash python3 references/gmail_link_archiver.py ``` The first run will prompt you for: | Setting | Description | Default | |---------|-------------|---------| | IMAP server | Gmail IMAP host | `imap.gmail.com` | | IMAP port | SSL port | `993` | | Gmail address | Your full email address | — | | App password | Gmail App Password (NOT your regular password) | — | | Default mailbox | IMAP folder to search | `INBOX` | | Subject prefix | Filter emails whose subject starts with this | — | | Workspace path | Where to save Markdown files | `~/openclaw-workspace/mail-archive` | Credentials are saved locally to `~/.config/gmail-link-archiver/config.json` with `0600` permissions. They are **never transmitted or logged**. > **Gmail App Password**: You need to generate an App Password at > https://myaccount.google.com/apppasswords (requires 2FA enabled). ### 3. Subsequent runs After the first setup, subsequent runs will read credentials from the saved config: ```bash # Use saved config defaults python3 references/gmail_link_archiver.py # Override mailbox and prefix on the fly python3 references/gmail_link_archiver.py --mailbox "INBOX" --subject-prefix "[Newsletter]" # Save to a different workspace python3 references/gmail_link_archiver.py --workspace ~/my-archive # Limit number of links to crawl python3 references/gmail_link_archiver.py --max-links 10 # Re-run the setup interview python3 references/gmail_link_archiver.py --reconfigure ``` ## How It Works 1. **Connect** — Authenticates to Gmail via IMAP SSL 2. **Filter** — Searches the specified mailbox for emails matching the subject prefix 3. **Extract** — Parses email bodies (HTML + plain text) to find HTTP/HTTPS links 4. **Crawl** — Opens each link in headless Chromium via Playwright (bypasses bot detection, renders JavaScript) 5. **Convert** — Transforms the crawled HTML into clean Markdown with metadata headers 6. **Save** — Writes each Markdown file to the workspace directory ## Pipeline Diagram ``` Gmail IMAP ──► Filter by Subject ──► Extract Links │ ▼ Playwright + Chromium (headless) │ ▼ HTML → Markdown (html2text) │ ▼ Save to OpenClaw Workspace ``` ## CLI Reference ``` usage: gmail_link_archiver.py [-h] [--mailbox MAILBOX] [--subject-prefix PREFIX] [--workspace PATH] [--max-links N] [--reconfigure] Options: --mailbox, -m IMAP mailbox to search (default: from config) --subject-prefix, -s Subject prefix to filter emails --workspace, -w Directory to save Markdown files --max-links Max number of links to crawl (default: 50) --reconfigure Re-run the setup interview ``` ## Output Format Each crawled page is saved as a Markdown file with YAML frontmatter: ```markdown --- source: https://example.com/article crawled_at: 2026-03-27T12:00:00Z --- # Article Title Article content converted to clean Markdown... ``` Files are named using a sanitized version of the URL plus a short hash for uniqueness. ## Example Usage with Claude Ask Claude to run the archiver: > "Run the Gmail Link Archiver to crawl links from my emails with subject starting with '[ReadLater]'" Claude will execute: ```bash python3 references/gmail_link_archiver.py --subject-prefix "[ReadLater]" ``` Or to set up fresh: > "Set up the Gmail Link Archiver with my credentials" ```bash python3 references/gmail_link_archiver.py --reconfigure ``` ## Troubleshooting **"App password" rejected?** - Ensure 2-Step Verification is enabled on your Google account - Generate a new App Password at https://myaccount.google.com/apppasswords - Use the 16-character password without spaces **Playwright/Chromium issues?** ```bash # Reinstall Chromium python3 -m playwright install chromium # Install system dependencies (Linux) sudo python3 -m playwright install-deps chromium ``` **No emails found?** - Check the mailbox name (use `INBOX`, `[Gmail]/All Mail`, etc.) - Verify the subject prefix matches exactly (case-sensitive) - Try a broader prefix **Permission denied on config file?** ```bash chmod 600 ~/.config/gmail-link-archiver/config.json ``` ## Security - Credentials are stored locally at `~/.config/gmail-link-archiver/config.json` - File permissions are set to `0600` (owner read/write only) - Credentials are **never** transmitted anywhere except to the IMAP server - Credentials are **never** logged or printed to stdout - Use Gmail App Passwords (not your main Google password) - The config directory has `0700` permissions ## Requirements - Python 3.8+ - Linux (Ubuntu/Debian) for MVP - Gmail account with IMAP enabled and App Password - Internet connection for IMAP and web crawling

标签

skill ai

通过对话安装

该技能支持在以下平台通过对话安装:

OpenClaw WorkBuddy QClaw Kimi Claude

方式一:安装 SkillHub 和技能

帮我安装 SkillHub 和 gmail-link-archiver-1775988061 技能

方式二:设置 SkillHub 为优先技能安装源

设置 SkillHub 为我的优先技能安装源,然后帮我安装 gmail-link-archiver-1775988061 技能

通过命令行安装

skillhub install gmail-link-archiver-1775988061

下载 Zip 包

⬇ 下载 gmail-link-archiver v1.1.0

文件大小: 9.31 KB | 发布时间: 2026-4-13 10:28

v1.1.0 最新 2026-4-13 10:28
Version 1.1.0

- Added version field to SKILL.md and updated documentation to reflect version 1.1.0.
- Made minor updates and maintenance to documentation and source files.
- No major functional changes to codebase; primarily a metadata and docs update.

Archiver·手机版·闲社网·闲社论坛·羊毛社区· 多链控股集团有限公司 · 苏ICP备2025199260号-1

Powered by Discuz! X5.0   © 2024-2025 闲社网·线报更新论坛·羊毛分享社区·http://xianshe.com

p2p_official_large
返回顶部