返回顶部
h

html-to-html

Clean and restructure HTML documents using MinerU. Takes messy or complex HTML and produces clean, well-formatted HTML output with proper structure preserved. Features: HTML cleanup and restructuring. Removes unnecessary markup and noise. Preserves core content structure. Produces clean HTML from cluttered web pages. Use when you need to: clean up messy HTML, restructure an HTML document, convert complex HTML to clean HTML, sanitize HTML content. Use when asked: 'how do I clean this HTML', 'make

作者: admin | 来源: ClawHub
源自
ClawHub
版本
V 0.4.0
安全检测
已通过
120
下载量
0
收藏
概述
安装方式
版本历史

html-to-html

# HTML to HTML Fetch a remote web page or local HTML file and convert it to clean structured HTML using MinerU. Strips noise and preserves semantic content. ## Install ```bash npm install -g mineru-open-api # or via Go (macOS/Linux): go install github.com/opendatalab/MinerU-Ecosystem/cli/mineru-open-api@latest ``` ## Quick Start ```bash # Crawl a web page and output clean HTML (requires token) mineru-open-api crawl https://example.com/article -f html -o ./out/ # Re-extract a local HTML file to clean HTML (requires token) mineru-open-api extract page.html -f html -o ./out/ # Batch crawl multiple URLs to HTML (requires token) mineru-open-api crawl url1 url2 -f html -o ./pages/ ``` ## Authentication Token required: ```bash mineru-open-api auth # Interactive token setup export MINERU_TOKEN="your-token" # Or via environment variable ``` Create token at: https://mineru.net/apiManage/token ## Capabilities - Input: remote web page URL or local .html file - Output: clean structured HTML (`-f html`) - For remote URLs: use `crawl -f html` - For local HTML files: use `extract -f html` - Requires token — not available in `flash-extract` ## Notes - HTML output (`-f html`) requires token; not available in `flash-extract` - `crawl` supports output formats: md, html, json - `extract` supports output formats: md, html, latex, docx, json - Output goes to stdout by default; use `-o <dir>` to save to a file or directory - All progress/status messages go to stderr; document content goes to stdout - MinerU is open-source by OpenDataLab (Shanghai AI Lab): https://github.com/opendatalab/MinerU

标签

skill ai

通过对话安装

该技能支持在以下平台通过对话安装:

OpenClaw WorkBuddy QClaw Kimi Claude

方式一:安装 SkillHub 和技能

帮我安装 SkillHub 和 html-to-html-1775983143 技能

方式二:设置 SkillHub 为优先技能安装源

设置 SkillHub 为我的优先技能安装源,然后帮我安装 html-to-html-1775983143 技能

通过命令行安装

skillhub install html-to-html-1775983143

下载 Zip 包

⬇ 下载 html-to-html v0.4.0

文件大小: 1.89 KB | 发布时间: 2026-4-13 10:35

v0.4.0 最新 2026-4-13 10:35
SEO: expand description for better ClawHub vector search discovery

Archiver·手机版·闲社网·闲社论坛·羊毛社区· 多链控股集团有限公司 · 苏ICP备2025199260号-1

Powered by Discuz! X5.0   © 2024-2025 闲社网·线报更新论坛·羊毛分享社区·http://xianshe.com

p2p_official_large
返回顶部