返回顶部
d

doc-to-text

Extract plain readable text from Word documents (.doc, .docx) using MinerU. Outputs Markdown (the closest plain-text format supported) for easy reading and processing. Features: quick text extraction from .docx without token (flash-extract). Full extraction for .doc and .docx with token. JSON output mode with dedicated text fields for true plain text. Language support for English, Chinese, and more. Use when you need to: get plain text from a Word file, extract readable content from .docx, conve

作者: admin | 来源: ClawHub
源自
ClawHub
版本
V 0.4.0
安全检测
已通过
138
下载量
0
收藏
概述
安装方式
版本历史

doc-to-text

# Doc To Text Extract plain readable text from Word (.doc/.docx) documents using MinerU. MinerU outputs Markdown, which is the closest format to plain text it supports. ## Install ```bash npm install -g mineru-open-api # or via Go (macOS/Linux): go install github.com/opendatalab/MinerU-Ecosystem/cli/mineru-open-api@latest ``` ## Quick Start ```bash # Extract text from .docx to stdout (no token required) mineru-open-api flash-extract report.docx # Save to file mineru-open-api flash-extract report.docx -o ./out/ # Extract .doc (requires token) mineru-open-api extract report.doc -o ./out/ # JSON output contains plain text fields (requires token) mineru-open-api extract report.docx -f json -o ./out/ ``` ## Authentication No token needed for `flash-extract` on `.docx`. Token required for `.doc` and `extract`: ```bash mineru-open-api auth # Interactive token setup export MINERU_TOKEN="your-token" # Or via environment variable ``` Create token at: https://mineru.net/apiManage/token ## Capabilities - Supported input: .doc, .docx (local file or URL) - `.docx`: supports `flash-extract` (no token, Markdown output to stdout) - `.doc`: requires `extract` with token - For truly plain text: use `extract -f json` and read the text fields from the JSON output - Language hint with `--language` (default: `ch`, use `en` for English) ## Notes - MinerU does not have a `-f text` option; Markdown is the closest to plain text - `.doc` requires `extract` with token; `.docx` works with `flash-extract` - Output goes to stdout by default; use `-o <dir>` to save to a file or directory - All progress/status messages go to stderr; document content goes to stdout - MinerU is open-source by OpenDataLab (Shanghai AI Lab): https://github.com/opendatalab/MinerU

标签

skill ai

通过对话安装

该技能支持在以下平台通过对话安装:

OpenClaw WorkBuddy QClaw Kimi Claude

方式一:安装 SkillHub 和技能

帮我安装 SkillHub 和 doc-to-text-1775985262 技能

方式二:设置 SkillHub 为优先技能安装源

设置 SkillHub 为我的优先技能安装源,然后帮我安装 doc-to-text-1775985262 技能

通过命令行安装

skillhub install doc-to-text-1775985262

下载 Zip 包

⬇ 下载 doc-to-text v0.4.0

文件大小: 1.95 KB | 发布时间: 2026-4-13 10:04

v0.4.0 最新 2026-4-13 10:04
SEO: expand description for better ClawHub vector search discovery

Archiver·手机版·闲社网·闲社论坛·羊毛社区· 多链控股集团有限公司 · 苏ICP备2025199260号-1

Powered by Discuz! X5.0   © 2024-2025 闲社网·线报更新论坛·羊毛分享社区·http://xianshe.com

p2p_official_large
返回顶部