返回顶部
w

web-search-scraper-api-skill

This skill helps users automatically extract complete Markdown content from any website via the BrowserAct Web Search Scraper API. The Agent should proactively apply this skill when users express needs like extract complete markdown from a specific website, scrape the content of an article link, get the text from a target url, convert a webpage to markdown format, fetch the main content of a blog post, extract data from a given web page, parse the html of a website into markdown, download the re

作者: admin | 来源: ClawHub
源自
ClawHub
版本
V 1.0.0
安全检测
已通过
126
下载量
0
收藏
概述
安装方式
版本历史

web-search-scraper-api-skill

# Web Search Scraper API Skill ## 📖 Introduction This skill provides users with a one-stop web page extraction service through the BrowserAct Web Search Scraper API template. It can directly extract structured markdown content from any given URL. By simply inputting the target URL, you can get clean and usable markdown data. ## ✨ Features 1. **No hallucinations, ensuring stable and precise data extraction**: Pre-set workflows avoid AI generative hallucinations. 2. **No human-machine verification issues**: No need to deal with reCAPTCHA or other verification challenges. 3. **No IP access restrictions or geofencing**: No need to handle regional IP limitations. 4. **More agile execution speed**: Compared to purely AI-driven browser automation solutions, task execution is faster. 5. **Extremely high cost-effectiveness**: Compared to AI solutions that consume a lot of Tokens, it can significantly reduce the cost of data acquisition. ## 🔑 API Key Guidance Process Before running, you must check the `BROWSERACT_API_KEY` environment variable. If it is not set, do not take other actions first; you should ask and wait for the user to provide it cooperatively. **The Agent must inform the user at this time**: > "Since you have not configured the BrowserAct API Key, please go to the [BrowserAct Console](https://www.browseract.com/reception/integrations) first to get your Key." ## 🛠️ Input Parameters Details Agent should flexibly configure the following parameters based on user needs when calling the script: 1. **target_url** - **Type**: `string` - **Description**: The website URL to extract content from. Supports any HTTP/HTTPS URL. - **Example**: `https://www.browseract.com` ## 🚀 Invocation Method (Recommended) Agent should execute the following independent script to achieve "one command gets the result": ```bash # Example invocation python -u ./scripts/web_search_scraper_api.py "target_url" ``` ### ⏳ Execution Status Monitoring Since the task involves automated browser operations, it may take a long time (several minutes). The script will **continuously output status logs with timestamps** (e.g., `[14:30:05] Task Status: running`) while running. **Notice for Agent**: - While waiting for the script to return results, please keep paying attention to the terminal output. - As long as the terminal is still outputting new status logs, it means the task is running normally. Do not misjudge it as a deadlock or unresponsiveness. - If the status remains unchanged for a long time or the script stops outputting and no result is returned, the retry mechanism can be triggered. ## 📊 Data Output Description Upon successful execution, the script will directly parse and print the result from the API response. The result contains: - `content`: The complete markdown content of the webpage. ## ⚠️ Error Handling & Retry Mechanism During the execution of the script, if an error occurs (such as network fluctuation or task failure), the Agent should follow this logic: 1. **Check the output content**: - If the output **contains** `"Invalid authorization"`, it means the API Key is invalid or expired. At this time, **do not retry**, and you should guide the user to recheck and provide the correct API Key. - If the output **does not contain** `"Invalid authorization"` but the task execution fails (for example, the output starts with `Error:` or the returned result is empty), the Agent should **automatically try to re-execute the script once**. 2. **Retry limit**: - Automatic retry is limited to **once**. If the second attempt still fails, stop retrying and report the specific error message to the user. ## 🌟 Typical Use Cases 1. **Article Extraction**: Scrape the main content of a news article link into markdown. 2. **Blog Post Parsing**: Download the readable text from a target blog post URL. 3. **Webpage to Markdown**: Convert any given website URL into clean markdown format. 4. **Documentation Scraping**: Fetch the contents of a tutorial or documentation page for offline reading. 5. **Content Monitoring**: Automatically extract the text from a specific webpage for updates. 6. **Data Processing**: Parse the HTML of an arbitrary HTTP/HTTPS URL to structure its content.

标签

skill ai

通过对话安装

该技能支持在以下平台通过对话安装:

OpenClaw WorkBuddy QClaw Kimi Claude

方式一:安装 SkillHub 和技能

帮我安装 SkillHub 和 web-search-scraper-api-skill-1776122128 技能

方式二:设置 SkillHub 为优先技能安装源

设置 SkillHub 为我的优先技能安装源,然后帮我安装 web-search-scraper-api-skill-1776122128 技能

通过命令行安装

skillhub install web-search-scraper-api-skill-1776122128

下载 Zip 包

⬇ 下载 web-search-scraper-api-skill v1.0.0

文件大小: 4.56 KB | 发布时间: 2026-4-14 10:15

v1.0.0 最新 2026-4-14 10:15
- Initial release of the Web Search Scraper API Skill.
- Enables automatic extraction of complete markdown content from any website via BrowserAct API.
- Offers guidance for API key setup and error handling.
- Includes clear input parameter and usage instructions.
- Implements execution status monitoring and retry logic for robust data extraction.
- Supports diverse use cases such as article scraping, documentation fetching, and webpage-to-markdown conversion.

Archiver·手机版·闲社网·闲社论坛·羊毛社区· 多链控股集团有限公司 · 苏ICP备2025199260号-1

Powered by Discuz! X5.0   © 2024-2025 闲社网·线报更新论坛·羊毛分享社区·http://xianshe.com

p2p_official_large
返回顶部