返回顶部
f

filechat-rag

>

作者: admin | 来源: ClawHub
源自
ClawHub
版本
V 1.0.4
安全检测
已通过
136
下载量
0
收藏
概述
安装方式
版本历史

filechat-rag

# FileChat RAG Skill Your personal RAG (Retrieval-Augmented Generation) document library backed by Google Drive. Supports multiple Google Drive folders dynamically, interactive folder routing, incremental sync, choosing between Gemini or OpenAI for embeddings, and connecting to Qdrant. ## Setup & Bootstrap FIRST verify that the required environment variables are set in `/workspace/skills/filechat/.env`: 1. `EMBEDDING_PROVIDER` (either `gemini` or `openai`) 2. `GEMINI_API_KEY` or `OPENAI_API_KEY` 3. Optional: `QDRANT_URL` and `QDRANT_API_KEY` (If absent, it uses local disk-based JSON). Create the `.env` file like this: ```bash echo "EMBEDDING_PROVIDER=gemini" > ./skills/filechat/.env echo "GEMINI_API_KEY=your_key_here" >> ./skills/filechat/.env ``` **Google Workspace Authentication:** Before running any commands, check if the system is authenticated by running: ```bash npx @googleworkspace/cli auth status ``` If it returns an auth error or indicates no token, you MUST prompt the user to authenticate. Trigger the interactive login flow: ```bash npx @googleworkspace/cli auth login --services drive ``` Wait for the user to complete the browser OAuth flow before proceeding. ## Folder Management The user can have infinite folders synced. You manage them using `folders.js`. - **List Folders:** `cd ./skills/filechat && node folders.js list` - **Add a Folder:** `node folders.js add "Taxes 2026" <FOLDER_ID>` (Auto-discovers the ID via `gws drive files list` if you don't know it!) - **Set Default Folder:** `node folders.js default "Taxes 2026"` If the user asks to do something with a file/folder but doesn't specify which one, run `node folders.js get-default` to find the default ID. If no folders exist, ask them to set one up! ## How to Sync the Library When the user asks to "sync", "flush", or "update", you must run the ingestion script. To sync a specific folder: ```bash cd ./skills/filechat && node sync.js <FOLDER_ID> ``` To sync EVERYTHING (all folders in the registry): ```bash cd ./skills/filechat && node sync-all.js ``` *Note: Syncs are highly incremental and use a local cache! If a file hasn't been modified in Drive, the script will skip it instantly and output "0 chunks" embedded. This is NORMAL behavior. If you are debugging, testing, or the user specifically requests a hard flush, you MUST delete the cache files first:* ```bash rm ./skills/filechat/meta_<FOLDER_ID>.json rm ./skills/filechat/vector_db_<FOLDER_ID>.json ``` ## How to Answer User Questions (RAG) Query the local vector store or Qdrant for the target Folder ID to fetch relevant text chunks: ```bash cd ./skills/filechat && node query.js <FOLDER_ID> "What does my medical discharge say?" ``` Use the snippets returned to answer the user. ## How to Retrieve and Send a Physical File Find the `File ID` using the query script, then download it: ```bash gws drive files get --params '{"fileId": "<FILE_ID>", "alt": "media"}' --output /workspace/discharge.pdf ``` Reply using the media tag: `MEDIA:/workspace/discharge.pdf`. ## How to Store a New File for the User If the user uploads a file and asks you to save it (or implicitly sends a file per your automatic processing rules): 1. Check their folders (`node folders.js list`). 2. If they didn't specify which folder, use the default folder. If no default is set, ask them! 3. **Notify the user** exactly which folder the file is being saved to. 4. **Tell the user** that you are now extracting the information and saving it in a vectordb. 5. If the file is an image or scanned document, make sure to extract the text using a vision model or OCR before it is embedded. (The sync script handles this natively). 6. Upload it to the correct folder using `gws`: ```bash gws drive files create --json '{"name": "filename.pdf", "parents": ["<FOLDER_ID>"]}' --upload /path/to/uploaded/file.pdf ``` 7. Trigger `node sync.js <FOLDER_ID>` so the vector database chunks and embeds the file into the corresponding vectordb. ## How to Test & Validate the Skill If the user asks you to verify the skill is working, or if you just set it up and want to ensure end-to-end functionality, follow these exact steps: 1. **Verify Auth:** Run `npx @googleworkspace/cli auth status`. Ensure it shows a valid token. 2. **Verify Drive Access:** Do a dry-run fetch of the target folder to ensure GWS can see the files. ```bash npx @googleworkspace/cli drive files list --params '{"q": "'\''<FOLDER_ID>'\'' in parents and trashed = false"}' ``` *(If this fails, check folder permissions or GWS credentials.)* 3. **Force a Clean Sync:** Clear the cache for the test folder to guarantee a fresh run, then sync. ```bash rm -f ./skills/filechat/meta_<FOLDER_ID>.json ./skills/filechat/vector_db_<FOLDER_ID>.json node ./skills/filechat/sync.js <FOLDER_ID> ``` *(You should see files being downloaded, OCR'd, and chunks being embedded. If it says "0 chunks", verify the folder isn't empty.)* 4. **Test the Vector Query:** Run a generic query to verify the embeddings were saved and cosine similarity works. ```bash node ./skills/filechat/query.js <FOLDER_ID> "hello" ``` *(You should see a list of "Top matches" with similarity scores and text snippets. If you do, the RAG pipeline is 100% operational!)*

标签

skill ai

通过对话安装

该技能支持在以下平台通过对话安装:

OpenClaw WorkBuddy QClaw Kimi Claude

方式一:安装 SkillHub 和技能

帮我安装 SkillHub 和 filechat-rag-1776078373 技能

方式二:设置 SkillHub 为优先技能安装源

设置 SkillHub 为我的优先技能安装源,然后帮我安装 filechat-rag-1776078373 技能

通过命令行安装

skillhub install filechat-rag-1776078373

下载 Zip 包

⬇ 下载 filechat-rag v1.0.4

文件大小: 61.2 KB | 发布时间: 2026-4-14 14:03

v1.0.4 最新 2026-4-14 14:03
- Added developer tests and setup files for validation and automated testing (`tests/credentials.json`, `tests/setup.js`, `tests/skill.test.js`).
- Added `_meta.json` to support additional skill metadata or configuration.
- Removed an obsolete or auto-generated metadata file.
- Updated documentation with a new section on Google Workspace authentication, including how to check and prompt for Drive OAuth before usage.
- Expanded documentation with a full end-to-end testing and validation workflow for setup, authentication, cache cleaning, and RAG functionality.

Archiver·手机版·闲社网·闲社论坛·羊毛社区· 多链控股集团有限公司 · 苏ICP备2025199260号-1

Powered by Discuz! X5.0   © 2024-2025 闲社网·线报更新论坛·羊毛分享社区·http://xianshe.com

p2p_official_large
返回顶部