qwen-ai

## 0. First Contact When the user opens this skill or sends their first message, **greet them immediately**: > 💬 Ready to qwen ai! Just send me a video or describe your project. **Try saying:** - "edit my video" - "add effects to this clip" - "help me create a short video" **IMPORTANT**: Always greet the user proactively on first contact. Let them know you're setting up while connecting. Always greet the user proactively on first contact. ### Auto-Setup When the user first interacts, set up the connection: 1. **Check token**: If `NEMO_TOKEN` env var is set, use it. Otherwise: 2. **Read or generate Client-ID**: - Read `~/.config/nemovideo/client_id` if it exists - Otherwise generate a UUID, save it to `~/.config/nemovideo/client_id` 3. **Acquire anonymous token**: ```bash curl -s -X POST "$API/api/auth/anonymous-token" -H "X-Client-Id: $CLIENT_ID" ``` Store the returned `token` as `NEMO_TOKEN` for this session. You get 100 free credits. 4. **Create a session** (§3.0) so you're ready to work immediately. Let the user know briefly: "Setting things up… ready!" then proceed with their request. # Ask Your Videos Anything With Qwen AI The qwen-ai skill brings Alibaba's Qwen multimodal model into your video workflow, letting you interact with video content the way you'd chat with a knowledgeable colleague. Rather than scrubbing through footage manually, you simply describe what you're looking for — a product mention, a scene transition, a speaker's argument — and qwen-ai surfaces it with context and precision. Under the hood, the OpenClaw agent orchestrates the exchange between your uploaded video and the Qwen model. It handles frame sampling, audio transcription alignment, and prompt construction so the model receives the right context at the right granularity. This means you get answers that are grounded in actual video content, not hallucinated summaries. Whether you're a researcher tagging interview footage, a marketer reviewing campaign reels, or a developer building video-aware pipelines, qwen-ai adapts to your query style. You can ask broad questions like 'What is this video about' or precise ones like 'At what point does the presenter mention pricing' The skill handles mp4, mov, avi, webm, and mkv files, making it compatible with virtually any production or consumer video source. ### Environment Variables | Variable | Required | Default | |----------|----------|--------| | `NEMO_TOKEN` | No | Auto-generated (100 free credits, expires in 7 days, revocable via Settings → API Tokens) | | `NEMO_API_URL` | No | `https://mega-api-prod.nemovideo.ai` | | `NEMO_WEB_URL` | No | `https://nemovideo.com` | | `NEMO_CLIENT_ID` | No | Auto-generated UUID, persisted to `~/.config/nemovideo/client_id` (UUID only, no secrets) | | `SKILL_SOURCE` | No | Auto-detected from install path, fallback `unknown` | If `NEMO_TOKEN` is not set, get one (requires `X-Client-Id` header): ```bash # Generate or read persisted Client-Id CLIENT_ID="${NEMO_CLIENT_ID:-$(cat ~/.config/nemovideo/client_id 2>/dev/null)}" if [ -z "$CLIENT_ID" ]; then CLIENT_ID=$(uuidgen 2>/dev/null || echo "client-$(date +%s)-$RANDOM") mkdir -p ~/.config/nemovideo & echo "$CLIENT_ID" > ~/.config/nemovideo/client_id fi curl -s -X POST "$API/api/auth/anonymous-token" -H "X-Client-Id: $CLIENT_ID" # → {"code":0,"data":{"token":"nmv_usr_xxx","credits":100,...}} ``` Save `token` as `NEMO_TOKEN`, `CLIENT_ID` as `NEMO_CLIENT_ID`. Anonymous: 1 token per client per 7 days; token expires in 7 days and can be revoked at any time via **Settings → API Tokens** on nemovideo.com. If your token expires, request a new one with the same `X-Client-Id`. **Local persistence:** This skill writes `~/.config/nemovideo/client_id` to persist the Client-Id across sessions. This avoids generating a new ID on every request, which would hit the per-IP rate limit quickly (default 10 tokens per 7 days per IP). The file contains only a UUID — no credentials are stored locally. ## 2. Routing Logic for Incoming Requests Use the following table to determine which endpoint each type of user request should be directed to. | User says... | Action | Skip SSE | |-------------|--------|----------| | "export" / "导出" / "download" / "send me the video" | → §3.5 Export | ✅ | | "credits" / "积分" / "balance" / "余额" | → §3.3 Credits | ✅ | | "status" / "状态" / "show tracks" | → §3.4 State | ✅ | | "upload" / "上传" / user sends file | → §3.2 Upload | ✅ | | Everything else (generate, edit, add BGM…) | → §3.1 SSE | ❌ | ## 3. Primary Operational Flows $API = `${NEMO_API_URL:-https://mega-api-prod.nemovideo.ai}`, $TOKEN = `${NEMO_TOKEN}`, $WEB = `${NEMO_WEB_URL:-https://nemovideo.com}`. Skill attribution variables — do NOT hardcode: - `$SKILL_NAME` = this file's YAML frontmatter `name` - `$SKILL_VERSION` = this file's YAML frontmatter `version` - `$SKILL_SOURCE` = detect at runtime using the first match: 1. Env `SKILL_SOURCE` if set 2. If this file's path contains a known platform directory (e.g. `~/.clawhub/` → `clawhub`, `~/.cursor/skills/` → `cursor`) 3. Fallback: `unknown` **CRITICAL**: ALL API requests (including render/export/upload/state/credits) MUST include these headers. Missing them will cause export to fail with 402. ``` X-Skill-Source: $SKILL_NAME X-Skill-Version: $SKILL_VERSION X-Skill-Platform: $SKILL_SOURCE ``` ### 3.0 Initializing a New Session ```bash curl -s -X POST "$API/api/tasks/me/with-session/nemo_agent" \ -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \ -H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE" \ -d '{"task_name":"project","language":"<lang>"}' # → {"code":0,"data":{"task_id":"...","session_id":"..."}} ``` Before any interaction can begin, a session must be established with the backend. This session identifier is required for all subsequent requests within the conversation. **Open in browser**: After creating a session, give the user a link to view/edit the task in NemoVideo: `$WEB/workspace/claim?task={task_id}&session={session_id}&skill_name=$SKILL_NAME&skill_version=$SKILL_VERSION&skill_source=$SKILL_SOURCE` ### 3.1 Delivering Messages Over SSE ```bash curl -s -X POST "$API/run_sse" \ -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \ -H "Accept: text/event-stream" -H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE" --max-time 900 \ -d '{"app_name":"nemo_agent","user_id":"me","session_id":"<sid>","new_message":{"parts":[{"text":"<msg>"}]}}' ``` All chat messages are transmitted to the backend and responses are streamed back to the client using Server-Sent Events. #### SSE Handling | Event | Action | |-------|--------| | Text response | Apply GUI translation (§4), present to user | | Tool call/result | Wait silently, don't forward | | `heartbeat` / empty `data:` | Keep waiting. Every 2 min: "⏳ Still working..." | | Stream closes | Process final response | Typical durations: text 5-15s, video generation 100-300s, editing 10-30s. **Timeout**: 10 min heartbeats-only → assume timeout. **Never re-send** during generation (duplicates + double-charge). Ignore trailing "I encountered a temporary issue" if prior responses were normal. #### Silent Response Fallback (CRITICAL) Approximately 30% of edit-type requests will result in the backend returning no visible text content. When this occurs: (1) do not treat the empty response as an error, (2) poll the session state endpoint to confirm the edit was accepted, (3) synthesize a brief confirmation message for the user based on the returned state, and (4) proceed with the next step in the flow as normal. **Two-stage generation**: When a raw video is submitted, the backend automatically initiates a two-stage enhancement pipeline. In the first stage, the unprocessed video is analyzed and accepted. In the second stage, the system independently appends background music and generates a title without any additional prompt from the user. Both stages must complete before the session state is considered ready for export. ### 3.2 Handling File Uploads **File upload**: `curl -s -X POST "$API/api/upload-video/nemo_agent/me/<sid>" -H "Authorization: Bearer $TOKEN" -H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE" -F "files=@/path/to/file"` **URL upload**: `curl -s -X POST "$API/api/upload-video/nemo_agent/me/<sid>" -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" -H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE" -d '{"urls":["<url>"],"source_type":"url"}'` Use **me** in the path; backend resolves user from token. Supported: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac. The upload endpoint accepts video and image files, returning a reference identifier that must be included in subsequent message payloads. ### 3.3 Checking Available Credits ```bash curl -s "$API/api/credits/balance/simple" -H "Authorization: Bearer $TOKEN" \ -H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE" # → {"code":0,"data":{"available":XXX,"frozen":XX,"total":XXX}} ``` Query the credits endpoint before initiating any generation task to confirm the user has a sufficient balance to proceed. ### 3.4 Retrieving Current Session State ```bash curl -s "$API/api/state/nemo_agent/me/<sid>/latest" -H "Authorization: Bearer $TOKEN" \ -H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE" ``` Use **me** for user in path; backend resolves from token. Key fields: `data.state.draft`, `data.state.video_infos`, `data.state.canvas_config`, `data.state.generated_media`. **Draft field mapping**: `t`=tracks, `tt`=track type (0=video, 1=audio, 7=text), `sg`=segments, `d`=duration(ms), `m`=metadata. **Draft ready for export** when `draft.t` exists with at least one track with non-empty `sg`. **Track summary format**: ``` Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s) ``` ### 3.5 Exporting and Delivering the Final Output **Export does NOT cost credits.** Only generation/editing consumes credits. Triggering an export does not deduct from the user's credit balance. The export flow proceeds as follows: (a) confirm the session state shows a completed edit, (b) call the export endpoint with the session identifier, (c) poll for the export job status until it resolves, (d) retrieve the download URL from the completed job response, and (e) present the URL to the user with a clear download prompt. **b)** Submit: `curl -s -X POST "$API/api/render/proxy/lambda" -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" -H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE" -d '{"id":"render_<ts>","sessionId":"<sid>","draft":<json>,"output":{"format":"mp4","quality":"high"}}'` Note: `sessionId` is **camelCase** (exception). On failure → new `id`, retry once. **c)** Poll (every 30s, max 10 polls): `curl -s "$API/api/render/proxy/lambda/<id>" -H "Authorization: Bearer $TOKEN" -H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE"` Status at top-level `status`: pending → processing → completed / failed. Download URL at `output.url`. **d)** Download from `output.url` → send to user. Fallback: `$API/api/render/proxy/<id>/download`. **e)** When delivering the video, **always also give the task detail link**: `$WEB/workspace/claim?task={task_id}&session={session_id}&skill_name=$SKILL_NAME&skill_version=$SKILL_VERSION&skill_source=$SKILL_SOURCE` Progress messages: start "⏳ Rendering ~30s" → "⏳ 50%" → "✅ Video ready!" + file + **task detail link**. ### 3.6 Recovering from SSE Disconnections If the SSE stream drops unexpectedly, follow these recovery steps: (1) detect the disconnection event and pause any UI updates immediately, (2) wait a minimum of two seconds before attempting to reconnect to avoid rapid retry loops, (3) re-establish the SSE connection using the existing session identifier rather than creating a new session, (4) request the current session state to reconcile any events that may have been missed during the outage, and (5) resume normal operation and notify the user only if the interruption lasted long enough to affect their experience. ## 4. Translating Backend GUI References The backend is designed around a graphical interface and will occasionally reference UI elements in its responses — these references must never be passed through to the user as-is. | Backend says | You do | |-------------|--------| | "click [button]" / "点击" | Execute via API | | "open [panel]" / "打开" | Show state via §3.4 | | "drag/drop" / "拖拽" | Send edit via SSE | | "preview in timeline" | Show track summary | | "Export button" / "导出" | Execute §3.5 | | "check account/billing" | Check §3.3 | **Keep** content descriptions. **Strip** GUI actions. ## 5. Recommended Interaction Patterns • Always confirm a session is active before forwarding any user message to the backend. • When the user's intent involves editing, check the credit balance first and surface any shortfall before the request is submitted. • After each backend response, validate the session state and use it to inform the next conversational turn rather than relying solely on the streamed text. • Translate all GUI-specific language from backend responses into plain, action-oriented instructions before presenting them to the user. • If a two-stage processing pipeline is detected, keep the user informed of progress between stages rather than waiting silently for final completion. ## 6. Known Constraints and Limitations • The backend does not support real-time collaborative sessions; only one active session per user token is permitted at a time. • File uploads are subject to size and format restrictions defined by the upload endpoint; exceeding these limits will return a validation error. • Credit balances are read-only through the API and cannot be topped up programmatically. • SSE streams may silently drop under poor network conditions; the disconnect recovery flow must always be implemented. • Export URLs are time-limited and will expire after the period specified in the job response; users should be advised to download promptly. ## 7. Error Handling Reference The table below maps common HTTP error codes returned by the backend to their likely causes and the recommended recovery action for each. | Code | Meaning | Action | |------|---------|--------| | 0 | Success | Continue | | 1001 | Bad/expired token | Re-auth via anonymous-token (tokens expire after 7 days) | | 1002 | Session not found | New session §3.0 | | 2001 | No credits | Anonymous: show registration URL with `?bind=<id>` (get `<id>` from create-session or state response when needed). Registered: "Top up at nemovideo.ai" | | 4001 | Unsupported file | Show supported formats | | 4002 | File too large | Suggest compress/trim | | 400 | Missing X-Client-Id | Generate Client-Id and retry (see §1) | | 402 | Free plan export blocked | Subscription tier issue, NOT credits. "Register at nemovideo.ai to unlock export." | | 429 | Rate limit (1 token/client/7 days) | Retry in 30s once | **Common**: no video → generate first; render fail → retry new `id`; SSE timeout → §3.6; silent edit → §3.1 fallback. ## 8. API Version and Token Scopes Before going live, verify that the integration is targeting the correct API version by inspecting the version field in the root endpoint response. Token scopes must include read access for session state and credits, write access for messages and uploads, and export access for the delivery endpoint. Tokens missing any required scope will receive a 403 response on the affected endpoint. If a version mismatch is detected, halt further requests and surface an actionable error to the operator.

qwen-ai

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载 Zip 包

qwen-ai

qwen-ai

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载 Zip 包

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement