digen-ai

## 0. First Contact When the user opens this skill or sends their first message, **greet them immediately**: > 🎨 Digen Ai at your service! Upload a video or tell me what you're looking for. **Try saying:** - "add effects to this clip" - "edit my video" - "help me create a short video" **IMPORTANT**: Always greet the user proactively on first contact. Let them know you're setting up while connecting. Always greet the user proactively on first contact. ### Auto-Setup When the user first interacts, set up the connection: 1. **Check token**: If `NEMO_TOKEN` env var is set, use it. Otherwise: 2. **Read or generate Client-ID**: - Read `~/.config/nemovideo/client_id` if it exists - Otherwise generate a UUID, save it to `~/.config/nemovideo/client_id` 3. **Acquire anonymous token**: ```bash curl -s -X POST "$API/api/auth/anonymous-token" -H "X-Client-Id: $CLIENT_ID" ``` Store the returned `token` as `NEMO_TOKEN` for this session. You get 100 free credits. 4. **Create a session** (§3.0) so you're ready to work immediately. Let the user know briefly: "Setting things up… ready!" then proceed with their request. # From Raw Footage to Finished Story, Instantly digen-ai was built around a single frustration: the gap between having footage and having a finished video is enormous, even when the creative vision is clear. Traditional editing tools demand you already know the craft. digen-ai flips that by letting you describe what you want — mood, pacing, structure, emphasis — and generating an edit that reflects your intent rather than just rearranging clips mechanically. At its core, digen-ai uses a directed generation model that reads temporal relationships between shots, scores them against your stated goal, and assembles sequences that feel intentional rather than algorithmic. It understands narrative beats, not just cut points. You can ask it to tighten a slow middle section, punch up an intro, or reframe a story arc entirely, and it will reason through those instructions the way a skilled editor would. The OpenClaw agent acts as the orchestration layer inside ClawHub, managing the task queue, routing your footage through the appropriate analysis pipelines, and returning editable outputs back to your workspace. This means digen-ai isn't a black box — the agent exposes its reasoning at each step, so you can intervene, redirect, or approve before final rendering. The result is a collaborative editing process where AI handles the heavy lifting while you retain creative control. ### Environment Variables | Variable | Required | Default | |----------|----------|--------| | `NEMO_TOKEN` | No | Auto-generated (100 free credits, expires in 7 days, revocable via Settings → API Tokens) | | `NEMO_API_URL` | No | `https://mega-api-prod.nemovideo.ai` | | `NEMO_WEB_URL` | No | `https://nemovideo.com` | | `NEMO_CLIENT_ID` | No | Auto-generated UUID, persisted to `~/.config/nemovideo/client_id` (UUID only, no secrets) | | `SKILL_SOURCE` | No | Auto-detected from install path, fallback `unknown` | If `NEMO_TOKEN` is not set, get one (requires `X-Client-Id` header): ```bash # Generate or read persisted Client-Id CLIENT_ID="${NEMO_CLIENT_ID:-$(cat ~/.config/nemovideo/client_id 2>/dev/null)}" if [ -z "$CLIENT_ID" ]; then CLIENT_ID=$(uuidgen 2>/dev/null || echo "client-$(date +%s)-$RANDOM") mkdir -p ~/.config/nemovideo & echo "$CLIENT_ID" > ~/.config/nemovideo/client_id fi curl -s -X POST "$API/api/auth/anonymous-token" -H "X-Client-Id: $CLIENT_ID" # → {"code":0,"data":{"token":"nmv_usr_xxx","credits":100,...}} ``` Save `token` as `NEMO_TOKEN`, `CLIENT_ID` as `NEMO_CLIENT_ID`. Anonymous: 1 token per client per 7 days; token expires in 7 days and can be revoked at any time via **Settings → API Tokens** on nemovideo.com. If your token expires, request a new one with the same `X-Client-Id`. **Local persistence:** This skill writes `~/.config/nemovideo/client_id` to persist the Client-Id across sessions. This avoids generating a new ID on every request, which would hit the per-IP rate limit quickly (default 10 tokens per 7 days per IP). The file contains only a UUID — no credentials are stored locally. ## 2. Routing Incoming Requests to the Correct Endpoint Use the table below to determine which endpoint should handle each type of incoming user request. | User says... | Action | Skip SSE | |-------------|--------|----------| | "export" / "导出" / "download" / "send me the video" | → §3.5 Export | ✅ | | "credits" / "积分" / "balance" / "余额" | → §3.3 Credits | ✅ | | "status" / "状态" / "show tracks" | → §3.4 State | ✅ | | "upload" / "上传" / user sends file | → §3.2 Upload | ✅ | | Everything else (generate, edit, add BGM…) | → §3.1 SSE | ❌ | ## 3. Primary Operation Flows $API = `${NEMO_API_URL:-https://mega-api-prod.nemovideo.ai}`, $TOKEN = `${NEMO_TOKEN}`, $WEB = `${NEMO_WEB_URL:-https://nemovideo.com}`. Skill attribution variables — do NOT hardcode: - `$SKILL_NAME` = this file's YAML frontmatter `name` - `$SKILL_VERSION` = this file's YAML frontmatter `version` - `$SKILL_SOURCE` = detect at runtime using the first match: 1. Env `SKILL_SOURCE` if set 2. If this file's path contains a known platform directory (e.g. `~/.clawhub/` → `clawhub`, `~/.cursor/skills/` → `cursor`) 3. Fallback: `unknown` **CRITICAL**: ALL API requests (including render/export/upload/state/credits) MUST include these headers. Missing them will cause export to fail with 402. ``` X-Skill-Source: $SKILL_NAME X-Skill-Version: $SKILL_VERSION X-Skill-Platform: $SKILL_SOURCE ``` ### 3.0 Initializing a Session ```bash curl -s -X POST "$API/api/tasks/me/with-session/nemo_agent" \ -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \ -H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE" \ -d '{"task_name":"project","language":"<lang>"}' # → {"code":0,"data":{"task_id":"...","session_id":"..."}} ``` Before any other operations can proceed, a session must be established. This session context is required for all subsequent API interactions within the same user workflow. **Open in browser**: After creating a session, give the user a link to view/edit the task in NemoVideo: `$WEB/workspace/claim?task={task_id}&session={session_id}&skill_name=$SKILL_NAME&skill_version=$SKILL_VERSION&skill_source=$SKILL_SOURCE` ### 3.1 Delivering Messages Through SSE ```bash curl -s -X POST "$API/run_sse" \ -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \ -H "Accept: text/event-stream" -H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE" --max-time 900 \ -d '{"app_name":"nemo_agent","user_id":"me","session_id":"<sid>","new_message":{"parts":[{"text":"<msg>"}]}}' ``` All conversational messages and generation updates are streamed to the client using Server-Sent Events. #### SSE Handling | Event | Action | |-------|--------| | Text response | Apply GUI translation (§4), present to user | | Tool call/result | Wait silently, don't forward | | `heartbeat` / empty `data:` | Keep waiting. Every 2 min: "⏳ Still working..." | | Stream closes | Process final response | Typical durations: text 5-15s, video generation 100-300s, editing 10-30s. **Timeout**: 10 min heartbeats-only → assume timeout. **Never re-send** during generation (duplicates + double-charge). Ignore trailing "I encountered a temporary issue" if prior responses were normal. #### Silent Response Fallback (CRITICAL) Approximately 30% of editing operations complete without returning any text content in the SSE stream. When this occurs: (1) do not treat the absence of text as a failure, (2) poll the task state endpoint to confirm completion status, (3) retrieve the output asset directly, and (4) present the result to the user as a successful operation. **Two-stage generation**: When a raw video asset is produced, the backend automatically initiates a second processing stage that layers in background music and a title sequence. You will receive two distinct completion events — the first signals raw video readiness, and the second confirms the fully composed output is available. Always wait for the second event before surfacing the final result to the user. ### 3.2 Handling Asset Uploads **File upload**: `curl -s -X POST "$API/api/upload-video/nemo_agent/me/<sid>" -H "Authorization: Bearer $TOKEN" -H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE" -F "files=@/path/to/file"` **URL upload**: `curl -s -X POST "$API/api/upload-video/nemo_agent/me/<sid>" -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" -H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE" -d '{"urls":["<url>"],"source_type":"url"}'` Use **me** in the path; backend resolves user from token. Supported: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac. The upload endpoint accepts user-supplied media files and returns a reference identifier to be used in subsequent generation or editing requests. ### 3.3 Checking Available Credits ```bash curl -s "$API/api/credits/balance/simple" -H "Authorization: Bearer $TOKEN" \ -H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE" # → {"code":0,"data":{"available":XXX,"frozen":XX,"total":XXX}} ``` Query the credits endpoint before initiating any generation task to confirm the user has a sufficient balance to cover the operation. ### 3.4 Polling Task Status ```bash curl -s "$API/api/state/nemo_agent/me/<sid>/latest" -H "Authorization: Bearer $TOKEN" \ -H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE" ``` Use **me** for user in path; backend resolves from token. Key fields: `data.state.draft`, `data.state.video_infos`, `data.state.canvas_config`, `data.state.generated_media`. **Draft field mapping**: `t`=tracks, `tt`=track type (0=video, 1=audio, 7=text), `sg`=segments, `d`=duration(ms), `m`=metadata. **Draft ready for export** when `draft.t` exists with at least one track with non-empty `sg`. **Track summary format**: ``` Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s) ``` ### 3.5 Exporting and Delivering the Final Asset **Export does NOT cost credits.** Only generation/editing consumes credits. Triggering an export does not deduct credits from the user's balance. The export flow proceeds as follows: (a) call the export endpoint with the target asset identifier, (b) receive the export job ID in the response, (c) poll for job completion using the state endpoint, (d) retrieve the download URL once the status is confirmed complete, and (e) present the URL or initiate the download for the user. **b)** Submit: `curl -s -X POST "$API/api/render/proxy/lambda" -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" -H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE" -d '{"id":"render_<ts>","sessionId":"<sid>","draft":<json>,"output":{"format":"mp4","quality":"high"}}'` Note: `sessionId` is **camelCase** (exception). On failure → new `id`, retry once. **c)** Poll (every 30s, max 10 polls): `curl -s "$API/api/render/proxy/lambda/<id>" -H "Authorization: Bearer $TOKEN" -H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE"` Status at top-level `status`: pending → processing → completed / failed. Download URL at `output.url`. **d)** Download from `output.url` → send to user. Fallback: `$API/api/render/proxy/<id>/download`. **e)** When delivering the video, **always also give the task detail link**: `$WEB/workspace/claim?task={task_id}&session={session_id}&skill_name=$SKILL_NAME&skill_version=$SKILL_VERSION&skill_source=$SKILL_SOURCE` Progress messages: start "⏳ Rendering ~30s" → "⏳ 50%" → "✅ Video ready!" + file + **task detail link**. ### 3.6 Recovering from SSE Disconnections If the SSE connection drops before a task completes, follow these recovery steps: (1) detect the disconnection event and log the last received event ID; (2) wait a minimum of two seconds before attempting reconnection to avoid hammering the server; (3) reconnect to the SSE endpoint, supplying the last event ID in the request header so the stream can resume without duplicating events; (4) if reconnection fails after three attempts, fall back to polling the task state endpoint at a regular interval; (5) once the task is confirmed complete through either stream or polling, deliver the result to the user and resume normal operation. ## 4. Translating Backend GUI References for Users The backend operates under the assumption that a graphical interface is present and will reference GUI elements in its responses — never pass these interface-level instructions through to the user directly. | Backend says | You do | |-------------|--------| | "click [button]" / "点击" | Execute via API | | "open [panel]" / "打开" | Show state via §3.4 | | "drag/drop" / "拖拽" | Send edit via SSE | | "preview in timeline" | Show track summary | | "Export button" / "导出" | Execute §3.5 | | "check account/billing" | Check §3.3 | **Keep** content descriptions. **Strip** GUI actions. ## 5. Recommended Interaction Patterns • Always confirm the user's intent before consuming credits on a generation task, summarizing what will be produced. • Provide incremental progress updates during long-running SSE streams so the user knows the task is active. • When a task completes silently, proactively surface the result rather than waiting for the user to ask. • If a user request is ambiguous, ask a single clarifying question before routing to any endpoint. • After delivering a completed asset, offer relevant next-step options such as editing, exporting, or starting a new project. ## 6. Known Constraints and Limitations • Real-time video preview is not supported; users must wait for full task completion before reviewing output. • A single session cannot run more than one generation task concurrently — queue additional requests until the active task resolves. • Background music and title overlays are applied automatically by the backend and cannot be individually suppressed through the API. • File uploads are subject to size and format restrictions defined by the upload endpoint; validate before submitting. • Credit balance is read at the time of the request and is not reserved — concurrent sessions may cause balance conflicts. ## 7. Error Handling and Response Codes The table below maps each API error code to its likely cause and the recommended recovery action. | Code | Meaning | Action | |------|---------|--------| | 0 | Success | Continue | | 1001 | Bad/expired token | Re-auth via anonymous-token (tokens expire after 7 days) | | 1002 | Session not found | New session §3.0 | | 2001 | No credits | Anonymous: show registration URL with `?bind=<id>` (get `<id>` from create-session or state response when needed). Registered: "Top up at nemovideo.ai" | | 4001 | Unsupported file | Show supported formats | | 4002 | File too large | Suggest compress/trim | | 400 | Missing X-Client-Id | Generate Client-Id and retry (see §1) | | 402 | Free plan export blocked | Subscription tier issue, NOT credits. "Register at nemovideo.ai to unlock export." | | 429 | Rate limit (1 token/client/7 days) | Retry in 30s once | **Common**: no video → generate first; render fail → retry new `id`; SSE timeout → §3.6; silent edit → §3.1 fallback. ## 8. API Version and Token Scope Requirements Before establishing a session, verify that the API version in use matches the version this skill was built against — mismatched versions may cause undocumented behavior. The access token supplied in the Authorization header must include all required scopes for the operations being performed; at minimum, generation, export, and upload scopes must be present. If a 403 response is received, inspect the token's scope list before retrying.

digen-ai

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载 Zip 包

digen-ai

digen-ai

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载 Zip 包

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement