video-generator-ai
# Turn Prompts Into Polished Video, Conversationally
video-generator-ai reimagines how video content gets made. Instead of navigating complex timelines or manually stitching clips together, you describe what you want — a product showcase, a tutorial walkthrough, a social media reel — and the skill interprets your intent and begins assembling the output. Each exchange refines the result, so your video evolves through conversation rather than isolated export attempts.
The skill operates on a conversational editing model, meaning every follow-up message you send acts as a revision instruction. Want to shorten the intro, swap background music, or add a text overlay at the 10-second mark Just say so. The system maintains context across turns, so you never have to re-explain your project from scratch.
Behind the scenes, the OpenClaw agent orchestrates the generation pipeline — coordinating scene rendering, asset placement, audio layering, and format encoding based on your chosen output target. The agent handles the heavy lifting so you stay focused on the creative vision rather than the technical execution. Final videos can be exported in mp4, mov, avi, webm, or mkv depending on your platform requirements.
### Environment Variables
| Variable | Required | Default |
|----------|----------|--------|
| `NEMO_TOKEN` | No | Auto-generated (100 free credits, expires in 7 days, revocable via Settings → API Tokens) |
| `NEMO_API_URL` | No | `https://mega-api-prod.nemovideo.ai` |
| `NEMO_WEB_URL` | No | `https://nemovideo.com` |
| `NEMO_CLIENT_ID` | No | Auto-generated UUID, persisted to `~/.config/nemovideo/client_id` (UUID only, no secrets) |
| `SKILL_SOURCE` | No | Auto-detected from install path, fallback `unknown` |
If `NEMO_TOKEN` is not set, get one (requires `X-Client-Id` header):
```bash
# Generate or read persisted Client-Id
CLIENT_ID="${NEMO_CLIENT_ID:-$(cat ~/.config/nemovideo/client_id 2>/dev/null)}"
if [ -z "$CLIENT_ID" ]; then
CLIENT_ID=$(uuidgen 2>/dev/null || echo "client-$(date +%s)-$RANDOM")
mkdir -p ~/.config/nemovideo & echo "$CLIENT_ID" > ~/.config/nemovideo/client_id
fi
curl -s -X POST "https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token" -H "X-Client-Id: $CLIENT_ID"
# → {"code":0,"data":{"token":"nmv_usr_xxx","credits":100,...}}
```
Save `token` as `NEMO_TOKEN`, `CLIENT_ID` as `NEMO_CLIENT_ID`. Anonymous: 1 token per client per 7 days; token expires in 7 days and can be revoked at any time via **Settings → API Tokens** on nemovideo.com. If your token expires, request a new one with the same `X-Client-Id`.
**Local persistence:** This skill writes `~/.config/nemovideo/client_id` to persist the Client-Id across sessions. This avoids generating a new ID on every request, which would hit the per-IP rate limit quickly (default 10 tokens per 7 days per IP). The file contains only a UUID — no credentials are stored locally.
## 2. Routing Incoming Requests to the Correct Endpoint
Use the table below to determine which API endpoint should handle each type of user request.
| User says... | Action | Skip SSE |
|-------------|--------|----------|
| "export" / "导出" / "download" / "send me the video" | → §3.5 Export | ✅ |
| "credits" / "积分" / "balance" / "余额" | → §3.3 Credits | ✅ |
| "status" / "状态" / "show tracks" | → §3.4 State | ✅ |
| "upload" / "上传" / user sends file | → §3.2 Upload | ✅ |
| Everything else (generate, edit, add BGM…) | → §3.1 SSE | ❌ |
## 3. Essential Operation Flows
$API = `${NEMO_API_URL:-https://mega-api-prod.nemovideo.ai}`, $TOKEN = `${NEMO_TOKEN}`, $WEB = `${NEMO_WEB_URL:-https://nemovideo.com}`.
Skill attribution variables — do NOT hardcode:
- `$SKILL_NAME` = this file's YAML frontmatter `name`
- `$SKILL_VERSION` = this file's YAML frontmatter `version`
- `$SKILL_SOURCE` = detect at runtime using the first match:
1. Env `SKILL_SOURCE` if set
2. If this file's path contains a known platform directory (e.g. `~/.clawhub/` → `clawhub`, `~/.cursor/skills/` → `cursor`)
3. Fallback: `unknown`
**CRITICAL**: ALL API requests (including render/export/upload/state/credits) MUST include these headers. Missing them will cause export to fail with 402.
```
X-Skill-Source: $SKILL_NAME
X-Skill-Version: $SKILL_VERSION
X-Skill-Platform: $SKILL_SOURCE
```
### 3.0 Establishing a New Session
```bash
curl -s -X POST "https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent" \
-H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
-H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE" \
-d '{"task_name":"project","language":"<lang>"}'
# → {"code":0,"data":{"task_id":"...","session_id":"..."}}
```
Before any other operation can occur, a session must be initialized. This session ID ties all subsequent requests together within the same working context.
**Open in browser**: After creating a session, give the user a link to view/edit the task in NemoVideo:
`$WEB/workspace/claim?task={task_id}&session={session_id}&skill_name=$SKILL_NAME&skill_version=$SKILL_VERSION&skill_source=$SKILL_SOURCE`
### 3.1 Delivering Messages Over SSE
```bash
curl -s -X POST "https://mega-api-prod.nemovideo.ai/run_sse" \
-H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
-H "Accept: text/event-stream" -H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE" --max-time 900 \
-d '{"app_name":"nemo_agent","user_id":"me","session_id":"<sid>","new_message":{"parts":[{"text":"<msg>"}]}}'
```
All conversational messages and generation instructions are transmitted to the backend through a Server-Sent Events connection.
#### SSE Handling
| Event | Action |
|-------|--------|
| Text response | Apply GUI translation (§4), present to user |
| Tool call/result | Wait silently, don't forward |
| `heartbeat` / empty `data:` | Keep waiting. Every 2 min: "⏳ Still working..." |
| Stream closes | Process final response |
Typical durations: text 5-15s, video generation 100-300s, editing 10-30s.
**Timeout**: 10 min heartbeats-only → assume timeout. **Never re-send** during generation (duplicates + double-charge).
Ignore trailing "I encountered a temporary issue" if prior responses were normal.
#### Silent Response Fallback (CRITICAL)
Approximately 30% of edit operations return no text in the SSE stream. When this occurs: (1) do not treat the absence of text as a failure, (2) poll the task state endpoint to confirm completion, (3) retrieve the output asset directly, (4) present the result to the user as if a normal response was received.
**Two-stage generation**: After a raw video is produced, the backend automatically triggers a second processing stage that layers in background music and a title sequence. Both stages must reach a completed state before the final asset is surfaced to the user — treat this as a two-part pipeline rather than a single operation.
### 3.2 Handling File Uploads
**File upload**: `curl -s -X POST "https://mega-api-prod.nemovideo.ai/api/upload-video/nemo_agent/me/<sid>" -H "Authorization: Bearer $TOKEN" -H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE" -F "files=@/path/to/file"`
**URL upload**: `curl -s -X POST "https://mega-api-prod.nemovideo.ai/api/upload-video/nemo_agent/me/<sid>" -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" -H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE" -d '{"urls":["<url>"],"source_type":"url"}'`
Use **me** in the path; backend resolves user from token.
Supported: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.
Reference footage, images, and audio files provided by the user must be uploaded to the platform before they can be referenced in any generation or edit request.
### 3.3 Verifying Available Credits
```bash
curl -s "https://mega-api-prod.nemovideo.ai/api/credits/balance/simple" -H "Authorization: Bearer $TOKEN" \
-H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE"
# → {"code":0,"data":{"available":XXX,"frozen":XX,"total":XXX}}
```
Query the credits endpoint prior to initiating any generation task to confirm the user has a sufficient balance to cover the operation.
### 3.4 Polling Task Progress
```bash
curl -s "https://mega-api-prod.nemovideo.ai/api/state/nemo_agent/me/<sid>/latest" -H "Authorization: Bearer $TOKEN" \
-H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE"
```
Use **me** for user in path; backend resolves from token.
Key fields: `data.state.draft`, `data.state.video_infos`, `data.state.canvas_config`, `data.state.generated_media`.
**Draft field mapping**: `t`=tracks, `tt`=track type (0=video, 1=audio, 7=text), `sg`=segments, `d`=duration(ms), `m`=metadata.
**Draft ready for export** when `draft.t` exists with at least one track with non-empty `sg`.
**Track summary format**:
```
Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)
```
### 3.5 Exporting and Delivering the Final Asset
**Export does NOT cost credits.** Only generation/editing consumes credits.
Triggering an export does not deduct any credits from the user's balance. The delivery sequence is: (a) call the export endpoint with the completed project ID, (b) wait for the export job to reach a finished state, (c) retrieve the download URL from the response, (d) confirm the URL is accessible, then (e) present the link to the user with a clear call to action.
**b)** Submit: `curl -s -X POST "https://mega-api-prod.nemovideo.ai/api/render/proxy/lambda" -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" -H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE" -d '{"id":"render_<ts>","sessionId":"<sid>","draft":<json>,"output":{"format":"mp4","quality":"high"}}'`
Note: `sessionId` is **camelCase** (exception). On failure → new `id`, retry once.
**c)** Poll (every 30s, max 10 polls): `curl -s "https://mega-api-prod.nemovideo.ai/api/render/proxy/lambda/<id>" -H "Authorization: Bearer $TOKEN" -H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE"`
Status at top-level `status`: pending → processing → completed / failed. Download URL at `output.url`.
**d)** Download from `output.url` → send to user. Fallback: `https://mega-api-prod.nemovideo.ai/api/render/proxy/<id>/download`.
**e)** When delivering the video, **always also give the task detail link**: `$WEB/workspace/claim?task={task_id}&session={session_id}&skill_name=$SKILL_NAME&skill_version=$SKILL_VERSION&skill_source=$SKILL_SOURCE`
Progress messages: start "⏳ Rendering ~30s" → "⏳ 50%" → "✅ Video ready!" + file + **task detail link**.
### 3.6 Recovering from an SSE Disconnection
If the SSE stream drops unexpectedly, follow these steps: (1) capture the last known task ID before the connection was lost; (2) wait a minimum of three seconds before attempting any recovery action; (3) use the task state endpoint to check whether processing has already completed; (4) if the task is still running, re-establish the SSE connection using the same session ID; (5) resume normal event handling once the connection is confirmed stable.
## 4. Translating GUI Elements for the User
The backend operates under the assumption that users are interacting through a graphical interface, so GUI-specific instructions must never be passed through to the backend — translate them into plain conversational language before forwarding.
| Backend says | You do |
|-------------|--------|
| "click [button]" / "点击" | Execute via API |
| "open [panel]" / "打开" | Show state via §3.4 |
| "drag/drop" / "拖拽" | Send edit via SSE |
| "preview in timeline" | Show track summary |
| "Export button" / "导出" | Execute §3.5 |
| "check account/billing" | Check §3.3 |
**Keep** content descriptions. **Strip** GUI actions.
## 5. Recommended Interaction Patterns
• Confirm intent before starting any long-running generation task, especially when credits are involved.
• After submitting a request, provide the user with a status update at regular polling intervals so they know work is in progress.
• When a task completes, immediately surface the output rather than waiting for the user to ask.
• If a user's instruction is ambiguous, ask one focused clarifying question before proceeding rather than making assumptions.
• Always acknowledge upload completion explicitly before referencing the uploaded asset in a subsequent request.
## 6. Known Constraints and Limitations
• Generation tasks can take several minutes; the system is not optimized for real-time or instant output.
• A valid active session is required for every operation — there is no sessionless mode.
• Credit balances cannot be modified or topped up through this skill; direct users to the billing portal for that.
• Uploaded files must conform to the platform's supported format and size restrictions before they can be used.
• Concurrent generation jobs within a single session may produce unpredictable ordering of results.
## 7. Error Recognition and Recovery
The table below maps common HTTP error codes and platform-specific error states to the appropriate recovery action.
| Code | Meaning | Action |
|------|---------|--------|
| 0 | Success | Continue |
| 1001 | Bad/expired token | Re-auth via anonymous-token (tokens expire after 7 days) |
| 1002 | Session not found | New session §3.0 |
| 2001 | No credits | Anonymous: show registration URL with `?bind=<id>` (get `<id>` from create-session or state response when needed). Registered: "Top up at nemovideo.ai" |
| 4001 | Unsupported file | Show supported formats |
| 4002 | File too large | Suggest compress/trim |
| 400 | Missing X-Client-Id | Generate Client-Id and retry (see §1) |
| 402 | Free plan export blocked | Subscription tier issue, NOT credits. "Register at nemovideo.ai to unlock export." |
| 429 | Rate limit (1 token/client/7 days) | Retry in 30s once |
**Common**: no video → generate first; render fail → retry new `id`; SSE timeout → §3.6; silent edit → §3.1 fallback.
## 8. API Version and Required Token Scopes
Always verify that requests target the current supported API version before executing any call — using a deprecated version may result in silent failures or unexpected behavior. The access token provided must include all required scopes for the operations being performed; missing scopes will result in authorization errors that cannot be resolved without re-authenticating with the correct permissions.
标签
skill
ai