tokenranger

# TokenRanger **TokenRanger** compresses session context through a local Ollama SLM before sending to cloud LLMs — reducing input token costs by **50–80%** per turn with graceful fallthrough if anything goes wrong. - **Plugin repo:** https://github.com/peterjohannmedina/openclaw-plugin-tokenranger - **npm:** `openclaw-plugin-tokenranger` - **Maintained by:** [@peterjohannmedina](https://github.com/peterjohannmedina) --- ## When to Load This Skill - User asks to install, configure, or troubleshoot TokenRanger - User wants to reduce token costs or enable context compression - User runs `/tokenranger` commands and needs help interpreting output - User wants to switch compression strategy (GPU/CPU/off) - User asks about upgrading or uninstalling TokenRanger --- ## How It Works ``` User message → OpenClaw gateway → before_agent_start hook → Turn 1: skip (full fidelity) → Turn 2+: send history to localhost:8100/compress → FastAPI sidecar runs LangChain LCEL chain via Ollama → Compressed summary prepended to context → Cloud LLM receives compressed context instead of full history ``` Inference strategy is auto-selected by GPU availability: | Strategy | Trigger | Model | Approach | |---|---|---|---| | `full` | GPU available | `mistral:7b` | Deep semantic summarization | | `light` | CPU only | `phi3.5:3b` | Extractive bullet points | | `passthrough` | Ollama unreachable | — | Truncate to last 20 lines | --- ## Install ### Step 1 — Install the plugin ```bash openclaw plugins install openclaw-plugin-tokenranger ``` To pin an exact version: ```bash openclaw plugins install openclaw-plugin-tokenranger@1.0.0 --pin ``` ### Step 2 — First-time setup ```bash openclaw tokenranger setup ``` This pulls Ollama models, creates the Python venv, installs FastAPI/LangChain deps, and registers the sidecar as a system service (systemd on Linux, launchd on macOS). ### Step 3 — Restart gateway ```bash openclaw gateway restart ``` ### Step 4 — Verify ```bash openclaw tokenranger ``` Should show current settings and sidecar status (reachable / unreachable). --- ## Configuration Set config values with: ```bash openclaw config set plugins.entries.tokenranger.config.<key> <value> openclaw gateway restart ``` | Key | Default | Description | |---|---|---| | `serviceUrl` | `http://127.0.0.1:8100` | TokenRanger sidecar URL | | `timeoutMs` | `10000` | Max wait before fallthrough | | `minPromptLength` | `500` | Min chars before compressing | | `ollamaUrl` | `http://127.0.0.1:11434` | Ollama API URL | | `preferredModel` | `mistral:7b` | Model for GPU strategy | | `compressionStrategy` | `auto` | `auto` / `full` / `light` / `passthrough` | | `inferenceMode` | `auto` | `auto` / `cpu` / `gpu` / `remote` | **Force CPU-only mode:** ```bash openclaw config set plugins.entries.tokenranger.config.compressionStrategy light openclaw config set plugins.entries.tokenranger.config.inferenceMode cpu openclaw gateway restart ``` --- ## Commands | Command | Description | |---|---| | `/tokenranger` | Show current settings and sidecar health | | `/tokenranger mode gpu` | Force GPU (full) compression | | `/tokenranger mode cpu` | Force CPU (light) compression | | `/tokenranger mode off` | Disable compression (passthrough) | | `/tokenranger model` | List available Ollama models | | `/tokenranger toggle` | Enable / disable the plugin | --- ## Upgrading ```bash # Check for updates (dry run) openclaw plugins update tokenranger --dry-run # Apply update openclaw plugins update tokenranger openclaw tokenranger setup # re-runs setup if sidecar deps changed openclaw gateway restart ``` To pin a specific version: ```bash openclaw plugins install openclaw-plugin-tokenranger@2026.3.1 --pin openclaw tokenranger setup openclaw gateway restart ``` List all published versions: ```bash npm view openclaw-plugin-tokenranger versions --json ``` --- ## Uninstalling ```bash openclaw plugins uninstall tokenranger openclaw gateway restart ``` Remove the sidecar service manually: ```bash # Linux systemctl --user stop tokenranger && systemctl --user disable tokenranger rm ~/.config/systemd/user/tokenranger.service # macOS launchctl unload ~/Library/LaunchAgents/com.peterjohannmedina.tokenranger.plist rm ~/Library/LaunchAgents/com.peterjohannmedina.tokenranger.plist ``` --- ## Troubleshooting **Sidecar unreachable after setup:** ```bash # Linux systemctl --user status tokenranger journalctl --user -u tokenranger -n 50 # macOS launchctl list | grep tokenranger cat ~/Library/Logs/tokenranger.log # Manual start (any platform) ~/.openclaw/extensions/tokenranger/service/start.sh ``` **Ollama not found:** ```bash curl http://127.0.0.1:11434/api/tags # If not running: ollama serve ``` **Compression not reducing tokens:** - Check `minPromptLength` — default 500 chars; short conversations are skipped by design - Run `/tokenranger` to confirm strategy is not `passthrough` - Check sidecar logs for errors **Graceful degradation:** TokenRanger never blocks a message. Any failure → silent fallthrough to uncompressed cloud LLM call. --- ## Performance Reference 5-turn Discord benchmark (GPU, `mistral:7b-instruct`): | Turn | Input tokens | Compressed | Reduction | |---|---|---|---| | 2 | 732 | 125 | 82.9% | | 3 | 1,180 | 150 | 87.3% | | 4 | 1,685 | 212 | 87.4% | | 5 | 2,028 | 277 | 86.3% | **Cumulative: 5,866 → 885 tokens (84.9% reduction)**

tokenranger

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载 Zip 包

tokenranger

tokenranger

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载 Zip 包

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement