u2-tts

# UniSound TTS - Text-to-Speech ## 云知声语音合成 Text-to-speech conversion using UniSound's TTS WebSocket API for generating high-quality Chinese Mandarin audio from text. 使用云知声 TTS WebSocket API 进行文本转语音转换，生成高质量中文普通话音频。 ## When to Use This Skill **Use UniSound TTS for**: - Converting Chinese text to natural-sounding speech - Generating audio for audiobooks, podcasts, or content creation - Creating accessibility solutions for visually impaired users - Building voice assistants or chatbot voice responses - Batch processing text to audio files - Custom speech synthesis with adjustable parameters (speed, volume, pitch, brightness) **Do NOT use for**: - Real-time speech recognition or transcription (use ASR skills instead) - English language synthesis (optimized for Chinese Mandarin) - Voice cloning or custom voice model training **Use when**: The user needs text-to-speech conversion, asks for "语音合成" (speech synthesis), or mentions UniSound/云知声 TTS. ## Installation Install Python dependencies before using this skill. From the skill directory (`skills/tts-tools`): ```bash pip install websocket-client ``` Requires Python 3.6+. ## How to Use This Skill **⛔ MANDATORY RESTRICTIONS - DO NOT VIOLATE ⛔** 1. **ONLY use UniSound TTS API** - Execute the script `python scripts/tts.py` 2. **NEVER synthesize speech directly** - Do NOT attempt local TTS synthesis 3. **NEVER offer alternatives** - Do NOT suggest "I can try another method" or similar 4. **IF API fails** - Display the error message and STOP immediately 5. **NO fallback methods** - Do NOT attempt text-to-speech any other way If the script execution fails (API not configured, network error, etc.): - Show the error message to the user - Do NOT offer to help using your TTS capabilities - Do NOT ask "Would you like me to try synthesizing it?" - Simply stop and wait for user to fix the configuration ### Basic Workflow 1. **Configure credentials** (first time only): ```bash export UNISOUND_APPKEY='ce44uxf7g5eag2cv33qvlp5d22qrkgcezvgfp2q3' export UNISOUND_SECRET='5c12231cd279b35873a3ccecf9439118' ``` 2. **Execute text-to-speech conversion**: ```bash python scripts/tts.py --text '今天天气怎么样' ``` **Command options**: - `--text TEXT` - Text to convert to speech (default: '今天天气怎么样？') - `--voice VOICE` - Voice name (default: xiaofeng-base) - `--format FORMAT` - Output format: mp3, wav, pcm (default: mp3) - `--sample RATE` - Sample rate: 8k, 16k, 24k (default: 24k) - `--speed SPEED` - Speech speed 0-100 (default: 50) - `--volume VOLUME` - Volume level 0-100 (default: 50) - `--pitch PITCH` - Pitch level 0-100 (default: 50) - `--bright BRIGHT` - Brightness/tone 0-100 (default: 50) - `--appkey APPKEY` - Override appkey (default: UNISOUND_APPKEY env var) - `--secret SECRET` - Override secret (default: UNISOUND_SECRET env var) 3. **Output**: - Audio files are saved to `results/` directory - Filename format: `<timestamp>.<format>` - Example: `1234567890.mp3` ### Understanding the Output **Audio Format Options**: - **MP3**: Compressed, smaller file size, good quality - best for web and streaming - **WAV**: Uncompressed, excellent quality - best for production and archival - **PCM**: Raw audio data - best for further audio processing **Sample Rates**: - **24k**: High quality, default - recommended for most use cases - **16k**: Standard quality - good balance of quality and size - **8k**: Lower quality, smaller file size - suitable for telephony ### Usage Examples **Example 1: Quick Start with Test Credentials** ```bash # Set test credentials export UNISOUND_APPKEY='ce44uxf7g5eag2cv33qvlp5d22qrkgcezvgfp2q3' export UNISOUND_SECRET='5c12231cd279b35873a3ccecf9439118' # Convert text to speech python scripts/tts.py --text '你好世界' ``` Output: `results/1234567890.mp3` **Example 2: Custom Voice and Format** ```bash python scripts/tts.py --text '今天天气怎么样' --voice xiaofeng-base --format wav ``` Output: High-quality WAV file with male voice **Example 3: Adjusted Speech Parameters** ```bash python scripts/tts.py --text '快速朗读' --speed 70 --volume 60 --pitch 50 ``` Output: Faster speech with increased volume **Example 4: High-Quality Audio Production** ```bash python scripts/tts.py --text '高质量音频' --format wav --sample 24k --volume 60 ``` Output: Production-quality WAV file at 24kHz **Example 5: Command-line Credential Override** ```bash python scripts/tts.py \ --text '测试' \ --appkey 'ce44uxf7g5eag2cv33qvlp5d22qrkgcezvgfp2q3' \ --secret '5c12231cd279b35873a3ccecf9439118' ``` ### How It Works The script uses the UniSound TTS WebSocket API with the following workflow: 1. **Authenticate** using SHA256 signature (appkey + timestamp + secret) 使用 SHA256 签名进行身份验证 2. **Establish WebSocket connection** to `wss://ws-stts.hivoice.cn/v1/tts` 建立 WebSocket 连接到云知声 TTS 服务 3. **Send TTS request** with text and voice parameters 发送包含文本和语音参数的 TTS 请求 4. **Receive streaming audio data** in binary chunks 以二进制块形式接收流式音频数据 5. **Save audio file** to the results directory 将音频文件保存到结果目录 ### Available Voices | Voice | Type | Description | |-------|------|-------------| | xiaofeng-base | Male | Standard male voice, clear and natural | | xiaoyan | Female | Female voice options | | xiaomei | Female | Alternative female voice | | Custom voices | Various | Contact UniSound for more options | ### Adjustable Parameters | Parameter | Range | Default | Description | |-----------|-------|---------|-------------| | speed | 0-100 | 50 | Speech speed (50 = normal, higher = faster) | | volume | 0-100 | 50 | Volume level (50 = normal, higher = louder) | | pitch | 0-100 | 50 | Pitch level (50 = normal, higher = higher) | | bright | 0-100 | 50 | Brightness/tone (50 = normal) | **Recommended settings**: - Audiobooks: speed 45, pitch 50 - News/announcements: speed 55, volume 60, bright 60 - Accessibility: speed 35-40, volume 70 - Normal conversation: speed 50, all parameters 50 ## First-Time Configuration **When credentials are not configured**: The script will show: ``` Error: AppKey and Secret are required! Set them via --appkey/--secret arguments or UNISOUND_APPKEY/UNISOUND_SECRET environment variables. ``` ### Test Credentials For testing and evaluation, use these credentials: 用于测试和评估，请使用以下凭据： ```bash export UNISOUND_APPKEY='ce44uxf7g5eag2cv33qvlp5d22qrkgcezvgfp2q3' export UNISOUND_SECRET='5c12231cd279b35873a3ccecf9439118' ``` > **⚠️ Important Security Notice / 重要安全提示** > > - **Test credentials only** — These are for testing and evaluation purposes > - **仅测试凭据**——这些凭据仅供测试和评估使用 > - **No sensitive data** — Never use with production or sensitive content > - **勿用于敏感数据**——切勿用于生产或敏感内容 > - **Get your own credentials** — For production use, contact UniSound > - **获取自己的凭据**——生产环境请联系云知声 > - **Data privacy** — Text is sent to UniSound servers for processing > - **数据隐私**——文本将发送至云知声服务器进行处理 ### Obtaining Production Credentials For production use, obtain API credentials from UniSound (云知声): 用于生产环境时，请从云知声获取 API 凭据： 1. **Contact UniSound** to obtain your API credentials 联系云知声获取您的 API 凭据 Visit: https://www.unisound.com/ 2. **You will receive**: 您将收到： - **AppKey**: Application key / 应用密钥 - **Secret**: Secret key for authentication / 认证密钥 ### Configuration Methods **Method 1: Environment Variables (Recommended)** *Linux/macOS:* ```bash export UNISOUND_APPKEY='ce44uxf7g5eag2cv33qvlp5d22qrkgcezvgfp2q3' export UNISOUND_SECRET='5c12231cd279b35873a3ccecf9439118' python scripts/tts.py --text '你好' ``` *Windows (PowerShell):* ```powershell $env:UNISOUND_APPKEY='ce44uxf7g5eag2cv33qvlp5d22qrkgcezvgfp2q3' $env:UNISOUND_SECRET='5c12231cd279b35873a3ccecf9439118' python scripts/tts.py --text '你好' ``` *Windows (CMD):* ```cmd set UNISOUND_APPKEY=ce44uxf7g5eag2cv33qvlp5d22qrkgcezvgfp2q3 set UNISOUND_SECRET=5c12231cd279b35873a3ccecf9439118 python scripts/tts.py --text '你好' ``` **Method 2: .env File (Recommended for Development)** Create a `.env` file in the project root: ``` UNISOUND_APPKEY=ce44uxf7g5eag2cv33qvlp5d22qrkgcezvgfp2q3 UNISOUND_SECRET=5c12231cd279b35873a3ccecf9439118 ``` Then use with `python-dotenv` or load in your shell. > **Security Note**: Never commit `.env` files or actual production credentials to version control. > **安全提示**：切勿将 `.env` 文件或实际生产凭据提交到版本控制系统。 **Method 3: Command-Line Arguments** ```bash python scripts/tts.py \ --text '你好世界' \ --appkey 'ce44uxf7g5eag2cv33qvlp5d22qrkgcezvgfp2q3' \ --secret '5c12231cd279b35873a3ccecf9439118' ``` ### Required Environment Variables | Variable | Required | Description | |----------|----------|-------------| | `UNISOUND_APPKEY` | **Yes** | Application key / 应用密钥 | | `UNISOUND_SECRET` | **Yes** | Secret key / 认证密钥 | ### Python API Usage **Basic Python API**: ```python import os from scripts.tts import Ws_parms, do_ws, write_results # Get credentials from environment variables appkey = os.getenv('UNISOUND_APPKEY', 'ce44uxf7g5eag2cv33qvlp5d22qrkgcezvgfp2q3') secret = os.getenv('UNISOUND_SECRET', '5c12231cd279b35873a3ccecf9439118') # Configure TTS parameters ws_parms = Ws_parms( url='wss://ws-stts.hivoice.cn/v1/tts', appkey=appkey, secret=secret, pid=1, vcn='xiaofeng-base', text='你好，欢迎使用云知声语音合成服务！', tts_format='mp3', tts_sample='24k', user_id='my-app', ) # Execute TTS conversion do_ws(ws_parms) # Save result to file write_results(ws_parms) print('Audio saved to results/ directory!') ``` ## Error Handling **Authentication failed**: ``` Error: AppKey and Secret are required! ``` → Credentials not provided → Set UNISOUND_APPKEY and UNISOUND_SECRET environment variables → 未提供凭据，请设置环境变量 **WebSocket connection error**: ``` WebSocket error: ... ``` → Check network connectivity to UniSound API → Verify the API endpoint URL is correct → Check if firewall is blocking WebSocket connections → 检查网络连接和防火墙设置 **No audio data received**: ``` Error: No audio data received ``` → Text may be empty or contain invalid characters → Check the text parameter is not empty → Verify text encoding is UTF-8 → Credentials may be invalid → 检查文本内容、编码和凭据 **Invalid speech parameter**: ``` Error: speed must be between 0 and 100, got 150 ``` → Speech parameters must be between 0 and 100 → 语音参数必须在 0 到 100 之间 **WebSocket connection timeout**: ``` WebSocket error: timeout ``` → Network connection issue → API service may be temporarily unavailable → Check internet connection → 网络连接问题或服务暂时不可用 ## Advanced Usage ### Custom Speech Parameters ```python import os from scripts.tts import Ws_parms, do_ws, write_results appkey = os.getenv('UNISOUND_APPKEY', 'ce44uxf7g5eag2cv33qvlp5d22qrkgcezvgfp2q3') secret = os.getenv('UNISOUND_SECRET', '5c12231cd279b35873a3ccecf9439118') ws_parms = Ws_parms( url='wss://ws-stts.hivoice.cn/v1/tts', appkey=appkey, secret=secret, pid=1, vcn='xiaofeng-base', text='这是自定义参数的语音合成示例', tts_format='wav', tts_sample='24k', user_id='demo', ) # Customize speech parameters ws_parms.tts_speed = 60 # Faster speech (0-100) ws_parms.tts_volume = 70 # Louder volume (0-100) ws_parms.tts_pitch = 40 # Lower pitch (0-100) ws_parms.tts_bright = 60 # Brighter tone (0-100) do_ws(ws_parms) write_results(ws_parms) ``` ### Batch Text Processing ```python import os from scripts.tts import Ws_parms, do_ws, write_results def batch_tts(text_list): """Convert multiple texts to audio files""" appkey = os.getenv('UNISOUND_APPKEY', 'ce44uxf7g5eag2cv33qvlp5d22qrkgcezvgfp2q3') secret = os.getenv('UNISOUND_SECRET', '5c12231cd279b35873a3ccecf9439118') for i, text in enumerate(text_list): ws_parms = Ws_parms( url='wss://ws-stts.hivoice.cn/v1/tts', appkey=appkey, secret=secret, pid=i, vcn='xiaofeng-base', text=text, tts_format='mp3', tts_sample='24k', user_id=f'batch-{i}', ) do_ws(ws_parms) write_results(ws_parms) print(f"Generated: {text[:30]}...") # Usage texts = [ "第一段文字", "第二段文字", "第三段文字" ] batch_tts(texts) ``` ### Audiobook Chapter Converter ```python import os from scripts.tts import Ws_parms, do_ws, write_results def convert_chapter(chapter_text, chapter_num, voice='xiaofeng-base'): """Convert a book chapter to audio file""" # Add chapter announcement intro = f"第{chapter_num}章。" full_text = intro + chapter_text appkey = os.getenv('UNISOUND_APPKEY', 'ce44uxf7g5eag2cv33qvlp5d22qrkgcezvgfp2q3') secret = os.getenv('UNISOUND_SECRET', '5c12231cd279b35873a3ccecf9439118') ws_parms = Ws_parms( url='wss://ws-stts.hivoice.cn/v1/tts', appkey=appkey, secret=secret, pid=chapter_num, vcn=voice, text=full_text, tts_format='mp3', tts_sample='24k', user_id=f'audiobook-ch{chapter_num}', ) # Slower, clearer reading for books ws_parms.tts_speed = 45 ws_parms.tts_pitch = 50 do_ws(ws_parms) write_results(ws_parms) print(f"Chapter {chapter_num} converted") # Usage chapter = """这是第一章的内容。在一个阳光明媚的早晨，主人公开始了他的冒险之旅。""" convert_chapter(chapter, 1) ``` ### Accessibility Helper ```python import os from scripts.tts import Ws_parms, do_ws, write_results def accessibility_reader(text, speed='normal', voice='xiaofeng-base'): """ Text-to-speech for accessibility (visually impaired users) with customizable reading speed """ speed_map = { 'slow': 35, 'normal': 50, 'fast': 65 } appkey = os.getenv('UNISOUND_APPKEY', 'ce44uxf7g5eag2cv33qvlp5d22qrkgcezvgfp2q3') secret = os.getenv('UNISOUND_SECRET', '5c12231cd279b35873a3ccecf9439118') ws_parms = Ws_parms( url='wss://ws-stts.hivoice.cn/v1/tts', appkey=appkey, secret=secret, pid=1, vcn=voice, text=text, tts_format='mp3', tts_sample='24k', user_id='accessibility', ) ws_parms.tts_speed = speed_map.get(speed, 50) ws_parms.tts_volume = 70 # Higher volume for accessibility do_ws(ws_parms) write_results(ws_parms) return ws_parms.tts_stream # Usage article = "这是一篇重要的新闻文章。" accessibility_reader(article, speed='slow') ``` ## Important Notes - **Chinese language optimized** - Best results with Simplified Chinese text **中文优化**——简体中文文本效果最佳 - **Requires stable internet connection** for WebSocket streaming **需要稳定的网络连接**进行 WebSocket 流式传输 - **Audio files saved locally** - Check `results/` directory for output **音频文件保存在本地**——输出文件在 `results/` 目录 - **Text encoding** - Ensure text is UTF-8 encoded **文本编码**——确保文本为 UTF-8 编码 - **Default sample rate is 24k** - Higher quality than standard 16k **默认采样率为 24k**——比标准 16k 质量更高 - **Test credentials** - Provided for testing and evaluation only **测试凭据**——提供的凭据仅供测试和评估使用 ## Security Best Practices - **For testing** - Use the provided test credentials **测试使用**——使用提供的测试凭据 - **For production** - Always obtain your own credentials from UniSound **生产环境**——始终从云知声获取您自己的凭据 - **Use environment variables** - Store credentials securely in environment variables **使用环境变量**——安全地将凭据存储在环境变量中 - **Never hardcode credentials** - Don't embed production credentials in code **切勿硬编码凭据**——不要在代码中嵌入生产凭据 - **Use .env files** - For local development (add to .gitignore) **使用 .env 文件**——用于本地开发（添加到 .gitignore） - **Rotate credentials regularly** - In production environments **定期轮换凭据**——在生产环境中 ## Troubleshooting **Issue**: Script fails with import error → Ensure dependencies are installed: `pip install websocket-client` → Ensure using Python 3.6 or later → 确保安装依赖并使用 Python 3.6 或更高版本 **Issue**: "AppKey and Secret are required!" error → Set UNISOUND_APPKEY and UNISOUND_SECRET environment variables → Or use --appkey and --secret command-line arguments → 设置环境变量或使用命令行参数 **Issue**: Poor audio quality → Try using WAV format with 24k sample rate → Adjust speech parameters for your use case → 尝试使用 WAV 格式和 24k 采样率 **Issue**: WebSocket connection timeout → Check network connectivity → Verify firewall allows WebSocket connections → Check if API service is operational → 检查网络连接和防火墙设置 **Issue**: Generated audio sounds unnatural → Adjust speed parameter (try 45-55 range) → Check text for proper punctuation → Consider breaking long sentences into shorter ones → 调整语速参数和文本标点 **Issue**: Test credentials stopped working → Test credentials may have expiration or rate limits → Contact UniSound to obtain your own credentials → 测试凭据可能已过期或达到速率限制 → 请联系云知声获取您自己的凭据 ## Tips and Best Practices - **For audiobooks**: Use speed 45, add chapter announcements **有声读物**：使用速度 45，添加章节说明 - **For accessibility**: Use speed 35-40, higher volume (70) **无障碍应用**：使用速度 35-40，更高音量（70） - **For news**: Use speed 55, brighter tone (60) **新闻播报**：使用速度 55，更明亮的语调（60） - **For batch processing**: Implement delays between requests **批量处理**：在请求之间实现延迟 - **For production**: Add error handling and retry logic **生产环境**：添加错误处理和重试逻辑 - **For best quality**: Use 24k sample rate with WAV format **最佳质量**：使用 24k 采样率和 WAV 格式 ## Reference Documentation - [UniSound Official Site](https://www.unisound.com/) - [WebSocket Client Documentation](https://websocket-client.readthedocs.io/) - [TTS API Documentation](https://www.unisound.com/tts-api) Load these reference documents when: - Debugging API connection issues - Understanding advanced features - Need detailed API parameter information ## Authentication Details The UniSound TTS API uses SHA256 signature-based authentication: ```python # Signature format (automatically generated by Ws_parms class) # SHA256(appkey + timestamp + secret).upper() # Manual signature example (if needed): import hashlib import time def generate_signature(appkey, secret): timestamp = str(int(time.time() * 1000)) hs = hashlib.sha256() hs.update((appkey + timestamp + secret).encode('utf-8')) signature = hs.hexdigest().upper() return timestamp, signature ``` **WebSocket URL format**: ``` wss://ws-stts.hivoice.cn/v1/tts?time={timestamp}&appkey={appkey}&sign={signature} ``` > **Note**: API capabilities, available voices, and rate limits are determined by your UniSound TTS API service configuration and subscription plan. > **注意**：API 功能、可用语音和速率限制由您的云知声 TTS API 服务配置和订阅计划决定。

u2-tts

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载 Zip 包

u2-tts

u2-tts

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载 Zip 包

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement