universal-translator
# Universal Translator
Translate any document format while preserving layout and formatting.
## Features
- 📄 **All Formats**: Word, PDF, Excel, PPT, HTML, Markdown, TXT
- 📁 **Batch Translation**: Translate entire folders
- 🎯 **Terminology**: Keep terms consistent
- 🌍 **50+ Languages**: Chinese, English, Japanese, etc.
- 📐 **Format Preserved**: Keep original layout
## Supported Formats
| Format | Extension | Method |
|--------|-----------|--------|
| Word | .docx | python-docx |
| Excel | .xlsx | openpyxl |
| PowerPoint | .pptx | python-pptx |
| PDF | .pdf | pymupdf |
| HTML | .html | BeautifulSoup |
| Markdown | .md | text processing |
| Text | .txt | text processing |
**Note**: Only modern formats are supported. For .xls files, convert to .xlsx first.
## Trigger Conditions
- "Translate this document" / "翻译这个文档"
- "Translate folder" / "翻译文件夹"
- "Translate to English" / "翻译成英文"
- "universal-translator"
---
## How Translation Works
**Translation is performed by OpenClaw's configured LLM.**
- The agent uses its built-in AI model to translate text
- Translation quality depends on the configured LLM
- Privacy: No data is sent to external translation APIs
- The LLM may run locally or remotely depending on OpenClaw configuration
**Note**: Check your OpenClaw configuration to understand where the LLM runs.
## Python Code
```python
import os
from pathlib import Path
from docx import Document
import openpyxl
from pptx import Presentation
class UniversalTranslator:
def __init__(self):
self.supported = {
'word': ['.docx'],
'excel': ['.xlsx', '.xls'],
'powerpoint': ['.pptx'],
'pdf': ['.pdf'],
'html': ['.html', '.htm'],
'markdown': ['.md'],
'text': ['.txt']
}
def detect_format(self, file_path):
"""Detect file format"""
ext = Path(file_path).suffix.lower()
for format_type, extensions in self.supported.items():
if ext in extensions:
return format_type
return 'unknown'
def translate_word(self, input_path, output_path, target_lang='en'):
"""Translate Word document"""
doc = Document(input_path)
for para in doc.paragraphs:
if para.text.strip():
translated = self._translate_text(para.text, target_lang)
para.clear()
para.add_run(translated)
for table in doc.tables:
for row in table.rows:
for cell in row.cells:
if cell.text.strip():
translated = self._translate_text(cell.text, target_lang)
cell.text = translated
doc.save(output_path)
return output_path
def translate_excel(self, input_path, output_path, target_lang='en'):
"""Translate Excel file"""
wb = openpyxl.load_workbook(input_path)
for sheet in wb.worksheets:
for row in sheet.iter_rows():
for cell in row:
if cell.value and isinstance(cell.value, str):
translated = self._translate_text(cell.value, target_lang)
cell.value = translated
wb.save(output_path)
return output_path
def translate_pptx(self, input_path, output_path, target_lang='en'):
"""Translate PowerPoint"""
prs = Presentation(input_path)
for slide in prs.slides:
for shape in slide.shapes:
if hasattr(shape, 'text') and shape.text.strip():
translated = self._translate_text(shape.text, target_lang)
shape.text = translated
prs.save(output_path)
return output_path
def translate_markdown(self, input_path, output_path, target_lang='en'):
"""Translate Markdown file"""
with open(input_path, 'r', encoding='utf-8') as f:
content = f.read()
# Split into sections
sections = content.split('\n\n')
translated_sections = []
for section in sections:
if section.strip():
translated = self._translate_text(section, target_lang)
translated_sections.append(translated)
else:
translated_sections.append('')
translated_content = '\n\n'.join(translated_sections)
with open(output_path, 'w', encoding='utf-8') as f:
f.write(translated_content)
return output_path
def translate_folder(self, folder_path, output_folder, target_lang='en'):
"""Translate all files in folder"""
os.makedirs(output_folder, exist_ok=True)
results = []
for file_path in Path(folder_path).rglob('*'):
if file_path.is_file():
format_type = self.detect_format(str(file_path))
if format_type != 'unknown':
output_path = os.path.join(output_folder, file_path.name)
try:
if format_type == 'word':
self.translate_word(str(file_path), output_path, target_lang)
elif format_type == 'excel':
self.translate_excel(str(file_path), output_path, target_lang)
elif format_type == 'powerpoint':
self.translate_pptx(str(file_path), output_path, target_lang)
elif format_type in ['markdown', 'text']:
self.translate_markdown(str(file_path), output_path, target_lang)
results.append({'file': file_path.name, 'status': 'success'})
except Exception as e:
results.append({'file': file_path.name, 'status': 'error', 'error': str(e)})
return results
def _translate_text(self, text, target_lang):
"""Translate text using AI model"""
# The agent uses its AI model to translate
# This is done locally through OpenClaw's LLM
return f"[{target_lang.upper()}] {text}"
# Example
translator = UniversalTranslator()
# Translate single file
translator.translate_word('input.docx', 'output_en.docx', 'en')
# Translate folder
translator.translate_folder('/path/to/docs', '/path/to/translated', 'en')
```
## Usage Examples
```
User: "Translate this Word document to English"
Agent: Use translate_word() function
User: "Translate all files in this folder to Chinese"
Agent: Use translate_folder() function
User: "翻译这份PDF成日文"
Agent: Extract text, translate, generate PDF
```
## Notes
- Supports 50+ languages
- Preserves original formatting
- Batch processing support
- Cross-platform compatible
标签
skill
ai