turboquant-memory

# TurboQuant Memory Compress embedding vectors 5-8x with 98%+ search accuracy using TurboQuant (Google, ICLR 2026). ## Quick Start ### 1. Run tests ```bash python3 scripts/turboquant.py ``` 15 built-in tests: FWHT correctness, MSE distortion, IP correlation, recall, compression ratio, determinism. ### 2. Validate on your data ```bash python3 scripts/validate.py --db /path/to/memory.sqlite --auto-detect --bits 5 ``` Auto-detects sqlite-vec `vec0` tables, analyzes distribution, reports quantization quality and recall. ### 3. Quantize a memory database ```bash python3 scripts/memory_quantize.py --db /path/to/memory.db --bits 5 --benchmark python3 scripts/memory_quantize.py --db /path/to/memory.db --bits 5 --migrate ``` ### 4. Integrate into code ```python from turboquant import TurboQuantMSE # Initialize (deterministic — same seed = same quantization) tq = TurboQuantMSE(dim=3072, bits=5) # Quantize for storage stored = tq.quantize(embedding_vector) # float32 → compressed # Reconstruct reconstructed = tq.dequantize(stored) # compressed → float32 # Search: query stays float32, database is quantized q_rot = tq.rotation.apply(query) for doc in database: score = doc['norm'] * doc['scale'] * np.dot(q_rot, tq.codebook[doc['indices']]) ``` ## Recommended Configuration | Preset | Mode | Bits | R@1 | Compression | Use Case | |--------|------|------|-----|-------------|----------| | **Default** | MSE | **5** | **98%** | **6.4x** | Most memory/RAG search | | Conservative | MSE | 6 | 98%+ | 5.3x | High-fidelity retrieval | | Aggressive | MSE | 4 | 92% | 8.0x | Large-scale, storage-constrained | ## Parameters | Parameter | Default | Description | |-----------|---------|-------------| | `dim` | auto-detect | Embedding dimension (768, 1536, 3072, etc.) | | `bits` | 5 | Bits per coordinate. See table above. | | `seed` | 42 | Rotation seed. Same seed = reproducible quantization. | ## Algorithm **Blockwise Hadamard Rotation → Lloyd-Max Scalar Quantization** 1. Split vector into power-of-2 blocks (e.g., 3072 = 3 × 1024) 2. Per block: random sign flip + Fast Walsh-Hadamard Transform (fully invertible) 3. Per-vector scale normalization 4. Lloyd-Max optimal scalar quantizer per coordinate (precomputed codebook for N(0,1)) 5. Pack indices into compact bit representation Key properties: - **Data-oblivious**: no training or calibration needed - **Fully invertible**: zero information loss from rotation - **Near-optimal**: within 2.7x of Shannon information-theoretic lower bound - **Deterministic**: same seed = same output See [references/algorithm.md](references/algorithm.md) for full details. ## Benchmark (Gemini embedding-001, 3072-dim, 112 vectors) | Bits | MSE | Cosine | R@1 | R@5 | R@10 | Bytes/vec | Compression | |------|-----|--------|-----|-----|------|-----------|-------------| | 3 | 1.1e-5 | 0.982 | 88% | 90% | 91% | 1,160 | 10.6x | | 4 | 3.2e-6 | 0.995 | 92% | 93% | 93% | 1,544 | 8.0x | | **5** | **8.2e-7** | **0.999** | **98%** | **96%** | **96%** | **1,928** | **6.4x** | | 6 | 2.2e-7 | 1.000 | 96% | 98% | 98% | 2,312 | 5.3x | | 7 | 8e-8 | 1.000 | 100% | 98% | 99% | 2,696 | 4.6x | | 8 | 3e-8 | 1.000 | 98% | 98% | 99% | 3,080 | 4.0x | ## Compatibility - Python 3.9+, **numpy only** (no scipy, no GPU) - Any embedding dimension ≥ 128 - Any embedding model (Gemini, OpenAI, Cohere, sentence-transformers, etc.) - SQLite / sqlite-vec `vec0` tables (auto-detected) ## References - TurboQuant paper: [arXiv:2504.19874](https://arxiv.org/abs/2504.19874) (ICLR 2026) - PolarQuant paper: [arXiv:2502.02617](https://arxiv.org/abs/2502.02617) (AISTATS 2026)

turboquant-memory

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载 Zip 包

turboquant-memory