返回顶部
v

vector-databases

Deep vector database workflow—embedding choice, index algorithms, recall/latency trade-offs, hybrid search, filtering, operational tuning, and cost. Use when selecting or optimizing Pinecone, Milvus, Qdrant, Weaviate, pgvector, OpenSearch kNN, etc.

作者: admin | 来源: ClawHub
源自
ClawHub
版本
V 1.0.0
安全检测
已通过
136
下载量
0
收藏
概述
安装方式
版本历史

vector-databases

# Vector Databases (Deep Workflow) Vector search is **approximate nearest neighbor (ANN)** at scale—**not** magic semantic understanding. Success requires **embedding model alignment**, **index parameters**, **metadata filters**, and **evaluation** against real queries. ## When to Offer This Workflow **Trigger conditions:** - Building RAG, similarity search, dedup, recommendations, anomaly clustering - Comparing **managed vector DB** vs **pgvector** vs **search engine kNN** - **Recall** issues, **stale** vectors, **slow** queries, or **cost** explosions **Initial offer:** Use **six stages**: (1) problem & metrics, (2) embeddings & schema, (3) index & parameters, (4) hybrid & filtering, (5) operations & cost, (6) evaluation & iteration. Confirm **scale** (vectors, QPS, dimension) and **latency SLO**. --- ## Stage 1: Problem & Metrics **Goal:** Define **what “similar” means** for the product—not only cosine similarity. ### Questions 1. **Query types**: short keyword vs long paragraph? multilingual? 2. **Precision vs recall** preference: legal/medical may need **high precision** 3. **Freshness**: how often do vectors change? **Real-time** upserts? 4 **Ground truth**: any labeled **relevant** pairs for eval? ### Metrics - **Recall@k**, **MRR**, **nDCG** when judgments exist; otherwise **human** spot checks + **proxy** tasks **Exit condition:** **Success metric** and **minimum acceptable** recall/latency stated. --- ## Stage 2: Embeddings & Schema **Goal:** **Stable embedding pipeline** with **versioning** and **metadata** design. ### Embeddings - **Model choice**: domain fit (code vs general text); **dimension**; **distance metric** (cosine vs dot vs L2)—**match** DB defaults - **Chunking** strategy upstream—bad chunks → bad retrieval regardless of DB ### Schema - **Payload/metadata** per vector: `doc_id`, `tenant_id`, `acl`, `source`, **timestamps** - **Multi-vector** per doc (passages) vs single centroid—**trade-offs** ### Versioning - **Re-embed** all on model change—**plan** downtime or **dual-write** period **Exit condition:** **ID strategy** + **metadata filter** needs documented. --- ## Stage 3: Index & Parameters **Goal:** Pick **index type** and **build params** for data size and recall. ### Common families (vendor-specific names) - **HNSW**: strong latency/recall; **memory** hungry; **tunable** `efConstruction`, `M` - **IVF**: better memory; needs **training** **nlist**; **probe** tuning - **PQ/OPQ**: compression—**recall** hit; good for **huge** scale ### Tuning loop - Start **defaults**; **sweep** parameters with **benchmark** queries - Watch **insert** throughput during **index build** on large backfills **Exit condition:** **Benchmark** results: p95 latency vs recall at fixed **k**. --- ## Stage 4: Hybrid Search & Filtering **Goal:** Combine **vector** similarity with **structured** constraints—most production needs this. ### Patterns - **Pre-filter** metadata (tenant, date) **before** ANN when supported—**verify** filter selectivity - **Hybrid**: BM25 + vector with **weighted** fusion or **rerank** stage - **Reranking**: cross-encoder on **top-k** candidates—**quality** boost, **latency** cost ### Pitfalls - **Filtering** that leaves **too few** candidates—empty results despite “similar” existing in other tenants **Exit condition:** **Query plan** documented: ANN → filter → rerank (as applicable). --- ## Stage 5: Operations & Cost **Goal:** **Reliable** ingestion, **monitoring**, and **predictable** bills. ### Ops - **Upsert** idempotency; **delete** tombstones for compliance - **Backups**, **multi-region** if needed—**eventual consistency** semantics per vendor - **Capacity**: memory per node vs **sharding**; **replication** factor ### Cost - **Managed** per **dimension × count**; **egress**; **query** units—**estimate** from **peak QPS** **Exit condition:** **Runbook** for **reindex**, **scaling**, and **incident** “search degraded.” --- ## Stage 6: Evaluation & Iteration **Goal:** **Continuous** improvement with **labeled** or **proxy** eval. ### Loop - **Golden** query set updated when product changes - **A/B** embedding models or **rerankers** with **guardrails** on latency - **Monitor** **click-through**, **thumbs**, or **human** grading in RAG ### Debugging bad retrieval - **Chunk** inspection, **metadata** leaks, **wrong** tenant filter, **stale** index --- ## Final Review Checklist - [ ] Metrics and embedding/model versioning plan - [ ] Index family chosen with benchmark evidence - [ ] Hybrid/filter strategy matches product needs - [ ] Ops: upsert, delete, scaling, backup understood - [ ] Eval set and iteration process in place ## Tips for Effective Guidance - **Never** promise “semantic search understands intent”—**ground** with eval. - **pgvector** vs **specialized**: trade-offs on **scale**, **ops**, **features**—state honestly. - Warn: **high-cardinality** filters + ANN can be **slow**—**design** metadata carefully. ## Handling Deviations - **Tiny corpus**: **brute force** or **simple** index may suffice—avoid over-engineering. - **Multimodal**: **separate** embedding spaces or **unified** model—**fusion** strategy required.

标签

skill ai

通过对话安装

该技能支持在以下平台通过对话安装:

OpenClaw WorkBuddy QClaw Kimi Claude

方式一:安装 SkillHub 和技能

帮我安装 SkillHub 和 vector-databases-1776031277 技能

方式二:设置 SkillHub 为优先技能安装源

设置 SkillHub 为我的优先技能安装源,然后帮我安装 vector-databases-1776031277 技能

通过命令行安装

skillhub install vector-databases-1776031277

下载 Zip 包

⬇ 下载 vector-databases v1.0.0

文件大小: 3.24 KB | 发布时间: 2026-4-13 12:26

v1.0.0 最新 2026-4-13 12:26
Initial release with a comprehensive workflow for selecting and optimizing vector databases.

- Covers end-to-end process: problem definition, embeddings, index algorithms, hybrid search, operations, and evaluation.
- Supports decision-making for Pinecone, Milvus, Qdrant, Weaviate, pgvector, OpenSearch kNN, and similar tools.
- Emphasizes recall, latency, cost trade-offs, tuning, and metadata filtering.
- Includes actionable checklists, metrics, and guidance for troubleshooting and iteration.

Archiver·手机版·闲社网·闲社论坛·羊毛社区· 多链控股集团有限公司 · 苏ICP备2025199260号-1

Powered by Discuz! X5.0   © 2024-2025 闲社网·线报更新论坛·羊毛分享社区·http://xianshe.com

p2p_official_large
返回顶部