RAG with pgvector and Supabase: From Zero to Semantic Search
How to build a production RAG pipeline using pgvector on Supabase — chunking strategy, embedding batches, cosine similarity search, and a cache layer to cut API costs.
RAG with pgvector and Supabase: From Zero to Semantic Search Retrieval-Augmented Generation RAG is how you give an LLM memory without fine-tuning. This is the exact implementation we use in our Company Brain — a searchable knowledge base of 300+ Twitter bookmarks and internal docs. Why pgvector Over a Dedicated Vector DB Pinecone, Weaviate, Qdrant — all great, all add infrastructure cost and complexity. If you're already on Supabase, pgvector gives you vector search inside your existing Postgres. At <10k rows, performance is identical. Schema No index needed under 1000 rows — sequential scan is fast enough and simpler. Embedding with Gemini Free Gemini gemini-embedding-001 produces 3072-dim embeddings and is free tier. We batch in groups of 50 API limit is 100, we use 50 for safety: Partial failures return zero-vectors and set metadata.needs_reembed=True — fixed on next sync without dropping the row. Cache Layer to Avoid Re-embedding Queries The same question asked twice should not hit the embedding API twice: Threshold is critical. 0.95 means only near-identical queries hit cache. Lower it and you get false matches — "what services do we offer" matching "what are the biggest AI trends". We learned this the hard way. Full Query Pipeline This pattern — embed → cache check → retrieve → generate → cache write — handles 90% of RAG use cases. --- We Run This in Production This RAG pipeline is the Company Brain behind our AI assistant — it answers questions about our services, past projects, and industry knowledge by searching a vector database of 300+ curated sources, with a cache layer that cuts API costs to near zero. If you want a knowledge base that actually answers questions about your business — rather than hallucinating — this is the architecture to build on. Talk to us about building your Company Brain →https://appopoleis.com/contact --- Related Reading - LLM Router with Fallback Chains and Circuit Breakershttps://appopoleis.com/blog/llm-router-fallback-chains-ci
Appopoleis Team