Vector Databases

Vector databases store embeddings and provide fast similarity search. They use approximate nearest neighbor (ANN) algorithms to find the closest vectors in milliseconds, even across millions of documents. This guide covers four popular options.

FAISS

FAISS (Facebook AI Similarity Search) is a library for efficient similarity search and clustering of dense vectors. It runs entirely in memory and is best for single-machine use cases.

bash

uv add faiss-cpu

python

import faiss
import numpy as np
from openai import OpenAI

client = OpenAI()

# Sample documents
documents = [
    "Python is a popular programming language for data science.",
    "Docker containers package applications with their dependencies.",
    "Machine learning models learn patterns from training data.",
    "FastAPI is a modern web framework for building APIs.",
    "Neural networks are inspired by the human brain's architecture.",
    "Git tracks changes in source code over time.",
]

# Get embeddings
def get_embeddings(texts: list[str]) -> np.ndarray:
    response = client.embeddings.create(
        input=texts, model="text-embedding-3-small"
    )
    return np.array([item.embedding for item in response.data], dtype=np.float32)

embeddings = get_embeddings(documents)

# Build a FAISS index
dimension = embeddings.shape[1]
index = faiss.IndexFlatL2(dimension)
index.add(embeddings)

print(f"Indexed {index.ntotal} documents")

# Search
query_embedding = get_embeddings(["How do AI models learn?"])
distances, indices = index.search(query_embedding, k=3)

print("\nTop 3 results:")
for dist, idx in zip(distances[0], indices[0]):
    print(f"  [distance={dist:.3f}] {documents[idx]}")

FAISS with L2 Distance vs Inner Product

python

# L2 distance (Euclidean) — smaller is more similar
index_l2 = faiss.IndexFlatL2(dimension)

# Inner product — larger is more similar (use with normalized vectors)
index_ip = faiss.IndexFlatIP(dimension)

# For large datasets, use IVF (Inverted File) for faster search
nlist = 100  # Number of clusters
quantizer = faiss.IndexFlatL2(dimension)
index_ivf = faiss.IndexIVFFlat(quantizer, dimension, nlist)

# Train on the data (required for IVF)
index_ivf.train(embeddings)
index_ivf.add(embeddings)

# Set number of clusters to search (trade-off speed vs accuracy)
index_ivf.nprobe = 10

When to use FAISS

FAISS is ideal when all your data fits in memory on a single machine. For datasets under 10M vectors, FAISS is often faster than dedicated vector databases. For distributed deployments or multi-user access, use Chroma or Qdrant.

Chroma

Chroma is an open-source embedding database designed for building LLM applications. It handles embedding, storage, and retrieval in one package.

bash

uv add chromadb

python

import chromadb

# Create a persistent client
client = chromadb.PersistentClient(path="./chroma_db")

# Create or get a collection
collection = client.get_or_create_collection(
    name="tds_docs",
    metadata={"hnsw:space": "cosine"}  # Use cosine similarity
)

# Add documents (Chroma can generate embeddings automatically)
documents = [
    "Python is a popular programming language for data science.",
    "Docker containers package applications with their dependencies.",
    "Machine learning models learn patterns from training data.",
]
ids = ["doc1", "doc2", "doc3"]

collection.add(
    documents=documents,
    ids=ids,
    metadatas=[
        {"source": "python.md", "category": "programming"},
        {"source": "docker.md", "category": "devops"},
        {"source": "ml.md", "category": "ai"},
    ]
)

# Query
results = collection.query(
    query_texts=["How do AI models learn?"],
    n_results=3,
    where={"category": "ai"},  # Optional metadata filter
)

for doc, dist in zip(results["documents"][0], results["distances"][0]):
    print(f"[distance={dist:.3f}] {doc}")

# Update documents
collection.update(
    ids=["doc1"],
    documents=["Python is the most popular programming language for data science and ML."],
)

# Delete documents
collection.delete(ids=["doc3"])

PGVector

PGVector adds vector similarity search to PostgreSQL. If you already use Postgres, PGVector is the simplest way to add vector search.

sql

-- Enable the extension
CREATE EXTENSION IF NOT EXISTS vector;

-- Create a table with a vector column
CREATE TABLE documents (
    id SERIAL PRIMARY KEY,
    content TEXT NOT NULL,
    source TEXT,
    embedding VECTOR(1536)
);

-- Insert with embedding
INSERT INTO documents (content, source, embedding)
VALUES (
    'Python is a popular programming language.',
    'python.md',
    '[0.0023, -0.0147, 0.0381, ...]'::vector
);

-- Create an index for fast search
CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);

-- Query for similar documents
SELECT content, source, 1 - (embedding <=> '[0.01, -0.02, ...]'::vector) AS similarity
FROM documents
ORDER BY embedding <=> '[0.01, -0.02, ...]'::vector
LIMIT 5;

Python usage:

python

import psycopg2
from pgvector.psycopg2 import register_vector

conn = psycopg2.connect("postgresql://user:pass@localhost/tds_db")
register_vector(conn)

# Insert
cur = conn.cursor()
cur.execute(
    "INSERT INTO documents (content, source, embedding) VALUES (%s, %s, %s)",
    ("Python is popular", "python.md", embedding_array)
)

# Search
cur.execute(
    """SELECT content, 1 - (embedding <=> %s) AS similarity
       FROM documents ORDER BY embedding <=> %s LIMIT 5""",
    (query_embedding, query_embedding)
)

Qdrant

Qdrant is a high-performance vector database with advanced filtering and payload support.

bash

uv add qdrant-client

python

from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams, PointStruct

# Connect to local Qdrant (or use Qdrant Cloud)
client = QdrantClient(":memory:")  # In-memory for testing
# client = QdrantClient(url="http://localhost:6333")  # Production

# Create a collection
client.create_collection(
    collection_name="tds_docs",
    vectors_config=VectorParams(size=1536, distance=Distance.COSINE),
)

# Insert points
points = [
    PointStruct(
        id=1,
        vector=embedding1,
        payload={"content": "Python is popular", "source": "python.md"}
    ),
    PointStruct(
        id=2,
        vector=embedding2,
        payload={"content": "Docker packages apps", "source": "docker.md"}
    ),
]
client.upsert(collection_name="tds_docs", points=points)

# Search with metadata filtering
results = client.search(
    collection_name="tds_docs",
    query_vector=query_embedding,
    query_filter={
        "must": [{"key": "source", "match": {"value": "python.md"}}]
    },
    limit=5,
)

Comparison

Feature	FAISS	Chroma	PGVector	Qdrant
Setup	Library	Server/Embedded	Postgres extension	Server/Docker
Persistence	Manual	Built-in	Postgres	Built-in
Metadata filter	No	Yes	SQL WHERE	Yes
Distributed	No	No	Postgres replication	Yes
Best for	Prototyping	Small apps	Existing Postgres	Production RAG

FAISS​

FAISS with L2 Distance vs Inner Product​

Chroma​

PGVector​

Qdrant​

Comparison​

FAISS

FAISS with L2 Distance vs Inner Product

Chroma

PGVector

Qdrant

Comparison