Upload and Query
- 1 Upload documents via API gateway
- 2 Documents chunked and embedded automatically
- 3 Ask natural language questions
- 4 Get sourced answers with citations
Enterprise Document Intelligence
A 10-component distributed document intelligence system built with Kotlin API gateway, Python Haystack RAG pipeline, and full Docker orchestration. Demonstrates production-grade patterns: hybrid search with RRF fusion, multi-tenant isolation, semantic caching, and LiteLLM provider abstraction.
// screenshots
// features
Combines BM25 keyword search and vector similarity for superior retrieval quality.
Document-level access control via Qdrant payload filtering with pre-filter ACLs.
Qdrant dual-use for vector storage and query caching, reducing LLM costs.
Same code for Ollama (local) or any cloud LLM provider — no vendor lock-in.
Self-hosted tracing and monitoring for the entire RAG pipeline.
// architecture
10-component distributed system with Kotlin API gateway, Python RAG pipeline, and Docker orchestration.

// user journeys
// tech stack
Ready when you are