DocIntel

In Progress

Enterprise Document Intelligence

A 10-component distributed document intelligence system built with Kotlin API gateway, Python Haystack RAG pipeline, and full Docker orchestration. Demonstrates production-grade patterns: hybrid search with RRF fusion, multi-tenant isolation, semantic caching, and LiteLLM provider abstraction.

GitHub ↗ Medium

// screenshots

// features

What it does

🔍

Hybrid Search with RRF Fusion

Combines BM25 keyword search and vector similarity for superior retrieval quality.

🔒

Multi-Tenant Document Isolation

Document-level access control via Qdrant payload filtering with pre-filter ACLs.

⚡

Semantic Caching

Qdrant dual-use for vector storage and query caching, reducing LLM costs.

🔄

LiteLLM Provider Abstraction

Same code for Ollama (local) or any cloud LLM provider — no vendor lock-in.

📊

Langfuse Observability

Self-hosted tracing and monitoring for the entire RAG pipeline.

// architecture

Under the hood

10-component distributed system with Kotlin API gateway, Python RAG pipeline, and Docker orchestration.

API Gateway

Kotlin/Spring Boot

REST API, authentication, rate limiting

Document Service

Kotlin/Spring Boot

Document ingestion, chunking, processing

RAG Service

Python/Haystack + LiteLLM

Query pipeline with hybrid retrieval and reranking

Admin Service

Kotlin/Spring Boot

Tenant management and system administration

Qdrant

Vector DB

Vector storage and semantic cache

PostgreSQL + pgvector

Database

Metadata storage and fallback vector search

// user journeys

How it gets used

Upload and Query

1 Upload documents via API gateway
2 Documents chunked and embedded automatically
3 Ask natural language questions
4 Get sourced answers with citations

// tech stack

HaystackKotlinQdrantLiteLLMDocker