-
-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Welcome to the LLM-QA-System wiki!
- Project Purpose
- Why This Project?
- What Makes This Project Different?
- How It Works (Architecture)
- How We Created It
- Getting Started
- API Reference
- Troubleshooting & FAQ
The LLM-QA-System was created to make technical documentation truly searchable and interactive.
It allows users and teams to ask natural language questions about large, complex documentation and receive context-aware answers.
Motivation:
- Technical docs become huge and hard to search.
- Keyword search is too basic; it misses synonyms and context.
- LLMs and vector search enable "semantic" Q&A.
Goals:
- High-quality, interactive Q&A over any docs.
- Modular, open-source, and easy deployment.
- Extensible for any domain or team.
- Existing solutions are often closed, expensive, or hard to extend.
- Keyword search misses intent and context.
- This project is fast, open, modular, and easy to run anywhere.
- Vision: Empower everyone to "chat with their docs" using state-of-the-art, open technology.
- Modular: Rust backend and Python ML service are independent.
- Open: All code is open-source and easily extensible.
- Fast: Async Rust, efficient Python, and FAISS for semantic search.
- Trustworthy: Returns both answers and the source snippets for verification.
- Easy to Deploy: One-command Docker Compose setup.
- Developer Friendly: Clear and documented API, easy to extend or swap out any component.
Components:
-
Rust Backend API (Actix-web):
Receives REST requests, forwards queries to Python, and returns answers plus sources. -
Python ML Service (FastAPI):
Loads a transformer model (e.g., BERT), embeds docs and queries, uses FAISS for similarity search, and returns the best matches. -
FAISS Vector Database:
Stores vector embeddings of doc chunks for fast semantic search.
Typical Flow:
- User sends a question to the Rust backend (
/api/query
). - Rust backend forwards the question to the Python ML service.
- Python ML service embeds the question, searches the FAISS database, and finds the best matching doc snippets.
- Python service returns the answer and sources to Rust backend.
- Rust backend responds to the user.
Diagram:
Client (UI/CLI) <---> Rust Backend API <---> Python ML Inference Service <---> FAISS Vector DB
-
Requirements:
- Needed a better documentation search/Q&A tool.
- Decided on features: semantic search, modular, open, easy to deploy.
-
Technology Choices:
- Rust (backend): performance, reliability, async support.
- Python (ML): best ML libraries.
- FAISS: fast vector search.
- Docker Compose: easy orchestration.
-
System Design:
- RESTful APIs for clear separation.
- Clean contracts between backend and ML service.
-
Implementation:
- Rust backend with
/api/query
and/api/status
. - Python FastAPI for model inference and FAISS search.
- Preprocessing/indexing script for docs.
- Dockerfiles for both services and shared data volume.
- CI workflow for automated build/test.
- Rust backend with
-
Documentation:
- Used GitHub Wiki for step-by-step documentation (as shown in the screenshot above).
- Docker & Docker Compose
- Python 3.8+ (for preprocessing)
- Git
-
Clone the repository:
git clone https://github.com/c0d3h01/llm-qa-system.git cd llm-qa-system
-
Add your documentation:
- Place
.txt
or.md
files intodata/raw/
. - Each line or paragraph will be a chunk.
- Place
-
Build the vector index:
pip install -r ml-inference/requirements.txt python scripts/build_index.py
-
Start the system:
docker compose up --build
-
Test the API:
curl http://localhost:8080/api/status curl -X POST http://localhost:8080/api/query \ -H "Content-Type: application/json" \ -d '{"question": "What is OAuth2?"}'
-
GET /api/status
Health check. Returns"LLM-QA Backend is running"
. -
POST /api/query
Request:{ "question": "How do I reset my password?" }
Response:
{ "answer": "Relevant info: ...", "sources": [ "source snippet 1", "source snippet 2" ] }
-
POST /inference
Internal endpoint for answering queries (used by backend).
-
ML service fails to start:
- Ensure Python dependencies are installed:
pip install -r ml-inference/requirements.txt
- Check
data/faiss.index
anddata/docs.txt
exist.
- Ensure Python dependencies are installed:
-
Rust backend can't connect:
- Check the
ML_INFERENCE_URL
environment variable. - Use
docker compose logs
for debugging.
- Check the
-
Query returns no results:
- Make sure docs are in
data/raw/
and index is rebuilt.
- Make sure docs are in
-
Can I use my own transformer model?
Yes, changeMODEL_NAME
inml-inference/app.py
or set via environment variable. -
How do I add new docs?
Put them indata/raw/
, rebuild the index, and restart the stack. -
Is this production ready?
See Improvement Ideas for steps to productionize. -
Can I run this on Windows/Mac?
Yes, with Docker Desktop.