Agentic Search Engine
- Overview
- Features
- Architecture
- Technology Stack
- Quick Start
- Installation
- Configuration
- API Reference
- Frontend Components
- Usage Examples
- Multi-Agent Workflow
- File Processing
- Document Generation
- Development
- Deployment
NEXUS SEARCH is a cutting-edge, ultra-premium agentic search engine that combines the power of LangChain, LangGraph, and Tavily to deliver unparalleled search experiences. Built with a sophisticated 13-agent architecture, it provides intelligent content analysis, multi-modal processing, and professional document generation.
- π€ 13-Agent Architecture: Specialized AI agents for different aspects of search and analysis
- π Advanced Search: Powered by Tavily with intelligent content validation
- π Multi-Modal Processing: Handle PDFs, documents, images, and videos seamlessly
- π¨ Professional UI: Modern React interface with dark/light modes
- π Credibility Scoring: AI-powered source validation and fact-checking
- π PDF Generation: Create professional research reports automatically
- Multi-Engine Search: Web, Academic, News, Images, Videos
- Intelligent Summarization: AI-powered content analysis
- Real-Time Processing: Live updates on search progress
- Follow-Up Questions: AI-generated exploration suggestions
- Multimodal Processor: Handles uploaded files and context
- Query Analyzer: Understands intent and complexity
- Search Strategist: Plans optimal search approach
- Content Validator: Scores credibility and quality
- Fact Checker: Validates information accuracy
- Synthesis Expert: Creates comprehensive answers
- Quality Assurance: Ensures output excellence
- File Upload: Support for PDF, DOC, TXT, Images
- Text Extraction: OCR and document parsing
- Thumbnail Generation: Visual previews
- Content Analysis: Intelligent metadata extraction
- PDF Generation: Create detailed research reports
- Custom Sections: Configurable document structure
- Media Integration: Include images and charts
- Citation Management: Automatic source attribution
- Modern Design: Clean, professional interface
- Dark/Light Mode: Customizable themes
- Responsive Layout: Works on all devices
- Real-Time Updates: Live search progress indicators
graph TB
A[Frontend - React] --> B[FastAPI Backend]
B --> C[LangGraph Workflow]
C --> D[13 Specialized Agents]
B --> E[Tavily Search Engine]
B --> F[Gemini LLM]
B --> G[File Processing]
B --> H[PDF Generation]
D --> D1[Multimodal Processor]
D --> D2[Query Analyzer]
D --> D3[Search Strategist]
D --> D4[Enhanced Search]
D --> D5[Content Validator]
D --> D6[Media Extractor]
D --> D7[Fact Checker]
D --> D8[Synthesis Expert]
D --> D9[Multimodal Synthesizer]
D --> D10[Summarization]
D --> D11[Citation Specialist]
D --> D12[Insight Generator]
D --> D13[Quality Assurance]
- Framework: FastAPI 0.104+
- AI/ML: LangChain, LangGraph, Google Gemini
- Search: Tavily Search API
- File Processing: PyPDF2, python-docx, Pillow, pytesseract
- PDF Generation: ReportLab
- Database: In-memory (production: PostgreSQL/MongoDB)
- Framework: React 18+
- Styling: Tailwind CSS
- Animations: Framer Motion
- State Management: React Hooks
- HTTP Client: Fetch API
- CORS: FastAPI CORS middleware
- File Storage: Local filesystem (production: AWS S3/GCS)
- Environment: Python 3.8+, Node.js 16+
- Python 3.8+
- Node.js 16+
- npm or yarn
git clone https://github.com/yourusername/nexus-ai.git
cd nexus-ai
# Navigate to backend directory
cd backend
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Set environment variables
export GEMINI_API_KEY="your_gemini_api_key"
export TAVILY_API_KEY="your_tavily_api_key"
# Start the backend server
uvicorn main:app --reload --host 0.0.0.0 --port 8000
# Navigate to frontend directory
cd ../frontend
# Install dependencies
npm install
# Start the development server
npm start
- Frontend: http://localhost:3000
- Backend API: http://localhost:8000
- API Documentation: http://localhost:8000/api/docs
Create a requirements.txt
file:
fastapi==0.104.1
uvicorn[standard]==0.24.0
python-multipart==0.0.6
pydantic==2.5.0
langchain==0.0.345
langgraph==0.0.20
langchain-google-genai==0.0.6
langchain-community==0.0.10
tavily-python==0.3.0
requests==2.31.0
beautifulsoup4==4.12.2
PyPDF2==3.0.1
python-docx==1.1.0
pandas==2.1.4
numpy==1.24.3
Pillow==10.1.0
pytesseract==0.3.10
opencv-python==4.8.1.78
moviepy==1.0.3
speech-recognition==3.10.0
pydub==0.25.1
reportlab==4.0.7
psutil==5.9.6
Create a package.json
file:
{
"name": "nexus-ai-frontend",
"version": "1.0.0",
"private": true,
"dependencies": {
"react": "^18.2.0",
"react-dom": "^18.2.0",
"react-scripts": "5.0.1",
"framer-motion": "^10.16.4",
"tailwindcss": "^3.3.6",
"autoprefixer": "^10.4.16",
"postcss": "^8.4.32"
},
"scripts": {
"start": "react-scripts start",
"build": "react-scripts build",
"test": "react-scripts test",
"eject": "react-scripts eject"
},
"eslintConfig": {
"extends": [
"react-app",
"react-app/jest"
]
},
"browserslist": {
"production": [
">0.2%",
"not dead",
"not op_mini all"
],
"development": [
"last 1 chrome version",
"last 1 firefox version",
"last 1 safari version"
]
}
}
Create a .env
file in the backend directory:
# API Keys
GEMINI_API_KEY=your_google_gemini_api_key
TAVILY_API_KEY=your_tavily_search_api_key
# Server Configuration
API_BASE_URL=http://localhost:8000
CORS_ORIGINS=["http://localhost:3000", "http://127.0.0.1:3000"]
# File Upload Configuration
MAX_FILE_SIZE=104857600 # 100MB
UPLOAD_DIR=uploads
THUMBNAILS_DIR=thumbnails
GENERATED_DOCS_DIR=generated_docs
# Processing Configuration
MAX_SOURCES=12
DEFAULT_TIMEOUT=30
-
Google Gemini API Key:
- Visit Google AI Studio
- Create a new API key
- Add to environment variables
-
Tavily Search API Key:
- Visit Tavily
- Sign up and get your API key
- Add to environment variables
POST /search
Request Body:
{
"query": "string",
"search_mode": "comprehensive",
"search_type": "web",
"uploaded_file_ids": ["string"],
"extract_media": true,
"max_sources": 10,
"stream": false
}
Response:
{
"answer": "string",
"sources": [
{
"id": "string",
"title": "string",
"url": "string",
"snippet": "string",
"credibility_score": 0.8,
"domain": "string",
"is_academic": false
}
],
"extracted_media": [
{
"type": "image",
"title": "string",
"url": "string",
"thumbnail": "string"
}
],
"follow_up_questions": ["string"],
"credibility_score": 0.85,
"processing_time": 2.5
}
POST /upload
Request: Multipart form data with file
Response:
{
"file_id": "string",
"filename": "string",
"file_type": "pdf",
"file_size": 1024,
"upload_time": "2024-01-01T00:00:00Z",
"extracted_text": "string",
"thumbnail_url": "string"
}
POST /generate-document
Request Body:
{
"title": "string",
"query": "string",
"sections": [
{
"id": 1,
"title": "string",
"content": "string",
"type": "text"
}
],
"searchResults": {},
"options": {
"includeImages": true,
"includeCharts": true,
"includeSources": true
}
}
Response: PDF file download
- Manages global state and routing
- Handles dark/light mode
- Coordinates search and file operations
<SourceCard
source={sourceData}
index={0}
darkMode={true}
onSummaryRequest={handleSummary}
/>
<SummaryPanel
results={searchResults}
query={searchQuery}
darkMode={darkMode}
onGenerateDocument={handleGenerate}
/>
<DocumentGenerator
isOpen={isOpen}
onClose={handleClose}
searchResults={results}
query={query}
darkMode={darkMode}
/>
<FileCard
file={fileData}
darkMode={darkMode}
onDelete={handleDelete}
/>
const { getRootProps, getInputProps, isDragActive } = useFileUpload({
onDrop: handleFileDrop,
accept: { 'application/pdf': ['.pdf'] },
multiple: true
});
const { copied, copyToClipboard } = useCopyToClipboard();
// Perform a simple web search
const searchRequest = {
query: "artificial intelligence trends 2024",
search_type: "web",
max_sources: 10
};
const response = await fetch('/search', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(searchRequest)
});
const results = await response.json();
// Upload file and include in search
const formData = new FormData();
formData.append('file', file);
const uploadResponse = await fetch('/upload', {
method: 'POST',
body: formData
});
const fileData = await uploadResponse.json();
// Use file in search
const searchRequest = {
query: "analyze this document for key insights",
uploaded_file_ids: [fileData.file_id]
};
// Generate comprehensive PDF report
const documentRequest = {
title: "AI Research Report",
query: "artificial intelligence",
sections: [
{ id: 1, title: "Executive Summary", content: "", type: "text" },
{ id: 2, title: "Sources", content: "", type: "sources" }
],
searchResults: results,
options: {
includeImages: true,
includeSources: true
}
};
const pdfResponse = await fetch('/generate-document', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(documentRequest)
});
The 13-agent architecture processes queries through specialized stages:
- Multimodal Processor - Processes uploaded files and context
- Query Analyzer - Analyzes intent, complexity, and domains
- Search Strategist - Plans optimal search approach
- Enhanced Search - Executes search using Tavily
- Content Validator - Validates and scores content credibility
- Media Extractor - Extracts relevant media from sources
- Fact Checker - Performs fact-checking and verification
- Synthesis Expert - Creates comprehensive answers
- Multimodal Synthesizer - Enhances with multimodal insights
- Summarization - Creates executive summaries
- Citation Specialist - Manages source citations
- Insight Generator - Generates follow-up questions
- Quality Assurance - Final quality check and scoring
# Example agent configuration
workflow = StateGraph(AdvancedAgentState)
# Add all agents
workflow.add_node("multimodal_processor", multimodal_processor_agent)
workflow.add_node("query_analyzer", query_analyzer_agent)
# ... other agents
# Define execution flow
workflow.set_entry_point("multimodal_processor")
workflow.add_edge("multimodal_processor", "query_analyzer")
# ... other edges
Type | Extensions | Features |
---|---|---|
.pdf |
Text extraction, thumbnail generation | |
Documents | .doc , .docx |
Content parsing, metadata extraction |
Text | .txt |
Direct content reading |
Images | .png , .jpg , .jpeg , .gif |
OCR text extraction, thumbnails |
Videos | .mp4 , .avi , .mov |
Metadata extraction (future: transcription) |
# Example file processing
async def process_file(file_content, file_type):
if file_type == "pdf":
return await FileProcessor.process_pdf(file_content)
elif file_type == "image":
return await ImageProcessor.process_image(file_content)
# ... other processors
- Professional Layout: Multi-column layouts with proper typography
- Table of Contents: Automatic generation with page numbers
- Source Citations: Properly formatted academic citations
- Media Integration: Embedded images and charts
- Custom Sections: Configurable document structure
- Metadata: Document properties and generation info
# Example document sections
sections = [
{"id": 1, "title": "Executive Summary", "type": "text"},
{"id": 2, "title": "Key Findings", "type": "text"},
{"id": 3, "title": "Source Analysis", "type": "sources"},
{"id": 4, "title": "Media Gallery", "type": "media"},
{"id": 5, "title": "Conclusion", "type": "text"}
]
nexus-ai/
βββ backend/
β βββ main.py # FastAPI application
β βββ requirements.txt # Python dependencies
β βββ uploads/ # File upload directory
β βββ thumbnails/ # Generated thumbnails
β βββ generated_docs/ # Generated PDF reports
βββ frontend/
β βββ src/
β β βββ App.js # Main React component
β β βββ index.js # React entry point
β β βββ index.css # Tailwind styles
β βββ public/ # Static assets
β βββ package.json # Node dependencies
βββ README.md # This file
βββ .gitignore # Git ignore rules
-
Backend Development:
cd backend uvicorn main:app --reload --log-level debug
-
Frontend Development:
cd frontend npm start
-
Testing:
# Backend tests pytest tests/ # Frontend tests npm test
- Python: Follow PEP 8, use Black formatter
- JavaScript: Use ESLint and Prettier
- Commits: Conventional Commits format
-
Backend Deployment:
# Using Docker docker build -t nexus-ai-backend . docker run -p 8000:8000 nexus-ai-backend # Using Gunicorn gunicorn main:app -w 4 -k uvicorn.workers.UvicornWorker
-
Frontend Deployment:
# Build for production npm run build # Deploy to static hosting (Vercel, Netlify, etc.) npm install -g vercel vercel --prod
# Production API Keys
GEMINI_API_KEY=prod_gemini_key
TAVILY_API_KEY=prod_tavily_key
# Database
DATABASE_URL=postgresql://user:pass@host:port/db
# File Storage
AWS_ACCESS_KEY_ID=your_aws_key
AWS_SECRET_ACCESS_KEY=your_aws_secret
AWS_BUCKET_NAME=nexus-ai-files
# Security
SECRET_KEY=your_secret_key
ALLOWED_HOSTS=["yourdomain.com"]
Dockerfile (Backend):
FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
EXPOSE 8000
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
docker-compose.yml:
version: '3.8'
services:
backend:
build: ./backend
ports:
- "8000:8000"
environment:
- GEMINI_API_KEY=${GEMINI_API_KEY}
- TAVILY_API_KEY=${TAVILY_API_KEY}
volumes:
- ./uploads:/app/uploads
frontend:
build: ./frontend
ports:
- "3000:3000"
depends_on:
- backend
# tests/test_search.py
import pytest
from fastapi.testclient import TestClient
from main import app
client = TestClient(app)
def test_search_endpoint():
response = client.post("/search", json={
"query": "test query",
"search_type": "web"
})
assert response.status_code == 200
assert "answer" in response.json()
def test_file_upload():
with open("test_file.pdf", "rb") as f:
response = client.post("/upload", files={"file": f})
assert response.status_code == 200
// src/App.test.js
import { render, screen } from '@testing-library/react';
import App from './App';
test('renders NEXUS AI title', () => {
render(<App />);
const titleElement = screen.getByText(/NEXUS AI/i);
expect(titleElement).toBeInTheDocument();
});
test('search functionality', () => {
render(<App />);
const searchInput = screen.getByPlaceholderText(/ask me anything/i);
expect(searchInput).toBeInTheDocument();
});
- Async Processing: All I/O operations are asynchronous
- Caching: Redis for search result caching
- Connection Pooling: Database connection optimization
- File Compression: Automatic image and document compression
- Lazy Loading: Components and images loaded on demand
- Memoization: React.memo for expensive components
- Virtual Scrolling: For large result sets
- Code Splitting: Bundle optimization
# Example JWT authentication
from fastapi import Depends, HTTPException
from fastapi.security import HTTPBearer
security = HTTPBearer()
async def get_current_user(token: str = Depends(security)):
# Verify JWT token
return user
- File type validation
- Size limitations
- Malware scanning
- Secure file storage
- Rate limiting
- CORS configuration
- Input validation
- SQL injection prevention
We welcome contributions! Please follow these guidelines:
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature
- Make your changes
- Add tests for new functionality
- Ensure all tests pass:
pytest
andnpm test
- Commit changes:
git commit -m 'Add amazing feature'
- Push to branch:
git push origin feature/amazing-feature
- Open a Pull Request
- Follow existing code style
- Add tests for new features
- Update documentation
- Use conventional commit messages
- Keep PRs focused and small
- New File Processors: Add support for more file types
- Additional Search Engines: Integrate more search APIs
- UI Improvements: Enhance user interface components
- Performance: Optimize search and processing speed
- Documentation: Improve guides and examples
- Initial release
- 13-agent architecture
- Multi-modal file processing
- PDF report generation
- Professional UI with dark/light modes
- Beta release
- Core search functionality
- Basic file upload
- Initial agent implementation
Backend won't start:
# Check Python version
python --version # Should be 3.8+
# Install dependencies
pip install -r requirements.txt
# Check API keys
echo $GEMINI_API_KEY
echo $TAVILY_API_KEY
Frontend build errors:
# Clear cache
npm cache clean --force
# Reinstall dependencies
rm -rf node_modules package-lock.json
npm install
# Check Node version
node --version # Should be 16+
File upload issues:
- Check file size (max 100MB)
- Verify file format support
- Ensure upload directory permissions
- LangChain - AI application framework
- Tavily - Search API provider
- Google Gemini - Large language model
- FastAPI - Modern web framework
- React - Frontend library
- Tailwind CSS - Utility-first CSS