Skip to content

🧡 ThreadNavigatorAI 2.0 Multi-agent Reddit thread analyzer with semantic summarization, tool-augmented fact-checking, and LLM-as-a-Judge rubric evaluation. πŸ”§ Configurable via config.yaml 🧠 Agent stack with OpenRouter LLMs πŸ–ΌοΈ Polished Streamlit UI (tabbed, latency-aware) 🌐 Deployed on Hugging Face Spaces (free-tier)

License

Notifications You must be signed in to change notification settings

rajesh1804/ThreadNavigatorAI2.0

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

19 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

title emoji colorFrom colorTo sdk sdk_version app_file pinned
ThreadNavigatorAI2.0
🧡
red
green
streamlit
1.33.0
app.py
true

🧡 ThreadNavigatorAI 2.0 β€” Multi-Agent Reddit Thread Analyzer with LLM-as-a-Judge

Built with Streamlit
Multi-Agent LLMs
Deployed on Hugging Face
License: MIT

πŸ” ThreadNavigatorAI 2.0 is a modular, multi-agent Reddit analyzer that summarizes threads, verifies claims, and scores insights using LLM-as-a-Judge. Built for high-scale moderation and discourse comprehension under real-world latency and token constraints.

🎯 Designed for folks evaluating top-tier AI engineering talent.


πŸš€ What’s New in 2.0

ThreadNavigatorAI 2.0 builds directly on v1.0 by introducing:

Feature 1.0 πŸš€ 2.0 Upgrade
Agent Orchestration LangGraph (Reply, Mod) Modular pipeline (Summarizer, FC, Eval)
Summarization Prompt + RAG Semantic + config-driven transformer
Moderation Rule-based + LLM πŸ” Fact Checker Agent (tool-augmented)
Evaluation Manual 🧠 LLM-as-a-Judge (rubric-based)
UI Basic Streamlit ✨ Tabbed layout, latency toggle, download
Scalability Single-thread only 100 threads (real + mock hybrid)

πŸ“Š Architecture Overview

ThreadNavigatorAI Multi-Agent Flow

Thread JSON β†’ [Multi-Agent Orchestration]
              β”œβ”€β”€ πŸ” Summarizer Agent β†’ Extracts core insights
              β”œβ”€β”€ πŸ§ͺ Fact Checker Agent β†’ Verifies claims via tools
              └── 🧠 Evaluator Agent β†’ LLM-as-a-Judge on rubric (Relevance, Factuality...)

              ↳ Config-driven model orchestration (via config.yaml)
              ↳ Latency + model tracking in UI
              ↳ Deployed on Hugging Face Spaces (Streamlit + Docker)

🌟 Key Features

βœ… Multi-Agent Stack (Summarizer, Fact Checker, Evaluator)
βœ… Semantic Summarization + Retrieval
βœ… LLM-as-a-Judge via rubric-based evaluation (like RAGAS)
βœ… Hybrid Simulation Mode (10 real + 90 mock = 100 total threads)
βœ… Live Latency Display + Model Attribution
βœ… Streamlit UX with onboarding, source links, toggles, download
βœ… Free-tier Deployable (OpenRouter APIs, Streamlit, Hugging Face)


🧠 Agent Stack (Config-Driven)

Modular multi-agent architecture modeled after real-world Reddit analysis workflows.

Agent Role Model Used (Free-tier)
🧠 Summarizer Extracts semantic insights from messy Reddit posts using transformer-based compression moonshotai/kimi-k2:free
πŸ§ͺ Fact Checker Verifies factuality using retrieval tools deepseek/deepseek-r1:free
πŸ“Š Evaluator Scores output on Relevance, Coherence, and Factuality (LLM-as-a-Judge) openrouter/mistralai/mistral-7b-instruct:free
πŸ”§ Tools External retrieval & KB (Serper, Wikipedia) β€”

All agents use individual OpenRouter models, with latency tracked per call.


πŸ”Ž How It Works

Select a Reddit-style thread from 100 examples.

Agents are triggered in sequence:

  • 🧠 Summarizer β†’ condenses the entire thread
  • πŸ§ͺ Fact Checker β†’ verifies major claims with external data
  • πŸ“Š Evaluator β†’ scores the summary based on LLM rubrics

Output is displayed in a polished tabbed UI with latency + model trace.
Optionally download the result as JSON.

All agents are modular β€” swap-in / swap-out via config.yaml.


πŸ–ΌοΈ UI Preview

Tabbed layout includes:

  • 🧠 Summary (with model and source)
  • πŸ”Ž Fact Checks (claim + judgment)
  • ⚑ Latency (per agent)
  • πŸ“Š Evaluation (scored by LLM)

🚒 Live Demo

πŸ“Œ Try the fully working UI on Hugging Face:
πŸ‘‰ ThreadNavigatorAI 2.0 – Live Space

No login, no billing β€” real inference with OpenRouter free-tier models.


πŸ§ͺ Sample Output (Thread ID: threadid_011)

  • 🧠 Summary: Debate around GPT-4 vs Gemini, with user opinions and factual misunderstandings
  • πŸ”Ž Claim: β€œGemini was trained on YouTube comments” β†’ πŸ”΄ Incorrect

πŸ“Š Eval Score:

  • Relevance: 🟒 5 β€” Very relevant
  • Coherence: 🟑 3 β€” Some redundancy
  • Factuality: 🟒 4 β€” Mostly accurate

⏱️ Latency:

  • Summarizer: 2.1s
  • FactChecker: 3.4s
  • Evaluator: 1.9s

πŸ“ Project Structure

ThreadNavigatorAI2.0/                      
β”œβ”€β”€ ui/                        
β”‚   └── app.py                     # Streamlit frontend
β”œβ”€β”€ cli/ 
β”‚   └── batch_run.py               # Multi-thread processor
β”œβ”€β”€ config/                        # Models and tool settings
β”‚   └── config.yaml
β”œβ”€β”€ agents/
β”‚   β”œβ”€β”€ summarizer_agent.py
β”‚   β”œβ”€β”€ factchecker_agent.py
β”‚   └── evaluator_agent.py
β”œβ”€β”€ utils/
β”‚   β”œβ”€β”€ llm_utils.py
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ batch_output.json          # 10 threads real inference and 90 threads simulated inference
β”‚   └── threads_100.json           # 100 simulated threads
β”œβ”€β”€ assets/
β”‚   β”œβ”€β”€ threadnavigator_flow.png   # Architecture Diagram
β”‚   └── threadnavigator-demo.gif   # UI Demo
└── requirements.txt

πŸ’Ό Why This Project Stands Out

  • βœ… Represents real-world pipeline for Reddit moderation/analysis
  • βœ… Dynamic toggles: latency, evaluation, download
  • βœ… Hybrid inference simulates scale under free-tier
  • βœ… LLM-as-a-Judge with model attribution & traceability
  • βœ… Fully local, frictionless UI with onboarding
  • βœ… Built like an internal debugging tool for social platforms

🧰 Run Locally (for Devs)

git clone https://github.com/rajesh1804/ThreadNavigatorAI2.0
cd ThreadNavigatorAI2.0
pip install -r requirements.txt

Create .env:

OPENROUTER_API_KEY=sk-
SERPER_API_KEY=

Then run:

streamlit run app.py

🧠 Linked Projects

Project Description Link
🧡 ThreadNavigatorAI 1.0 RAG-based summarizer + moderator + reply agent using LangGraph πŸ”— View
πŸš• RideCastAI Real-time ride fare/ETA prediction with spatial heatmaps πŸ”— View
πŸ›’ GroceryGPT+ Vector search + reranking grocery assistant with fuzzy recall πŸ”— View
🎬 StreamWiseAI Netflix-style movie recommender with Retention Coach Agent πŸ”— View

⚠️ Known Challenges

  • πŸ§ͺ gemma-3-4b-it rejected evaluation due to instruction tuning being disabled β€” switched to mistralai-7b:free
  • πŸ“‰ Reddit data volume limited to 10 threads (real) due to OpenRouter limits β€” mock data generated for scalability tests
  • ⏱️ Tool-based fact-checking adds ~2–3 seconds latency β€” handled with async retries + optional toggle
  • 🧡 No full LangGraph orchestration in this version (ThreadNavigator 1.0 had it) β€” but this version enables more precise per-agent control

πŸ§‘β€πŸ’Ό About Me

Rajesh Marudhachalam β€” AI/ML Engineer with a focus on building LLM-native applications.
πŸ“ GitHub | LinkedIn

Projects: RideCastAI, StreamWiseAI, GroceryGPT+


πŸ™Œ Acknowledgments


πŸ“œ License

MIT License

⭐️ Star this repo if it impressed you. Follow for more elite-level ML + LLM builds.

About

🧡 ThreadNavigatorAI 2.0 Multi-agent Reddit thread analyzer with semantic summarization, tool-augmented fact-checking, and LLM-as-a-Judge rubric evaluation. πŸ”§ Configurable via config.yaml 🧠 Agent stack with OpenRouter LLMs πŸ–ΌοΈ Polished Streamlit UI (tabbed, latency-aware) 🌐 Deployed on Hugging Face Spaces (free-tier)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages