PersonaRAG - A Conversation-Aware, Self-Correcting AI Twin
Date
January 2026
Service
Employer
Project Overview
I’ve developed a "digital twin" RAG chatbot that answers portfolio queries using a FAISS index optimized for retrieving full content blocks rather than fragments. The pipeline ensures conversational continuity by rewriting multi-turn prompts into standalone questions and maintains high reliability through an LLM-as-judge evaluator that triggers bounded self-correction to prevent hallucinations. To stay current, the system automatically re-indexes my website daily, ensuring all responses are grounded in my latest professional data.
https://huggingface.co/spaces/ritup3/PersonaRag
Key Highlights
Contextual Query Rewriting: Transforms conversational follow-ups into standalone, retrieval-ready questions using chat history.
Header-Based Chunking: Indexes documents by HTML headers (H1, H2) to retrieve complete sections (e.g., "Work Experience") rather than fragmented snippets.
Zero-Shot Section Routing: Uses a classifier (BART-MNLI) to map natural language queries to specific section labels, with vector search as a fallback.
Confidence-Ranked Retrieval: Prioritizes context based on routing confidence scores to minimize "noise" and improve LLM accuracy.
Evaluator-Driven Self-Correction: Employs a judge model to score grounding; triggers a single feedback-guided retry if hallucinations are detected.
Production Fault Tolerance: Implements comprehensive exception handling and safe fallbacks across all API-dependent steps to prevent crashes.
Atomic Re-Indexing: Features a background scheduler that rebuilds and swaps the FAISS index daily for zero-downtime content updates.
Full-Stack Observability: Logs rewrites, metadata previews, and evaluator scores to simplify debugging and quality assurance.





