Available for work

PersonaRAG - A Conversation-Aware, Self-Correcting AI Twin

Date

January 2026

Service

Employer

Project Overview

I’ve developed a "digital twin" RAG chatbot that answers portfolio queries using a FAISS index optimized for retrieving full content blocks rather than fragments. The pipeline ensures conversational continuity by rewriting multi-turn prompts into standalone questions and maintains high reliability through an LLM-as-judge evaluator that triggers bounded self-correction to prevent hallucinations. To stay current, the system automatically re-indexes my website daily, ensuring all responses are grounded in my latest professional data.

https://huggingface.co/spaces/ritup3/PersonaRag

Key Highlights

Contextual Query Rewriting: Transforms conversational follow-ups into standalone, retrieval-ready questions using chat history.
Header-Based Chunking: Indexes documents by HTML headers (H1, H2) to retrieve complete sections (e.g., "Work Experience") rather than fragmented snippets.
Zero-Shot Section Routing: Uses a classifier (BART-MNLI) to map natural language queries to specific section labels, with vector search as a fallback.
Confidence-Ranked Retrieval: Prioritizes context based on routing confidence scores to minimize "noise" and improve LLM accuracy.
Evaluator-Driven Self-Correction: Employs a judge model to score grounding; triggers a single feedback-guided retry if hallucinations are detected.
Production Fault Tolerance: Implements comprehensive exception handling and safe fallbacks across all API-dependent steps to prevent crashes.
Atomic Re-Indexing: Features a background scheduler that rebuilds and swaps the FAISS index daily for zero-downtime content updates.
Full-Stack Observability: Logs rewrites, metadata previews, and evaluator scores to simplify debugging and quality assurance.

More Project

RAG-PDF QA Method

Implementation of RAG method to perform question answering on PDF documents. HuggingFace embeddings in addition to the Groq API were used for efficient and accurate information retrieval. Session management was done to ensure embeddings are computed once per document upload to optimize performance by avoiding redundant computations.

January 2025

RAG-PDF QA Method

January 2025

RAG-PDF QA Method

January 2025

LLM Based Q/A on Web Context

Development of Web App that uses open source LLMs to answer user queries based on its own knowledge as well as any reference provided in the form of external webpages.

December 2024

LLM Based Q/A on Web Context

Development of Web App that uses open source LLMs to answer user queries based on its own knowledge as well as any reference provided in the form of external webpages.

December 2024

LLM Based Q/A on Web Context

Development of Web App that uses open source LLMs to answer user queries based on its own knowledge as well as any reference provided in the form of external webpages.

December 2024

Next Word

"Forecasting the Future: Utilizing LSTM for Next-Word Prediction in Natural Language Processing."

December 2024

Next Word

"Forecasting the Future: Utilizing LSTM for Next-Word Prediction in Natural Language Processing."

December 2024

Next Word

"Forecasting the Future: Utilizing LSTM for Next-Word Prediction in Natural Language Processing."

December 2024