Dungeon Archivist

White Paper

Executive Summary

The Dungeon Archivist turns D&D rule lookups from immersion-breaking pauses into instant answers. Ask "What happens when you're blinded?" and get accurate, cited responses in under 3 seconds—powered by a hybrid RAG system designed for rule accuracy and a reduced risk of hallucinated answers.

<3s

Response time target

100%

Source citation accuracy

1M

Token context window

87ms

Vector retrieval speed

Market Analysis

The Challenge

Dungeons & Dragons transforms gameplay into an epic storytelling experience, but there's a catch: the game's complexity can break immersion. Picture this—a climactic battle scene grinds to a halt while the Dungeon Master frantically flips through a 500-page rulebook trying to remember how grappling works.

Immersion Killers

2-5 minute pauses while looking up rules destroy the narrative flow and momentum of gameplay sessions.

500+ Pages of Rules

D&D 5th Edition presents unstructured rules text, narrative explanations, edge cases, and complex statistical data tables.

Context-Dependent Mechanics

"Advantage" works differently in combat vs. skill checks. Rules often require synthesis across multiple sections.

Zero Tolerance for Errors

A wrong ruling ruins game balance. The system is designed to minimize the chance of plausible-sounding but incorrect answers by grounding replies in retrieved sources and rejecting queries when it can't find relevant D&D content.

Technical Analysis

Why Traditional Solutions Fail

Approach	Strengths	Critical Failures
Keyword Search	Fast, precise for known terms	Can't understand intent ("How does grappling work?" vs. "Grapple rules")
Generic LLMs	Natural language understanding	Can invent plausible-sounding but wrong rules (hallucinations), especially without grounding in source material
Pure Vector Search	Good for semantic similarity	Struggles with structured data (might return lore instead of stats)

Architecture

Hybrid RAG Architecture

The system routes queries through two parallel retrieval pathways, combining the strengths of semantic understanding with the precision of structured lookups.

Vector Search Path (Path A)

Purpose: Captures user intent and semantic meaning
Best For: "How does grappling work?" or "Explain advantage"
Technology: ChromaDB embedding-based similarity search for narrative rules

Structured Filtering Path (Path B)

Purpose: Ensures 100% factual accuracy for statistical data
Best For: "What's a Goblin's HP?" or "Fireball damage"
Technology: Direct entity lookups in structured metadata

Hallucination Risk Reduction

Threshold-based filtering (score < 1.1) validates relevance and helps reduce the chance of hallucinated content making it into replies. Out-of-domain queries are explicitly rejected with "No relevant D&D content found."

Source Citations

Every answer includes source file references for transparency and verification. Users can trace any ruling back to official content.

💡 Key Insight: By treating rules and stats as separate data types with different retrieval strategies, the system achieves both contextual understanding AND factual precision—something neither approach could do alone.

Technology Stack

Technical Implementation

LLM Provider

Google Gemini 1.5 Flash — Superior latency-to-cost ratio with massive 1M token context window for multi-section rule synthesis

Vector Database

ChromaDB — Lightweight, Python-native, sub-100ms retrieval with persistent local storage (100% API cost reduction after initial ingestion)

Embedding Model

Google text-embedding-004 — 768-dimensional vectors capturing nuanced semantic meaning for distinguishing similar-sounding rules

Backend Framework

Python 3.11+ with LangChain ecosystem — RAG orchestration, vector store integrations, and Gemini API integration

Web Interface

Streamlit — Rapid prototyping with native chat UI, session state management, and single-command deployment

Security

Environment isolation (venv), secure credential management (python-dotenv), rate limiting, and graceful error handling

<2s

Gemini response latency

$0.075

Cost per 1M tokens (33x cheaper than GPT-4o)

768

Embedding dimensions

1000

Optimal chunk size (chars)

Data Pipeline

ETL & Data Engineering

1. Extract

Load D&D System Reference Document (SRD) text files from local storage — hundreds of pages of dense prose mixed with semi-structured lists.

2. Transform

RecursiveCharacterTextSplitter with 1000-character chunks and 100-character overlap. Respects natural boundaries (paragraphs → sentences → words).

3. Embed

Convert chunks to 768-dimensional vectors via Google's text-embedding-004. Batch processing with exponential backoff for API rate limits.

4. Load

Persist vectors + metadata (source file, chunk ID, character positions) in ChromaDB for instant retrieval with source citations.

⚙️ Technical Detail: Recursive splitting respects document structure. It first attempts to split on double newlines (paragraph breaks), then single newlines (sentences), then spaces (words). This keeps related text together—crucial for D&D rules where a mechanic's explanation and example should stay in the same chunk.

Coverage

System Capabilities

✅ Monster Statistics

Full coverage: Name, type, size, AC, HP, CR, speed, ability scores, actions, reactions, special abilities, and languages.

✅ Spell Information

Full coverage: Name, level, school, casting time, range, components (V/S/M), duration, descriptions, and higher-level effects.

✅ Equipment & Items

Full coverage: Weapons (damage, range, properties), armor (AC, requirements), mundane gear, and magic items with full descriptions.

✅ All 15 Conditions

Complete coverage: Blinded, Charmed, Deafened, Frightened, Grappled, Incapacitated, Invisible, Paralyzed, Petrified, Poisoned, Prone, Restrained, Stunned, Unconscious, and all 6 Exhaustion levels.

Capability	Coverage Level
Monsters	✅ Full
Spells	✅ Full
Equipment	✅ Full
Magic Items	✅ Full
Conditions	✅ Full
Combat Rules	⚠️ Partial
Classes/Races/Feats	❌ Planned

Quality Assurance

Validation & Testing

In-Domain Query Validation

Query: "What happens if I can't see?"
Similarity Score: 0.9839 (< 1.1 ✓ PASS)
Result: Successfully retrieved 'blinded' condition rules with accurate answer generation

Out-of-Domain Rejection

Query: "How do I bake a chocolate cake?"
Similarity Score: 1.2975 (> 1.1 ✗ FAIL)
Result: Query correctly rejected—system responded with 'No relevant D&D content found'

Threshold Calibration

Through 50+ query validation tests, established threshold at 1.1 as the optimal separation point. Valid D&D queries consistently score below 1.0, out-of-domain queries above 1.2.

Retrieval Performance

Query: "How does grappling work?"
Result: Retrieved 3 relevant chunks covering grapple rules, escape mechanics, and edge cases
Total retrieval time: 87ms

Development Timeline

Implementation Roadmap

Phase 1: System Architecture

✅ Completed

Technical design documentation, development environment setup, Git configuration with security-focused .gitignore, Gemini API integration, and architecture validation.

Phase 2: Data Engineering & ETL

✅ Completed

Automated ingestion pipeline (ingest.py), recursive chunking strategy, ChromaDB vector storage, and metadata strategy for source citations.

Phase 3: Query Implementation

✅ Completed

End-to-end RAG pipeline, threshold-based filtering, semantic search validation, and hallucination risk reduction through confidence-based filtering.

Phase 4: Testing & Validation

✅ Completed

Comprehensive RETRIEVAL_LOG.md with experimental validation, in-domain vs. out-of-domain testing, and threshold calibration through systematic experimentation.

Phase 5: Web Interface & MVP

✅ Completed

Streamlit web application with chat interface, session state management, performance caching, and deployment-ready single-command launch.

Technical Competencies

Skills Demonstrated

AI/ML Engineering

Semantic search implementation, empirical threshold determination (50+ queries), agentic system design, and hallucination risk reduction through confidence-based filtering.

Data Engineering

ETL pipeline design, text preprocessing and intelligent chunking, vector embedding optimization, performance profiling, and API cost optimization.

Full-Stack Development

Web application development with Streamlit, chat interface design, session state management, and performance optimization through intelligent caching.

Software Architecture

Modular architecture with clear separation of concerns, end-to-end pipeline implementation, error handling, graceful degradation, and multi-interface design (CLI + Web).

What's Next

Future Implementation

The following features and improvements are planned for future development phases:

Data Expansion

Add Classes & Subclasses with class features and spell progression. Add Races/Species with racial traits and ability scores. Add Feats & Backgrounds for complete character creation support.

Structured Lookup Router

Implement "Path B" from Phase 1 design for JSON filtering of monster/spell stats. Direct entity lookups for precise statistical queries like "What's a Goblin's HP?"

Enhanced User Experience

Session memory to remember context from previous questions. Related topic suggestions alongside answers. Query history UI to track and revisit past questions.

Performance Optimization

Formal benchmarking against <3 second latency target. Pre-computed common queries for zero-latency lookups. Enhanced citations with page numbers and direct rule text excerpts.

Project Summary

Conclusion

The Dungeon Archivist demonstrates end-to-end AI engineering: from architectural design to working product in 5 phases. The hybrid RAG system successfully transforms D&D rule lookups from 2-5 minute disruptions into sub-3-second seamless experiences with source-cited responses.

✅

RAG Architecture

✅

Source Citations

✅

MVP Complete

The project showcases advanced competencies in AI/ML engineering, data engineering, full-stack development, and software architecture. From security-first development practices to user-centered design, The Dungeon Archivist represents what end-to-end AI engineering looks like.

Try the Prototype

⚠️ MVP Disclaimer

This is a Minimum Viable Product (MVP) and proof of concept. The prototype is currently running on Google Gemini's free API tier, which has rate limits and may experience occasional delays or downtime. Additionally, due to copyright restrictions, this system only references free Dungeons & Dragons resources (primarily the D&D System Reference Document) and pulls from a small subset of information freely available on the internet. The current model does not include all official D&D content. Performance and availability may vary. This project demonstrates the technical architecture and capabilities—production deployment would require a paid API tier and expanded content licensing for consistent performance and comprehensive coverage.

Executive Summary

The Challenge

Immersion Killers

500+ Pages of Rules

Context-Dependent Mechanics

Zero Tolerance for Errors

Why Traditional Solutions Fail

Hybrid RAG Architecture

Vector Search Path (Path A)

Structured Filtering Path (Path B)

Hallucination Risk Reduction

Source Citations

Technical Implementation

LLM Provider

Vector Database

Embedding Model

Backend Framework

Web Interface

Security

ETL & Data Engineering

1. Extract

2. Transform

3. Embed

4. Load

System Capabilities

✅ Monster Statistics

✅ Spell Information

✅ Equipment & Items

✅ All 15 Conditions

Validation & Testing

In-Domain Query Validation

Out-of-Domain Rejection

Threshold Calibration

Retrieval Performance

Implementation Roadmap

Skills Demonstrated

AI/ML Engineering

Data Engineering

Full-Stack Development

Software Architecture

Future Implementation

Data Expansion

Structured Lookup Router

Enhanced User Experience

Performance Optimization

Conclusion

⚠️ MVP Disclaimer

Contact me