GPTWeb Two-Phase RAG Engine Routing

4 views
the GPTWeb RAG engine isn't a single lookup — it's a two-phase intelligent routing system that first figures out where to look, then figures out what to return. Phase 1 is semantic routing: the visitor's question is converted into a vector embedding and matched against collection descriptions to identify which knowledge container holds the most relevant content. Phase 2 is semantic retrieval: within that matched collection, the engine performs similarity search across chunked document segments to surface the most contextually accurate passages, which are then passed as context to the LLM to generate a grounded, accurate response.
Here's what makes this powerful: - Collections act as semantic namespaces — the richer your collection description, the more accurately Phase 1 routes the query to the right knowledge bucket
  • Documents are auto-chunked into ~1,500 character segments with overlapping context windows to preserve meaning across boundaries

  • Vector embeddings (via OpenAI or compatible model) power both phases — meaning the engine finds intent and meaning, not just keyword matches

  • Phase 2 retrieval returns top-K chunks ranked by cosine similarity — the LLM only sees the most relevant segments, keeping responses precise and grounded

  • The LLM never hallucinates beyond your content because it is constrained to the retrieved context window, not its general training data

  • Multi-collection queries are supported — if semantic routing scores two collections similarly, both are queried and results are merged before LLM generation

GPTWeb Two-Phase RAG Routing Architecture

Match Score High
Match Score Medium
Match Score Low
Visitor Question
Embed Query as Vector
Phase 1: Semantic Collection Routing
Collection A: Product Knowledge
Collection B: Support Docs
Collection C: Blog Content
Phase 2: Chunk Similarity Search
Top-K Ranked Chunks Returned
Context Window Assembled
LLM Generates Grounded Response
Visitor Receives Accurate Answer
DQL Scoring + Session Analytics
Image
The practical takeaway: write your collection descriptions carefully — they are the routing signal for Phase 1. A vague description like 'company docs' will route poorly. A specific description like 'product features, pricing tiers, onboarding steps, and integration capabilities for [Company]' routes with high precision. You can get started building your first collection and seeding it via the GPTWeb Crawler Wizard in App Configuration under KB Tuning. GPTWeb is the future of engagement — websites and marketing automation combined, built for the AI era, built for now.

Need more help?

Our AI assistant can answer any question instantly.

Continue This Conversation