the GPTWeb RAG engine is a two-phase intelligent routing system — not a simple keyword lookup. It separates the problem of where to look from what to return, making responses dramatically more accurate and grounded in your actual content. Phase 1 — Semantic Routing converts the visitor's query into a vector embedding and matches it against your configured knowledge collections. The engine scores each collection by semantic similarity to the query and routes to the best-matching knowledge namespace. The richer your collection description, the more precise this routing becomes. Phase 2 — Semantic Retrieval then operates inside the matched collection. Documents are pre-chunked into overlapping segments (~1,500 characters) to preserve contextual meaning across boundaries. The engine performs cosine similarity search across all chunks and surfaces the top-K most relevant segments, which are passed directly to the LLM as a constrained context window.
The two-phase approach is what separates GPTWeb from simple chatbot bolt-ons. By decoupling routing from retrieval, the engine scales gracefully as your knowledge base grows — adding more collections doesn't slow down responses, it sharpens them. Every answer your visitors receive is grounded, traceable, and scoped to what you've curated. Learn more about
Getting Started with your knowledge base setup, or explore
Use Cases that leverage this engine for
Discussion Qualified Leads and visitor engagement.
GPTWeb is the future of engagement, websites, and marketing automation combined — built for the AI era, built for now.