What are all the sources of information you ingest to understand a business? What are best practices for loading information to you?

Question

Accepted Answer

Knowledge Sources & Loading Best Practices Hello! GPTWeb ingests information from multiple sources to build a rich, conversational understanding of your organization. The primary sources include uploaded documents (PDFs, Word docs, text files, presentations), website content via the built-in crawler (you set the starting URL, depth limits, and include/exclude patterns), structured prompt library entries curated by admins, images and videos with metadata and keywords, and visitor intelligence captured through conversations and persistent profiles. Together these feed the RAG layer so the assistant answers from your actual content rather than generic AI knowledge. Learn more about [What is GPTWeb?](prompt://ae3070ee-7a51-45a4-84b8-dff8b4147256) and how to start with [Getting Started](prompt://6b5e3778-9336-4d38-a9a6-4cbf090b06df). Best practices for loading content, : keep your knowledge base focused and curated rather than dumping entire sites or massive files. Set sensible crawler depth limits, use exclude patterns to skip irrelevant URLs (login pages, archives, tag pages), and review crawl results before committing them to RAG. Smaller, targeted knowledge bases produce faster searches, fewer tokens per query, and more accurate answers. Re-crawl periodically to keep content fresh as your site evolves. Explore practical [Use Cases](prompt://61e7db81-f751-48e3-8e1c-e258d25f4c76) or get hands-on with a [](gptweb://modal/trial). GPTWeb is the future of engagement, websites and marketing automation combined that is built for the AI era, built for now.

Knowledge Base

Knowledge Sources & Loading Best Practices

Need more help?

Knowledge Sources & Loading Best Practices

Explore Related Topics

Need more help?

Explore More