From docs to live chatbot in 5 minutes
A step-by-step breakdown of how Orac transforms your content into an intelligent, context-aware AI chatbot.
Upload your content
PDFs, URLs, or full site crawl
Start by giving Orac your content. You have three options: upload PDF or DOCX files directly, paste specific page URLs, or enter your domain for a full site crawl.
Full site crawl is the most powerful option. Orac discovers all pages via your sitemap.xml, crawls each one, and extracts clean text content. It works with WordPress, Shopify, Webflow, Next.js, and any publicly accessible website.
For SaaS products, you can create separate projects — one trained on marketing content (for your public site) and another trained on help docs (for in-app support).
Orac learns your docs
RAG pipeline: chunk → embed → store
Your content goes through Orac's RAG (Retrieval-Augmented Generation) pipeline. First, text is split into chunks of 500-1000 tokens with overlap to preserve context across boundaries.
Each chunk is then converted into a vector embedding using OpenAI's text-embedding-3-small model (or your own API key via BYOK). These embeddings capture the semantic meaning of your content.
Embeddings are stored in a pgvector database alongside metadata: source URL, page title, section heading, and content hash. The hash is used for auto-sync — Orac can detect exactly which content has changed.
Embed & go live
One script tag on any website
Add a single script tag to your website. The widget renders inside an iframe for complete CSS isolation — your site's styles never conflict with the widget, and vice versa.
The widget is fully customizable: primary color, position (left/right), welcome message, avatar, and suggested starter questions. On paid plans, the 'Powered by Orac' branding is removed at no extra cost.
For React/Next.js apps, use the npm package for tighter integration. For advanced use cases, the REST API gives you full programmatic control over conversations.
Visitors ask questions
Contextual answers with source citations
When a visitor asks a question, the widget sends it along with the current page URL. Orac embeds the question and runs a similarity search against your vector database.
The search is context-aware: chunks from the visitor's current page are boosted first, then the broader corpus is searched. The top 5-10 most relevant chunks are selected and fed to the LLM along with the question.
The LLM generates a grounded answer — meaning it only uses information from your actual content, not its general training data. Every response includes source citations linking back to the original pages on your site.
Auto-sync keeps it current
24h re-crawl with content hash diffing
Every 24 hours, Orac re-crawls your site and compares content hashes. New pages are added, updated pages are re-embedded with fresh content, and deleted pages have their chunks removed.
Only the delta is processed — if 5 out of 500 pages changed, only those 5 are re-embedded. This makes auto-sync fast and cost-effective (5 pages ≈ $0.001 in embedding costs).
You can also trigger a manual sync anytime via the dashboard. A sync history log shows exactly what changed: pages added, updated, and removed.
Ready to let your docs do the talking?
Start free. No credit card required. Your first chatbot is live in under 5 minutes.