Website Training
Crawl websites and add web pages to train your workspace
Website Training
Website training allows you to crawl and extract content from websites to build your workspace's knowledge base.
Adding a Website
Enter Website URL
- Click "Add Website" button
- Enter the website URL (e.g.,
https://example.com
orexample.com
) - The system will automatically crawl all accessible pages
Configure URL Patterns (Optional)
Include Patterns: Only crawl URLs matching these patterns
- Enter one pattern per line
- Use
*
as wildcard (e.g.,https://example.com/docs/*
) - Useful for targeting specific sections or languages
Exclude Patterns: Skip URLs matching these patterns
- Enter one pattern per line
- Use
*
as wildcard (e.g.,https://example.com/admin/*
) - Useful for excluding private pages or unwanted sections
Examples:
Include: https://example.com/en/*
Exclude: https://example.com/*/old/*
Start Crawling
Click "Add" to begin. The system will:
- Discover and crawl all pages
- Extract text content
- Count toward your URL limit
URL Limit: Each crawled page counts as 1 URL toward your plan limit. Check your workspace info panel for current usage.
Auto-Sync Schedules
Keep your website content automatically updated with scheduled syncs.
Available Sync Options
- Daily: Sync every 24 hours
- Weekly: Sync once per week
- Monthly: Sync once per month
- None: Manual sync only (default)
Setting Up Auto-Sync
- Expand a website in your training list
- Click the schedule dropdown
- Select sync frequency
- Confirm the change
Plan Restrictions: Auto-sync frequency depends on your plan. Upgrade to access daily and weekly sync options.
Manual Sync
Update website content manually anytime:
Sync Website: Re-crawl the entire website from scratch Sync Links: Update all previously crawled pages
Adding Individual Links
For specific pages without full website crawl:
- Click "Add Links" button
- Enter URLs (one per line or comma-separated)
- Click "Add" to process
Managing Website Content
Edit Mode
- Click "Edit" button
- Select websites or individual pages
- Choose action:
- Delete: Remove permanently
- Assign: Assign to specific BV applications
Search
Use the search bar to filter websites and pages by URL.