What Are User-Agents?
A user-agent is a string that identifies which bot or browser is accessing your website. AI search engines use specific user-agents so you can control their access in robots.txt.
Understanding these user-agents helps you configure GEO-friendly access. This reference lists all major AI bots you should know about.
Major AI Bot User-Agents
| Bot Name | User-Agent String | Company |
|---|---|---|
| GPTBot | GPTBot | OpenAI (ChatGPT) |
| ChatGPT-User | ChatGPT-User | OpenAI (ChatGPT Plugins/Browse) |
| Claude-Web | Claude-Web | Anthropic (Claude) |
| anthropic-ai | anthropic-ai | Anthropic (Training) |
| Google-Extended | Google-Extended | Google (Gemini/Bard) |
| PerplexityBot | PerplexityBot | Perplexity AI |
| YouBot | YouBot | You.com |
| Applebot-Extended | Applebot-Extended | Apple (AI features) |
| Diffbot | Diffbot | Diffbot (Knowledge Graph) |
| cohere-ai | cohere-ai | Cohere |
robots.txt Examples
Allow All AI Bots (Recommended for GEO)
This configuration allows all AI bots to crawl your entire site:
User-agent: GPTBot
Allow: /
User-agent: ChatGPT-User
Allow: /
User-agent: Claude-Web
Allow: /
User-agent: anthropic-ai
Allow: /
User-agent: Google-Extended
Allow: /
User-agent: PerplexityBot
Allow: /
User-agent: YouBot
Allow: /
User-agent: Applebot-Extended
Allow: /
User-agent: Diffbot
Allow: /
User-agent: cohere-ai
Allow: /Block All AI Bots (Not Recommended)
Only use this if you want to opt out of AI search completely:
User-agent: GPTBot
Disallow: /
User-agent: ChatGPT-User
Disallow: /
User-agent: Claude-Web
Disallow: /
User-agent: Google-Extended
Disallow: /Partial Access Example
Allow AI bots to public content but block private areas:
User-agent: GPTBot
Allow: /blog/
Allow: /docs/
Disallow: /admin/
Disallow: /user/
Disallow: /api/Testing Your Configuration
Check Your robots.txt
Visit: https://yoursite.com/robots.txt
Verify the file loads and contains your AI bot configurations.
Use Robots.txt Testers
Google Search Console offers a robots.txt tester. Test different user-agents to verify your configuration.
Monitor Server Logs
Check your server logs to see which bots are actually crawling. Look for the user-agent strings in access logs.
Best Practices
- •Allow all AI bots unless you have specific reasons not to
- •Keep your robots.txt simple and well-documented
- •Test changes before deploying to production
- •Monitor logs to see which bots actually visit
- •Update your robots.txt as new AI bots emerge
- •Block only specific sensitive directories, not entire site
Related Resources
- Robots.txt Complete Guide
Full guide to robots.txt configuration
- AI Bot Access
Understanding bot access for GEO
- Technical SEO Checklist
Complete technical optimization checklist