Email Subject Line Tester
Preview your email subject line in various inbox sizes and score length, readability, punctuation, spam-risk words, and clear value.
Build robots.txt Disallow rules for AI training crawlers, AI search crawlers, and user-triggered fetchers.
Choose the AI training crawlers, search crawlers, and user-triggered fetchers you want to disallow, then copy or download the generated block for your root robots.txt file.
Policy ready.
Use this builder when you want a clear starting point for controlling how AI crawlers, AI search systems, and training-data crawlers interact with public pages on your site.
An AI crawler policy is usually expressed through robots.txt, the plain-text file placed at the root of a website to give automated crawlers instructions about which URLs they may fetch. Search engines have used this convention for decades. AI companies, search engines, data providers, and research crawlers now publish their own user-agent tokens so site owners can signal whether public content should be crawled for search, model training, retrieval, summaries, or other product features.
The important word is signal. A robots.txt rule is not a password, firewall, or access-control system. Well-behaved crawlers fetch the file, parse the matching User-agent group, and follow its Allow and Disallow directives. A crawler that does not check robots.txt, spoofs a user agent, uses a third-party fetcher, or accesses cached copies may not behave the way your file asks it to behave. Sensitive, private, paid, or legally restricted material should be protected with real access controls instead of relying on crawler instructions.
The best policy depends on your tradeoff. A publisher may block training crawlers while still allowing search crawlers that send referral traffic. A SaaS company may allow broad indexing for documentation but block training use for gated templates. A community site may block archive-style crawlers to reduce load. A product catalog may permit traditional search but review AI answer crawlers carefully if summaries could replace visits. This builder makes those choices visible by separating crawler tokens and paths before you publish anything.
Crawler names also change. OpenAI documents separate tokens for search, training, and user-triggered requests. Google uses Google-Extended as a robots.txt control token rather than a separate HTTP user-agent string. Apple documents Applebot-Extended for foundation-model training opt-out, while Applebot itself is the search crawler. Anthropic and Amazon document separate search and user-triggered agents alongside training crawlers. Treat the generated policy as a starting point that should be checked against current vendor documentation and your own server logs.
The builder collects selected crawler tokens, additional crawler tokens, target paths, and an optional sitemap URL. It removes duplicate tokens, normalizes blank lines, and writes one robots.txt group per crawler. Each group outputs Disallow directives for the selected paths. The default selections favor blocking training and data-use tokens while leaving search and user-triggered fetchers available unless you choose to block them.
When you append the generated block to an existing robots.txt file, keep rule order and matching behavior in mind. Crawlers generally use the most specific matching group for their token, so duplicate groups can make a file harder to reason about. If your current file already has a group for one of these tokens, merge the generated directives into that existing group instead of publishing conflicting sections. After publishing, use search-console tools, crawler documentation, or server logs to confirm the file is reachable at /robots.txt.
The builder runs in your browser. It does not fetch your domain, inspect your server logs, verify IP ranges, or make claims about legal enforceability. Those steps may still matter for larger publishers, high-value content sites, or domains with unusual crawler traffic. For production policies, keep a dated internal note explaining why each crawler is allowed or blocked so future teams can update the file deliberately instead of copying old rules forward forever.
Built and maintained by utilkit. Found an issue? Send corrections to contact@utilkit.com
Preview your email subject line in various inbox sizes and score length, readability, punctuation, spam-risk words, and clear value.
Generate QR codes for websites, Wi-Fi networks, text, email, phone numbers, SMS messages, contacts, events, and locations.
Convert titles, headlines, and keyword lists into clean URL slugs with separator, stop word, number, and length options.