# ========================================================== # robots.txt — LawFirmCentral.com # Note: robots.txt is advisory. Malicious bots ignore it — # Cloudflare WAF / firewall rules handle real enforcement # (geo-block CN/HK, bad ASNs, rate limiting, bot fight mode). # ========================================================== # Cloudflare Content Signals (EU DSM Directive Art. 4 reservation): # search = yes → indexing & search results allowed # ai-input = yes → real-time AI grounding / RAG allowed # ai-train = no → no model training on this content User-agent: * Content-Signal: search=yes,ai-input=yes,ai-train=no Allow: / Disallow: /admin Disallow: /admin/ Disallow: /auth Disallow: /auth/ Disallow: /unsubscribe Disallow: /lovable/ Disallow: /api/ Disallow: /*/claim$ # ----- Allowed: reputable web search engines ----- User-agent: Googlebot Allow: / User-agent: Googlebot-Image Allow: / User-agent: Googlebot-News Allow: / User-agent: Bingbot Allow: / User-agent: Slurp Allow: / User-agent: DuckDuckBot Allow: / User-agent: Twitterbot Allow: / User-agent: facebookexternalhit Allow: / User-agent: LinkedInBot Allow: / User-agent: Pinterestbot Allow: / # ----- Allowed: reputable AI / LLM crawlers (AEO / GEO / AIO) ----- User-agent: GPTBot Allow: / User-agent: ChatGPT-User Allow: / User-agent: OAI-SearchBot Allow: / User-agent: ClaudeBot Allow: / User-agent: Claude-Web Allow: / User-agent: PerplexityBot Allow: / User-agent: Perplexity-User Allow: / User-agent: Google-Extended Allow: / User-agent: Applebot Allow: / User-agent: Applebot-Extended Allow: / User-agent: cohere-ai Allow: / User-agent: Meta-ExternalAgent Allow: / User-agent: Meta-ExternalFetcher Allow: / User-agent: DuckAssistBot Allow: / User-agent: YouBot Allow: / User-agent: Mistralbot Allow: / # ----- Blocked: Chinese / Hong Kong crawlers ----- User-agent: Baiduspider Disallow: / User-agent: Baiduspider-image Disallow: / User-agent: Baiduspider-video Disallow: / User-agent: Baiduspider-news Disallow: / User-agent: Baiduspider-render Disallow: / User-agent: Sogou Disallow: / User-agent: Sogou web spider Disallow: / User-agent: Sogou inst spider Disallow: / User-agent: Sogou Pic Spider Disallow: / User-agent: Sogou Orion spider Disallow: / User-agent: Sogou spider2 Disallow: / User-agent: 360Spider Disallow: / User-agent: 360Spider-Image Disallow: / User-agent: Haosouspider Disallow: / User-agent: YisouSpider Disallow: / User-agent: Bytespider Disallow: / User-agent: PetalBot Disallow: / User-agent: AspiegelBot Disallow: / # ----- Blocked: Russian / other regional crawlers ----- User-agent: Yandex Disallow: / User-agent: YandexBot Disallow: / User-agent: YandexImages Disallow: / User-agent: Mail.RU_Bot Disallow: / User-agent: Naver Disallow: / User-agent: Yeti Disallow: / User-agent: Daum Disallow: / User-agent: SeznamBot Disallow: / # ----- Blocked: SEO scrapers / competitive intel ----- User-agent: AhrefsBot Disallow: / User-agent: SemrushBot Disallow: / User-agent: SemrushBot-SA Disallow: / User-agent: MJ12bot Disallow: / User-agent: DotBot Disallow: / User-agent: rogerbot Disallow: / User-agent: BLEXBot Disallow: / User-agent: SerpstatBot Disallow: / User-agent: DataForSeoBot Disallow: / User-agent: barkrowler Disallow: / User-agent: SiteAuditBot Disallow: / User-agent: LinkpadBot Disallow: / User-agent: spbot Disallow: / User-agent: linkdexbot Disallow: / User-agent: Sistrix Disallow: / User-agent: SISTRIX Crawler Disallow: / User-agent: SEOkicks Disallow: / User-agent: Screaming Frog SEO Spider Disallow: / User-agent: NetcraftSurveyAgent Disallow: / User-agent: ZoominfoBot Disallow: / User-agent: Bloomberg Disallow: / User-agent: TurnitinBot Disallow: / User-agent: GrapeshotCrawler Disallow: / # ----- Blocked: aggressive / unwanted AI scrapers ----- User-agent: CCBot Disallow: / User-agent: Diffbot Disallow: / User-agent: Omgili Disallow: / User-agent: Omgilibot Disallow: / User-agent: ImagesiftBot Disallow: / User-agent: FacebookBot Disallow: / User-agent: Amazonbot Disallow: / User-agent: anthropic-ai Disallow: / User-agent: Timpibot Disallow: / User-agent: ICC-Crawler Disallow: / User-agent: PanguBot Disallow: / User-agent: Kangaroo Bot Disallow: / User-agent: AwarioRssBot Disallow: / User-agent: AwarioSmartBot Disallow: / User-agent: Magpie-Crawler Disallow: / User-agent: peer39_crawler Disallow: / User-agent: peer39_crawler/1.0 Disallow: / User-agent: FriendlyCrawler Disallow: / Sitemap: https://lawfirmcentral.com/sitemap.xml