User-agent: * Allow: / # Allow crawling of all important content Allow: /article/ Allow: /category/ Allow: /about Allow: /contact Allow: /privacy Allow: /terms # Block admin and sensitive areas Disallow: /admin* Disallow: /login Disallow: /api* Disallow: /*.json$ Disallow: /search?* Disallow: /*?*utm_* Disallow: /*?*fbclid=* Disallow: /*?*gclid=* # Block duplicate URLs with language parameters Disallow: /*?lang=*&* # Sitemap locations (both dynamic and static) Sitemap: https://impaxnews.com/sitemap.xml Sitemap: https://impaxnews.com/static/sitemap.xml # Crawl-delay for respectful crawling Crawl-delay: 1 # Special rules for news crawlers (faster crawling for news content) User-agent: Googlebot-News Allow: / Allow: /article/ Allow: /category/ Crawl-delay: 0 User-agent: Bingbot Allow: / Crawl-delay: 2 # Social media crawlers User-agent: facebookexternalhit/* Allow: / User-agent: Twitterbot Allow: / User-agent: LinkedInBot Allow: / User-agent: WhatsApp Allow: / User-agent: Telegrambot Allow: /