# Best Practices robots.txt Example # 1. Sitemap Declaration(s) # Always declare your sitemap(s) to help search engines discover your important pages. # Use the full URL to your sitemap(s). If you have multiple, list them all. Sitemap: https://attendanceguru.com/subdomain.xml # Sitemap: https://website-sitemap.s3.ap-south-1.amazonaws.com/subdomain.xml # Sitemap: https://website-sitemap.s3.ap-south-1.amazonaws.com/conversationseo.xml # Sitemap: https://website-sitemap.s3.ap-south-1.amazonaws.com/sitemap.xml # 2. User-Agent Directives # Apply directives to all crawlers unless a specific crawler needs different rules. User-Agent: * # 3. General Allowance (often implicit or good for clarity) # Allow crawling of the entire site by default. More specific Disallow rules will override this for specific paths. Allow: / # 4. Disallow Directives (Commonly Blocked Areas) # Block areas that are not intended for public search results or are purely functional. # - Administrative areas (e.g., login, admin dashboards) # - User-specific pages (e.g., user profiles, settings) that are not public # - Internal search result pages (can create infinite crawl loops and low-value content) # - Shopping cart/checkout processes (once the user starts them) # - Development/staging environments Disallow: /login/ Disallow: /#/edzon/ # Specific disallows from your original list (adjust as needed based on intent) # Disallow: /#/edzon/attendanceguru/ # Only if this path is truly not meant for indexing Disallow: /signinupextended # 5. Handling of CSS, JavaScript, and Images (CRITICAL FOR RENDERING) # Google explicitly recommends *not* blocking CSS, JavaScript, or images that are # essential for rendering the page's content or understanding its layout. # Blocking them can lead to "degraded" or "incomplete" rendering by Googlebot. # If you have non-essential JS/CSS (e.g., very large analytics files that don't affect content), # you *could* disallow them, but it's often not necessary. # ALLOW all CSS and JS for proper rendering. Allow: /*.css Allow: /*.js Allow: /*.png Allow: /*.jpg Allow: /*.gif Allow: /*.svg # 6. Specific Allowances for Third-Party Scripts (like AdSense, Google Analytics) # These are often allowed even if there's a broader disallow that might accidentally catch them. # Your original file had good examples of these. Allow: /ads.txt Allow: /ads/preferences/ Allow: /gpt/ Allow: /pagead/show_ads.js Allow: /pagead/js/adsbygoogle.js Allow: /pagead/js/*/show_ads_impl.js Allow: /static/glade.js Allow: /static/glade/