Every website should have a robots.txt file — a plain text file at the root of your domain that tells search engine crawlers which pages they are allowed to visit and which to skip. Without one, Google crawls everything it finds, including admin pages, duplicate content, and staging areas that should stay private. This guide shows you how to create a correct robots.txt file for free in under 10 minutes using our free generator.
What Is a Robots.txt File?
A robots.txt file is a plain text file located at the root of your website (e.g., https://yoursite.com/robots.txt) that uses the Robots Exclusion Protocol to communicate with search engine crawlers. When Googlebot or any other crawler visits your site, it first checks this file to see what it is and is not allowed to access.
The file follows a simple format: you specify a user-agent (which crawler the rule applies to) and then Disallow or Allow directives listing paths the crawler should skip or access.
Robots.txt is a request, not a lock. Well-behaved crawlers like Googlebot follow robots.txt instructions, but malicious bots may ignore them. For pages you want kept truly private (login areas, admin panels), use proper server-level authentication — do not rely on robots.txt alone.
Robots.txt Syntax Explained
A robots.txt file is made up of groups of directives. Each group starts with a User-agent line followed by one or more Disallow or Allow lines:
User-agent: * Disallow: /admin/ Disallow: /private/ Allow: / Sitemap: https://yoursite.com/sitemap.xml
Breaking down each directive:
- User-agent: * — the asterisk (*) means "all crawlers." You can replace * with a specific crawler name like "Googlebot" to target just that bot.
- Disallow: /admin/ — tells crawlers not to access any URL beginning with /admin/
- Allow: / — explicitly allows access to everything else. Often unnecessary (allowing is the default) but useful for clarity.
- Sitemap: — an optional but highly recommended line pointing crawlers to your XML sitemap. Google uses this to find all your pages.
Key syntax rules:
- Each directive must be on its own line
- Paths are case-sensitive:
/Admin/and/admin/are different - A blank
Disallow:with no path means "allow everything" — not "block everything" - Lines starting with # are comments (ignored by crawlers, useful for notes)
How to Create One Free in 3 Steps
Our free robots.txt generator builds the file for you without needing to write any code:
Step 1: Open the generator
Go to searchranktool.com/robots-txt-generator. No signup required.
Step 2: Configure your rules
Select which crawlers to target, which directories to block, and whether to include your sitemap URL. The generator builds the correct syntax as you make selections.
Step 3: Copy and upload
Copy the generated robots.txt content. Create a plain text file named exactly robots.txt (lowercase, no spaces) and upload it to the root directory of your web server — the same folder that contains your index.html or index.php file.
Once uploaded, verify it is accessible by visiting https://yoursite.com/robots.txt in your browser. You should see the plain text content of the file.
What to Block vs What to Allow
The decision of what to block depends on your site type, but here are standard recommendations:
Typically block (add to Disallow):
/admin/or/wp-admin/— WordPress admin area/login,/logout,/register— authentication pages/cart,/checkout— e-commerce transaction pages/search— internal search result pages (duplicate content risk)/tag/,/author/— WordPress taxonomy pages with thin content/cdn-cgi/— Cloudflare system pages- Any staging, test, or development directories
Always keep allowed (never block):
- Your homepage (
/) - All blog posts, articles, and product pages
- Your sitemap (
/sitemap.xml) - CSS and JavaScript files (blocking these prevents Google from rendering your pages correctly)
- Any pages you want to rank in Google
Critical mistake to avoid: Never add Disallow: / to your live site. This blocks all crawlers from your entire website and will cause your site to be de-indexed from Google within days.
Robots.txt for WordPress
WordPress sites have a virtual robots.txt generated automatically. To customise it, the cleanest approach is:
Option A — Use a plugin: Yoast SEO and Rank Math both have robots.txt editors under their SEO settings. This is the easiest approach for WordPress users.
Option B — Upload a physical file: Upload a robots.txt file to your WordPress root directory (the folder containing wp-config.php). A physical file overrides the WordPress virtual robots.txt.
A standard WordPress robots.txt looks like:
User-agent: * Disallow: /wp-admin/ Allow: /wp-admin/admin-ajax.php Disallow: /wp-login.php Disallow: /search Disallow: /?s= Sitemap: https://yoursite.com/sitemap.xml
The Allow: /wp-admin/admin-ajax.php line is important — it allows crawlers to access the AJAX endpoint used by many plugins, preventing rendering issues.
How to Test Your Robots.txt
After uploading your robots.txt, test it in Google Search Console:
- Open Search Console for your property
- Go to Settings (gear icon) → Robots.txt
- Google shows your current robots.txt content and when it was last fetched
- Use the URL tester to check whether specific URLs are allowed or blocked
The URL tester is essential — enter any URL on your site and it tells you whether Googlebot can access it based on your current robots.txt rules. This catches mistakes before they affect your rankings.
Also check your XML sitemap is referenced in your robots.txt. Google crawlers use the Sitemap line to discover all your pages efficiently.
Frequently Asked Questions
Does every website need a robots.txt file?
Not strictly required, but strongly recommended. Without a robots.txt file, search engine crawlers will attempt to access every page they can find via links, including admin areas, login pages, and duplicate content. A robots.txt file gives you control over what gets crawled, protecting both your security and your crawl budget.
What happens if I block the wrong page in robots.txt?
If you accidentally block a page that should be indexed, Google will stop crawling it and eventually remove it from search results. This can cause a sudden ranking drop. Always test your robots.txt in Google Search Console after making changes, and check that your most important pages (homepage, blog posts, product pages) are not inadvertently blocked.
Can robots.txt hide my pages from Google?
Robots.txt prevents Google from crawling a page, but not necessarily from indexing it. If other websites link to a blocked page, Google may still add it to its index based on those links — it just won't have crawled the content. To prevent a page from appearing in search results entirely, use a noindex meta tag on the page itself rather than (or in addition to) blocking it in robots.txt.
How often should I update my robots.txt file?
Update your robots.txt whenever you add new sections to your site that should be blocked (new admin tools, staging areas, search result pages) or when you change your URL structure. After any major site rebuild or platform migration, review and update your robots.txt to match the new structure. Submit to Google Search Console after each update.