robots.txt Tester Tool
The robots.txt Tester is a tool that checks the validity and effectiveness of a website's robots.txt file and its directives. The robots.txt file is crucial for controlling how search engines and other web (AI/LLM) robots interact with a site.
Download robots.txt
You can download and insert any live robots.txt file in the editor above by using the form below.
Important: This action will override and disregard any content in the editor below. Save your edits before fetching a live robots.txt file.
What is a robots.txt file?
A robots.txt file is a simple text file placed in the root directory of a website. It contains instructions for web robots, telling them which parts of the site they can access and which parts they should ignore.
Why is robots.txt important?
The robots.txt file serves several important purposes:
- Prevents search engines from indexing private or sensitive content
- Reduces server load by blocking crawlers from non-essential areas
- Helps manage duplicate content issues
- Guides search engines to focus on the most important parts of a site
- Controls access for AI and Large Language Model (LLM) training, helping to protect copyrighted or sensitive information from being used in AI training datasets
How does this robots.txt Tester work?
This robots.txt Tester simulates how a search engine crawler interprets the robots.txt file. It uses the official Google open-sourced robots.txt parsing code and complies with RFC 9309, ensuring accuracy and up-to-date compliance with the latest standards. The robots.txt Tester allows entering a URL and a user agent, then shows which pages are allowed or disallowed based on the robots.txt directives.
Examples of robots.txt directives
Example 1: Blocking all robots from the entire site
User-agent: *
Disallow: /
Example 2: Allowing all robots, but blocking a specific directory
User-agent: *
Disallow: /private/
Example 3: Blocking a specific robot from a specific file
User-agent: BadBot
Disallow: /secret-file.html
Using the robots.txt Tester
To use this robots.txt Tester:
- Enter the absolute URL of the website to test.
- Select or enter a user agent.
- Click theicon to retrieve the live robots.txt file from the specified URL.
- Review the fetched robots.txt content in the provided text area.
- Modify the robots.txt content if desired for testing purposes.
- Click "Test" to see the results.
Fetching and modifying the robots.txt content in the robots.txt Tester does not make any changes to the live website. This tool is for testing and analysis purposes only.
Interpreting the results
The robots.txt Tester will show which URLs are allowed or disallowed for the specified user agent. This helps in verifying that the robots.txt file is correctly configured and effectively managing web robot access to the site.
Disclaimer
The robots.txt tools and information shared on this website are under constant development and improvements will be added over time. Any data uploaded to this website, or its APIs, are deleted within a few days after the purpose has ended, including personal information such as email addresses. For example, email addresses and all related identifable data is deleted shortly after unsubscribing from the notifications. Email addresses are not distributed or sold to third parties or used for any other purpose than stated (notifying of results). Only in the rare occasion when a bug appears, some uploaded data may be stored for slightly longer to debugging purposes, afterwhich the uploaded data is still completely disregarded and deleted.
Bugs will happen. Despite best efforts to maintain the code base and data quality of the information shared on this website, no guarantees can or will be given. Data, information and results may be incomplete and/or errors may occur. This is a personal website and for-fun project. Use at your own risk.