Robots.txt file not available

Table of Contents

Cause:
How to resolve it:

Cause: #

A robots.txt file tells search engine crawlers which URLs the crawler can access on your site. This is used mainly to avoid overloading your site with requests; it is not a mechanism for keeping a web page out of Google. To keep a web page out of Google, block indexing with noindex or password-protect the page.

Robots.txt file is automatically crawled by robots when they arrive at your website. This file should contain commands for robots, such as which pages should or should not be indexed. It must be well-formatted to ensure search engines can crawl and read it.

If you want to disallow indexing of some content (for example, pages with private or duplicate content), just use an appropriate rule in the robots.txt file.

How to resolve it: #

If the Robots.txt file is not available on your website you need to create an appropriate rule and upload it to the root folder of the website.

You can control which files crawlers may access on your site with a robots.txt file. A robots.txt file lives at the root of your site. So, for the site www.example.com, the robots.txt file lives at www.example.com/robots.txt. robots.txt is a plain text file that follows the Robots Exclusion Standard. A robots.txt file consists of one or more rules. Each rule blocks or allows access for a given crawler to a specified file path in that website. Unless you specify otherwise in your robots.txt file, all files are implicitly allowed for crawling.

Here is a simple robots.txt file with two rules:

User-agent: Googlebot
Disallow: /nogooglebot/
User-agent: *
Allow: /
Sitemap: http://www.example.com/sitemap.xml