Robots.txt file is an important term which we should learn to understand its importance in crawling. 

What is the Robots.txt file?

Robots.txt stands for Robots Exclusion Protocol (REP).  By inserting this file on your website, you can disallow crawlers for crawling to a specific webpage of your website.

When the crawler visits your website, initially it checks for this file. If it is not present, then it will crawl all the crawlable information present on your website.

Robots.txt file
Google Spider

Syntax for Robots.txt files:

User-agent: user agent name (unnecessary)

Disallow: URL string not to be crawled

Meaning of technical terms associated in robots.txt syntax:

User-agent: Here you specify the name of user-agent (Bot name example: Googlebot, Bingbot etc.)

Disallow: Here you specify the URL or file you do not want to crawl

Allow (only for Googlebot): Allow the crawler to crawl the subfolder despite the main folder being disallowed

Crawl-delay: The time Googlebot should wait to crawl the page (Googlebot ignores this command)

Sitemap: To tell the crawler the location of sitemap present for that URL

Now let’s take some examples to better understand the robots.txt file:

Case1: We allow the crawlers to crawl the entire website (Never do this)

User-agent: *

Disallow:

Case2: We disallow the crawlers, not to crawler services section of our website www.example.com/services

User-agent: *

Disallow: /services/

Case3: Apply robots.txt file to the entire website and block crawler to crawl it

User-agent: *

Disallow: /

Case4: Disallow a specific crawler not to crawl your website

User-agent: Bingbot

Disallow: /

Robots.txt SEO
My Website Robots FIle

Where to insert the robots.txt file:

After making this file (done on a notepad by saving the file in .txt format) you must upload the same on the root directory (Homepage) of your website. The crawlers only crawl the root directory for the robots.txt file if they don’t find it there they assume that this website does not have this file.

How to insert the robots.txt file?

To know how to insert it click on this link.

Next Technical Term: Sitemaps

Leave a comment

Your email address will not be published. Required fields are marked *

Donate & Support our work

X