If you own a website then you should have a robot.txt file. All a robot.txt file does is give search engine spiders commands on what to look at and what not to. This file is easy to set up and rarely will you have to edit it once it is done. Here is a short how to guide to making a robot.txt file.
Basically, a robot.txt file tells search engines where not to go. Maybe your site has a page or folder with statistics, scripts or if in a CMS, an admin folder you do not want to show up in a search engine list. For whatever reason, if you do not want something to show in search engine results, the robot.txt is where you can say that.
This file can be made with any word processor (like Word or Notepad) and is saved as a .txt file. It is then uploaded into you root folder of your server. Search engines will see this first and follow the commands in it before continuing through your site.
The robot.txt file is a really simple file that contains two lines in it:
User-agent:
Disallow:
The first line identifies the search engine spider you want to communicate with.
- User agent: * – tells all spiders to follow the commands in the file
- User-agent: googlebot – tells Google only
- User-agent: scooter – tells Alta Vista only
You can find a list of specific search engine spider names here.
The second line tells the spider(s) to skip this certain URL on your site:
- Disallow: /privatestats.htm – will tell spider(s) to skip the web page privatstats.htm
You can also tell the spider(s) to skip whole folders of your site:Â
- Disallow: /administrator/ – will tell spider(s) to skip the folder administrator on your site
The robot.txt is a very useful and powerful tool and lets you interact with the search engines that are coming to your site. It gives you a little control as to the direction you want them to take through your site. If you own a website, take the time to make a robot.txt it will be well worth it.












1 Response to “An Easy Guide to Robots.txt File”
Leave a Reply