what does robot txt mean.


the robot exclusion standard ,also known as robot exclusion protocols or simply robot.txt,is a standard used by website to communicate with web crawlers  and other web robots.the standard specify how o inform .the web root about which area of website should not be processed or scanned .



The slash after “Disallow” tells the robot to not visit any pages on the site.


Understand the limitations of robots.txt

  • Robots.txt directives may not be supported by all search engines

The instructions in robots.txt files cannot enforce crawler behavior to your site, it's up to the crawler to obey them. While Googlebot and other respectable web crawlers obey the instructions in a robots.txt file, other crawlers might not. Therefore, if you want to keep information secure from web crawlers, it’s better to use other blocking methods, such as password-protecting private files on your server.


  • Different crawlers interpret syntax differently

Although respectable web crawlers follow the directives in a robots.txt file, each crawler might interpret the directives differently. You should know the proper syntax for addressing different web crawlers as some might not understand certain instructions.








Comments

Popular posts from this blog

how to setup google analytics account

How to Add Google Analytics to Blogger