what does robot txt mean.
the robot exclusion standard ,also known as robot exclusion protocols or simply robot.txt,is a standard used by website to communicate with web crawlers and other web robots.the standard specify how o inform .the web root about which area of website should not be processed or scanned .
The slash after “Disallow” tells the robot to not visit any pages on the site.
Understand the limitations of robots.txt
- Robots.txt directives may not be supported by all search engines
The instructions in robots.txt
files cannot enforce crawler behavior to your site, it's up to the crawler to obey them. While Googlebot and other respectable web crawlers obey the instructions in a robots.txt
file, other crawlers might not. Therefore, if you want to keep information secure from web crawlers, it’s better to use other blocking methods, such as password-protecting private files on your server.
- Different crawlers interpret syntax differently
Although respectable web crawlers follow the directives in a robots.txt
file, each crawler might interpret the directives differently. You should know the proper syntax for addressing different web crawlers as some might not understand certain instructions.
Comments
Post a Comment