Google Search: “robots.txt” “Disallow:” filetype:txt
James rates this entry 5 out of 10.
Submitted: 2004-03-04 14:36:03
Added by: James
The robots.txt file serves as a set of instructions for web crawlers. The “disallow” tag tells a web crawler where NOT to look, for whatever reason. Hackers will always go to those places first!
2004-06-18 03:16:09 (Coldhak): Awesome, yield some very “useful” information
2004-06-18 19:42:35 (Anonymous): second item in results is :www.whitehouse.gov/robots.txt
2004-09-28 11:10:11 (wittaboom): very good stuff to find out here... thankx
2006-02-26 13:45:40 (viral169): interesting paradox that we're using a bot-based search engine to find sites that are trying to repel bots/crawlers :P
2006-12-27 22:22:50 (BeenJammin): yeah I don't quite get the point of this one. I know that google disallows certain information from getting out on the web as long as they rename the page robots but most of these sites, if you input the disallowed directory into the url it usually says you're unauthorized