[Info-vax] proper file format, attributes for non-binary files served by a web server

Phillip Helbig---undress to reply helbig at astro.multiCLOTHESvax.de
Mon Jan 23 03:30:29 EST 2012


In article <4f1cdd3a$0$283$14726298 at news.sunsite.dk>,
=?ISO-8859-1?Q?Arne_Vajh=F8j?= <arne at vajhoej.dk> writes: 

> On 1/21/2012 11:20 AM, Phillip Helbig---undress to reply wrote:
> > It would be nice if the access log of the web server provided an
> > indication of whether something was a robot or not.  (For the common
> > ones, one can see this from the name, of course.)  One could then use
> > SEARCH to come up with a list of pages the search engines are hitting.
> 
> Impossible.
> 
> Anyone can write a crawler putting whatever they want in UA.

Right.  And anyone can write a crawler which ignores robots.txt.  But if 
it is a polite crawler which abides by robots.txt, it isn't too much to 
ask to follow some other rule to make its presence obvious in the access 
logs.




More information about the Info-vax mailing list