Tuesday, October 31, 2006

Plea for Help

{ Dodger - Our good friend DH has a problem that he brought up in the previous post comments and which I thought deserved a post of its own. This is a problem that mainly affects those who run their blogs on their own domains, but that is an increasing number of us. Anyone that knows of something that can help, please let us know. DH has promised to also share the solutions he has found. Thanks. }

This is more of a plea for help and a Rant from an upset, abused person! I'm probably one week away from shutting down my blog and moving it somewhere else. Another server and domain perhaps? Why you ask and why Am I discussing it here. Well it is a form of censorship. Bots...

Name Hits Bandwidth Date
Inktomi Slurp 1315 47.10 MB 23 Oct 2006 - 20:24
MSNBot 815 30.45 MB 27 Oct 2006 - 09:24
WISENutbot 438 16.67 MB 27 Oct 2006 - 09:50
Unknown robot (identified by 'spider') 364 2.97 MB 24 Oct 2006 - 23:04
Unknown robot (identified by 'crawl') 345 13.27 MB 26 Oct 2006 - 18:50
LinkWalker 209 2.78 MB 02 Oct 2006 - 03:57
AskJeeves 147 5.02 MB 27 Oct 2006 - 07:01

Trying to get a handle on this through server control files like .htaccess or Robot.txt and at what level they should reside and how long they take to listen is like performing magic.

While I'm mastering this, users are often kept out, I lose posts, and basically when something is not working right, because of speed or comments, people just go away. In my mind it is a form of censorship as it is to other users. Sure I'm flattered to have references to my blog sought out by the different engines.
A. I never asked for it
B. for some people it a Bandwidth problem.
C. It slows the system to a crawl.
D. It deters users from re-visiting.

Just to name a few.
So can we get all the brainiacs together and put up a formula for people who spend the $125 U.S. + to have access to their own server, register a domain. What is the proper syntax, which files should be modified, where should they reside. Are there other tricks out there that deter the bots, can you turn them all off or just some? Can you let them crawl in certain places and not others?

Why don't they schedule them by time zone and hit at 3.am when there maybe less traffic?

This may be indirectly related to your cause, and solutions I'm sure could help many out there. Hope we get something going on this.


