All Narfed Up photography and words by Bryan Villarin

All bots banned

Update: Harsh ban removed

For those search engine bots and whatever else obeys robots.txt, I’ve disallowed access to my whole server. I’ll do this for a month, to clear the cache, then re-enable access (except for any image directories). I’ll be better with this stuff in the future.

Why am I doing this? Maybe it’ll clear things up in my logs and whatnot, especially pages I probably don’t want accessed anymore. Besides, I think the majority of my traffic comes directly from going to my blog. Forums are pretty much dead. The old site is there because of some pages that are being accessed, but I’ll probably just redirect all to a single page stating that it was taken offline.

So again, all bots are banned. I’ll rethink and redo everything a month from now.

If this is all jumbled and confusing, it’s because I’m kind of doing all this off the top of my head. If you want to take a look at my tiny robots.txt file, go ahead. Do I need to do anything different for subdomains? Do they need their own robots.txt file? Or will that single slash cover everything under the sun?

 

6 Comments

And, for those robots that disobey robots.txt, I highly recommend Bad Behavior: http://www.ioerror.us/software/bad-behavior/

Posted by MacManX on 25 May 2005 @ 12am

Oh, and subdomains will need their own robots.txt file.

Posted by MacManX on 25 May 2005 @ 12am

(sorry for typing faster than I think)

And, that single slash will cover everything under each domain, so your robots.txt is kind of redundant. All you need is this (one robots.txt file for each domain/subdomain):

User-agent: *
Disallow: /

Posted by MacManX on 25 May 2005 @ 12am

I can’t help but think this will have ill effects.

Anyways. What is wrong with people coming in from pages that do not exist? I suppose it matters much less in my case, because all 404 pages are custom. I even create custom ones for certain subdomains.

Posted by Anthony on 25 May 2005 @ 3am

@James: all the changes you’ve suggested have been implemented. I installed Bad Behavior right afterwards, too. Do you check your logs at all?

@Anth: What do you think will go wrong?

Posted by Bryan on 25 May 2005 @ 6am

I check my BadBehavior logs every day and have them set to only store entries that are up to 2 days old. Some days, the log only has 20 entries in it. And on days like yesterday, the log may be filled with over 600. Sometimes you’ll see things that look like false positives, but if you send the developer the MySQL data for that entry, he’ll look into it for you. So far, everything that I thought was a false positive, turned out to be a bad bot.

Posted by MacManX on 25 May 2005 @ 11am

Leave a Comment

WordPress Email Notification Plugin Unstoppable Copier