The Attack of the Spiders from the Clouds omertacc, cardingstore

We have seen a lot of discussions of cloud computing in the news recently, as a technology to permit “users to access technology-enabled services without knowledge of, expertise with, nor control over the technology infrastructure that supports them.”   This sounds great doesn’t it?!   Users with little to no IT expertise can log into the cloud and launch 8 instances of a server with the equivalence of 16 high performance CPU cores.   However, as we all know, all things, including cool technologies, have the potential for both good and evil, opportunity or threat; and cloud computing is no different.
It just so happens that I have been experimenting with Amazon Elastic Computing Services (EC2), documented in Computing in the Clouds with AWS over at The CEP Blog .  The server over at The UNIX and Linux Forums has been experiencing some very hardware-limited, high load averages recently. We thought we should take a look at moving the forum server up to the clouds.   
Then, a fellow system admin over at the forums suggested that maybe some rogue bots were causing high server loads; so I wrote a one-line command to do a bit of real-time spider hunting in the Apache2 logfiles.  Surprise!  I found there were a number of rogue, hungry spiders that would not follow our robots.txt directive not to crawl the site.   One of the bots was from Russia, one was from China, and another one was from Korea.  There were spiders from places I never heard of, all consuming precious  resources and denying our users!
So, I did what any Linux admin would do. I used iptables to block the networks of these rogue, hungry, spiders (sorry I was not very kind to these cyber creatures).  It probally comes to no surprise at this point in the story that four of the spiders were from the Amazon EC2 cloud.  Here is a sample of the output from iptables -L:
root@www:~# iptables -LChain INPUT (policy ACCEPT)target prot opt source destination DROP all — ec2-67-202-45-0.compute-1.amazonaws.com/24 DROP all — ec2-75-101-243-0.compute-1.amazonaws.com/24DROP all — ec2-75-101-197-0.compute-1.amazonaws.com/24DROP all — ec2-75-101-213-0.compute-1.amazonaws.com/24
omertacc cardingstore

Author: wpadmin