Ask.com and Their bot!
by Harun Yayli on Wednesday, April 18th, 2007 at 6:25 pm under Ask, Internet, Sitemaps
After their initiative to put the sitemaps into the robots.txt I recently posted about how to identify robots to server the sitemap or not. I believe it’s extremely important for the webmaster’s to protect themselves from site scrapers. Sitemaps in the robots.txt is like a highway sign pointing the easy way to scrape a site.
With this idea in my mind, I also added my sitemap a small snippet to check the bot if it’s from a company that I like to serve the sitemap.
It’s been 3-4 days and I received a warning from my webserver that someone that I don’t know tried to get my sitemap.
The ip was 65.119.214.9. reverse DNS resolves as ext9.eds.jeeves.ask.info, but no ip defined for the host name!
So I started to dig around the ip to see who it belongs to.
Well actually it was from Ask.com, or someone looks like related to ask.com.
I was surprised about the misconfiguration and I filled out the crawler feedback form at the ask.com to report this.
I received a response from them claiming the bot on sunday but I’ve got hit by the same bot again today. Guess what! No IP defined for the rDNS of the same ip.
They show how to identify the bot on their site in webmaster’s guide but they are not even properly configured ! What a shame.
I’ve just sent an email to their dns admin about this. Let’s see what will be their response. Or will I ever get a response back.
I’ll post the updates…
Recent Entries
- memcache.php can delete keys now
- memcache.php is now part of pecl/memcache
- memcache.php goes PECL
- memcache.php stats like apc.php
- oci_bind_by_name maxlength is not so optional
- Is Sun going to buy PHP too?(PHP Quebec 2008)
- PHP APC apc_shm_create error on CLI
- Facebook’s Buggy Spam Detection
- Is it Firefox or Zend Debugger? Cookie Standards
- ezComponents ready for prod?