Ask.com and Their bot!

by Harun Yayli on Wednesday, April 18th, 2007 at 6:25 pm under Ask, Internet, Sitemaps

After their initiative to put the sitemaps into the robots.txt I recently posted about how to identify robots to server the sitemap or not. I believe it’s extremely important for the webmaster’s to protect themselves from site scrapers. Sitemaps in the robots.txt is like a highway sign pointing the easy way to scrape a site.
With this idea in my mind, I also added my sitemap a small snippet to check the bot if it’s from a company that I like to serve the sitemap.

It’s been 3-4 days and I received a warning from my webserver that someone that I don’t know tried to get my sitemap.
The ip was 65.119.214.9. reverse DNS resolves as ext9.eds.jeeves.ask.info, but no ip defined for the host name!
So I started to dig around the ip to see who it belongs to.
Well actually it was from Ask.com, or someone looks like related to ask.com.
I was surprised about the misconfiguration and I filled out the crawler feedback form at the ask.com to report this.
I received a response from them claiming the bot on sunday but I’ve got hit by the same bot again today. Guess what! No IP defined for the rDNS of the same ip.
They show how to identify the bot on their site in webmaster’s guide but they are not even properly configured ! What a shame.
I’ve just sent an email to their dns admin about this. Let’s see what will be their response. Or will I ever get a response back.
I’ll post the updates…

Recent Entries

Leave a Reply

authimage