Why on earth does Google want to FTP into your server?

Recently I discovered that Google bots were being blocked after visiting my server.

After examining the log files, it appeared that Google bots were not just visiting, they were attempting to FTP into the server, something that perplexed me.

After several such blocks, an entire C class of Google’s network would be blocked, due to the rules of my firewall.

The logs look like this:

66.249.71.13 # lfd: (ftpd) Failed FTP login from 66.249.71.13 (US/United States/crawl-66-249-71-13.googlebot.com): 1 in the last 300 secs – Tue Apr 17 04:58:43 2012

66.249.71.238 # lfd: (ftpd) Failed FTP login from 66.249.71.238 (US/United States/crawl-66-249-71-238.googlebot.com): 1 in the last 300 secs – Tue Apr 17 05:03:23 2012

66.249.71.73 # lfd: (ftpd) Failed FTP login from 66.249.71.73 (US/United States/crawl-66-249-71-73.googlebot.com): 1 in the last 300 secs – Tue Apr 17 05:34:46 2012

66.249.71.0/24 # lfd: (NETBLOCK) 66.249.71.0/24 has had more than 4 blocks in the last 86400 secs – Tue Apr 17 05:34:36 2012

As I have anonymous FTP turned off, I resorted to researching Google about its own snooping behavior. I found out it’s not uncommon, as others reported the same behavior by Google bots attempting to FTP into servers.

A response to a similar question was given by a Google technician from Switzerland, who stated:

“When we find links to FTP content, we’ll generally attempt to crawl those URLs. If they’re publicly accessible and return normal content, we may choose to index them as well. While it’s not that common, there are occasionally queries where a file on an FTP server is a good result. […] If you wish to block crawling of a public FTP server, you can use the robots.txt file just as you would on a normal website. If your FTP server isn’t publicly accessible, then you wouldn’t need to do anything specific to prevent that content from being indexed (as it can’t be accessed).”

The problem is, that there is no public FTP on my server and thus the reasoning behind visits from Google bots, along with the above method of blocking Google bots do not apply.

I resorted to checking my logs frequently and removing entries where Google had been blocked, manually. It seems that Google keeps on trying to unlock a virtual door with no keyhole to it.

Comments

  1. Wouldn’t surprise me if they were trying to get into your cell phone to sniff out who you talk to, when, for how long and how to best use your personal information to profit. Oh wait.. they’re already doing that (http://androidandme.com/2012/03/news/google-wants-to-monitor-your-phone-calls-background-noise-to-better-serve-you-ads/).
    There is no limit to how much data they want about you.

  2. Wow, Luc, now that’s creepy.

  3. Creepy indeed. The article is interesting, but the parody video by The Onion at the bottom is extremely funny (http://www.youtube.com/watch?feature=player_embedded&v=Xtuxax8Dtk4)

  4. Luc – Parody often tells the truth 😉 Quite scary.

Speak Your Mind

*