It looks like someone at Apple is running a web crawler written in Go.
The first hit I could find on was from 17.147.18.33 on the 15th of October:
17.147.18.33 - - [15/Oct/2014:03:05:42 +0200] "GET /robots.txt HTTP/1.1" 301 185 "-" "Mozilla/5.0 (compatible; Fetcher/0.1)"
All IPs starting with "17." are Apple's:
NetRange: 17.0.0.0 - 17.255.255.255
CIDR: 17.0.0.0/8
NetName: APPLE-WWNET
It identifies itself as "Mozilla/5.0 (compatible; Fetcher/0.1)". Interestingly,
it seems to have a bug, which is how I found out it was written in Go. When
following redirects, it does not set the "User-Agent" header. In the case of
, the canonical domain name does not have the "www".
Requesting redirects you to
:
17.147.18.33 - - [15/Oct/2014:03:05:44 +0200] "GET /robots.txt HTTP/1.1" 200 404 "http://www.catenacycling.com/robots.txt" "Go 1.1 package http"
I am not sure if that is a bug in the crawler, or in the Go http package
.
So far, I have seen requests from two IPs: 17.147.18.33 (7 on 2014-10-15) and
17.147.18.35 (about 3000 and counting on 2014-11-06, today).
17.147.18.35 - - [06/Nov/2014:08:10:29 +0100] "GET /robots.txt HTTP/1.1" 301 185 "-" "Mozilla/5.0 (compatible; Fetcher/0.1)"
17.147.18.35 - - [06/Nov/2014:08:10:30 +0100] "GET /robots.txt HTTP/1.1" 200 403 "http://www.catenacycling.com/robots.txt" "Go 1.1 package http"
17.147.18.35 - - [06/Nov/2014:08:10:36 +0100] "GET / HTTP/1.1" 301 185 "-" "Mozilla/5.0 (compatible; Fetcher/0.1)"
17.147.18.35 - - [06/Nov/2014:08:10:40 +0100] "GET /en HTTP/1.1" 200 93350 "http://www.catenacycling.com" "Go 1.1 package http"
17.147.18.35 - - [06/Nov/2014:08:11:03 +0100] "GET /robots.txt HTTP/1.1" 200 403 "-" "Mozilla/5.0 (compatible; Fetcher/0.1)"
17.147.18.35 - - [06/Nov/2014:08:11:05 +0100] "GET /en/ride-the-world/routes HTTP/1.1" 200 6379 "-" "Mozilla/5.0 (compatible; Fetcher/0.1)"
17.147.18.35 - - [06/Nov/2014:08:11:09 +0100] "GET /en/ride-the-world/climbs HTTP/1.1" 200 6841 "-" "Mozilla/5.0 (compatible; Fetcher/0.1)"
17.147.18.35 - - [06/Nov/2014:08:11:14 +0100] "GET /en/calendar/events HTTP/1.1" 200 9333 "-" "Mozilla/5.0 (compatible; Fetcher/0.1)"
…
The requests are only for the HTML pages, not the CSS, JavaScript or image files.
Does anyone have an idea whether this is an official Apple project, or just
someone crawling the web from their workplace at Apple? And what is its
purpose?
https://twitter.com/janmoesen or apple-crawler@moesen.nu