I wrote about rolling your own server logs awhile back, but it only occurred to me while at the GDC that by updating the .htaccess file I can redirect 404 errors to my own PHP document that will display the 404 error and also log when misdirections occur at my site (since the requested document is in the GET request).
The conversation with Rob went something like this:
Jeff: Man, why didn't I think of this before???
Rob: 'cuz you're an idiot.
Jeff: Right, and why didn't you think of it before, ass?
Rob: 'cuz you're an idiot.
That's the kind of trip it was...wise cracks all the way around.
Anyway, having done this modification Sunday night after getting back from GDC, I'm seeing a couple interesting things:
-
I started seeing all the requests for favicon.ico
Basically something implemented within Internet Explorer that has become a web standard with most browsers, the browser makes a request for the favicon.ico file and uses that next to the bookmark/favourites menu on your browser. As a result, I made a quickie icon (that looks like crap, sorry). Maybe I'll make a nicer one when I have some more time to blow.
-
I see requests "private" files that haven't existed for a long period of time.
These are coming from Googlebot and MSNBot. What's interesting is that the files were only known by myself and one other person. That other person had installed the Google Page Rank utility that feeds Google with page locations he was browsing. I can understand how Google came up with those URLs but how did MSNBot? Is MSNBot spidering Google content?
It makes me wonder what other things the bots are spidering that I don't know about...I should move some of my "private" web files around and see what other 404's pop up.
Anyway, I'm happy with the little modification that I made. It only took one line of code in .htaccess and a couple copy/paste actions to create the new 404 document and now these logs provides me more information than I had before. Of course, I'm still missing image downloads and the full spread of things that the bots are spidering, oh and I should secure the server log web interface too...a cheap-ass web developer's work is never done...
… OR you can upgrade to better hosting. 🙂 As Rob says, we love re-inventing the wheel.