trygvis at opencsw.org
Tue Sep 1 01:47:59 CEST 2009
Philip Brown wrote:
> 2009/8/31 Trygve Laugstøl <trygvis at opencsw.org>:
>> Philip Brown wrote:
>> The fact that they list both is to me a
>> sign of that they store both. How would they determine the canonical one
> I would say that the common-sense answer is, if "page", and
> "page.anything" exists (and the content is the same), then "page"
> should always be canonical.
> after all, if "site.com/docs" exists, and "site.com/docs.php"
> exists... odds are that "site.com/docs" will ALWAYS exist.. but
> someday, they may change the backend to be .pl, or .cgi, or
>>> Hence I still stand by my statement that intelligent search engines
>>> already handle this sort of thing "properly".
>>> It is more appropriate for us fix our internal links, rather than tell
>>> search engines, "ignore our mangled links" !
>> I think that in this case fixing the references is the right solution.
> I'm glad we agree there. The annoying thing is that I'm not sure where
> the references are, at this point. From my searching through the
> pages, I dont see any obvious references to ".php" in our site. So
> assistance from other folks in hunting those down, would be
It very well might be old links that has gotten index at some point in
time, and since we still return 200 OK on those URLs, Google will keep
on re-indexing them. Try adding a permanent redirect from "page.php" to
"page" and they will disappear after a while.
Another option might be to grep the logs for ".php" and check the
referer field if that is logged. Might give you a clue as well.
This is where the Webmaster toolkit comes in handy, it can show you how
Google view your side (not that I've used it myself, but that's what
I've been told).
More information about the maintainers