Using cfheader to guide the Search Engines
In my last blog post, I mentioned that I recently finished a site redesign project where maintaining search engine rankings was a big concern. Since many of the files on the new site had been relocated or renamed, I had to make sure that the search engines knew: 1) where the existing content had moved to, and 2) which pages should be removed from the search engine's indexes. This was accomplished easily enough using cfheader and onMissingTemplate in Application.cfc.
If you aren't familiar with cfheader, it's basically a tag that allows you to specify HTTP response headers to be returned to the client. For this particular situation, I am passing HTTP status codes back to the client. You can find a list of status code definitions here. This list was very helpful, and showed me which status codes should be returned for different situations. When used correctly these status codes allow you to tell search engines exactly what you want them to do with specific pages on your site.
For example, lets say we had two pages on our site, newbooks.cfm and usedbooks.cfm, which are used to display new and used books respectively. Now, with our site redesign, we've decided that we want to just have one page, books.cfm, and it will be used to display both new an used books. Since we don't want to confuse our users or lose the search engine rankings that our new and used book pages already have, we need to tell the spiders that newbooks.cfm and usedbooks.cfm have moved to a new page as well as redirecting our users when they try to pull up one of the old pages. This is done through the Application.cfc onMissingTemplate method:
<cfargument name="targetpage" required="true" type="string" />
<cfswitch expression="#ARGUMENTS.TargetPage#">
<cfcase value="newbooks.cfm,usedbooks.cfm">
<cfheader statuscode="301" statustext="Moved permanently" />
<cfheader name="Location" value="books.cfm" />
</cfcase>
</cfswitch>
</cffunction>
Now that newbooks.cfm and usedbooks.cfm no longer exist on our site, onMissingTemplate is automatically called whenever a person or spider tries to access them. In our onMissingTemplate method we check to see which pages were being called, and if it's one of our two book pages, we provide the "301 Moved Permanently" status code as well as the location that these pages have moved to. The 301 status code tells the search engine spiders that the requested page has been assigned a new permanent URI and any future references to this page should use the new URI, which is specified in our second cfheader tag.
But what if we've removed a page and didn't replace it with a new one? This is where we would pass a "410 Gone" status code back to the client. Most people incorrectly use a 404 status for this instead of the 410. While you may eventually achieve the same results with a 404 status code, by using it you're telling the search engines to keep your old, dead links indexed (for a while, anyway). Wikipedia does a good job of explaining the differences between the 404 and the 410 status codes. Basically, you use a 404 status code when "The requested resource could not be found but may be available again in the future." If you've removed a page and you know it's not coming back (and you're not doing a 301 redirect), then you should use the 410 status code because it "Indicates that the resource requested is no longer available and will not be available again. This should be used when a resource has been intentionally removed."
Using the above example, now we're going to add the 410 status code for any missing pages that aren't getting at 301 redirect.
<cfargument name="targetpage" required="true" type="string" />
<cfswitch expression="#ARGUMENTS.TargetPage#">
<cfcase value="newbooks.cfm,usedbooks.cfm">
<cfheader statuscode="301" statustext="Moved permanently" />
<cfheader name="Location" value="books.cfm" />
</cfcase>
<cfdefaultcase>
<cfheader statuscode="410" statustext="Gone" />
<p>We're sorry, but the page you have requested no longer exists on our site.</p>
</cfdefaultcase>
</cfswitch>
</cffunction>
And that's it. Now our users are being redirected to the content they were after or notified that it no longer exists, and the search engines are being to told where the content has move to and can update their indexes, or that it no longer exists and they can remove the old index entirely.


There are no comments for this entry.
[Add Comment] [Subscribe to Comments]