Wednesday, June 27, 2012

Google Webmaster Crawl Errors - A Solution

Google may be known for being secretive about many things, but they have blessed us with a glimpse of what they see under the hood of our precious websites. It behooves every webmaster on the planet to make use of Google's free Webmaster Tools, a collection of data and actually testing tools you can use to improve the way Google sees your site which in the end will help your ranking in their search engine.

One of the biggest issues webmasters face are crawl errors. This is any error Google's "bot" or "spider" may find on your website - generally resulting in a "page not found" error - called a 404 error.

If you design using Wordpress, one of the most useful tools to get your Google Webmaster crawl errors clear or not found is a plugin called Permalink Finder. What Permalink Finder does is accepts the URL the bot says it is searching for and if it is not found on your website it provides the closest match based on the words in the URL.

Did I lose you?








In the image above, note how the bot wanted to crawl:

http://www.threebestbeaches.com/northamerica/mexico/cancun/2009/01/playa-tortugas-beach-cancun.html

BUT, that page no longer exists because I removed the date in the URL's about a year ago. Yet, the bot is still looking for it. This is because somewhere somebody has a link to that old URL, and the bot crawls it.

The Permalink Finder handles this nicely by using the words in the URL, "playa tortugas beach cancun" and finds the current match. Look at the image below. You can click on the image to get the original size:







Above we see the correct URL and the bot goes along its merry way - no errors!

I was working on page speed recently and disabled this plugin on www.ThreeBestBeaches.com and you can see the results!

As you can see, the errors spiked until I turned the plugin back on. This will dramatically help you with crawl errors in Google Webmaster Tools.

My suggestion is to set the plugin to recognize and work with at least two words in the URL. Anything less will produce errors.

Let me know if you have questions about how this can help you.

No comments: