Using a custom query on Wordpress's Redirection plugin data will find and order the top 404 pages for you.
The Redirection plugin for WordPress offers some indispensable flexibility and speed in tackling redirects for common user mistakes and manual/automated 301ing for old content. It’s become one of my must have default plugins on any WordPress installation I do. However finding the best suited pages that deserve a 301 after a while become difficult. Especially if your site is pulling some serious traffic. No one wants to fish through that log one by one. An overall look at my problem:
- Bots hit random pages that clearly don’t exists and clutter your 404 entries
- Legitimate users hit 404 pages sometimes just mangle the URL.
- Google webmaster’s: Diagnostics -> Crawl Errors seldom lists pages that were legitimately hit the most. Not to mention the list is lacking a complete set of data overall. You fix a few and a week later there’s a fresh batch that you’d think would be included up front.
Custom Query Solution
I came up with this solution because I needed something more comprehensive than what the Redirection plugin offers out of the box. Without making my own plugin, a quick query is easy enough. So, assuming Redirection plugin is installed and has been active for a while, throw this in your favorite MySQL GUI and run the query:
SELECT COUNT(url) AS count,created,url,agent,referrer FROM wp_redirection_logs WHERE sent_to IS NULL AND agent NOT LIKE '%bot%' AND agent NOT LIKE '%java%' AND agent NOT LIKE '%crawl%' GROUP BY url,ip ORDER BY count DESC
This will produce results on several data points for the wp_redirection_log table with an added count column.
Filters and Column Analysis
I’ve also tried to filter out three of the most common bot agent keywords: bot, crawl, java. You’ll still have some bot data in there, but that’s inevitable pending a much more elaborate solution with a tediously maintained tracking list of bot IP’s.
When you look at the results, remember that the row data you’re looking at is grouped. Based on the URL. So while the URL is the same for all of the combined data grouped into that row, the rest of the columns are showing the most recent entry in the table for that particular URL grouping. This means leaving columns in the results like created is helpful in determining the last time that 404 page was hit. Furthermore you can see if the agent for a row is typically being hit by a bot or coming from a certain referrer. Assuming any consistency in those areas.
Create Your 301 Redirect
Now all you have to do is go into your redirection module admin area: /wp-admin/tools.php?page=redirection.php and enter source and target URL’s. Then you can track continued hits to that from the 301 entry that is created.
Entrupeners, Subscribe for the lastest tools, tips, and tutorials.