seo: how to avoid bad requests to improve page load speed of your webpage
As webpages and websites change and evolve over time, it is inevitable that resources will be moved, deleted and/or renamed. Most times when a resource changes, the corresponding usage in pages are to be updated to reflect the change. But often there are so many references in complex web pages that it is very easy to miss some of the references.
You will need to understand what exactly a bad request is and how to find them before you can work towards fixing them.
What is a Bad Request
A Bad Request is usually defined as the web request by the client for a resource from the server that is non-existent and will consistently return an error or will time out without an OK response. Usually, the error is one of the HTML 4xx error codes, such as 404 or 410.
It is quite possible for a valid and working web page to have references to dead resources. These are usually referred to as bad or broken links. These can be references to any of the static resources such as images, audio, video, JS or CSS files.
We modify the definition of Bad Request slightly from the perspective of the SEO. A Bad Request does not necessarily always have to a be broken link, it can also include resources that may not result in an error but are not necessary for the purpose of rendering the web page. It could include resources that are not actively used on the page.
For example, your css file could be fetching an image that you used to use as a background image. You have since then changed the back ground, but forgot to remove the reference. The image resource will still be requested and downloaded by the client, but it would however never be used when the page is rendered.
The same could be true of other static web resources, such the JS or CSS files that are not actively used in the current web page. This is basically removing unused resources and requests from the page, which is essentially the same as removing broken resources or bad requests.
Why are Bad Requests not desirable
There are several different reasons as to why it is not desirable to have bad requests…
Broken Visual Elements: A bad request could cause a visual element to not render correctly, such as an image. This can cause page layout distortions and can cause a bad user experience. It can also give the web site a bad and un-kept look and feel.
High Bandwidth: Depending on how the server is configured, you might be returning an unnecessary 404 page which could increase the bandwidth usage of the site.
Rendering Issues: If the bad request is for a synchronous or blocking element such as JS or CSS file, then it can prevent the page from rendering at all. Even if it does, it can substantially affect the rendering time of the page.
Network and Resource Hog: Every bad request still requires the client to perform several round trips to the server, which adds to the network latency as well as the unwanted CPU and memory utilization by the server.
DNS Lookup: Each request cause a DNS lookup as well. It means you will be doing extra DNS lookups that what is necessary which adds up even with the DNS caching enabled.
How to Find Bad Requests
As mentioned, it is quite possible that the bad request may reference either a visible or non-visible resource on the webpage. If it is a visible element, then it is quite easy to spot them and fix it. It could be broken image or distorted page layout.
It is also quite possible that these requests are hidden inside the JS or CSS files. This makes it much more difficult to track these down.
Link Checker Tools
There are quite a few link checker tools available, both online and applications that can be used. These link checkers are usually much efficient in tracking down straight forward links that are broken. Hidden requests, such as the ones in JS and CSS files are usually beyond the scope of these applications. However these are a good place to start.
Browser Tools
Most modern day browsers support built in development tools that can be used to track down bad requests. Mozilla Firefox has a built in tool called Web Console. You can also use a third party tool such as Firebug for this. Google Chrome also have similar tool named Developer Tools.
All of these tools work pretty much the same. Once you get a hang of it, you should be able to figure it out easily in other tools. In Mozilla Firefox, open the web console by going to the Menu and then clicking on Developer button and then Network to open the network tab in Web Console. You can also open it by using the keyboard shortcut Ctrl+Shift+K or Ctr+Shift+Q
- Open the website in a browser window or tab
- Open Web console using the menu or keyboard shortcut
- Click on the Network tab in web console
- Click on the Timer button to start analysis or Refresh the webpage in the tab
- You will see a list of all network activity in the console
- Look for any column which has the status in the 400s, like 404 or 410
The process is similar in Google Chrome. Open the Developer Tools from the tools menu or use the keyboard shortcut Ctrl+Shift+I. Now click on the Network tab and refresh the page. Check for any web request that threw an error, which is usually denoted in red and the status code should be in the 400s.
Once you have found the web request that is causing the error, you will have to find where exactly they are used. It could be anywhere in the source that generates the webpage. This very well depends on the framework and technology that has been used to create the webpage. Generally, there are four places that it can occur…
HTML Source: The reference could be in the HTML source, which is the most straight forward case. JS/CSS Files: Static resources can be referenced from JS and CSS files. JS files are also capable of generating resource links dynamically. Template and Include Files: This is not much different from the HTML source. Some frameworks can include or inject template files dynamically to generate pages and sections. There could also templates that are included such the header and footer. Third party Plugins: Many frameworks such WordPress, Blogger etc support plugins. Many third party plugins dynamically include calls to resource files which could be erroneous.
While these tools are good at finding broken and missing resources, it is not good at finding unused resources. You will need to manually hunt them down.
How to Fix Bad Requests
Fixing bad requests is often much easier than finding them, well…most of the time. Once you have identified the bad request URL and where it is referenced, you have one of two options
Remove the Reference: If the resource is not used anymore or is non-existent, then you can completely remove the code without any side effects.
Correct the Reference: If the resource has just moved or has been renamed, then you will need to correct the reference to now point to correct name of the resource.
Many times it is very tempting to fix the reference with an HTML redirect to the new or updated resource URI. This is especially the case when there are several references across pages to the same broken URI. This should, however be avoided at all costs. Using redirects will only cause for further degradation of your page load speed, as multiple and unnecessary redirects is one of the major factor affecting SEO.
Other Caveats
If you use plugins that perform some kind of content or resource rewriting then it is quite possible that you will find some URLs that does not look familiar or are not directly used inside your source. This is also the case when you use a caching plugin that rewrites resources. Also many Content Delivery Networks (CDN) has optimization features that can potentially rewrite content. These are be due to features that minify and optimize content.
When you come across errors in such cases, your best bet is to flush the cache/CDN and force a refresh of the content.