seo: how to remove query strings from static resources

When you analyze your page using any page analyzer such as Page Speed, YSlow or Pingdom, you are very likely to see suggestions to remove query strings from static resources. This is an easy enough task that can get your pages and resource to cache better so that the pages can load faster.

A query string refers to the last part of an URL, which follows a ? (Question mark). The query string is optional and you usually do not find them in many of your links. The function of the query string is to pass values to the page, that can then be used to render the page. This is especially useful when you have a dynamic page which need to render differently or with different content under varying context.

If you have a dynamic page, that changes content based on a dynamic value such the user name, geographic location or any other parameter, then it requires that value at the time generating and serving the file. This is accomplished using the query parameter. A typical query could look be something like:

http://domainname/products/newproducts.php?user=tom&location=seattle&prevbuyer=true

Static resources are files that are delivered to the client exactly as it is stored on the web server. These usually include HTML files, images, js and css files. This does not change very often and definitely not with every request. As these files do not undergo any server side processing they do not need any parameters or values passed to them.

One of the advantages of static resources, is that it can be cached. It can be cached by the web server, a proxy server and even the web browser or all of them. Caching resources that do not change can dramatically reduce the page load times. These resources then have to be downloaded just once and can be re-used repeatedly.

Furthermore caching reduces the load on the server, as the content can be saved and served from the proxy server or by a content delivery network. This allows the server to handle more traffic and also decreases the time it takes each page to load on the client.

First of all, you should reduce the number of dynamic content on your website. Use dynamic content and pages only when it is absolutely necessary. Also, do not use query strings when creating internal links and accessing static resources. If you have been careful, then it is likely that you do not have any bad URLs with unwanted query strings.

Sometimes it is not very obvious how and why some resources have query strings. You will first need to identify these resources and work towards removing them. The best way to find the resources that is being accessed using query strings is to use a web page analyzer software. Most analyzers will give you a list of URLs that are using the query strings.

Remove Query Strings

How to remove these query strings will depend very much on how, where and why they are used. We will explore some of the main reasons for these URLs and find ways to remove them when appropriate.

First and foremost, there are reasonable use cases for having query string in URLs. As mentioned before, this is especially true in the case of dynamic pages. These are pages that generate the content on the fly or when the request is received, based on the parameters or query string in the request. By definition, these pages are not to be cached because the content varies or differs each time.

For the purpose of SEO, we are mostly concerned with the query string in URLs that refer to the static resources.

If these URLs occur in a static resource, such as the HTML or Javascript source code, then you can remove them manually. That is pretty trivial, as it involves opening the source file and identifying the URL and removing the query string.

If you use Wordpress plugins or themes, you might see many of your CSS and JS file links being appended with query strings that refer to the version of the script. You can dynamically remove them using a Wordpress filter.

You can add a snippet of code to the theme's function file named functions.php file in Wordpress, so as to remove these query strings that start with ?ver. This is much safer than removing all query strings from all URLs. You can get to the functions.php from the Wordpress Admin, by accessing Appearence -> Editor -> Theme Functions (functions.php).

function remove_query_string($src){
    $parts = explode ('?ver', $src);
    return $parts[0];
}
add_filter('script_loader_src','remove_query_string',20,1);
add_filter('style_loader_src','remove_query_string',20,1);

This will remove query strings from both the JS scripts as well as the style sheet URLs. If you like to remove all query strings, whether or not they start with ?ver then you can substitute the code shown below into the snippet mentioned above.

$parts = explode ('?', $src)

The number 20 refers to the priority of the filter with respect to other filters that are executed. You would want this filter to run towards the end of the page filter execution chain, so a higher number is appropriate. You can try tweaking this priority in case you find that it misbehaves with a plugin.