seo: what is HTTP keep alive and how to enable it on your server

HTTP Keep Alive or Persistent Connection is a term that refers to the message sent between two devices, in this case the client machine and the web server, in order to maintain the connection between the two and prevent it from being broken. These are extremely small messages that take up very little bandwidth.

Keep Alive messages or signals are sent at predefined intervals from one device to another and if a response signal is not received then the connection is considered broken or closed. Thus the Keep alive functionality performs a couple of important functions in device communication.

Connection Testing: It checks that the connection is still open and is good to use. In spite of the keep-alive functionality, it is still possible for the connection to be dropped by either one of the devices or an intermediate device due to varying reasons, like the load on the devices or a physical disruption or for some other unforeseen reason.

Prevents Breaking of Connections: Sending these messages at certain interval signals to the other device and the intermediate devices like the routers that you expect to use the connection very soon and not to time-out and break the connection. It basically extends the timeout period of the connection.

One of the most time consuming and expensive part of inter device communication is the creation of a connection. For this reason almost all protocols have some form of keep-alive functionality built into it so as to get around the latency of creating a new communication channel for every single request and/or response sessions.

Advantages of HTTP Keep Alive

Reduced server CPU usage: Creating new TCP connections can take a lot of resources such as the CPU and memory usage. That means keeping connections alive longer can reduce this especially if a lot of traffic comes from the same client.

Improved Web Page Speed: The ability to serve multiple files using the same connection can reduce latency time in creating a connection and improve the webpage download time and speed.

Disadvantages of HTTP Keep Alive

Reduced # of concurrent connections: Keeping the connections alive longer can actually reduce the number of connections available at any one time. At peak times, this can actually reduce the concurrent users that can be served.

In SEO, the keep-alive that is mentioned refers to the HTTP Keep-alive functionality (actually it is the underlying TCP Keep-alive). Enabling this between the server and client will allow the client to download all the resources required for the page rendering over a single connection rather than having to initiate multiple connections for each resource.

A single HTML page can contain several different resources that make up the entire page. This can include the html code, several css files, several javascript files, several images and multimedia files. Creating a new connection for each of these resources is not only expensive but also can take a longer time, compared to downloading them over a single connection.

Keeping the connection alive, while the client parses the html file and finds all the dependant resources referenced in the HTML file will allow the client to then request other files using the same connection.

Since this function has more advantages than disadvantages, the default for HTTP connections is to keep alive the connection in most modern web servers that uses HTTP 1.1 specifications . But it does not hurt to enable keep alive explicitly and to verify that it works.

Enabling HTTP Keep-Alive

Keep-alive is enabled using the HTTP headers on the connection. You will need to explicitly include the header that specifies the value as enabled. We will look at how to enable it in Apache Web Servers. You can do this using the .htaccess file that is found in the root of your web server.

<IfModule mod_headers.c>
Header set Connection keep-alive
</IfModule>

If you cannot access the .htaccess file, then you can configure it using the configuration file of the Apache Web server, httpd.conf. In the configuration file, there are three properties that affect the HTTP Keep alive functionality.

KeepAlive: This is the specify if the HTTP keep alive should be turned on or off. KeepAlive On will turn it on, while KeepAlive Off will turn off the functionality.
MaxKeepAliveRequests: This specifies the maximum number of requests that will be served from a single persistent connection. This is to deter clients from keeping the connection alive for too long a time. A value of around 100 is usually good enough for almost any scenario.
KeepAliveTimeout: Again, another configuration setting to prevent the un-used connections from hanging around for too long. A value between 5 to 10 seconds is usually ideal.

These extra configuration settings are to make sure that the keep-alive functionality is not misused and to prevent a single or few clients from becoming resource hogs.

Verify that Keep-alive is enabled

You will need to use some kind of a web tool to view and test your http headers when a request is made, to verify that the appropriate headers are included. You could use either a browser add-on/extension or a simple command line tool like curl for this.

bash$ curl -I http://www.mydomain.com/mypage.html

Here look for the http header which should state something like: Connection: keep-alive

bash$ curl -I -v http://www.mydomain.com/mypage.html

Here in addition to the http header, also look for the last statement which should state something like : Connection #0 to host left intact.

Many webpage analysis softwares like those from webpagetest.org, tools.pingdom.com and Google Page Speed will also tell you if HTTP keep alive is enabled for your pages.

Most times enabling keep alive is definitely the way to go. But it does depend on some of the other factors as well, so taking those into consideration will let you make a good decision for your website.

SSL: If you serve content over SSL (read: HTTPS) in addition to HTTP then definitely enable keep alive for HTTPS connections. HTTPS connections are very resource intensive and have a large latency time to create new connections.

Web Site setup: Does your webpage reference a lot of resources? How many requests does it take for a single webpage to render completely? Reducing the extra requests will significantly reduce the server load and the need to keep the connection alive for long periods. Ideally, I would say that you should not have more than 50 to 75 requests per page, but this will significantly depend on your website itself.

Traffic Patterns: Is your server load distributed more or less evenly across all hours of the day? Then enable keep alive and keep the persistent connections. If it is heavily skewed towards certain times of the day and your web server falls over during these peak hours, then turn it off or at least reduce the MaxKeepAliveRequests and KeepAliveTimeout values significantly to absorb the load.

Server Hardware: This is mostly related to the above two, webpage and traffic. Is your hardware equipped enough to handle the traffic? If your hardware is using significantly high CPU then turning off the keep alive might help a little, but it is no substitute for upgrading the hardware.