Varnish Cache and ESI

Having started moving my site to the new domains, it is time to look at my Varnish Cache and Edge Side Includes.

Varnish Cache

Although changing my Varnish Cache has been on my mind recently during my process of moving to new domain names, I have not really changed how the cache works.

While writing this paragraph, my varnish cache is working how I previously mentioned in Moving Domains - Part 3: Googlebot and the Google Index.

That is, my varnish configuration is to vary the hash of a page as follows:

  • User-Agent: Internet Explorer, Outdated Opera, Outdated Firefox, Outdated IceWeasel, Outdated Chrome, Outdated Internet Explorer, Search Engine Crawlers, Everything Else.
  • Protocol: HTTPS, HTTP.
  • Cookie: fonts=1 (fonts presumed loaded), Everything Else (JavaScript font loading if not a Search Engine Crawler).

Also, my cache is very simplistic:

sub vcl_fetch {
	if (beresp.ttl <= 0s || beresp.http.Vary == "*") {
		set beresp.ttl = 120 s;
		return (hit_for_pass);
	}
	unset beresp.http.set-cookie;
	return (deliver);
}

This is pretty much the default, only I have removed || beresp.http.Set-Cookie as my site uses cookies for an exceptional case where caching content with a Set-Cookie header is desirable (I only use one cookie, and it simply tells the server to put the fonts in the <head> section rather than using a JavaScript font loader).

Although I vary by cookie, I remove the fonts cookie (and any others) before running deliver because I set the cookie within sub vcl_deliver { }.

Basically, I want the cookie to expire 30 minutes after a page is requested, even if a cached page is served. By pulling a "with cookie" (i.e. fonts loaded within <head>) object (Web page) from the cache, and then adding on the Set-Cookie before delivery, I can avoid actually storing the pages in the cache with the cookie.

I am even using the default 120 second cache expiry time (TTL). As nginx goes straight to lighttpd, bypassing varnish, for static content (for which long expiry times should be getting passed) the only things going through varnish are dynamic PHP pages, and possibly some things in the image gallery.

Yes, upon closer inspection items from the image gallery are indeed going through nginx -> varnish -> lighttpd, because galery items are not in one of the directories I test for in nginx. Images have a 2,592,000 second expiry, or 30 days. I need to modify nginx so those images don't go through Varnish (I am sure there was a reason I only wanted PHP file to go through Varnish).

OK, I started tailing my nginx access log and monitoring varnishhist at 11:00 UTC+0 (2015-01-22) and am going to let them run in the background while I'm doing things.

Based on a sample I took earlier (unknown timespan) and with those images being served via Varnish (not sure how many times, I did see Googlebot-Image in the logs though) there were 458 entries, of which 353 were cache misses (77.1%) and 105 were cache hits (22.9%). These results are not particularly helpful, however, as I was doing a lot of site changes (mostly accessibility related) during that time and so me refreshing pages to test changes were within those results.

I think I'll give it a couple of hours without unnecessarily loading pages on my site. At 01:10 UTC+0 (2015-01-23, approximately 14.25 hours later) there were 125 records in varnishhist. Of those, 108 (86.4%) were cache misees, and 17 were cache hits (13.6%). 0 cache hits and 2 cache misses within this timeframe were me (I was watching iPlayer and Youtube, and then fell asleep).

Considering I have not made any modifications to the caching time, and I have modified my nginx configuration so 100% of these results are for PHP files, the fact is around 1 in 7 page loads are cache hits.

The varnish histogram has these hits around the high 1e-5 (1.0*10^-5, or hundred-thousandths of a second, or the high double-digit microseconds) to low 1e-4 (1.0*10^-4, or ten-thousandths of a second, or the low treble-digit microseconds).

Put another way, the majority of cache hits at this load/memory usage require approximately 50-500 microseconds (0.05-0.5 milliseconds) which is entirely within my overall aim... to reduce PHP processing time to less than 1 millisecond.


Fixing Existing Issues

A command I have just found, but which is really useful, is varnishlog. By using the -d option you can go through the entire log since Varnish started (or as far back as the shared memory log goes), not just the stuff starting from now.

This option is also available for varnishhist. By using the varnishlog command you can see what tags there are. Thus, if I use the command varnishhist -d -m Hash:/ -m Hash:web.watfordjc.uk -m Hash:http I can see exactly what cache hits and misses there were for the home page of Web.WatfordJC.UK over the HTTP protocol (the result is, surprisingly, zero total).

varnishhist -d -m Hash:/robots.txt on the other hand shows that the robots.txt file (really a PHP file) has much more cache misses than ideal - ideally each domain should only have one cache miss.

robots.txt

The first thing to do is to take a look at my robots.php file:

<?php
header("Content-Type: text/plain");
switch ($_SERVER['HTTP_HOST']) {
	case "web.johncook.uk":
	case "johncook.uk":
	case "web.watfordjc.uk":
	case "watfordjc.uk":
	// TODO: Disallow indexing by robots of johncook.co.uk after 301'ing URLs.
	case "johncook.co.uk":
	case "web.johncook.co.uk":
		echo "User-agent: ia_archiver
Disallow: /
User-agent: archive.org_bot
Disallow: /
User-Agent: *
Disallow: /*.git/
Disallow: /bower_components/
Disallow: /inc/
Disallow: /node_modules/
Disallow: /scss/";
		break;
	case "dev.johncook.co.uk":
	case "www.johncook.co.uk":
	default:
		echo "User-agent: *
Disallow: /";
		break;
}
?>

Also, a look at some of the headers received:

Accept-Ranges: "bytes"
Age: "13"
Content-Encoding: "gzip"
Content-Type: "text/plain"
Date: "Fri, 23 Jan 2015 03:45:24 GMT"
Set-Cookie: "fonts=1; Domain=.web.johncook.uk; Path=/; Max-Age=1800"
Strict-Transport-Security: "max-age=0"
Vary: "Accept-Encoding"
Via: "1.1 varnish"

There are a number of obvious issues the headers are telling me I need to investigate:

  1. The fonts cookie is being set, but this is not a page that includes the JavaScript font loader.
  2. There is no Cache-Control header.
  3. There is no Pragma header.
  4. There is no Last-Modified header.
  5. There is no Expires header.

The first issue is simple, in my Varnish user.vcl file I can amend the sub vcl_recv { } and sub vcl_deliver { } sections:

sub vcl_recv {
...
  if (req.url ~ "^/robots.txt") {
    // Normalise URL, removing any extraneous parameters:
    set req.url = "/robots.txt";
    // Remove User-Agent:
    remove req.http.User-Agent;
  }
  
  return (lookup);
}

sub vcl_deliver {
...
  if (req.url ~ "^/robots.txt$") {
    // Remove all cookies:
    unset resp.http.Set-Cookie;
  }
  
  return (deliver);
}

Next, I am going to add a Last-Modified header to robots.php:

<?php
header("Last-Modified: ".gmdate("D, d M Y H:i:s",filemtime($_SERVER['DOCUMENT_ROOT'].$_SERVER['SCRIPT_NAME']))." GMT");
header("Content-Type: text/plain");
switch ($_SERVER['HTTP_HOST']) {
	case "web.johncook.uk":
	case "johncook.uk":
	case "web.watfordjc.uk":
	case "watfordjc.uk":
	// TODO: Disallow indexing by robots of johncook.co.uk after 301'ing URLs.
	case "johncook.co.uk":
	case "web.johncook.co.uk":
		echo "User-agent: ia_archiver
Disallow: /
User-agent: archive.org_bot
Disallow: /
User-Agent: *
Disallow: /*.git/
Disallow: /bower_components/
Disallow: /inc/
Disallow: /node_modules/
Disallow: /scss/";
		break;
	case "dev.johncook.co.uk":
	case "www.johncook.co.uk":
	default:
		echo "User-agent: *
Disallow: /";
		break;
}
?>

The reason I am using filemtime() instead of getlastmod() is because filemtime() seems to be faster for me.

At this point, I have removed the fonts cookie and have set a Last-Modified header for robots.php. I now need to decide how to tell caches (and robots) how to cache /robots.txt.

The Expires header tells caches when to expire the cache of an object. As I want the latest robots.txt to always be served, I am not going to use the Expires header yet. I am currently ignoring the two minutes all PHP files are cached for by Varnish.

I'm going to ignore Pragma.

For Cache-Control, I'm going to use Public, a max-age of 60 seconds (for testing), and must-revalidate (i.e. send an If-Modified-Since if the cached page is older than 60 seconds):

<?php
header("Last-Modified: ".gmdate("D, d M Y H:i:s",filemtime($_SERVER['DOCUMENT_ROOT'].$_SERVER['SCRIPT_NAME']))." GMT");
header("Cache-Control: public, max-age=60, must-revalidate");
header("Content-Type: text/plain");
switch ($_SERVER['HTTP_HOST']) {
...
}
?>

And as it works, I can make some temporary changes (until my robots.txt becomes stable):

<?php
header("Last-Modified: ".gmdate("D, d M Y H:i:s",filemtime($_SERVER['DOCUMENT_ROOT'].$_SERVER['SCRIPT_NAME']))." GMT");
header("Cache-Control: public, max-age=7200, must-revalidate, s-maxage=3600, proxy-revalidate");
header("Content-Type: text/plain");
switch ($_SERVER['HTTP_HOST']) {
...
}
?>

Now my robots.txt is cached by proxies (such as varnish) for an hour (3600 seconds) at which point it should send an If-Modified-Since (Last-Modified value, proxy-revalidate) and not serve stale content.

Clients (such as Googlebot), on the other hand, should cache the robots.txt for 2 hours (7200 seconds) at which point it should send an If-Modified-Since (Last-Modified value, must-revalidate) and not use stale content.

This basically means that a client behind a cache might get served content that is up to 3600 seconds old, with the Age response-header indicating the total time since the first caching server in the chain checked if the file has changed.

For example, it has been 3610 seconds since Varnish last asked lighttpd If-Modified-Since? A new request comes in and Varnish checks with lighttpd (the origin server) if the file has changed. If it hasn't, Age gets set to zero and the response to the request says "if you are a caching server, this is valid for [s-maxage]-[Age]=3600 seconds. If you are a client, this is valid for [max-age]-[Age]=7200 seconds.

If, on the other hand, a client asks an ISP caching proxy (once that is correctly configured, that is) for robots.txt and the ISP copy has an Age of 3500 seconds, it will be sent to the client without revalidating if the file is stale. The client will then use the copy for 7200-3500 seconds before asking the ISP if the file is still valid.

What should be noted, however, is that I have gone from a 2 minute Varnish cache of robots.txt and clients always requesting the latest file, to a 1 hour Varnish cache and clients only requesting the latest file if (a) they have a copy in their cache that is over 2 hours old and the file has been modified, or (b) if they do not have a cached copy.

I still have one modification to make to robots.php:

<?php
$if_modified = $_SERVER['HTTP_IF_MODIFIED_SINCE'] ? $_SERVER['HTTP_IF_MODIFIED_SINCE'] : "nothing";
$last_modified = gmdate("D, d M Y H:i:s",filemtime($_SERVER['DOCUMENT_ROOT'].$_SERVER['SCRIPT_NAME']))." GMT";
header("Last-Modified: ".$last_modified);
header("Cache-Control: public, max-age=7200, must-revalidate, s-maxage=3600, proxy-revalidate");
if ($if_modified == $last_modified) {
      header("HTTP/1.1 304 Not Modified");
      header("Status: 304 Not Modified");
      exit();
}
header("Content-Type: text/plain");
switch ($_SERVER['HTTP_HOST']) {
...
}
?>

I am basically using If-Modified-Since like an ETag - if the strings do not exactly match then do not return a 304.

It should be noted at this point that I am not using Edge Side Includes (ESIs) anywhere yet. That is because I am making optimisations that don't really call for ESIs.

favicon.ico

Something else I noticed while going through the testing of robots.txt is that /img/favicon.ico is being served as Content-Type: application/octet-stream.

sudo nano /etc/lighttpd/conf.d/mime.conf
...
".ico" => "image/x-icon",
...

Edge Side Includes

Edge Side Includes (ESIs) are a way of chopping up a page into pieces and getting caches, like Varnish, to reassemble them. By chopping up pages just right, and making good changes to code, it can reduce the amount of RAM needed for caching pages.

Varnish currently supports 3 ESI instructions:

  1. <esi:include></esi:include>
  2. <esi:remove></esi:remove>
  3. <--esi -->

Varnish cannot yet do ESI based on cookies, variables, or in HTML comments. That is why I mentioned at the start of this section that "good changes to code" may be required to fully utilise Varnish.

copyright.php

The only piece of code on my site that I can think of that is wholly reused no matter which domain has been requested is /inc/copyright.php:

</main>
<footer class="row copyright">
<p id="copyright">Copyright &copy; <span class="author">John Cook</span> 2004-15. All rights reserved.</p>
<p id="license">All original content is licensed under a <a href="http://creativecommons.org/licenses/by-sa/2.0/uk/"><img src="/img/by-sa.svg" alt="Creative Commons Attribution-ShareAlike" width="80" height="15" /></a> 2.0 UK: England &amp; Wales License unless otherwise specified.</p>

Yes, it is one of few, if not the only, Server Side Include (SSI) that is plain old HTML with no PHP at all.

The problem I will have to deal with first, however, is the fact that Varnish normalises Accept-Encoding, so I can't use ESI until I move compression away from lighttpd and into nginx.

sudo nano /etc/php5/cgi/php.ini
...
zlib.output_compress = Off
...
sudo ~/Scripts/sync-webroot-after-update.sh
sudo service lighttpd restart
sudo nano /etc/varnish/user.vcl
...
sub vcl_recv {
...
# Normalise Vary: Accept-Encoding (I belive I have disabled deflate in lighttpd).
#  if (req.http.Accept-Encoding ~ "gzip") {
#    set req.http.Accept-Encoding = "gzip";
#  } else {
#    remove req.http.Accept-Encoding;
#  }

# Normalise Vary: Move Accept-Encoding completely to nginx:
  if (req.http.Accept-Encoding) {
    remove req.http.Accept-Encoding;
  }
...
sudo service varnish restart

Test that pages are coming back from the server uncompressed. And finally...

sudo nano /etc/nginx/nginx.conf
...
http {

gzip on;
gzip_disabled "msie6";

gzip_vary on;
gzip_proxied any;
gzip_comp_level 6;
gzip_buffers 16 8k;
gzip_http_version 1.1;
gzip_types text/plain text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript;
...
sudo service nginx restart

At this point, another quick test to confirm pages are being compressed again, and I can now start to look at testing Edge Side Includes (ESIs).

First, I am going to enable ESI processing:

sudo nano /etc/varnish/user.vcl
...
sub vcl_fetch {
...
  set beresp.do_esi = true;
  return(deliver);
}
sudo service varnish reload

And secondly, I am going to modify copyright.php:

<?php $file_is_included = (isset($script_name));?>
<?php include_once $_SERVER['DOCUMENT_ROOT'].'/inc/htmlheader.php';?>
<?php
if (!$file_is_included) {
	$if_modified = isset($_SERVER['HTTP_IF_MODIFIED_SINCE']) ? $_SERVER['HTTP_IF_MODIFIED_SINCE'] : "nothing";
	$last_modified = gmdate("D, d M Y H:i:s",filemtime($_SERVER['DOCUMENT_ROOT'].$_SERVER['SCRIPT_NAME']))." GMT";
	header("Last-Modified: ".$last_modified);
	header("Cache-Control: public, max-age=30, must-revalidate, s-maxage=10, proxy-revalidate");
	if ($if_modified == $last_modified) {
		header("HTTP/1.1 304 Not Modified");
		header("Status: 304 Not Modified");
		exit();
	} else {
?>
</main>
<footer class="row copyright">
<p id="copyright">Copyright &copy; <span class="author">John Cook</span> 2004-15. All rights reserved.</p>
<p id="license">All original content is licensed under a <a href="http://creativecommons.org/licenses/by-sa/2.0/uk/"><img src="/img/by-sa.svg" alt="Creative Commons Attribution-ShareAlike" width="80" height="15" /></a> 2.0 UK: England &amp; Wales License unless otherwise specified.</p>
<?php
	}
} else {
?>
<esi:include src="http://web.johncook.uk/inc/copyright"/>
<?php
}
?>

I am using even shorter cache lifetimes here for testing purposes. With debugging complete, and seeing /inc/copyright being accessed on every page load (if leaving 10+ seconds between) I can up the caching times. 3600 seconds max-age and 1800 seconds s-maxage sound good enough for the time being.

varnishlog is showing that I can make some improvements specifically for this ESI include.

sudo nano /etc/varnish/user.vcl
...
sub vcl_recv {
...
  if (req.url == "/inc/copyright") {
    remove req.http.User-Agent;
    remove req.http.Cookie;
    remove req.http.X-Has-Fonts;
    remove req.http.X-Forwarded-Proto;
  }
  
  return (lookup);
}

By removing everything except req.host and req.url, only one item should be in the cache for /inc/copyright per domain. Although I could remove req.host as well, it results in an infinite redirect loop due to a 307 redirect that is in place for bare IP access. I have, instead, used an absolute rather than relative URL.

At this point everything seems to be working, although varnishhist no longer works properly as it appears it is only recognising the ESI. Interestingly, however, a direct access of /inc/copyright results in a processing time of 1.0*10^-4 seconds and an ESI include results in a processing time of 1.0*10^-6 seconds (approximately 1 nanosecond). I don't know how accurate those numbers are.

Something that will need further investigation is gzip compression. As Varnish is receiving everything uncompressed, that would increase bandwidth use significantly if I were to move the PHP processing In-House. Varnish looks like it can do ESI with gzip compressed content, but I need to understand the performance recommendations first.

Long-Term Site Parameters

Some parameters are hard-coded in my PHP files because they are only ever going to change rarely. The variable $site_name, for example, is John Cook UK or WatfordJC UK depending on the domain requested. By creating a file, and using a relative URL, I will be able to convert all instances of $site_name to <esi:include src="/inc/esi/site_name"/>.

On the Home Page, for example, the string "John Cook UK & WatfordJC UK" is a hard-coded parameter: $my_sites. Likewise, the string "John Cook (WatfordJC)" is another hard-coded parameter: $my_name.

Something I should remember to be careful about is ESIs inside ESIs. With a relative URL, fixed parameters like /inc/esi/site_name included within an ESI that has been referenced by an absolute URL will presumably take on the fixed parameters for that domain, instead of the originally requested domain.

Code duplication is going to become an issue if I am wrapping every ESI page within if/else statements. Can a PHP include be incomplete? If so, I could shift the code to a function within htmlheader.php:

function cache_headers($cache_public,$client_cache_for,$client_revalidate,$proxy_cache_for,$proxy_revalidate) {
	$if_modified = isset($_SERVER['HTTP_IF_MODIFIED_SINCE']) ? $_SERVER['HTTP_IF_MODIFIED_SINCE'] : "nothing";
	$last_modified = gmdate("D, d M Y H:i:s",filemtime($_SERVER['DOCUMENT_ROOT'].$_SERVER['SCRIPT_NAME']))." GMT";
	header("Last-Modified: ".$last_modified);
	$caching = $cache_public ? "Public" : "Private";
	$c_cache_secs = $client_cache_for !== false ? ", max-age=".$client_cache_for : "";
	$c_revalidate = $client_revalidate ? ", must-revalidate" : "";
	$p_cache_secs = $proxy_cache_for !== false ? ", s-maxage=".$proxy_cache_for : "";
	$p_revalidate = $proxy_revalidate ? ", proxy-revalidate" : "";
	header("Cache-Control: ".$caching.$c_cache_secs.$c_revalidate.$p_cache_secs.$p_revalidate);
	if ($if_modified == $last_modified) {
		header("HTTP/1.1 304 Not Modified");
		header("Status: 304 Not Modified");
		exit();
	}
}

I can now simply replace copyright.php with the following:

<?php $file_is_included = (isset($script_name));?>
<?php include_once $_SERVER['DOCUMENT_ROOT'].'/inc/htmlheader.php';?>
<?php
if (!$file_is_included) {
	cache_headers(true,3600,true,1800,true);
?>
</main>
<footer class="row copyright">
<p id="copyright">Copyright &copy; <span class="author">John Cook</span> 2004-15. All rights reserved.</p>
<p id="license">All original content is licensed under a <a href="http://creativecommons.org/licenses/by-sa/2.0/uk/"><img src="/img/by-sa.svg" alt="Creative Commons Attribution-ShareAlike" width="80" height="15" /></a> 2.0 UK: England &amp; Wales License unless otherwise specified.</p>
<?php
} else {
?>
<esi:include src="http://web.johncook.uk/inc/copyright"/>
<?php
}
?>

If the file is not included, then it runs the cache_headers() function. If-Not-Modified, a 304 is sent and all processing halts (the exit function is called). If we're not sending a 304, the end of cache_headers() is reached and the HTML is sent. Otherwise, if the file is included, the ESI include tag is sent.

And with this function created, static variables are now extremely simple:

<?php $file_is_included = (isset($script_name));?>
<?php include_once $_SERVER['DOCUMENT_ROOT'].'/inc/htmlheader.php';?>
<?php
if (!$file_is_included) {
	cache_headers(true,3600,true,1800,true);
	header("Content-Type: text/plain");
	echo $site_name;
}
?>

That is, I think, the most I can condense the code down to.

nav.php

The navigation bar on my site is not completely static. Here is the previous code:

<nav role="navigation" class="top-bar bottom-bar" data-topbar>
<ul class="title-area">
<li class="name">
<a href="/" class="fi-home"><span><esi:include src="/inc/esi/site_name"/></span></a>
</li>
<li class="toggle-topbar menu-icon"><a href="#">Menu</a></li>
</ul>
<section class="top-bar-section">
	<ul class="right">
		<li title="Articles"><a href="/articles" class="fi-book"><span class="text">Articles</span></a></li>
		<li title="Blog Posts"><a href="/blogs" class="fi-pencil"><span class="text">Blogs</span></a></li>
		<li title="Image Gallery"><a href="/gallery" class="fi-photo"><span class="text">Gallery</span></a></li>
		<li title="Music"><a href="/music" class="fi-music"><span class="text">Music</span></a></li>
		<li title="Downloads"><a href="/downloads" class="fi-download"><span class="text">Downloads</span></a></li>
		<li title="Site Links"><a href="/links" class="fi-link"><span class="text">Links</span></a></li>
		<li title="About"><a href="/about" class="fi-info"><span class="text">About</span></a></li>
		<li title="Network Status"><a href="/status" class="fi-graph-bar"><span class="text">Status</span></a></li>
		<li title="Secure Site">
<?php
$link_url = $_SERVER['REQUEST_URI'];
$link_class = "fi-lock HTTPS".$is_HTTPS;
$https_aria_label = "";

switch ($is_HTTPS) {
	case "isNotUsed":
		$link_url = "https://".$site_domain.$link_url;
		$https_aria_label = "This page is also available securely.";
		break;
	case "isUsed":
		$link_url = $link_url."#";
		$https_aria_label = "This page is secure.";
		break;
}
echo "<a href=\"".$link_url."\" class=\"".$link_class."\" rel=\"nofollow\" aria-label=\"".$https_aria_label."\"><span class=\"text\">Secure Site</span></a></li>";
?>
	</ul>
</section>
</nav>
</footer>

There is quite a lot of static HTML in the navigation bar. The only thing that isn't static is the padlock icon because it is either a link to "#" or a link to the current page over HTTPs (with the domain changed if applicable).

Thus, I can move the first section of static HTML to /inc/esi/nav_head.php:

<?php $file_is_included = (isset($script_name));?>
<?php include_once $_SERVER['DOCUMENT_ROOT'].'/inc/htmlheader.php';?>
<?php
if (!$file_is_included) {
	cache_headers(true,30,true,10,true);
?>
<nav role="navigation" class="top-bar bottom-bar" data-topbar>
<ul class="title-area">
<li class="name">
<a href="/" class="fi-home"><span><esi:include src="/inc/esi/site_name"/></span></a>
</li>
<li class="toggle-topbar menu-icon"><a href="#">Menu</a></li>
</ul>
<section class="top-bar-section">
	<ul class="right">
		<li title="Articles"><a href="/articles" class="fi-book"><span class="text">Articles</span></a></li>
		<li title="Blog Posts"><a href="/blogs" class="fi-pencil"><span class="text">Blogs</span></a></li>
		<li title="Image Gallery"><a href="/gallery" class="fi-photo"><span class="text">Gallery</span></a></li>
		<li title="Music"><a href="/music" class="fi-music"><span class="text">Music</span></a></li>
		<li title="Downloads"><a href="/downloads" class="fi-download"><span class="text">Downloads</span></a></li>
		<li title="Site Links"><a href="/links" class="fi-link"><span class="text">Links</span></a></li>
		<li title="About"><a href="/about" class="fi-info"><span class="text">About</span></a></li>
		<li title="Network Status"><a href="/status" class="fi-graph-bar"><span class="text">Status</span></a></li>
		<li title="Secure Site">
<?php
}
?>

I can likewise move the end section of static HTML content to /inc/esi/nav_tail.php:

<?php $file_is_included = (isset($script_name));?>
<?php include_once $_SERVER['DOCUMENT_ROOT'].'/inc/htmlheader.php';?>
<?php
if (!$file_is_included) {
	cache_headers(true,30,true,10,true);
?>
	</ul>
</section>
</nav>
</footer>
<?php
}
?>

And with that done, /inc/nav.php can now become:

<esi:include src="/inc/esi/nav_head"/>
<?php
$link_url = $_SERVER['REQUEST_URI'];
$link_class = "fi-lock HTTPS".$is_HTTPS;
$https_aria_label = "";

switch ($is_HTTPS) {
	case "isNotUsed":
		$link_url = "https://".$site_domain.$link_url;
		$https_aria_label = "This page is also available securely.";
		break;
	case "isUsed":
		$link_url = $link_url."#";
		$https_aria_label = "This page is secure.";
		break;
}
echo "<a href=\"".$link_url."\" class=\"".$link_class."\" rel=\"nofollow\" aria-label=\"".$https_aria_label."\"><span class=\"text\">Secure Site</span></a></li>";
?>
<esi:include src="/inc/esi/nav_tail"/>

As I said, good changes to code can cut a lot of duplicated code from the Varnish cache. The navigation bar has been reduced to two ESI includes and one hyperlink, with almost everything in the navigation bar now contained in just two files in the cache per domain.

It is a small change, but this one example (nav.php) shows how something that is almost always unique per page can have the surrounding non-unique stuff moved out to an ESI.

One further optimisation that could be made would be that if the page is accessed over HTTPS then another ESI include statement could be used for the padlock icon link. But I'll come back to that as it will require a bit more familiarity with Varnish ESI so I can see just what things are sent in the headers when fetching an ESI.

Actually, the code is right in front of me. If I create /inc/esi/nav_page_secure.php:

<?php $file_is_included = (isset($script_name));?>
<?php include_once $_SERVER['DOCUMENT_ROOT'].'/inc/htmlheader.php';?>
<?php
if (!$file_is_included) {
	cache_headers(true,30,true,10,true);
?>
<a href="#" class="fi-lock HTTPSisUsed" rel="nofollow" aria-label="This page is secure."><span class="text">Secure Site</span></a></li>
<?php
}
?>

I can now shift nav.php around so that what I want to do is done:

<esi:include src="/inc/esi/nav_head"/>
<?php
$link_url = $_SERVER['REQUEST_URI'];
$link_class = "fi-lock HTTPS".$is_HTTPS;
$https_aria_label = "";

switch ($is_HTTPS) {
	case "isNotUsed":
		$link_url = "https://".$site_domain.$link_url;
		$https_aria_label = "This page is also available securely.";
echo "<a href=\"".$link_url."\" class=\"".$link_class."\" rel=\"nofollow\" aria-label=\"".$https_aria_label."\"><span class=\"text\">Secure Site</span></a></li>";
		break;
	case "isUsed":
echo "<esi:include src=\"/inc/esi/nav_page_secure\"/>";
		break;
}

?>
<esi:include src="/inc/esi/nav_tail"/>

Now the entire navigation bar is cacheable by Varnish for secure pages. One further optimisation I did was to move the Secure Site span from nav.php and nav_page_secure.php to nav_tail.php.

With everything working properly, I set the caching times for the 3 ESI files to 3600 and 1800 seconds.


This is Just the Start

This is only the start of modifying the code on this site for use of Varnish ESI. What I have done so far is not too complex.

What will require a lot more thought is how to deal with the meat of the pages. Until then, I can make some more small changes like I have done throughout this page.

One thing that should be noted is that Varnish does not (yet?) send an If-Modified-Since request header when a cached ESI is refetched. Despite this, I have implemented the Cache-Control headers for the ESI files so that if such a feature is later implemented by Varnish there should be a performance increase without needing to make any changes.

Another thing I have yet to do is implement a way to purge a particular cached object. If I currently make a mistake and that mistake is cached, I need to restart varnish (purging the entire cache) or remembering the command line syntax to ban an URL. This is something else that needs further investigation.

Finally, I need to find out just how Varnish can deal with ESI and gzip-compression (AKA ESI+GZIP).

In order to avoid potential pitfalls up to this point I moved gzip compression from within PHP and lighttpd (backend server) to within nginx (HTTP/HTTPS termination server).

Although this means Varnish is no longer varying by Accept-Encoding, and is thus using less memory on the occasions a client without gzip support requests a page, it is perhaps not the most efficient way of doing things.

So... how does Varnish deal with gzip-compressed ESI?

Varnish ESI+gzip

Varnish can handle gzip-compressed content that includes ESI includes. First, reverse what I did earlier in this page, and then make suitable modifications.

sudo nano /etc/php5/cgi/php.ini
...
zlib.output_compress = On
...
zlib.output_compression_level = 9
sudo ~/Scripts/sync-webroot-after-update.sh
sudo service lighttpd restart
sudo nano /etc/varnish/user.vcl
...
sub vcl_recv {
...

# If an image is requested, do not cache it, nor do anything special.
  if (req.url ~ "\.(?i)(jpg|jpeg|png|bmp)$") {
    return (pipe);
  }
...

# Normalise Vary: Accept-Encoding (I belive I have disabled deflate in lighttpd).
  if (req.http.Accept-Encoding) {
    # All PHP files on the site, with few exceptions (robots.txt) do not have a dot in the path:
    if (req.url ~ "\.") {
      remove req.http.Accept-Encoding;
    } elsif (req.http.Accept-Encoding ~ "gzip") {
      set req.http.Accept-Encoding = "gzip";
    } else {
      remove req.http.Accept-Encoding;
    }
  }

# Normalise Vary: Move Accept-Encoding completely to nginx:
#  if (req.http.Accept-Encoding) {
#    remove req.http.Accept-Encoding;
#  }
...

  set req.http.Surrogate-Capability = "varnish=ESI/1.0";
  
  return(lookup);
}
sudo service varnish restart
sudo nano /etc/nginx/nginx.conf
...
http {

gzip off;
...
sudo service nginx restart
sudo nano /etc/varnish/user.vcl
...
sub vcl_hit {
  return(deliver);
}

sub vcl_miss {
# Uncomment following line to disable gzip compression between Varnish and Backend:
#  unset bereq.http.Accept-Encoding;
}

sub vcl_deliver {
...
  if (resp.http.Vary) {
    if (resp.http.Vary !~ "Accept-Encoding") {
      set resp.http.Vary = resp.http.Vary + ", Accept-Encoding";
    }
  } else {
    set resp.http.Vary = "Accept-Encoding";
  }
  return(deliver);
}

sub vcl_fetch {
# All text/* MIME types can probably be compressed (e.g. text/javascript):
  if (beresp.http.content-type ~ "^text/.+") {
    set beresp.do_esi = true;
    set beresp.do_gzip = true;
# All *xml* MIME types can probably be compressed (e.g. image/xvg+xml):
  } elsif (beresp.http.content-type ~ "xml") {
    set beresp.do_esi = true;
    set beresp.do_gzip = true;
  }
...
  return(deliver);
}

sub vcl_pass {
# If req.url doesn't contain a dot, it is most likely a PHP file with stripped extension:
  if (req.url !~ "\.") {
# Comment following line to disable compression between Varnish and Backend:
    set bereq.http.Accept-Encoding = "gzip";
# All ESI includes that do not contain ESI includes themselves should be compressed.
  } elsif (req.url ~ "^/inc/esi/") {
    set bereq.http.Accept-Encoding = "gzip";
  }
}
sudo service varnish reload

Varnish now makes most requests to the backend with gzip Accept-Encoding, and uncompresses responses to check for ESI include tags. It then merges the compressed response with the compressed ESI includes using some magic I don't understand, makes it cacheable, and then sends it to the client.

The reason I have upped PHP compression from the default of -1 to 9 is because Varnish caches everything that doesn't have a Cache-Control header for 2 minutes. That means if / is requested Varnish stores the cached copy for two minutes and varnishstat suggests that Varnish does no compression/decompression on a cache hit even if the hit file includes ESI include tags.

Now, a look at the source code of the home page of this site shows a lot of possibilities for ESI includes:

  • Certain Internet Explorer User-Agents get sent an extra header.
  • Everything before the <title> tag is the same throughout the site.
  • The stylesheet link tag is the same for everyone.
  • The fonts stylesheet link tag is added for all clients with the fonts cookie.
  • The in-line CSS for positioning the site navigation bar and the top (and moving the lead section upwards if JavaScript is disabled) is the same for everyone.
  • The dns-prefetch link tags are the same for everyone.
  • The in-line CSS for the highlighting effects of the navigation bar are the same for everyone, but are specific to the sections of the site (e.g. Home Page, Articles, Blogs, etc.)
  • The </head> closing tag, <body> and <main> tags are the same for everyone for all pages.
  • The opening code for every included article/blog post is similar, as is the closing code, and the whole article/blog summary include is the same for everyone.
  • The tracked changes in-line JavaScript code is the same for everyone, and since the form controls for it have been commented out the JavaScript is a waste of bandwidth.
  • The InitiateFoundation() in-line JavaScript function differs by User-Agent by fonts cookie.
  • The deferred JavaScript include which calls InitiateFoundation() after loading is the same for everyone.
  • The fonts loader in-line JavaScript is the same for everyone, but it is not included if they have the fonts cookie or are a Web crawler.
  • The fonts activated in-line JavaScript is the same for everyone.
  • The fonts cookie in-line JavaScript setter is using the wrong domain, but it the same for all non-crawler users.
  • The twitter share in-line JavaScript is the same for every page and user, although it does slow down page loading times.
  • The closing of elements </body> and </html> are the same for every user.

With the above list, I can see that quite a lot of work is needed, but some things will have a big impact on the number of things Varnish needs to cache.

Take the first item, for example. If I move that to an ESI include file then those versions of Internet Explorer can be merged into the normalised "IE 7-10" User-Agent.

If I do the same thing for the in-line JavaScript code for out-of-date Web browsers, and the code that is different for those that are crawlers or that which varies by fonts cookie, then I can remove the hashing of UA and cookie unless hashing one of the aforementioned files.

Although the Vary headers will still be sent to clients, it should be possible to further amend my user.vcl file so that if ESI/1.0 support is enabled downstream from Varnish (e.g. another proxy between nginx and the Web client) then it should be possible to just send the output without evaluating the ESI include statements.

As an ESI-supporting downstream proxy would be fetching the ESI includes via varnish, the Cache-Control and Age headers will be sent and possibly (if supported) make If-Modified-Since request headers further reduce bandwidth usage. That is to say a corporate proxy could have benefits to both their bandwidth usage and my own bandwidth usage.