Fixing Obscure Bugs: Apache, GZip, ETags, & Edge Compute

This post highlights an interesting use-case for using edge compute to solve an obscure performance bug with Apache's GZip module.

I was recently introduced to an obscure performance bug with Apache’s GZip module, and a clever solution using EdgeWorkers. So I made a video, and a small blog post to cover it:

One thing I love about my job is when I get to work with people that are way smarter than me at things that I never had the interest in working on, or the opportunity.

For example, although Apache is arguably the most popular HTTP server technology on the planet, it’s not without flaws. If you look back through the bug reports, you may find one called, “Incorrect ETag on gzip:ed content“, which was reported in 2006 by Hendrick Nordstrom.

Hendrick noticed that entities that were GZipped using the mod_deflate module still carried the same ETag as the plain entities causing some sort of inconsistency with ETag aware proxy caches.

If you’re not familiar, ETags are HTTP response headers that are basically used to identify a specific version of a resource. The value of the ETag is often a hash value that’s generated from either the file contents or the date modify, or some combination of the two.

It might looks something like this:

ETag: 27dc5-556d73fd7fa43

If an HTTP request includes an If-None-Match header that matches an existing ETag response header, a proxy cache (like a CDN) can assume that file has not been modified, and may respond with a 304 “Not Modified” status.

In theory, this would be faster and save bandwidth.

When you enable the GZip compression module in Apache, it will actually append a little “-gzip” suffix to the end of these generated hash values for the ETags.

Unfortunately when Apache goes to compare the If-None-Match header (“27dc5-556d73fd7fa43-gzip”) to the ETag value (“27dc5-556d73fd7fa43”) it doesn’t account for the suffix.

As a result, it’s always going to serve the entire resource payload, which is ironic considering ETags and GZip are both ways to, in theory, save bandwidth.

I hope I was able to explain this issue clearly, but if I haven’t, there’s an excellent blog post entitled “Apache, ETag, and ‘Not Modified’” on Flameeyes’s Weblog. It even provides the solution.

So, if the problem is already solved, why am I talking about this today?

Well, the solution requires that you modify the Apache config. But as developers these days, I know all too well that there’s many cases where we can’t do that ourselves. I may not have permissions, or I may be working with third party service.

And here’s where the interesting solution comes up.

Akamai has a product called EdgeWorkers, which can basically sit in between a client and an origin server and run JavaScript logic.

So when a client makes a request, the EdgeWorker can intercept the request and modify it on its way to the origin server. Or when an origin server makes a response, the Akamai EdgeWorker can intercept that response and modify it before it gets passed along back to the client.

In theory, a developer that doesn’t have access to the Apache config can still monkey patch a temporary fix until the solution can be solved at the Apache config level.

It might look something like this:

// Modify origin responses
export function onOriginResponse(request, response) {
  // Grab the server name
  const serverName = response.getHeader('Server')[0];
  // Grab the ETag header
  const etag1 = response.getHeader('ETag')[0];
  // Check if it's an Apache server and the ETag ends with '-gzip'
  if (serverName.startsWith('Apache') && etag1.endsWith('-gzip')) {
    // If so, strip out the '-gzip' suffix
    response.setHeader('ETag', etag1.replace('-gzip', ''));
  }
}

I realize that this may not actually apply to a lot of folks out there, but I wanted to share it because it strikes a nice balance of obscure bugs, deep technical knowledge, and a only a few lines of code to fix. Especially when you include the Akamai EdgeWorkers part, it’s just a cool, elegant approach that I never would have thought of before.

So hope you also found it interesting, even if it’s not immediately relevant to you today.

Thank you so much for reading. If you liked this article, and want to support me, the best ways to do so are to share it, sign up for my newsletter, and follow me on Twitter.


Originally published on austingil.com.

Leave a Reply

Your email address will not be published. Required fields are marked *