11 Varnish Web Acceleration Techniques Shield Your Slow Legacy Web Application

5 minutes read (1276 words)

August 26th, 2023

As web applications grow in popularity, they often struggle to scale to meet increasing traffic demands. The application servers become overloaded, response times suffer, and the user experience deteriorates. Many sites turn to cache solutions like Varnish to improve performance and reduce load on their backends.

Varnish provides a caching HTTP reverse proxy that stores common responses in memory and serves those directly, bypassing application servers entirely for cached content. This greatly reduces the work required from the backend application servers. However, Varnish will only provide significant benefit if it can serve a high percentage of requests from cache - known as the hit rate.

Tuning Varnish appropriately is crucial to achieving a high hit rate. There are many techniques that can optimize Varnish to cache more content and keep it cached longer. With the right configuration, sites can often improve their hit rate from sub 50% to over 90%, with corresponding reductions in application server load.

This essay will explore 11 key optimization techniques to tune Varnish caching and attain high hit rates:

1. Set Cache-Control Headers

The Cache-Control header instructs caches how to handle content. It is one of the primary mechanisms for controlling cache behavior. Varnish looks at Cache-Control to determine if a response can be cached and for how long.

Backend applications should set the max-age directive appropriately in Cache-Control. This tells caches the maximum time they can reuse a response before revalidating it. Higher max-age values allow longer caching.

For example:

Cache-Control: max-age=3600

This caches the content for one hour before revalidating. Adjust max-age based on how frequently content changes. Use the longest duration that will maintain freshness.

Setting no-cache or no-store instead prevents caching entirely. And private indicates not to cache publicly shared caches like Varnish.

So proper use of Cache-Control is imperative to maximize Varnish hit rates.

2. Adjust TTLs

The TTL (Time To Live) determines the cache lifetime of an object in Varnish. After the TTL expires, caches are invalidated.

Varnish computes a TTL based on Cache-Control headers and other factors. But Varnish also allows manually overriding TTLs in VCL configuration.

For example:

sub vcl_backend_response {
  if (bereq.url ~ "^/legacy/") {
    set beresp.ttl = 4h; 
  }
}

This sets a 4 hour TTL for legacy application URLs that aren't cache-friendly. Overriding with longer TTLs where appropriate keeps more content cached longer.

But beware setting TTLs too long, which could serve stale content. Monitor cache invalidation rates and adjust accordingly.

3. Enable Grace Mode

Grace mode instructs Varnish to keep serving cached content for a period after it expires. This grace window prevents a "thundering herd" of requests from hitting backends all at once when caches invalidate.

For example:

sub vcl_backend_response {
  set beresp.grace = 1h;
}

This allows serving cache for 1 hour past its TTL. The grace period allows time to fetch updated content in the background.

Grace mode dramatically avoids traffic spikes to backend. Set grace long enough to comfortably revalidate caches in background.

4. Normalize Request Headers

Varnish keeps separate cache versions for every variation of a request header. So a request with "en-gb" in Accept-Language gets a different cached response than "en-us", even if the content is identical.

This cache fragmentation severely impacts hit rates. Normalizing headers combines these variations:

sub vcl_recv {
  if (req.http.Accept-Language) {
    set req.http.Accept-Language = "en";
  }
}

Now Varnish will treat all English language variants the same. Do this normalization for headers like Cookies, User-Agent, and Accept-Encoding too.

5. Filter Cookies

By default, Varnish caches nothing when request cookies are present, to avoid caching personalized content. This often hinders caching significantly.

But many cookies like analytics cookies don't impact cacheability. Filter out non-essential cookies:

sub vcl_recv {
  if (!req.url ~ "^/user/") {
    unset req.http.Cookie;
  }
}

Now only /user/ requests with session cookies bypass cache. Varnish will cache content for requests with filtered cookies.

Implement cookie filtering rules cautiously to avoid caching personalized content improperly. But huge gains are possible filtering non-essential cookies.

6. Watch out for Vary

The Vary header tells caches that responses differ based on specific request headers. This requires caching multiple variants.

For example:

Vary: User-Agent

This means Varnish must cache a separate copy for every observed User-Agent. Just the slightest difference in user agent strings leads to many redundant cache objects.

Ideally applications should not set Vary arbitrarily. But when they do, normalize the associated request headers, just like #4. Otherwise cache efficiency suffers drastically.

7. Handle Uncacheable Content

Not all content can be cached, like dynamic or personalized data. But we still want an efficient caching strategy for uncacheable content.

Use hit-for-pass configuration to indicate certain requests always pass to backend. For example:

sub vcl_recv {
  if (req.url ~ "^/api/") {
    return(pass);
  }
}

Now /api/ requests that can't be cached will completely bypass cache. Varnish won't waste any effort caching responses marked uncacheable.

It's also possible to configure hit-for-miss caching for content like responses with cookies that become cacheable later. This eases transition to cacheable.

8. Offload Static Assets

Static assets like images, CSS and JS don't benefit from caching in Varnish - they never change. But storing them in Varnish wastes precious cache space.

Offload static assets to a cookieless domain like static.example.com. Configure this domain to skip Varnish and hit backend servers directly. Varnish no longer wastes space storing static content.

For dynamic content, redirect static asset references to the static domain. This improves cache efficiency tremendously.

9. Limit Crawl Rate

Aggressive crawlers can wreck havoc on caches by slamming backends when caches inevitably expire.

Use Varnish's rate limiting capabilities to restrict crawling activity. For example:

sub vcl_recv {
  if (req.http.User-Agent ~ "bot") {
    set req.http.X-Varnish-Rate = "5r/s";
  }
}

This limits crawlers to 5 requests per second. Configure rate limits wisely to find balance between crawl needs and capacity.

10. Purge Content

When content changes, caches need to be purged so new content is fetched. Use Varnish's PURGE requests to actively invalidate content.

For example, when a post is updated:

PURGE /blog/post/123

This immediately invalidates the old post so new content is cached on next request.

Strategically purging cache proactively keeps content fresh rather than waiting for TTLs to slowly expire stale content.

11. Monitor Performance

Continuously monitor key metrics to tune Varnish. varnishstat provides great visibility:

hit rate percentage - target over 90%
cache miss ratio - aim for under 1:100
backend requests - minimize these to ease load
object bloat - trim down needlessly cached content

Monitor these metrics on an ongoing basis and optimize configurations to gradually improve. Don't forget, incremental gains add up over time!

Conclusion

Achieving high hit rates with Varnish requires continual optimization. Implementing techniques like the 12 covered in this essay can drastically boost performance. Sites often see hit rate gains of 40% or more with appropriate tuning.

The savings in backend load, traffic costs and latency benefits hugely impact overall application user experience as traffic scales up. The difference between a 50% cache hit rate and 90% is tremendous in actual capacity.

But attaining those high hit rates involves trade offs in complexity and maintenance. There is no "one size fits all" configuration. Each application's needs and content require custom optimization.

Careful tuning and diligent monitoring is an iterative process. Work through these caching techniques step-by-step to improve Varnish response times and reduce that load off struggling backends. The effort is well worth the substantial gains in site performance and scalability that result from making the most of Varnish caching capabilities.

Frequently Asked Questions

How can I optimize Cache-Control headers?

Set the max-age directive appropriately to cache content for the maximum freshness period. Use higher max-age values for cacheable content. Avoid no-cache and no-store to block caching.

Should I manually adjust TTLs in Varnish?

Yes, override default TTLs with higher values via custom VCL code to keep cacheable content cached longer. But don't use extremely long TTLs that would serve stale content.

What are the benefits of enabling grace mode?

Grace mode reduces load spikes on backends by serving stale cache briefly during revalidations. Enable grace mode to avoid thundering herd problems.

Why normalize request headers?

Normalizing headers like User-Agent combines cache variations for much higher cache efficiency rather than separate cached versions.

How can filtering cookies help?

Caching more content is possible by filtering out non-essential cookies like analytics cookies that don't impact cacheability.

What's wrong with overusing Vary?

Too much Vary usage leads to an explosion of unnecessarily cached variants. Normalize associated request headers to ease this cache fragmentation.

How to handle uncachable content?

Use hit-for-pass and hit-for-miss configurations to optimize caching of responses that can't be cached initially but may become cacheable.

Will Redis help increase hits?

Yes, Redis provides another caching layer to capture misses from Varnish's cache for dynamic or personalized content.

Should I offload static assets?

Offloading static assets improves cache density. Remove them from Varnish caching since they waste space but never change.

Why rate limit crawlers?

Aggressive crawler traffic can easily overwhelm caches as they expire. Enforce rate limits to prevent cascade failures during spikes.

Is proactive purging beneficial?

Yes, proactively purge old content immediately when updated rather than just letting caches slowly expire stale content.

Should I monitor caching performance?

Continuously monitor key metrics like hit rate, misses, and object bloat. Use this data to further optimize Varnish configurations for even better caching.