×
×

How to cache pages by role in Varnish

Usually Varnish is used to cache pages for anonymous users but it's possible to cache pages by role as well. We have used this technique for corporate sites that are only visible under SSO. As a result, all users are authenticated but there is no user related information present on most of these pages. In this article, we'll describe how Varnish caches the pages and how the configuration can be changed to cache pages by role.

How Varnish works

Here is how Varnish works for most pages that can be cached. This is based on most Varnish configurations that I have seen being used with Drupal. When Varnish receives a page request, it removes all the cookies that are not useful for the Drupal backend, such as Google Analytics, has_js, etc. Now if there are cookies remaining such as the Drupal session cookies and ones set by Drupal modules, then Varnish does not look up the page in its cache and asks the webserver to provide that page. This is because it assumes that since there is a session cookie present, then the backend will provide a page that is specific to a user.

New configuration

In our case, even though a user is logged in (session cookie is present), we have pages that are not user-specific but are role-specific, i.e. they change from one role to another but not from one user to another if both of them have the same role. So all users who are editors will see exactly the same HTML on a page and all users who are administrators will see exactly the same HTML on a page. If there are roles for which the HTML is not identical from one user to another, then you can skip caching in Varnish for those roles. For this to work, Varnish needs to know what the role is for each page request. Unfortunately a session cookie does not provide that information. So we will set another cookie in Drupal to identify the roles and we will use it in Varnish to cache pages by role.

Here is what we have done. We created a Drupal module by the name varnish_cache_role to enable this functionality. In this module, Drupal needs to set a cookie to identify user roles. We used hook_init() in this module to do the same. Here is the code:

/**
 * Implements hook_init().
 */
function varnish_cache_role_init() {
  global $user;

  // If "Drupal_roles" cookie is not present or doesn't match the role that the user has, then set such a cookie.
  // This cookie will have all user roles separated by comma and will expire in 30 days.
  if (empty($_COOKIE['Drupal_roles']) || $_COOKIE['Drupal_roles'] != implode(',', array_values($user->roles))) {
    setcookie('Drupal.roles', implode(',', array_values($user->roles)), time()+60*60*24*30, '/');
  }
}

We need to remove this cookie when the user logs out. This can be done in hook_user_logout().

/**
 * Implements hook_user_logout().
 */
function varnish_cache_role_user_logout() {
  // Set the expiry date of the cookie to 1 so that it expires immediately.
  setcookie('Drupal.roles', '', 1, '/');
}

Now we need to change Varnish configuration so that it can parse the Drupal.roles cookie and cache pages based on it with the option to not cache pages for some roles. For this we need to first install Varnish from source and then install Varnish's Cookie module. Once you are done with these two steps, restart Varnish and then change the default.vcl file as below.

Generally Varnish's default.vcl has the following code:

if (req.http.Authorization || req.http.Cookie) {
  /* Not cacheable by default */
  return (pass);
}

Replace it by the following:

if (req.http.Authorization) {
  /* Not cacheable by default */
  return (pass);
}

Generally, Varnish's default.vcl has following code:

?
1
2
3
4
if (req.http.Authorization || req.http.Cookie) {
  /* Not cacheable by default */
  return (pass);
}

Replace the above code by the following:

?
1
2
3
4
if (req.http.Authorization) {
  /* Not cacheable by default */
  return (pass);
}

- See more at: http://redcrackle.com/blog/how-cache-pages-role-varnish#sthash.kq2oCOtZ....

So now Varnish doesn't automatically return pass if there is a cookie present. At the end of vcl_rev(), add the following code:

if (req.http.Cookie) {
  # Use Cookie module to parse the cookies in the request.
  cookie.parse(req.http.Cookie);

  # If Drupal.roles cookie is present and is "authenticated+user",
  # then look up if this page is cached.
  if (cookie.get("Drupal.roles") == "authenticated+user") {
    return (lookup);
  }

  # If cookie is "anonymous+user", then look up if this page is cached.
  if (cookie.get("Drupal.roles") == "anonymous+user") {
    return (lookup);
  }

  # If cookie is "authenticated+user;editor", then look up if the page is cached.
  if (cookie.get("Drupal.roles") == "anonymous+user;editor") {
    return (lookup);
  }

  # For any other role, do not return content from cache and instead pass the
  # request to the backend.
  else {
    return (pass);
  }
}

In the above code, we are checking that if the user is either anonymous, authenticated or editor, then look up the page in cache. For any other role, skip the cache and pass the request to backend web server. Now change the code at the end of vcl_hash() function so that pages are cached by user role.

if (req.http.Cookie) {
  # Parse the cookie string.
  cookie.parse(req.http.Cookie);

  # If Drupal.roles cookie is "authenticated+user", then use it in the hash.
  if (cookie.get("Drupal.roles") == "authenticated+user") {
    hash_data(cookie.get("Drupal.roles"));
  }

  # If Drupal.roles cookie is "anonymous+user", then use it in the hash.
  elseif (cookie.get("Drupal.roles") == "anonymous+user") {
    hash_data(cookie.get("Drupal.roles"));
  }

  # If Drupal.roles cookie is "authenticated+user;editor", then use it in the hash.
  elseif (cookie.get("Drupal.roles") == "authenticated+user;editor") {
    hash_data(cookie.get("Drupal.roles"));
  }

  # For any other role, do not use Drupal.roles cookie in the hash. This means 
  # pages for all the remaining roles will not be cached by role.
  else {
    hash_data(req.http.Cookie);
  }
}

Restart Varnish and you will see that all the pages for anonymous users, authenticated users and editors are cached in Varnish by role. Let us know how you liked this technique by leaving a comment below!

Services: 
Drupal Performance Tuning

Sign up for our weekly newsletter


Comments

  • by debra.v (not verified)
  • Tue, 09/09/2014 - 11:54

Thanks for the excellent info.

I'd very much like to know how you handle the cookies so that pages do not miss cache. Do you unset them? If so, when and where?

I have been working on a project trying to do this very thing with Drupal forums, anonymous vs authenticated users. What I can't seem to figure out is how to handle the Session Cookie. If we want to cache pages for authenticated users (role-based), I have a role cookie set to tell me its an authenticated user much like you did above, but the presence of the role cookie invalidates cache for the page. If you unset all cookies, you get a page for anonymous users as Drupal does not receive the Session cookie.

How do you work around this? What am I overlooking?

neerav.mehta's picture
  • by neerav.mehta
  • Sun, 09/14/2014 - 15:34

Good catch! We forgot to mention that code change in varnish.vcl. I have updated the blog article. Generally, Varnish's default.vcl has following code:

if (req.http.Authorization || req.http.Cookie) {
  /* Not cacheable by default */
  return (pass);
}

Replace the above code by the following:

if (req.http.Authorization) {
  /* Not cacheable by default */
  return (pass);
}

So now Varnish doesn't automatically return pass if there is a cookie present. We have already added following to default.vcl at the end of vcl_recv.

if (req.http.Cookie) {
  # Use Cookie module to parse the cookies in the request.
  cookie.parse(req.http.Cookie);
 
  # If Drupal.roles cookie is present and is "authenticated+user",
  # then look up if this page is cached.
  if (cookie.get("Drupal.roles") == "authenticated+user") {
    return (lookup);
  }
 
  # If cookie is "anonymous+user", then look up if this page is cached.
  if (cookie.get("Drupal.roles") == "anonymous+user") {
    return (lookup);
  }
 
  # If cookie is "authenticated+user;editor", then look up if the page is cached.
  if (cookie.get("Drupal.roles") == "anonymous+user;editor") {
    return (lookup);
  }
 
  # For any other role, do not return content from cache and instead pass the
  # request to the backend.
  else {
    return (pass);
  }
}

In above code, even if there is a Session cookie present but if the role is anonymous, authenticated or editor, then the code will return lookup. This means that Varnish will first look in cache and then ask the back-end for it if it's not present in the cache. Since we are explicity returning lookup here, we don't need to unset the Session cookie. In fact, we should not unset the Session cookie. Here is why:

Suppose we unset the Session cookie and there is a cache miss. This means that the request goes to the back-end for retrieving the page. Since the request does not have session cookie any more, the back-end thinks that it's an anonymous user and responds with a page for anonymous user. Varnish caches this page. Note that in vcl_hash() function, we are adding the role from the Drupal.roles cookie to the hash. This means that irrespective of the role present in the Drupal.roles cookie, a page for anonymous user will be retrieved and cached. This will break the entire scheme.

Let us know if you have more questions.

  • by lhangea (not verified)
  • Thu, 01/28/2016 - 02:33

Thanks for explaining and I have a another question: wouldn't your setup cache per role AND per user because of the session cookie ?

The pages by default have the Vary:Cookie header which can't really be removed because in that case logged in user would see the same page and anonymous user after log in. So because of that header I think each user with a certain role will have a separate cache. Am I missing something ?

  • by debra.v (not verified)
  • Mon, 09/15/2014 - 14:39

As strange as it sounds, I have to say, changing the following statement was a real eye-opener.

FROM:
if (req.http.Authorization || req.http.Cookie) {
/* Not cacheable by default */
return (pass);
}

TO:
if (req.http.Authorization) {
/* Not cacheable by default */
return (pass);
}

This is the standard cookie check in every VCL I've reviewed. It almost seemed blasphemous to remove it, but it brought me closer to the solution. There were a couple of other adjustments I had to make in order to get it working, but it does indeed work.

The extra changes are in the vcl_fetch(). I needed to add an additional check for the cookie (note, our version does not use the Varnish Cookie Module).

ADDED to vcl_fetch:
if ( req.http.Cookie ~ "authenticated\+user") {
if (beresp.http.Set-Cookie) { remove beresp.http.Set-Cookie; }
if (req.http.Cookie) { remove req.http.Cookie; }
if (beresp.http.Cache-Control ~ "(private|no-cache)") {
remove beresp.http.Cache-Control;
set beresp.ttl = 1h;
}
}

The fetch executes after a request has been made to retrieve content from the Drupal backend. Even if we are successfully caching on role, when Varnish has to send a request to the Drupal backend, it would return pages for authenticated users with beresp.http.Cache-Control set to "no-cache, max-age 0". Removing Cache-Control, setting a beresp.ttl (where we know it is safe to do so) and removing all cookies made the authenticated user content cacheable.

Thanks again for the info. Your article has been immensely helpful and enlightening.

  • by Amit Jain (not verified)
  • Sat, 07/02/2016 - 08:11

Great article..but I have a question ..can i by pass some specific blocks which are related to specific user like "Hi Amit". in the top menu...or any other block....

Add new comment