Getting Acquainted with ADAL’s Token Cache

A token cache has been one of the top requests from the development community since I have been in the business of securing remote resources. It goes back to the first days of the Web Service Enhancements; it got even more pressing with WCF, where having token instances buried in channels often led to gimmicks and hacks; its lack became obvious when WIF introduced CreatingChannelWithIssuedToken to allow (tease?) you to write your own, without providing one out of the box.

Well, my friends: rejoice! ADAL, our first client-only developer’s library, features a token cache out of the box. Moreover, it offers you a model to plug your own to match whatever storage type and isolation level your app requires.

In this post I am going to discuss concrete aspects of the cache and disregard most of the why’s behind the choices we made. There’s a more abstract discussion to be had about what “session” really means for native clients, and – although in my mind the two are inextricably entangled – I’ll make an effort to keep most theory out, and tackle that in a later post.

Why a Client Side Cache?

Here there’s a quick positioning of ADAL’s cache. “Cache” is a pretty loaded term, and its use here is somewhat special: it’s important to wrpa your head around what we are trying to enable here.

Performance and As Little User Prompting As Possible

Quick, what’s the first purpose of a cache that comes to mind? That’s right: performance. I’ll keep data close to my heart, so that when I need it again it will be right here, FAST.
That is one of the raison d’être of the ADAL cache, of course. Acquiring tokens entails putting stuff in and out of a wire, occasionally doing crypto, and so on: that’s expensive stuff. But it’s not the only reason.
Those issues might not be as pressing for an interactive application as they are for a backend or a middle tier. Think about it: your app only needs to appear snappy to ONE user at a time, and that user’s reaction times are measured in the tenth of seconds it takes for the dominoes of the synapses to fall over each other as neurotransmitters travel back & fro.

In the case of native applications, the performance requirements might be more forgiving; but performance is not the only parameter that can test the user’s patience. For example: there are few things that can annoy a user more than asking him/her to enter and re-enter credentials. Once the user actively participated to an authentication flow, you better hold on to the resulting token for as long as it is viable and ask for the user’s help again only when absolutely necessary. Fail to do so, and your app will be the recipient of their wrath (low ratings, angry reviews, uninstalls, personal attacks).

Well, good news everyone! The ADAL cache helps you to store the tokens you acquire, and retrieves them automatically when they are needed again. It keeps the number user prompts as low as it is “physically” possible.
In fact, it goes much farther than that: if the authority provides mechanisms for silently refreshing access tokens, as Windows Azure AD and Windows Server AD do, ADAL will take advantage of that feature to silently obtain new access tokens. You just keep calling AcquireToken, all of this is completely transparent to you.
If none of those silent method for obtaining tokens work out, only then ADAL resorts to show up the authentication dialog (and even in that case, there might be a cookie which will spare your user from having to do anything. More about that below).

Multiple Users and Application State

That alone would have been enough value for us to implement the feature, but in fact there’s more. Unfortunately this is the part that would benefit from the theoretical discussion I want to postpone, but I don’t think I’ll be able to avoid mentioning it at least a bit here.

In web apps you typically sign in as a given user, and you are that user for the duration of the session. When you sign out, all the artifacts in the session associated to that user are flushed out.
A native app can work that way too, if it is a client for a specific service; but in fact, that might not be the case. Sometimes rich clients will allow you to connect multiple accounts at once, from different providers (think a mail client connecting to many web mail services, or a calendar app aggregating multiple calendar services) or even from the same (thin of an admin sometimes acting as himself, sometimes acting as his boss).

If you’d be dealing with a single user, flushing out a session might be implemented by simply clearing up the entire cache; but for multiple users, you need to get finer control. When the end user wants to disconnect a specific account you need to selectively find all the tokens associated to that account only, and get rid of them without disturbing the rest of the cache.
Now, you can repeat the same reasoning for any entity that is relevant when requesting tokens: resources (the app might aggregate multiple services), providers, everything that comes to mind.

Fortunately, ADAL can help you with all of the above – thanks to a small shift in perspective: instead of being locked up in some private variable, ADAL’s cache is fully accessible to you. You can query it using your favorite combination of LINQ and lambda notation, and do whatever you’d expect to be able to do with an IDictionary<>.

Important: as long as you don’t need advanced session manipulation, the cache remains fully transparent to you. AcquireToken will consult the cache on your behalf without asking you to know any of the underlying details. The ability of querying the cache is on top of its traditional use.

How AcquireToken Works with the Cache

ADAL comes with an in-memory cache out of the box. Unless you explicitly pass at AuthenticationContext construction time an instance of your own cache class (or NULL if you want to opt out from caching) ADAL will use its default implementation.

The default implementation lives in memory, is static (e.g. every AuthenticationContext created in the application shares the store and searches the same collection) and (beware!) is not thread safe. If you want a different isolation model, or to persist data, you can plug your own. This is the only extensibility point in the entire ADAL! I’ll touch on that later.

As you use AcquireToken, you’ll be using the cache without even knowing it’s there. I already went through this in the post about AuthenticationResult, and although it’s tempting to go through this again now that you know a bit more about the cache, that would make the post grow far beyond what I intended. If you didn’t read that post, please head there and sweep through that before reading further. I’ll wait.

[…time passes…]

Welcome back!

Here is a flow chart that might help to get an idea of what AcquireToken actually does with the cache. At the cost of being boring: you don’t need to know any of that in order to use AcquireToken taking advantage of the cache. This is only meant for people who want to know more, and to help you troubleshoot if the behavior you observe is not in line with what you expect. To that end: please remember that Windows Azure AD and Windows Server AD have small differences here.

The Cache Structure

The cache structure is pretty simple; it’s a IDictionary<TokenCacheKey,string>.

You are not supposed to know, given that you would never need to look into that directly, but the Value side of the KeyValuePair contains, in fact, the entire AuthenticationResult for the entry.

The thing that should get your interest, conversely, is TokenCacheKey. Here it is:

That’s mostly a flattened view of the AuthenticationResult info, except for the actual tokens.
In the opening of the post I said that the cache serves two purposes, helping AcquireToken to prompt as little as possible and helping you to assess & manipulate the token collection (hence, the session state) of your client.
AcquireToken uses only a subset of the key members, typically the ones that affect the contract between the client and the target resource (Authority, ClientId, Resource) and the mechanics of the authentication itself (ExpiresOn, UserId). None of the other entries come into play during AcquireToken.
All the other info are there mostly for your benefit: instead of forcing you to remember those extra settings in your own store every time you get back an AuthenticationResult and later join them to the cache, we save them for you directly there. That allows you to use them in your own queries, for display purposes or for whatever other function your scenario might require.

And apropos, here there are few examples of queries you might want to do.

AuthenticationContext ac = new AuthenticationContext("hahha");
var allUsersInMyBelly = 
   ac.TokenCacheStore.GroupBy(p => p.Key.UserId).Select(grp => grp.First());

The query above returns a cache entry for each unique users – where “user” is used in the sense of UserId, see this post for an explanation of what that really means for ADFS & ACS). You might want to use this query for finding out how many/which unique users are connected to your application, for example to enumerate them in your UX.

var allTokensForAResource = 
     ac.TokenCacheStore.Where(p => p.Key.Resource == https://localhost:9001);

The query above is very straightforward, it lists all the tokens scoped for a given resource. You might want to use that to discover which users (and/or which authorities) in your app currently have access to it.

var allUsersInMyBellyThatCanAccessAGivenResource =
   ac.TokenCacheStore.Where(
       p => p.Key.Resource == "https://localhost:9001").GroupBy(
                p => p.Key.UserID).Select(
                         grp => grp.First());

I knooow, I am terrible at formatting those things… but I have to do something to get it to fit in this silly blog’s theme! But I digress. This query

combines the first two to return all the users that have access to a specific resource.

foreach (var s in ac.TokenCacheStore.Where
              (p => p.Key.UserId == "vittorio@cloudidentity.net").ToList())
    mac.Cache.Remove(s.Key);

The query above deletes all the tokens associated to a specific user.

bool IsItGood = ac.TokenCacheStore.Where(
        p => p.Key == "https://localhost:9001").First().Key.ExpiresOn > DateTime.Now;

Finally, this one tells you if the access token for a given resource is about to expire. This one can come in handy when you know that there are clock skews in your system (ADAL does not take clock skew considerations into account, given that they’re largely a matter between the authority ant the resource).

Dude, Get Your Own

I expect that many, many scenarios will require a persistent cache which can survive app shutdowns and restarts. That will likely means using different persistent store types for different apps, in all likelihood the same persistent storage you already use for your own app data.
Furthermore: different apps will require different isolation levels, perhaps segregating token cache stores per AuthenticationContext instance, tenancy boundaries and whatever else your unique scenario calls for.

That’s why we made an exception to the otherwise ADAL’s adamant rule “as little knobs as the main scenarios require”, and made the cache pluggable.

You can easily write your own cache, all you need to do is implementing an IDictionary<TokenCacheKey, string>. Our most excellent SDET extraordinaire Srilatha Inavolu created a good example of custom cache, which saves tokens in CredMan: you can see her implementation here.

Now, I heard from some early adopters that implementing IDictionary requires fleshing out a lot of methods, and it could have been done with a far slimmer API surface. That is true: that said, we believe that there will be far more people querying the cache than people implementing custom cache classes. Furthermore, whereas querying the cache will be most often than not entangled in each app’s unique logic (hence bad candidates for componentization and reuse), custom cache classes are components that might end up being implemented by few gurus in the community and downloaded ad libitum by everybody else. And those gurus can most certainly implement the 15 required methods in their sleep

Given the above, we choose to use an IDictionary to give you something extremely familiar to work with. Identity is complicated enough, we didn’t want you to have to learn yet another way of querying a collection

This had other tradeoffs (KeyValuePair is a struct, which makes LINQ materialization problematic; implementing an efficient cache on the middle tier will require extra care) but after much thought we believe this will serve well our mainline scenario, native clients. If you have feedback please let us know, we can always adjust the aim in v2!

Wrap

You asked for a client side token cache: you got it!

ADAL’s cache plays an essential role in keeping complexity out of your native applications, while at the same time taking full advantage of the OAuth2 features (like refresh tokens) and AD features (like multi-resource refresh tokens) to reduce user prompts to a minimum and keep your app as snappy as possible.

I believe that one of the reasons for which we were able to add cache support is that ADAL makes the token acquisition operation very explicit, providing a very natural plug for it. WCF buried the token acquisition in channels and proxies, but in so doing it tied the acquired token lifecycle to the lifecycle of the channel itself and made it hard to aggregate all tokens for the app.
This, coupled to the fact that REST services greatly diminish the need for a structured proxy, makes me hopeful that ADAL’s model is actually an improvement and will make your life easier in that department!