Clients shouldn’t peek inside access tokens
I am having a Twitter thread about why the Microsoft Graph- and only the Microsoft Graph- should be the one validating access tokens obtained by a client for calling the Microsoft Graph. However I am failing to explain that effectively in 240 chars quanta, so here I am – breaking a ~7 months blogging hiatus to provide a long form explanation.
A lot of the misunderstandings and challenges arising in this area stem from the fact that many people tend to interpret specific features and implementation choices of the product/service/SDK/scenario/topology they are using as general, universal properties of the protocol. Just like assuming that all swans are white, because you have never seen a black one, that approach will fail pretty often. Another form of that fallacy is the illusory correlation bias- for example, the conviction that OAuth2 tokens are JWTs (not necessarily!) or some form of novelty bias ( you think OpenID Connect tokens are all JWTs? Nope.)
Nowadays I tend to avoid trying to backtrack to first principles when explaining something- there’s a lot of people needing help in the queue, and they usually appreciate getting unblocked without having to become experts in protocol algebra. I tried that route in the twitter thread, pointing out things that could break in code when pursuing the wrong approach, but this time it didn’t work- so I have to bring out the big guns and go back to the fundamentals.
OAuth2 Access Tokens
OAuth2 introduces access tokens as artifacts that a client can use to gain access to a resource. You can find the stone tablets definition here.
Here there are few facts that are obvious for sad people like me spending a lot of time reading specs, but that might challenge the beliefs of pure practitioners.
OAuth2 access tokens have NO defined format.
That’s right. OAuth2 access tokens have NO mandated format. Read the definition I linked to earlier. Re-read it. There is NO LAW of any kind determining that an access token should be a JWT. More! There is no law of any kind establishing that an access token must contain issuers, scopes, audiences, claims, or any attribute at all. As long as a token enables a resource server to perform the authorization decisions it needs to make when receiving a call, it can use whatever (verifiable) method it wants.
Let’s drive the point home by sketching a quick scenario (which is semi fictional, I am using “Facebook” just because it’s familiar but I can’t really say how they are implemented in reality- this is more about understanding what’s possible).
Imagine a client app that wants to call the Facebook Graph. The client will hit the Facebook’s authorization server authorization endpoint, which will authenticate the user and present a consent screen; upon successful authentication, Facebook will record (say in a DB) the fact that the user granted consent for delegated access from the client for a set of given scopes (let’s say writing on your wall). For the sake of simplicity let’s assume implicit grant: the authorization server returns an access token directly to the client.
Now the client turns around and invokes the Facebook Graph, presenting the newly obtained access token in the call. The resource server in the API retrieves the access token and uses it for deciding whether it should allow the client to post on the user’s wall.
Here’s the kicker! Given that the Facebook’s authorization server and resource server can be co-located and/or access the same backend, the token could simply be the ID of the row of a DB where the AS originally stored the consent info; as long as the resource server can read the same DB, there’s no need to use any fancy token format.
Now, why in practice so many services and solutions do use a specific format for access tokens? Two common reasons:
- Often AS and RS are not co-located. When you protect your own web API with Azure AD, the AS is in Azure AD and the RS is wherever you host your API. In that case, the RS must be given a way to access and verify authorization info- and a format with validation rules does the trick better than providing access to a shared DB
- Statelessness. Although consent is usually recorded long term, transient info are not good candidates to be stored server side- what auth method was used for a given auth transaction, for how long should a token be considered valid: those are all info that are easier to manage if the token itself remembers them, rather than having to store and retrieve them server side. This is one of the reasons for which even tokens meant for co-located AS and RS might be in a readable format (like JWT) but that is, I cannot stress it enough, happenstance: it’s absolutely NOT an invite to process that token on the cient, for all the reasons above.
OpendID Connect does NOT redefine OAuth2 access tokens to be JWTs.
OpenID Connect does lots of amazing things, including introducing id_tokens which ARE meant to be read by clients (that’s the entire point) and ARE defined as JWTs.
But OpenID Connect does NOT say anything about access token format: it introduces new grants and response types so that id_tokens can be requested alongside access tokens, but that doesn’t override the existing flows and properties defined in the original OAuth2 spec (including the shapelessness of access tokens).
OAuth2 access tokens’ sole purpose is to relay authorization info to the resource server
That’s why a client requests an access token: to gain access to the corresponding resource. That’s the only contract that needs to be respected when issuing access tokens, and it’s a contract between the AS and the RS- no one else, and in particular NOT the client, which should always treat access tokens as perfectly opaque.
Every other collateral functionality that might be achieved on the client by processing the access token is NOT part of the contract. For example: if you are inspecting the access token on the client for determining what scopes have been granted to your client for that resource, your code is going to break when the token will change format. Often those changes are unrecoverable. Say that the RS starts requiring PII about the user, which cannot be seen by anyone by the RS: the AS might start to encrypt the entire token, so that no one but the RS (an in particular, not the client) can gain access to the token content. Note, this isn’t to say that knowing what scopes have been granted is not a legitimate functionality your client app might need: this is to say that extracting that info from an access token is brittle and very likely to end in tears- in production, no less.
The resource server is the only entity that can decide whether the token is enough to authorize the call
This is less of a protocol consideration and more of an architectural one.
A client trying to divine from the access token content whether a call to an AI will succeed is, again, destined to a great deal of pain.
An API that until yesterday was accepting calls from all regions, today might start to restrict to specific countries; an API that was happy with any authentication method might suddenly start to accept only tokens obtained via 2FA; and so on. Same token, different authorization outcome.
Once again: this does not delegitimize the need to give your users a good experience and minimize failures or unnecessary prompts. But a client looking inside an access token trying to guess that cannot succeed in the general case, hence it is not a viable solution for that problem. Even without invoking the risk of the token content disappearing behind an encryption event horizon, as described earlier, the client simply isn’t privy to the semantic the RS will assign to attributes- and the authorization decisions that will come with it.
So, what about validating tokens for the Microsoft Graph (or any other API) on a client?
At this point, you know that attempting validation of an access token for the Microsoft Graph or any other API is entirely pointless on the client:
- There is no guarantee that the token will be in JWT format (or any format). It might be today, but that can change (or become impossible to ascertain) at any second, without any warning or remediation because the token format isn’t part of the contract with the client
- No amount of validation or inspection will say anything certain about whether calls to the Graph using that token will succeed. API access policies change without any way for the client to know. And of course, it really bears repeating, having access to the token content from the client is pure happenstance and can stop working any second
The recipient of an access token is responsible for any validation and authorization enforcement; if you are that recipient, as is the case when you expose your own API, you can and should totally process the access token accordingly. But if you are a client of another API, I hope this post convinced you that you should not peek inside access tokens not meant for you.
A client is meant to do two things with an access token
- request it
- use it to call the resource it is meant for
- renew it
Ok, that was 3 (tho the 3rd is really a special case of the 1st).
Anything else is prone to painful errors when the AS makes any legitimate change- and we know that software changes are far more common than black swans