Validating JWT from multiple identity providers

2023-02-19

I have been tasked recently with finding a way to validate JWT that can come from multiple different identity providers, in the same API. This tends to happen when companies buy each other and try to merge their products and user base. So, first, let's be clear about one thing: wrangling multiple IdPs is the path of pain. You would think that it would avoid painful migrations and get you up and running quickly, but you will pay a huge complexity tax when juggling users and roles from multiple systems interacting in your application.

Validating the JWT, though, is manageable, but I've not seen good advice on how to verify them when they were produced by multiple identity providers. So, let's find a good process for that.

First, some quick terminology if you're not familiar with them:

JWT ( JSON Web Token, pronounced "jot"): a JSON based format for signed data, commonly used to authenticate API clients
JWKS (JSON Web Key Set): a JSON based format meant to carry cryptographic keys, used by identity providers to deploy the keys that can verify the JWT they produced
IdP (identity provider): a service that manages user identities, often used to delegate actual user authentication (emails, password, 2FA, etc) to a separate service, that then vouches for these identities using signed tokens

Why is it hard to verify a JWT?

A lot has already been written about the pitfalls of JWT verification. They boil down to: you should not trust what the token tells you about its signature algorithm. There were multiple flaws due to token coming in with the "none" algorithm, which libraries happily understood as a token they don't need to verify (most libraries have fixed that now), or more subtle ones, where a token references a RSA key but with the HMAC algorithm, which results in the library verifying that HMAC using the RSA public key.

So you should not trust the token, but the specification does not help you here, and the behaviour of some identity providers does not help either.

Let's get into the details! A JWT has a JOSE header that contains parameters about the signature algorithm:

alg: indicates the signature algorithm in text form, like "RS256" for RSA PKCS#1 1.5 with SHA256. The list of possible values is defined in RFC 7518. This parameter is mandatory
kid: a string id that indicates which key should be used to verify the token. This parameter is optional (and that's one of the problem we will deal with)
we do not care about the other fields for this

So in theory, we should take the kid, look up the key we want to use, then make sure the alg from the token matches the alg from the key, and then verify the token.

Now let's look at the JWKS content. It contains an array of key objects, and each object contains:

kid: used to match the key to the JWT. This parameter is optional (another problem)
alg: same format as in the JWT. This parameter is optional (a BIG problem)
kty: indicates the type of key. EC for elliptic curve, RSA for RSA, oct for symmetric algorithms like HMAC. This parameter is mandatory
crv: for elliptic curve algorithms, indicates which curve is used. The possible values are defined in RFC 7518. This parameter is mandatory if kty=EC
other fields we do not care about here

Some additional context on what an IdP could do with a JWKS:

provide multiple keys, possibly with different key types: normal, and useful if they want to migrate from one key to another. You would hope that they set the kid field in that case
provide keys without the alg field. one of the big issues here. As an example, the specification allows us to generate a key with kty=RSA, then sign tokens either with alg=RS256 or alg=RS384. There's nothing really wrong with that, it just makes everything more confusing
provide multiple keys with different kty but the same kid, because why the hell not

Matching these with what the JWT provides is indeed a challenge. An advice I've seen here and there is to restrict algorithms to a reasonable set, like RS256 and HS256, but that won't be enough if the JWKS contains keys of both RSA and oct types with the same kid.

And the fun is not over: now you need to get a JWKS from each of multiple providers! Is it possible that different identity providers give keys with colliding kid and kty? In theory, yes, but I've not seen it yet, because large IdPs tend to generate random looking kid. It's entirely possible that a homegrown IdP would generate incrementing kid though.

How do we match a JWT to a key?

Since we may not have a direct match, we need a deterministic process to choose the key, and make sure there is no algorithm confusion. here is the process we came up with for the Apollo Router:

get the kid and alg from the JWT
open all the JWKS and look up all the keys matching that kid or, as fallback, all the available keys
select the key with a alg field that matches the JWT's exactly. If there is one, skip the rest and validate the token directly with that key
select the keys with a kty field matching the alg field from the JWT:
- kty=RSA with alg=RS256, alg=RS384, alg=RS512, alg=PS256, alg=PS384, or alg=PS512
- kty=oct with alg=HS256, alg=HS384 or alg=HS512
- kty=EC with crv=P-256: alg=ES256
- kty=EC with crv=P-384: alg=ES384
- kty=EC with crv=P-521 (not a typo, it's 521): alg=ES512
- kty=EC with crv=Ed25519: alg=EdDSA
- any other combination, refuse the token

Hopefully, at this point you would either have no matching key, and you refuse the token, or you have a matching keys. It is still possible to get multiple candidate keys, due to the specification's (lack of) constraints. But at this point the keys you have chosen are safe to test, and if one of them verifies the token, then it's ok.

There is one final step after that: check that the issuer claim iss in the deserialized JWT, if present, matches the issuer of the JWKS. If it is not, you should ask your IdP what they are doing.

I'm not too happy about it, it is a complex process that could have been avoided if:

kid was mandatory in JWT
kid was mandatory in JWKS
alg was mandatory in JWKS (if you want to use the same key for different algorithms, store it under two different kid)

If you wonder how this is handled with Biscuit tokens:

tokens have an optional key id field, that is a u64. You're not obligated to make it incremental, it can be a random number
there is only one signature algorithm right now, Ed25519
we are looking at adding ECDSA too, without adding a risk of confusion