SAML – what can go wrong? Security check
What you should consider when trying to securely integrate with SAML Identity Provider.
Single sign-on is getting more and more attention due to its many benefits – it not only encourages users to use strong passwords, but also makes it possible to use multi-factor authentication across many applications. One of the key words you will definitely come across when reading about SSO is Security Assertion Markup Language, aka SAML. The SAML standard has been around for almost 20 years and it is still gaining popularity – we regularly stumble upon SAML in penetration tests. As usual, vulnerabilities happen – the reality proves that the maturity of a standard does not have much correlation to the security of its implementation; afterall, we still find SQL Injections.
Last year I tested multiple SAML implementations and I found a full range of vulnerabilities – including even the least expected ones. During my research I found many fantastic resources, but it still took me some time to get the whole idea of SAML, including the associated threats and possible attacks. Now I really enjoy SAML pentests – they are quite fun even though they may appear similar.
My article is not a know-it-all SAML encyclopedia, but rather a set of ideas you should consider when integrating with a SAML Identity Provider. I particularly wanted to answer my own questions from one year ago – I hope you will find your answers too. If your company wants to introduce SAML authentication, this article is a good place to start – you can not only learn how SAML works, but also peek into the attacker’s mind and understand the attacks they will try to perform. If you are a penetration tester – you can either read the whole article (I believe you won’t be disappointed) or just skip to the bonus section at the end.
Also, please note that I will only discuss vulnerabilities that I have actually found when performing penetration tests during the last year. Just some food for thought.
How does SAML work?
SAML authentication – ELI5-style high level overview
I’m a big fan of Explain Like I’m Five, so I will try it first. Let’s say you want to visit a very special, invite-only library with rare books and manuscripts to do some research, but it requires a confirmation of your identity. The library has a deal with the government, so they can confirm a person’s identity per their special request. However, the library has no computer system in place and there is no government facility in the building, so they can’t communicate directly – only via letters. In order to make sure that nobody opens the letters (or replaces them!), they seal the envelopes with custom, unique seals.
If a researcher comes to the library, the librarian will give them a letter with the library’s seal. Then the researcher has to deliver the letter to a government facility. A government official will confirm the authenticity of the seal, check if the letter has been tampered with in any way, and if everything is fine, they will write you another letter with a confirmation of your identity. Now you have to deliver it to the library. This time the librarian will confirm the authenticity of the government’s seal. Then they compare your identity with the invite list – and if all these conditions are met, they can finally let you in. If another one of these libraries opens in another city, the library representatives will visit the government facility first and exchange their seals – and they can even have a completely different invite list!
In conclusion, we have two parties – a Service Provider (the library) and the Identity Provider (the government). They want to somehow exchange information about identity in a secure way. In order to do so, they choose a standardized way of communicating using letters with unique seals – and that’s SAML.
SAML 101… in a technical way
Now we can take another look at the whole SAML issue. This time we will discuss the flow as IT specialists (we can call it Explain Like I’m an IT Junior).
Let’s say that your company has created a website using WordPress and configured SSO, so you can log in using your corporate account. When your friend Charlie from the marketing department wants to log in, the flow would go like this:
- Charlie visits the WordPress login page. WordPress, the Service Provider (SP), creates the AuthnRequest (an XML containing information such as who issued the request, who is the recipient of the request, how the authentication must be performed etc.) and sends it back to Charlie.
- Charlie is redirected to Identity Provider’s (IdP) website. If Charlie is not already authenticated on IdP’s website (meaning that there is no valid session cookie), Charlie is presented with IdP’s login page.
- Charlie logs in using his IdP credentials and enters a 2FA code, if required. If everything is fine, IdP creates a SAML Response (an XML with confirmation of Charlie’s identity), signs it, and sends it back to Charlie.
- Charlie is redirected to WordPress again. WordPress verifies the signature and if Charlie has permission to log in to WordPress, he receives a WordPress session cookie.
You should know that the flow I described is not the only one possible – however, it’s the most popular, so we will stick to it for now.
SAML assertions – plaintext or encrypted?
Let’s focus on the anatomy of SAML Responses for a moment. A SAML Response contains SAML Assertions. It is usually just one assertion, but in some cases there can be more. Moreover, it is also possible to encrypt the SAML Assertions for some extra hardening.
To sum up:
- SAML Response contains at least one SAML Assertion.
- SAML Responses can be signed or unsigned.
- SAML Assertions can also be signed or unsigned.
- SAML Assertions can be encrypted or unencrypted.
When you read about SAML (or when you encounter SAML SSO in the wild), you will most likely come across single plaintext signed SAML Assertion wrapped with signed SAML Response. You can find examples of all the options I described here.
It is not by any means required to encrypt the assertions. However, if you use encryption and your Service Provider is vulnerable to any SAML authentication bypass, it will be harder (if not impossible) to exploit it. It can also be useful if you want to pass any extra sensitive information. Overall, in most cases it is purely optional – but if both your Service Provider and your Identity Provider offer assertion encryption, it won’t hurt anybody to enable it.
Where is the SAML token stored?
Actually, there are two different options here, which is rarely discussed.
The usual way to handle the SAML token is to send it back to the Service Provider, who verifies the authenticity and integrity of the SAML Response, and then generates their own session token.
Another way is to treat the SAML Response as a stateless session token by itself and send it via e.g. Authorization header. This way can be useful for web services, which do not handle any session logic by themselves.
Depending on how you want to use the SAML Response, they should treated differently:
|Session created based on SAML Response||SAML Response as session token|
|It should be valid for 3-5 minutes||Valid like a normal session, e.g. 8-16 hours|
|It should not be possible to use the same SAML Response twice||Reusable|
The many faces of SAML integration security
Let’s get this straight – SAML is just a standard and it’s up to you how you make use of it. In this article I would like to talk about both dumb and less obvious ways to mess up SAML in your application, including:
- Basic mistakes that still happen,
- Why you should check the XML schema of your SAML Response,
- Why out-of-the-box is not always okay,
- Nasty secrets of XML comments in SAML,
- What you could still forget about even though you took care of all the above.
Let’s get into the attacker’s mindset and dive into the pool of SAML vulnerabilities from the past!
What if I just modified the assertion?
One of the scariest situations is when you can just modify the SAML Response and nobody cares… because the Service Provider does not check the signature. There are many possible causes of this vulnerability – my favourite is the case in Github Enterprise, in which both the validity and the presence of the signature were not checked due to a mistake in the variable name. Remember – always check if the signature is valid!
Hello, is the signature still there?
Then there is another case – when the signature is verified, but only if it is actually there. This was the case for the internal chat in Uber and OneLogin SAML SSO WordPress plugin – making it possible for an anonymous user to log in to the application. Yikes!
SAML Response, Picasso style
The standard SAML Response is usually quite simple – there is a signature linked to either a Response or an Assertion entity and some data (Subject, AttributeStatement…). But what if we move the pieces around a bit and add another entity here and there? After some operations, we achieve a SAML Response that no longer resembles the original – but sometimes can slip through anyway, leading to privilege escalation! The Picasso style modification of a SAML Response is called a Signature Wrapping Attack (XSW).
When I was checking for XSW during the penetration tests, I always did that without much hope, as it seemed kind of abstract for me. Imagine my surprise when the attack was actually successful! Conclusion: check your XML Schema. Also, if you are a penetration tester: always test for XSW!
For some real-life examples, check out these known vulnerabilities:
👉 XSW in Github Enterprise
👉 XML Signature Validation Bypass in simpleSAMLphp and xmlseclibs
👉 XSW in Samlify
The SAML security checks are there, but… disabled?!
The most important rule is: never try to write the SAML library by yourself. Use a dedicated library, preferably in the newest stable version (and remember to update it regularly).
Having said that, sometimes even if you take a ready-to-go box of software, you should still review the configuration – because the important security checks can simply be disabled. The examples include Oracle WebLogic (CVE-2018-2998/CVE-2018-2933) – in the default configuration, the signature was not required, meaning that it was possible to send an unsigned SAML Response and sign in. Yes, this is an extreme case, but there can also be seemingly minor security features (such as verifying the validity of your SAML Response), which are disabled by default, and overall sum up to the security of your application.
Signatures and comments – how not to parse your SAML Responses
Let’s begin with how you sign a SAML Response. In most cases IdP and SP do not communicate directly with each other – they have to rely on the signatures.
How is a SAML Response signed?
Our goal is to achieve a signature that looks as follows:
As you can see in the Reference URI field, the above signature corresponds to entity
7b7da352-0e56-4fea-9940-060f5142a4d8. There are also a lot of references to used algorithms. Our goal is to calculate values that we can put in DigestValue and SignatureValue entities.
- Perform XML normalization
First, take a look at the following XML entities:
<Example A="1" B="2">Hello!</Example> <Example B="2" A="1">Hello!</Example> <Example B="2" A="1">Hello!<!--comment--></Example> <Example B="2" A="1" >Hello!</Example>
Even though they are different from each other, they are all logically the same. That’s why you have to normalize the XML first, which means producing textually equal XMLs from logically equal XMLs. The algorithm used for normalization is described in the CanonicalizationMethod entity.
- Calculate the DigestValue
The second step is to calculate a hash from the normalized XML (e.g. SHA256). The DigestValue is just a base64 of an unhexlified hash.
- Sign the SignedInfo
Then you can finally sign the whole (normalized) SignedInfo entity using the SignatureMethod – in this case
http://www.w3.org/2001/04/xmldsig-more#rsa-sha256. The certificate used for signing should be put in X509Certificate entity – however, as a Service Provider you should always check if this is really the correct certificate of your Identity Provider.
- Voila! The SAML Response is ready for shipment!
What could go wrong when SAML Response is parsed?
As I already told you, in order to sign an XML, we have to normalize it. You might have noticed in the previous example that we ignored the XML comment during normalization. It actually depends on the canonicalization algorithm used, but in most cases the comments are ignored – they are not actual content. This fact is sometimes forgotten.
Let’s say that there’s an email@example.com account in our WordPress application. When someone logs in, WordPress extracts the e-mail from the SAML Response and checks if the user exists in their database. The entity which contains the e-mail can look like this:
However, due to normalization, these entities are also logically the same:
<saml:NameID><!--comment-->firstname.lastname@example.org</saml:NameID> <saml:NameID>ad<!--comment-->email@example.com</saml:NameID> <saml:NameID>firstname.lastname@example.org<!--comment--></saml:NameID>
In order to get the email from the NameID, you have to extract the inner text of the entity. However, when you treat the XML as a tree and extract the first or the last subnode of NameID (there should only be one, so it doesn’t matter, right?), a seemingly innocent comment can cause a lot of trouble.
What if the attacker changes their email to email@example.com? After adding a perfectly fine comment, the NameID entity will look as follows:
So if the application extracts the last subnode instead of the whole inner text… The attacker can now log in as firstname.lastname@example.org! This is definitely an edge case you should consider when parsing the SAML Response. And I actually encountered this vulnerability during a penetration test a few months ago!
The issue with XML comments in SAML was researched and presented by Kelby Ludwig on AppSecUSA 2018. The vulnerability was found in multiple SAML libraries.
Conceptual vulnerabilities – when the SAML is all right, but there’s something you forgot about
There are also some other questions you should ask yourself when implementing SAML.
Probably the most important – do you use the same attribute to identify the user as your Identity Provider?
I recently encountered this case in the wild during a penetration test. The setup was as follows:
- Service Provider (custom web app) identified the users based on their emails.
- Identity Provider identified the users based on logins. The SAML Response included the login as well as email, name, and surname.
As soon as I noticed this case, I logged in to IdP and tried to change my email. (Un)fortunately, it checked for duplicates, so I couldn’t use the email of the Service Provider admin. That’s when it gets even more interesting – the Service Provider was vulnerable to the XML comments issue. Voila!
You should also check if your Identity Provider allows anonymous registration, as it could really mess up with your access control. If you perform a penetration test, it is always a good idea to look for forgotten register endpoints!
“This other SAML” – briefly about an alternative called SAML Artifacts
Remember when I told you that in most cases IdP and SP do not directly communicate with each other? Here’s the edge case – allow me to introduce you to the SAML Artifacts.
Let’s get back to my library example. As previously, if you want to enter the library, you receive a letter from the librarian, which you deliver to the government facility. However, this time the government official will not give you the letter with the information about your identity back – they will send it as a priority letter via the post office and only give you a shipping confirmation. Then, when you deliver the confirmation to the library, they will look for it in their mailbox based on the shipping number. If they find it and everything is fine – you can get in. This time it is not possible to tamper with the letter in any way – you don’t even touch it. It is shipped via an independent channel, directly from the library to the government facility.
As you probably noticed, SAML Artifacts flow is more bulletproof. The attacker won’t even see the SAML Response – forget about tampering with it. However, the SAML Artifacts flow has one downside – the Service Provider has to communicate directly to the Identity Provider, which is not always possible (e.g. when you use an internal Identity Provider). Unless that is your case, you can surely go for it – it is always a good idea to reduce the attack surface. I also tested SAML Artifacts a few times and can’t say a bad word.
Bonus: SAML Penetration Testing Checklist
During my research I created a personal checklist for penetration tests, which I decided to share with you. Enjoy! Remember that the important data (username, organization name, etc.) can be included either in the AttributeStatement or in the NameID entity.
|☑️ Check if it is possible to modify the assertion|
|☑️ Check if it is possible to remove the signature|
|☑️ Perform Signature Wrapping Attacks (XSW)|
|☑️ Analyse the application behaviour when adding XML comments in the beginning, middle, and end of an attribute (such as username)|
|☑️ Sign the SAML Response with own certificate (depending on the case: assertion, message, or both)|
|☑️ Perform XXE and XSLT attacks|
|☑️ Check if there are any known vulnerabilities for the SAML library or software in use|
|☑️ Check if the SP uses the same attribute as IdP to identify the user|
|☑️ Check if IdP allows anonymous registration|
|☑️ Verify Single Log Out (if required)|
|☑️ Check if the validity time window is short (3-5 minutes)|
|☑️ Check if the time window is validated (try to use the same SAML Response after it has expired)|
|☑️ Check for Cross-Site Request Forgery attack (Unsolicited Response)|
|☑️ Check if the recipient is validated (Token Recipient Confusion)|
|☑️ Check if it is possible to send the same SAML Response twice (Replay Attack)|
|☑️ Check for Open Redirect in RelayState|
|☑️ Check the signature algorithm in use|
The great news is: you can perform most of the above checks using the SAML Raider – my favourite Burp Extension for SAML.
Remember that you have to test each endpoint that receives the SAML Response!
Other resources and final conclusions
If you want to check out some other resources about SAML, here are some recommendations:
🔗 Epi052 series: Part 1, Part 2, Part3 – where you can read more about the attacks I mentioned and also learn how to use SAML Raider.
🔗 Fun with SAML SSO Vulnerabilities and Footguns – some more information about the attacks I mentioned from the developer’s perspective.
🔗 Securing XML implementations across the web – an article about vulnerable XML implementations and how it affects SAML.
🔗 OWASP SAML Security Cheat Sheet – always a good place to check.
I hope you enjoyed reading this article just as much as I enjoyed testing SAML over the course of last year.
But remember that…
Authentication is the critical feature of every application. I believe you should always perform an extensive penetration test of your authentication mechanism – including SAML. Afterall, that’s a place where you definitely don’t want the vulnerabilities to happen, right?