Hi, I’m canalun (@i_am_canalun
), a security researcher at GMO Flatt Security Inc.
This article explores the question: “Why Does XSS Still Occur So Frequently?” We will delve into why this notorious and classic vulnerability despite the widespread adoption of built-in XSS countermeasures in modern development frameworks.
The world of web development, especially frameworks, is evolving at a rapid pace, bringing improvements not only in development efficiency but also in security.
In particular, defensive mechanisms against XSS have become increasingly robust. Today, many frameworks automatically escape HTML content and attribute values. Also, interfaces that could lead to the dangerous use of innerHTML are given names that signal their risk (e.g., unsafeHTML in Lit, dangerouslySetInnerHTML in React). Recently, React v19 added the ability to disable javascript: scheme URLs1, and that means yet another attack vector for XSS is neutralized.
This progress might lead some developers to believe, “If I’m using a modern framework, I don’t need to worry about XSS.”
However, the reality is quite different. Even in this era of advanced frameworks, data shows that XSS remains one of the top vulnerabilities in terms of both severity and frequency. It sounds like a joke, but unfortunately, it’s the reality.
Data on the Frequency and Severity of XSS
Let’s examine three data sources that demonstrate how XSS remains a severe and frequent vulnerability.
CWE Top 25 Archive
The MITRE Corporation, a U.S. non-profit organization, manages global vulnerabilities through CVE (Common Vulnerabilities and Exposures) and categorizes them using CWE (Common Weakness Enumeration). For example, SQL Injection is CWE-89, and XSS is CWE-79.
The “CWE Top 25 Archive” is an annual report that ranks the most dangerous software weaknesses by score including both severity and prevalence. The table below shows the top three weaknesses from the last five years.
As the table shows, XSS has consistently ranked first or second over the past five years, underscoring its status as a top-tier vulnerability!
Vulnerability Reports by Industry on HackerOne
Next, let’s look at industry-specific data from the “8th Annual Hacker-Powered Security Report 2024/2025”7, published by one of the most famous bug bounty platforms, HackerOne. The graph below shows statistics for the top 10 reported vulnerabilities from 2023 to 2024, broken down by industry. The graph is original.
This data reveals that XSS accounts for approximately 20% of reported vulnerabilities in all industries except for Crypto. This suggests that XSS is a universal challenge, affecting everything from government agencies to e-commerce.
Vulnerability Detection Trends at GMO Flatt Security
Finally, we’ll share trends from our own vulnerability assessment data at GMO Flatt Security8. When we categorized the vulnerabilities we discovered in 2023 by volume, XSS was the third most common, following authentication/authorization flaws and business logic vulnerabilities.
While this data reflects the prevalence of B2B SaaS applications among our clients, leading to a high number of authentication-related findings, XSS still outpaces other common vulnerabilities like CSRF and misconfigured security headers.
These three data sources indicate that XSS remains a consistently discovered vulnerability with significant severity still today.
Why XSS Still Occurs
Now, we will address the central question: “Why does XSS still happen in the modern era?” We’ll explore the technical reasons behind these numbers.
Premise 1: XSS Sinks Are Extremely Diverse
Two important concepts for understanding vulnerabilities are “sources” and “sinks”. A source is an entry point for attacker-controllable data (e.g., a form, a filename), while a sink is a location in the program where that data ends up and causes a vulnerability (e.g., embedding in HTML, execution as JavaScript).
For example, a typical sink of SQL injection is raw query generation such as db.query("SELECT * FROM users WHERE name = '" + userInput + "'");. For path traversal, it’s like readFile("/user/uploads/" + userInput);.
XSS sinks are incredibly diverse, leading to vulnerabilities in various contexts. Here are some representative sink patterns:
HTML: Inserting untrusted data as HTML.
Example: document.getElementById('output').innerHTML = [userInput]; (XSS occurs if userInput is <img src onerror=alert(document.domain)>)
HTML Attribute (non-URL): Inserting untrusted data as an HTML attribute value.
Example: <button type="text" data-value="[userInput]"> (XSS occurs if userInput is "><script>alert(document.domain)<p ")
HTML Attribute (URL): Using untrusted data as a URL in an HTML attribute.
Example: <a href=[userInput]> (XSS occurs if userInput is javascript:alert(document.domain))
JavaScript (URL): Using untrusted data as a URL within JavaScript code.
Example: window.location.href = [userInput]; (XSS occurs if userInput is javascript:alert(document.domain))
JavaScript (Function Construction): Using untrusted data to construct a JavaScript function.
Example: eval('var data = "' + userInput + '";'); (XSS occurs if userInput is ";alert(document.domain);//)
Premise 2: Frameworks Only Address a Limited Set of Sinks
The XSS countermeasures provided by most web frameworks primarily focus on escaping HTML output. This is intended to prevent data from being unintentionally interpreted as HTML or JavaScript.
In other words, these features mainly target the “HTML” and “HTML Attribute (non-URL)” sinks. So, “HTML Attribute (URL)” and other JavaScript-related sinks are often not covered by a framework’s standard protection.
Roadmap and Summary
Based on these premises, we will detail the causes of XSS from two perspectives.
The first question is: why can’t HTML-related XSS (i.e., “HTML” and “HTML Attribute (non-URL)”) be fully prevented even with a framework’s built-in escaping mechanisms? Yes, this type of XSS can happen even under frameworks.
The second question is: in what scenarios do other sinks (i.e., “HTML Attribute (URL)” and JavaScript related ones) appear? Some of you may think like “When do we have to set URL to href?”
We will explore these two questions with concrete examples from real-world reports on HackerOne :) Specifically, we take some interesting cases from those reported in 2024 on HackerOne and given CWE of XSS. If you’d like to get the raw result of categorizing those reports by cause, you can refer this repository: https://github.com/canalun/h1-2024-xss-categorization
Well, before jumping into detailed analysis, here is the summary of the modern XSS patterns.
Data that was assumed to be safe was, in fact, not.
A custom sanitizer was bypassed, or an open-source sanitizer was used incorrectly.
The framework’s built-in defense mechanisms were not used correctly or were intentionally avoided.
The implementation contained a sink that was outside the framework’s protection scope and not widely recognized.
A library’s specifications were misunderstood, leading to incorrect usage.
Why It Can’t Prevent HTML-Related XSS to Use Frameworks?
First, let’s explore why XSS targeting “HTML” and “HTML Attribute (non-URL)” sinks still occurs, even with framework-level HTML escaping. This can be broken down into two main factors.
A. Cases Where Framework Escaping Mechanisms Cannot Be Used
When developing a somewhat complex system or one that integrates various modules, you’ll encounter cases where it’s overwhelmingly better, or even necessary, to set HTML directly without using the framework’s escaping mechanism. For example, code that exists outside of the framework cannot use its escaping features. In other cases, while you could technically parse the provided HTML and fit it into JSX to escape it, doing so is often unrealistic from a development cost perspective.
This is the first point: there are situations where the framework’s escaping mechanism is simply unavailable.
This issue often arises when incorporating modules that output HTML, such as a WYSIWYG editor or a Markdown parser. If you’re using a trusted CMS, you might implement a feature that directly inserts the HTML fetched from it. You might also encounter mechanisms that are difficult for frameworks to handle, like cross-frame communication.
Now, I imagine you’re all thinking:
“I know that directly inserting HTML is dangerous! When I absolutely have to do it, I’ll make sure the input is safe or run it through a sanitizer!”
And that’s exactly the point. The core of this issue is whether you can truly get by with just verifying input safety or using a sanitizer. Let’s explore that.
A-1. Errors in Determining Data Safety
Now, let’s first discuss the act of verifying that an input is safe. In the paper “Securing the Tangled Web”9, authored by Google engineer Christoph Kern, mistakes in this very verification process are cited as one of the primary causes of Cross-Site Scripting (XSS).
In this paper, the author examines the causes of XSS based on real-world cases, specifically, XSS reported through the Google VRP10. It argues that verifying the safety of values in web applications is extremely difficult. The main reasons cited are as follows11:
User inputs and fetched data from various sources are combined through complex conditional branches before being used.
The source code is constantly changing over time.
An interesting related topic mentioned is the issue of responsibility. Bugs can also arise from a failure to correctly establish whether the front-end or the back-end is responsible for ensuring the safety of a value.
This bug could arise in practice from a misunderstanding between front-end and back-end developers regarding responsibilities for data validation and sanitization.
Let’s see a HackerOne example, CVE-2021-20323
. It was of U.S. Dept. of Defense and triggered by the below payload.
As the payload suggests, a JSON key (field name) was interpreted as HTML, causing XSS. It’s plausible the developers who really care about handling values may have incorrectly assumed keys are safe to render.
A-2. Bypassing Custom Sanitizers
Now, let’s turn our attention to sanitizers. If direct insertion is unavoidable and the input values cannot be trusted, what about sanitizing them?
First, regarding custom-built sanitizers, if you asked 100 security engineers whether they would recommend it, 100 of them would likely tell you to stop. That’s because HTML sanitizers have flaws in nine cases out of ten.
Let’s look at a real-world example. In HackerOne report #1675516
, a vulnerability in a sanitizer was exploited. Specifically, the attacker managed to bypass the sanitizer by following multiple unclosed