The Anatomy of a URL
Every URL follows the structure defined in RFC 3986. A full URL can contain up to eight components:
https://user:pass@example.com:443/path/to/page?q=search&lang=en#section
scheme: https
username: user
password: pass
host: example.com
port: 443
path: /path/to/page
query: q=search&lang=en
fragment: section
Most URLs you work with day-to-day use only a subset: scheme, host, path, and query are the most common. But knowing the full anatomy helps when parsing non-standard or legacy URLs.
The Query String
The query string starts after the ? and contains key-value pairs separated by &. Pairs can appear multiple times (?tag=js&tag=web), keys can have no value (?debug), and values should be percent-encoded for special characters. Modern parsers handle all of this, but manual parsing with string splits will break on edge cases.
Parsing URLs in JavaScript
The URL constructor is available in browsers and Node.js and handles parsing correctly per the spec:
const url = new URL("https://example.com:8080/path?q=test#top");
url.protocol // "https:"
url.hostname // "example.com"
url.port // "8080"
url.pathname // "/path"
url.search // "?q=test"
url.hash // "#top"
// Query params parsed with URLSearchParams
const params = new URLSearchParams(url.search);
params.get("q") // "test"
Array.from(params) // [["q", "test"]]
The URL constructor throws on invalid URLs — wrap in try/catch when parsing user input.
Parsing URLs in Python
from urllib.parse import urlparse, parse_qs
url = urlparse("https://example.com:8080/path?q=test#top")
url.scheme # "https"
url.hostname # "example.com"
url.port # 8080
url.path # "/path"
url.query # "q=test"
url.fragment # "top"
# Parse query string
params = parse_qs(url.query) # {"q": ["test"]}
Common URL Pitfalls
- Double slashes:
https://example.com//path— the path has a leading empty segment. Most servers normalize this, but not all. - Trailing dot in hostname:
https://example.com./path— technically valid in DNS but treated differently by some browsers. - Case sensitivity: Scheme and host are case-insensitive. Path, query, and fragment are case-sensitive per the HTTP spec but individual servers may vary.
- Encoding: Non-ASCII characters in the path or query must be percent-encoded.
encodeURIComponent()for query values,encodeURI()for full URLs.
Parse Any URL Instantly
Paste a URL into ToolsVito's URL Parser to see every component broken down in a clean table — protocol, host, port, path, query params (each key-value pair), and fragment. All parsed in your browser.