Skip to content
Network 7 min read

URL Parsing Explained: Break Down Any URL Into Its Components

Learn how URLs are structured — protocol, host, port, path, query string, fragment — and how to parse them programmatically in JavaScript, Python, and the browser.

ToolsVito Team

The Anatomy of a URL

Every URL follows the structure defined in RFC 3986. A full URL can contain up to eight components:

https://user:pass@example.com:443/path/to/page?q=search&lang=en#section

scheme:   https
username: user
password: pass
host:     example.com
port:     443
path:     /path/to/page
query:    q=search&lang=en
fragment: section

Most URLs you work with day-to-day use only a subset: scheme, host, path, and query are the most common. But knowing the full anatomy helps when parsing non-standard or legacy URLs.

The Query String

The query string starts after the ? and contains key-value pairs separated by &. Pairs can appear multiple times (?tag=js&tag=web), keys can have no value (?debug), and values should be percent-encoded for special characters. Modern parsers handle all of this, but manual parsing with string splits will break on edge cases.

Parsing URLs in JavaScript

The URL constructor is available in browsers and Node.js and handles parsing correctly per the spec:

const url = new URL("https://example.com:8080/path?q=test#top");

url.protocol   // "https:"
url.hostname   // "example.com"
url.port       // "8080"
url.pathname   // "/path"
url.search     // "?q=test"
url.hash       // "#top"

// Query params parsed with URLSearchParams
const params = new URLSearchParams(url.search);
params.get("q")         // "test"
Array.from(params)      // [["q", "test"]]

The URL constructor throws on invalid URLs — wrap in try/catch when parsing user input.

Parsing URLs in Python

from urllib.parse import urlparse, parse_qs

url = urlparse("https://example.com:8080/path?q=test#top")
url.scheme   # "https"
url.hostname # "example.com"
url.port     # 8080
url.path     # "/path"
url.query    # "q=test"
url.fragment # "top"

# Parse query string
params = parse_qs(url.query)  # {"q": ["test"]}

Common URL Pitfalls

  • Double slashes: https://example.com//path — the path has a leading empty segment. Most servers normalize this, but not all.
  • Trailing dot in hostname: https://example.com./path — technically valid in DNS but treated differently by some browsers.
  • Case sensitivity: Scheme and host are case-insensitive. Path, query, and fragment are case-sensitive per the HTTP spec but individual servers may vary.
  • Encoding: Non-ASCII characters in the path or query must be percent-encoded. encodeURIComponent() for query values, encodeURI() for full URLs.

Parse Any URL Instantly

Paste a URL into ToolsVito's URL Parser to see every component broken down in a clean table — protocol, host, port, path, query params (each key-value pair), and fragment. All parsed in your browser.

Try it now — free, runs in your browser

URL Parser

Deconstruct any URL