JAVASCRIPT
Extract URL Components (Protocol, Host, Path)
Efficiently parse a URL string in JavaScript to extract its core components like protocol, hostname, optional port, and path using a powerful regular expression.
function parseUrlComponents(url) {
const urlRegex = /^(https?):\/\/([a-zA-Z0-9.-]+)(:\d+)?(\/[^\s]*)?$/;
const match = url.match(urlRegex);
if (match) {
return {
protocol: match[1],
hostname: match[2],
port: match[3] ? match[3].substring(1) : undefined, // Remove leading ':'
path: match[4] || '/' // Default path to '/' if not present
};
}
return null;
}
// Example usage:
console.log(parseUrlComponents("https://www.example.com:8080/path/to/page?query=1"));
console.log(parseUrlComponents("http://localhost/"));
console.log(parseUrlComponents("ftp://example.com")); // null, as it only matches http/https
How it works: The `parseUrlComponents` function employs a regular expression to deconstruct a URL into its constituent parts: protocol (http/https), hostname, an optional port number, and an optional path. It captures these segments into distinct groups, which are then returned as a structured object, making it invaluable for tasks like dynamic routing, analytics, or security-related URL inspections.