JAVASCRIPT

Validate and Extract URLs from Text

Discover a robust regular expression to validate URL formats and efficiently extract all valid URLs from a block of text in JavaScript, enhancing content processing for links.

function findAndValidateUrls(text) {
  const urlRegex = /(https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|www\.[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|https?:\/\/[a-zA-Z0-9]+\.[^\s]{2,}|[a-zA-Z0-9]+\.[^\s]{2,})/g;
  const urls = text.match(urlRegex) || [];
  return urls;
}

// Examples
const text = "Visit our website at https://www.example.com or check out http://blog.example.org. Also, try example.net for more info.";
console.log(findAndValidateUrls(text));
// Expected: ["https://www.example.com", "http://blog.example.org", "example.net"]
console.log(findAndValidateUrls("Invalid URL example. this is not a url")); // []
How it works: The `findAndValidateUrls` function utilizes a comprehensive regular expression to identify and extract URLs from a given string. The `urlRegex` pattern is designed to capture URLs starting with `http://`, `https://`, `www.`, or even just a domain name followed by a TLD. The `g` flag ensures that all occurrences are found, and `match()` returns an array of all matching URLs, or an empty array if none are found.

Need help integrating this into your project?

Our team of expert developers can help you build your custom application from scratch.

Hire DigitalCodeLabs